logo
    Massive torsion modes from Adler-Bell-Jackiw and scaling anomalies
    3
    Citation
    1
    Reference
    10
    Related Paper
    Citation Trend
    Abstract:
    Regularization of quantum field theories introduces a mass scale which breaks axial rotational and scaling invariances. We demonstrate from first principles that axial torsion and torsion trace modes have non-transverse vacuum polarization tensors, and become massive as a result. The underlying reasons are similar to those responsible for the Adler-Bell-Jackiw (ABJ) and scaling anomalies. Since these are the only torsion components that can couple minimally to spin 1/2 particles, the anomalous generation of masses for these modes, naturally of the order of the regulator scale, may help to explain why torsion and its associated effects, including CPT violation in chiral gravity, have so far escaped detection. As a simpler manifestation of the reasons underpinning the ABJ anomaly than triangle diagrams, the vacuum polarization demonstration is also pedagogically useful.
    Keywords:
    Regularization
    Language models have been shown to exhibit positive scaling, where performance improves as models are scaled up in terms of size, compute, or data. In this work, we introduce NeQA, a dataset consisting of questions with negation in which language models do not exhibit straightforward positive scaling. We show that this task can exhibit inverse scaling, U-shaped scaling, or positive scaling, and the three scaling trends shift in this order as we use more powerful prompting methods or model families. We hypothesize that solving NeQA depends on two subtasks: question answering (task 1) and negation understanding (task 2). We find that task 1 has linear scaling, while task 2 has sigmoid-shaped scaling with an emergent transition point, and composing these two scaling trends yields the final scaling trend of NeQA. Our work reveals and provides a way to analyze the complex scaling trends of language models.
    Negation
    Scaling up language models has been empirically shown to improve performance on a wide range of downstream tasks. However, if we were to observe worse performance as a function of scale ("inverse scaling") on certain tasks, this would indicate that scaling can also encourage behaviors that are misaligned with human preferences. The Inverse Scaling Prize (McKenzie et al. 2022) identified eleven such inverse scaling tasks, evaluated on models of up to 280B parameters and up to 500 zettaFLOPs of training compute. This paper takes a closer look at these inverse scaling tasks. We evaluate models of up to 540B parameters, trained on five times more compute than those evaluated in the Inverse Scaling Prize. With this increased range of model sizes and training compute, only four out of the eleven tasks remain inverse scaling. Six out of the eleven tasks exhibit "U-shaped scaling", where performance decreases up to a certain size, and then increases again up to the largest model evaluated (the one remaining task displays positive scaling). In addition, we find that 1-shot examples and chain-of-thought can help mitigate undesirable scaling patterns even further. U-shaped scaling suggests that the inverse scaling trend observed in McKenzie et al. (2022) may not continue to hold for larger models, which we attribute to the presence of distractor tasks that only sufficiently large models can avoid.
    Citations (1)
    Improved approximations or displacements and stresses, achieved by the following types of scaling, are presented: a. scaling of the initial stiffness matrix; b. a new type of scaling of displacements; and c. mixed scaling of stiffness and displacements, where the two types of scaling are combined The geometric interpretation of the various scaling types is illustrated and methods for selecting the scaling multipliers based on geometrical considerations, mathematical criteria and the reduced basis approach, are demonstrated and compared. It is shown that high quality approximations can be achieved for very large changes in cross section and geometrical variables with a small computational effort. The results presented indicate that scaling procedures have high potential in future applications where effective reanalysis is essential.
    Basis (linear algebra)
    Matrix (chemical analysis)
    Citations (2)
    Scaling up language models has been empirically shown to improve performance on a wide range of downstream tasks. However, if we were to observe worse performance as a function of scale (inverse scaling) on certain tasks, this would indicate that scaling can also encourage behaviors that are misaligned with human preferences. The Inverse Scaling Prize (McKenzie et al. 2023) identified eleven such inverse scaling tasks, evaluated on models of up to 280B parameters and up to 500 zettaFLOPs of training compute. This paper takes a closer look at these inverse scaling tasks. In this paper, we evaluate models of up to 540B parameters, trained on five times more compute than those evaluated in the Inverse Scaling Prize. With this increased range of model sizes and compute, only four out of the eleven tasks remain inverse scaling. Six tasks exhibit U-shaped scaling, where performance decreases up to a certain size, and then increases again up to the largest model evaluated (the one remaining task displays positive scaling). In addition, 1-shot examples and chain-of-thought can help mitigate undesirable scaling patterns even further. U-shaped scaling suggests that the inverse scaling trend observed in McKenzie et al. (2023) may not continue to hold for larger models, which we attribute to the presence of distractor tasks that only sufficiently large models can avoid.
    Language models have been shown to exhibit positive scaling, where performance improves as models are scaled up in terms of size, compute, or data. In this work, we introduce NeQA, a dataset consisting of questions with negation in which language models do not exhibit straightforward positive scaling. We show that this task can exhibit inverse scaling, U-shaped scaling, or positive scaling, and the three scaling trends shift in this order as we use more powerful prompting methods or model families. We hypothesize that solving NeQA depends on two subtasks: question answering (task 1) and negation understanding (task 2). We find that task 1 has linear scaling, while task 2 has sigmoid-shaped scaling with an emergent transition point, and composing these two scaling trends yields the final scaling trend of NeQA. Our work reveals and provides a way to analyze the complex scaling trends of language models.
    Negation
    Citations (0)