How Do Transformers Model Physics? Investigating the Simple Harmonic Oscillator
Abstract
:1. Introduction
2. Related Work
3. Developing Criteria for Intermediates with Linear Regression
- What is an intermediate?
- How can intermediates be encoded and how can we robustly probe for them?
- When, or under what circumstances, are intermediates encoded?
3.1. What Is an Intermediate?
3.2. How Can Intermediates Be Encoded and How Can We Robustly Probe for Them?
3.3. When, or Under What Circumstances, Are Intermediates Encoded?
3.4. Key Question: How Can We Use Intermediates to Demonstrate That a Transformer Is Actually Using a Method in Its Computations?
- Criteria for use of a method g with an associated unique intermediate I:
- If a model uses a method g then the model performance should improve if I is better represented (shown in Figure 3).
- If and only if the model uses g then we expect some hidden state’s variance to be almost fully explained by I (shown in Figure 4).
- If and only if the model uses g then we can intervene with hidden states, to change and predictably change the model output from (shown in Figure 4).
4. Investigating the Simple Harmonic Oscillator
- Can a transformer successfully predict the position/velocity at the SHO’s next timestep?
- Can we determine what computational method the transformer is using in this prediction?
4.1. Mathematical and Computational Setup
4.2. What Methods Could the Transformer Use to Model the Simple Harmonic Oscillator?
4.3. Evaluating Methods for the Undamped Harmonic Oscillator
Criterion | Linear Multistep | Taylor Expansion | Matrix Exponential |
---|---|---|---|
1. Intermediate encoding | 0.66/0.51 | 0.67/0.25 | 0.84/0.54 |
2. Performance, encoding correlation | 0.73/0.44 | 0.74/0.39 | 0.89/0.44 |
3. Intermediate’s explanatory power | 0.42/0.15 | 0.53/0.11 | 0.78/0.16 |
4. Intervention success | 0.44/X | 0.44/X | 0.72/X |
4.4. Extension to the Damped Harmonic Oscillator ()
5. Discussion
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Additional Results for Linear Regression
Appendix B. Undamped Harmonic Oscillator Appendices
Appendix C. Undamped Harmonic Oscillator Appendices
Appendix D. Investigating the Damped Harmonic Oscillator (γ > 0)
Appendix D.1. Mathematical Setup
Appendix D.2. Computational Setup for the Damped Harmonic Oscillator
Appendix D.3. Criteria Are Less Aligned for the Underdamped Harmonic Oscillator
References
- Vaswani, A.; Shazeer, N.M.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All you Need. In Proceedings of the Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the North American Chapter of the Association for Computational Linguistics, Minneapolis, MN, USA, 2–7 June 2019. [Google Scholar]
- Chowdhery, A.; Narang, S.; Devlin, J.; Bosma, M.; Mishra, G.; Roberts, A.; Barham, P.; Chung, H.W.; Sutton, C.; Gehrmann, S.; et al. PaLM: Scaling Language Modeling with Pathways. J. Mach. Learn. Res. 2022, 24, 240:1–240:113. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–17 October 2021; pp. 9992–10002. [Google Scholar]
- Elhage, N.; Hume, T.; Olsson, C.; Schiefer, N.; Henighan, T.; Kravec, S.; Hatfield-Dodds, Z.; Lasenby, R.; Drain, D.; Chen, C.; et al. Toy Models of Superposition. arXiv 2022, arXiv:2209.10652. [Google Scholar]
- Olsson, C.; Elhage, N.; Nanda, N.; Joseph, N.; DasSarma, N.; Henighan, T.; Mann, B.; Askell, A.; Bai, Y.; Chen, A.; et al. In-context Learning and Induction Heads. arXiv 2022, arXiv:2209.11895. Available online: https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html (accessed on 6 November 2024).
- Liu, Z.; Kitouni, O.; Nolte, N.S.; Michaud, E.; Tegmark, M.; Williams, M. Towards understanding grokking: An effective theory of representation learning. Adv. Neural Inf. Process. Syst. 2022, 35, 34651–34663. [Google Scholar]
- Elhage, N.; Nanda, N.; Olsson, C.; Henighan, T.; Joseph, N.; Mann, B.; Askell, A.; Bai, Y.; Chen, A.; Conerly, T.; et al. A Mathematical Framework for Transformer Circuits. Transform. Circuits Thread 2021, 1, 12. Available online: https://transformer-circuits.pub/2021/framework/index.html (accessed on 6 November 2024).
- Chughtai, B.; Chan, L.; Nanda, N. A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations. In Proceedings of the Fortieth International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023. [Google Scholar]
- Gurnee, W.; Nanda, N.; Pauly, M.; Harvey, K.; Troitskii, D.; Bertsimas, D. Finding Neurons in a Haystack: Case Studies with Sparse Probing. arXiv 2023, arXiv:2305.01610. [Google Scholar]
- Wang, K.R.; Variengien, A.; Conmy, A.; Shlegeris, B.; Steinhardt, J. Interpretability in the Wild: A Circuit for Indirect Object Identification in GPT-2 Small. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
- Conmy, A.; Mavor-Parker, A.N.; Lynch, A.; Heimersheim, S.; Garriga-Alonso, A. Towards Automated Circuit Discovery for Mechanistic Interpretability. arXiv 2023, arXiv:2304.14997. [Google Scholar]
- Nanda, N.; Chan, L.; Lieberum, T.; Smith, J.; Steinhardt, J. Progress measures for grokking via mechanistic interpretability. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
- Zhong, Z.; Liu, Z.; Tegmark, M.; Andreas, J. The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks. arXiv 2023, arXiv:2306.17844. [Google Scholar]
- Liu, T.J.B.; Boull’e, N.; Sarfati, R.; Earls, C.J. LLMs learn governing principles of dynamical systems, revealing an in-context neural scaling law. arXiv 2024, arXiv:2402.00795. [Google Scholar]
- Shapiro, J.A. Classical Mechanics; Lagrange’s and Hamilton’s Equations, 1st ed.; Rutgers University: Piscataway, NJ, USA, 2010; Chapter 2. [Google Scholar]
- Gurnee, W.; Tegmark, M. Language Models Represent Space and Time. arXiv 2023, arXiv:2310.02207. [Google Scholar]
- Sourmpis, C.; Petersen, C.; Gerstner, W.; Bellec, G. Biologically informed cortical models predict optogenetic perturbations. bioRxiv 2024. [Google Scholar] [CrossRef]
- Schrimpf, M.; McGrath, P.; Margalit, E.; DiCarlo, J.J. Do Topographic Deep ANN Models of the Primate Ventral Stream Predict the Perceptual Effects of Direct IT Cortical Interventions? bioRxiv 2024. [Google Scholar] [CrossRef]
- Alain, G.; Bengio, Y. Understanding intermediate layers using linear classifier probes. arXiv 2016, arXiv:1610.01644. [Google Scholar]
- Wei, J.; Tay, Y.; Bommasani, R.; Raffel, C.; Zoph, B.; Borgeaud, S.; Yogatama, D.; Bosma, M.; Zhou, D.; Metzler, D.; et al. Emergent Abilities of Large Language Models. arXiv 2022, arXiv:2206.07682. [Google Scholar]
- Schaeffer, R.; Miranda, B.; Koyejo, S. Are Emergent Abilities of Large Language Models a Mirage? arXiv 2023, arXiv:2304.15004. [Google Scholar]
- Udrescu, S.M.; Tegmark, M. AI Feynman: A physics-inspired method for symbolic regression. Sci. Adv. 2020, 6, eaay2631. [Google Scholar] [CrossRef]
- Liu, Z.; Tegmark, M. Machine Learning Conservation Laws from Trajectories. Phys. Rev. Lett. 2021, 126, 180604. [Google Scholar] [CrossRef]
- Cranmer, M.; Greydanus, S.; Hoyer, S.; Battaglia, P.W.; Spergel, D.N.; Ho, S. Lagrangian Neural Networks. arXiv 2020, arXiv:2003.04630. [Google Scholar]
- Greydanus, S.; Dzamba, M.; Yosinski, J. Hamiltonian Neural Networks. Neural Inf. Process. Syst. 2019, 15353–15363. [Google Scholar]
- Kantamneni, S.; Liu, Z.; Tegmark, M. OptPDE: Discovering Novel Integrable Systems via AI-Human Collaboration. arXiv 2024, arXiv:2405.04484. [Google Scholar]
- Akyürek, E.; Schuurmans, D.; Andreas, J.; Ma, T.; Zhou, D. What learning algorithm is in-context learning? Investigations with linear models. arXiv 2022, arXiv:2211.15661. [Google Scholar]
- Garg, S.; Tsipras, D.; Liang, P.; Valiant, G. What Can Transformers Learn In-Context? A Case Study of Simple Function Classes. arXiv 2022, arXiv:2208.01066. [Google Scholar]
- Härdle, W.; Simar, L. Canonical Correlation Analysis. In Applied Multivariate Statistical Analysis; Springer: Berlin/Heidelberg, Germany, 2007; pp. 321–330. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Frankle, J.; Carbin, M. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. arXiv 2018, arXiv:1803.03635. [Google Scholar]
- Liu, Z.; Tegmark, M. A Neural Scaling Law from Lottery Ticket Ensembling. arXiv 2023, arXiv:2310.02258. [Google Scholar]
- Ravichander, A.; Belinkov, Y.; Hovy, E.H. Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance? In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics, Julian’s, MT, USA, 21–22 March 2020. [Google Scholar]
- Butcher, J.C. Chapter 3: Runge-Kutta Methods. In Numerical Methods for Ordinary Differential Equations; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2016; pp. 143–331. [Google Scholar] [CrossRef]
- Butcher, J.C. Numerical Methods for Ordinary Differential Equations; John Wiley & Sons: Hoboken, NJ, USA, 2016. [Google Scholar]
- University of Victoria. ODEs: Matrix Exponentials. Adapted for Math 204 at the University of Victoria. Available online: https://web.uvic.ca/~tbazett/diffyqs/sec_matexp.html (accessed on 16 May 2024).
- Reuther, A.; Kepner, J.; Byun, C.; Samsi, S.; Arcand, W.; Bestor, D.; Bergeron, B.; Gadepally, V.; Houle, M.; Hubbell, M.; et al. Interactive supercomputing on 40,000 cores for machine learning and data analysis. In Proceedings of the 2018 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 25–27 September 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar]
- Brunton, S.L. Notes on Koopman Operator Theory; Universität von Washington, Department of Mechanical Engineering, Zugriff: Washington, DC, USA, 2019. [Google Scholar]
Method | I | |
---|---|---|
Linear Multistep | ||
Taylor Expansion | ||
Matrix Exponential |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kantamneni, S.; Liu, Z.; Tegmark, M. How Do Transformers Model Physics? Investigating the Simple Harmonic Oscillator. Entropy 2024, 26, 997. https://doi.org/10.3390/e26110997
Kantamneni S, Liu Z, Tegmark M. How Do Transformers Model Physics? Investigating the Simple Harmonic Oscillator. Entropy. 2024; 26(11):997. https://doi.org/10.3390/e26110997
Chicago/Turabian StyleKantamneni, Subhash, Ziming Liu, and Max Tegmark. 2024. "How Do Transformers Model Physics? Investigating the Simple Harmonic Oscillator" Entropy 26, no. 11: 997. https://doi.org/10.3390/e26110997
APA StyleKantamneni, S., Liu, Z., & Tegmark, M. (2024). How Do Transformers Model Physics? Investigating the Simple Harmonic Oscillator. Entropy, 26(11), 997. https://doi.org/10.3390/e26110997