Hacker News • 95일 전

딥러닝의 과학적 이론이 도래할 것이다

IMP

8/10

핵심 요약

제이미 사이먼(Jamie Simon) 등 14명의 연구진은 딥러닝의 훈련 과정, 가중치, 성능 등을 설명하는 과학적 이론이 등장하고 있음을 주장합니다. 이 논문은 기계 학습 역학(learning mechanics)이라는 새로운 관점을 통해 다섯 가지 주요 연구 흐름을 통합하며, 기존의 '블랙박스'로 여겨지던 신경망의 작동 원리를 수학적, 역학적으로 규명할 수 있는 기반을 마련했다는 점에서 학계와 실무 양쪽에 큰 의의를 갖습니다.

번역된 본문

통계 > 머신러닝

arXiv:2604.21691 (stat) [2026년 4월 23일 제출]

제목: 딥러닝의 과학적 이론이 도래할 것이다 저자: Jamie Simon, Daniel Kunin, Alexander Atanasov, Enric Boix-Adserà, Blake Bordelon, Jeremy Cohen, Nikhil Ghosh, Florentin Guth, Arthur Jacot, Mason Kamb, Dhruva Karkada, Eric J. Michaud, Berkan Ottlik, Joseph Turnbull

초록: 이 논문에서 우리는 딥러닝에 대한 과학적 이론이 싹트고 있음을 주장합니다. 여기서 말하는 과학적 이론이란 신경망의 훈련 과정, 은닉 표현(hidden representations), 최종 가중치, 그리고 성능의 중요한 특성과 통계를 규명하는 이론을 의미합니다. 우리는 딥러닝 이론 분야에서 진행 중인 주요 연구 흐름을 한데 묶어, 이러한 이론의 탄생을 가리키는 다음의 다섯 가지 성장하는 연구 분야를 확인했습니다.

(a) 현실적인 시스템의 학습 역학에 대한 직관을 제공하는 '풀 수 있는 이상적인 설정(Solvable idealized settings)' (b) 근본적인 학습 현상에 대한 통찰력을 제공하는 '다루기 쉬운 한계(Tractable limits)' (c) 중요한 거시적 관측 가능량을 포착하는 '단순한 수학적 법칙(Simple mathematical laws)' (d) 하이퍼파라미터 이론을 훈련 과정의 나머지 부분과 분리하여 더 단순한 시스템을 남기는 '하이퍼파라미터 이론(Theories of hyperparameters)' (e) 어떤 현상이 설명을 필요로 하는지 명확히 하는, 시스템 및 설정 전반에 공유되는 '보편적 행동(Universal behaviors)'

종합해 볼 때, 이러한 연구 분야들은 몇 가지 공통된 광범위한 특성을 공유합니다. 즉, 훈련 과정의 역학과 관련이 있고, 주로 거친 집계 통계를 설명하는 것을 목표로 하며, 반증 가능한 정량적 예측을 강조합니다. 우리는 새롭게 떠오르는 이 이론을 학습 과정의 역학, 즉 '학습 역학(Learning mechanics)'으로 이해하는 것이 가장 좋다고 주장하며 이 명칭을 제안합니다. 또한 이 역학적 관점과 통계적 및 정보 이론적 관점을 포함하여 딥러닝 이론을 구축하기 위한 다른 접근 방식 간의 관계를 논의합니다. 특히, 이 학습 역학과 기계적 해석 가능성(Mechanistic interpretability) 사이의 공생 관계를 예상합니다. 또한 근본적인 이론이 불가능하거나 중요하지 않다는 일반적인 주장을 검토하고 이에 대해 반론을 제기합니다. 마지막으로 학습 역학의 중요한 미해결 과제에 대한 청사진과 초보자를 위한 조언으로 결론을 맺습니다. 추가 소개 자료, 관점 및 공개 질문은 본 웹사이트에서 호스팅됩니다.

코멘트: 41페이지, 6개의 그림 주제: 머신러닝 (stat.ML); 머신러닝 (cs.LG) 인용: arXiv:2604.21691 [stat.ML] 제출 이력: Daniel Kunin [v1] 2026년 4월 23일 (3,519 KB) 전문 링크: PDF 및 HTML(실험적) 보기 현재 탐색 컨텍스트: stat.ML < 이전 | 다음 > 최신 | 최근 | 2026-04 탐색 기준 변경: cs, cs.LG, stat 참고문헌 및 인용: NASA ADS, Google Scholar, Semantic Scholar, BibTeX 내보내기 서지 및 인용 도구: Bibliographic Explorer, Connected Papers, Litmaps, scite.ai 코드, 데이터, 미디어: alphaXiv, CatalyzeX, DagsHub, GotitPub, Huggingface, ScienceCast

원문 보기

원문 보기 (영어)

--> Statistics > Machine Learning arXiv:2604.21691 (stat) [Submitted on 23 Apr 2026] Title: There Will Be a Scientific Theory of Deep Learning Authors: Jamie Simon , Daniel Kunin , Alexander Atanasov , Enric Boix-Adserà , Blake Bordelon , Jeremy Cohen , Nikhil Ghosh , Florentin Guth , Arthur Jacot , Mason Kamb , Dhruva Karkada , Eric J. Michaud , Berkan Ottlik , Joseph Turnbull View a PDF of the paper titled There Will Be a Scientific Theory of Deep Learning, by Jamie Simon and 13 other authors View PDF HTML (experimental) Abstract: In this paper, we make the case that a scientific theory of deep learning is emerging. By this we mean a theory which characterizes important properties and statistics of the training process, hidden representations, final weights, and performance of neural networks. We pull together major strands of ongoing research in deep learning theory and identify five growing bodies of work that point toward such a theory: (a) solvable idealized settings that provide intuition for learning dynamics in realistic systems; (b) tractable limits that reveal insights into fundamental learning phenomena; (c) simple mathematical laws that capture important macroscopic observables; (d) theories of hyperparameters that disentangle them from the rest of the training process, leaving simpler systems behind; and (e) universal behaviors shared across systems and settings which clarify which phenomena call for explanation. Taken together, these bodies of work share certain broad traits: they are concerned with the dynamics of the training process; they primarily seek to describe coarse aggregate statistics; and they emphasize falsifiable quantitative predictions. We argue that the emerging theory is best thought of as a mechanics of the learning process, and suggest the name learning mechanics. We discuss the relationship between this mechanics perspective and other approaches for building a theory of deep learning, including the statistical and information-theoretic perspectives. In particular, we anticipate a symbiotic relationship between learning mechanics and mechanistic interpretability. We also review and address common arguments that fundamental theory will not be possible or is not important. We conclude with a portrait of important open directions in learning mechanics and advice for beginners. We host further introductory materials, perspectives, and open questions at this http URL . Comments: 41 pages, 6 figures Subjects: Machine Learning (stat.ML) ; Machine Learning (cs.LG) Cite as: arXiv:2604.21691 [stat.ML] (or arXiv:2604.21691v1 [stat.ML] for this version) https://doi.org/10.48550/arXiv.2604.21691 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Daniel Kunin [ view email ] [v1] Thu, 23 Apr 2026 13:58:12 UTC (3,519 KB) Full-text links: Access Paper: View a PDF of the paper titled There Will Be a Scientific Theory of Deep Learning, by Jamie Simon and 13 other authors View PDF HTML (experimental) TeX Source view license Current browse context: stat.ML < prev | next > new | recent | 2026-04 Change to browse by: cs cs.LG stat References & Citations NASA ADS Google Scholar Semantic Scholar export BibTeX citation Loading... BibTeX formatted citation × loading... Data provided by: Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer ( What is the Explorer? ) Connected Papers Toggle Connected Papers ( What is Connected Papers? ) Litmaps Toggle Litmaps ( What is Litmaps? ) scite.ai Toggle scite Smart Citations ( What are Smart Citations? ) Code, Data, Media Code, Data and Media Associated with this Article alphaXiv Toggle alphaXiv ( What is alphaXiv? ) Links to Code Toggle CatalyzeX Code Finder for Papers ( What is CatalyzeX? ) DagsHub Toggle DagsHub ( What is DagsHub? ) GotitPub Toggle Gotit.pub ( What is GotitPub? ) Huggingface Toggle Hugging Face ( What is Huggingface? ) ScienceCast Toggle ScienceCast ( What is ScienceCast? ) Demos Demos Replicate Toggle Replicate ( What is Replicate? ) Spaces Toggle Hugging Face Spaces ( What is Spaces? ) Spaces Toggle TXYZ.AI ( What is TXYZ.AI? ) Related Papers Recommenders and Search Tools Link to Influence Flower Influence Flower ( What are Influence Flowers? ) Core recommender toggle CORE Recommender ( What is CORE? ) Author Venue Institution Topic About arXivLabs arXivLabs: experimental projects with community collaborators arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs . Which authors of this paper are endorsers? | Disable MathJax ( What is MathJax? )

딥러닝 이론 학습 역학 신경망 머신러닝 arXiv 논문