HN
Hacker News • 2일 전
최신 LLM 5개, 실제 팩트체크 67%에서 불일치
IMP 7/10
핵심 요약
해커뉴스에 공유된 연구에 따르면, 최신 프론티어 대형 언어 모델(LLM) 5개가 1,000개의 실제 팩트체크 주장 중 67%에서 서로 다른 결과를 내놓았습니다. 이는 현재의 첨단 AI 모델들조차 복잡한 실제 정보의 진위를 판별하는 데 있어 의견이 크게 엇갈린다는 치명적인 한계를 보여줍니다.
번역된 본문
해당 소스의 본문은 실제 연구 결과나 글의 내용이 아닌, 웹페이지의 디자인과 레이아웃을 담당하는 CSS 코드로만 구성되어 있습니다. 따라서 번역할 수 있는 텍스트 형태의 연구 내용이 존재하지 않습니다.
원문 보기 (영어)
, the page silently does not scroll. */ @media (max-width: 639px) { main.research-page.page { padding-top: calc(2.75rem + 13px + 1rem); } } /* ── Hero / poster moment ──────────────────────────────────────────── */ .research-hero { text-align: left; margin: 0 auto 4rem; padding-top: 2rem; } .research-meta { font-size: 0.8125rem; letter-spacing: 0.04em; text-transform: uppercase; color: var(--warm-500); margin-bottom: 1.5rem; } .research-stat { font-size: clamp(2.75rem, 6vw, 4rem); font-weight: 800; line-height: 1; letter-spacing: -0.04em; color: var(--primary); margin: 0 0 0.25rem; } .research-stat-tag { font-size: clamp(1.5rem, 3.5vw, 2rem); font-weight: 700; line-height: 1.25; letter-spacing: -0.02em; color: var(--warm-800); max-width: 42rem; margin: 0 0 1rem; } .research-stat-sub { font-size: 1rem; color: var(--warm-500); max-width: 42rem; } .research-title { font-size: clamp(1.75rem, 3.5vw, 2.25rem); font-weight: 700; line-height: 1.2; letter-spacing: -0.02em; color: var(--warm-800); margin: 5rem 0 1rem; max-width: 48rem; } .research-abstract { font-size: 1.0625rem; line-height: 1.65; color: var(--warm-600); max-width: 42rem; margin: 0 0 2rem; } .research-abstract strong { color: var(--warm-700); } /* ── TOC ───────────────────────────────────────────────────────────── */ .research-toc { border: 1px solid var(--warm-200); border-radius: var(--radius-md); padding: 1rem 1.25rem; background: var(--surface); margin: 0 0 3rem; font-size: 0.875rem; } .research-toc-title { font-weight: 700; color: var(--warm-700); margin-bottom: 0.5rem; font-size: 0.8125rem; letter-spacing: 0.04em; text-transform: uppercase; } .research-toc ol { list-style: decimal-leading-zero inside; margin: 0; padding: 0; columns: 2; column-gap: 2rem; } .research-toc li { padding: 0.125rem 0; } .research-toc a { color: var(--warm-600); text-decoration: none; } .research-toc a:hover { color: var(--primary); text-decoration: underline; } @media (max-width: 768px) { .research-toc ol { columns: 1; } } /* ── Sections ──────────────────────────────────────────────────────── */ .research-section { margin: 1.5rem 0; } /* Offset TOC anchor jumps so the sticky nav (~44-57px) doesn't cover the target heading. Applies to top-level section anchors AND to inline h3 IDs like #agreement-with-majority that the methodology + limitations sections refer back to. */ .research-section, .research-page [id] { scroll-margin-top: 6rem; } .research-section h2 { font-size: 1.5rem; font-weight: 700; letter-spacing: -0.02em; color: var(--warm-800); margin: 0 0 0.5rem; } .research-section h2 .num { color: var(--warm-400); font-weight: 600; margin-right: 0.5rem; } .research-section h2 .num::after { content: '.'; } .research-section h3 { font-size: 1.125rem; font-weight: 700; color: var(--warm-800); margin: 2rem 0 0.75rem; } .research-section p { font-size: 1rem; line-height: 1.65; max-width: 42rem; color: var(--warm-600); } .research-section p + p { margin-top: 1rem; } .research-section p strong { color: var(--warm-800); } /* Callout */ .research-callout { background: var(--cream); border-left: 4px solid var(--primary); padding: 1rem 1.25rem; margin: 1.5rem 0; border-radius: 4px; font-size: 0.9375rem; line-height: 1.55; color: var(--warm-700); } /* ── Tables ────────────────────────────────────────────────────────── */ .research-table-scroll { overflow-x: auto; margin: 1rem 0 1.5rem; } table.research-table { width: 100%; border-collapse: collapse; font-size: 0.875rem; font-variant-numeric: tabular-nums; } table.research-table th, table.research-table td { padding: 0.625rem 0.875rem; text-align: right; border-bottom: 1px solid var(--warm-200); } table.research-table th:first-child, table.research-table td:first-child { text-align: left; } table.research-table th { font-weight: 600; color: var(--warm-700); background: var(--cream); border-bottom: 2px solid var(--warm-300); white-space: nowrap; } table.research-table td { color: var(--warm-600); } table.research-table td strong { color: var(--warm-800); } table.research-table .num { font-weight: 600; color: var(--warm-800); } /* ── Pattern list (Lenz disagreement signatures) ───────────────────── */ .pattern-list { list-style: none; padding: 0; margin: 1.5rem 0; font-size: 1rem; line-height: 1.65; } .pattern-list li { padding: 0.625rem 0; border-bottom: 1px solid var(--warm-200); color: var(--warm-600); } .pattern-list li:last-child { border-bottom: 0; } .pattern-list li strong { color: var(--warm-800); } .pattern-list .pattern-count { display: inline-block; color: var(--primary); font-weight: 700; min-width: 3rem; text-align: right; font-variant-numeric: tabular-nums; } /* ── Verdict labels (no color emphasis) ───────────────────────────── */ .verdict-chip { display: inline; font-weight: 600; color: inherit; background: transparent; border: 0; padding: 0; white-space: nowrap; } .verdict-error { color: var(--warm-500); font-style: italic; font-weight: 500; } /* ── Example cards ─────────────────────────────────────────────────── */ .example-card { border: 1px solid var(--warm-200); border-radius: var(--radius-md); background: var(--surface); padding: 1rem 1.25rem; margin: 0.75rem 0; } .example-card:first-of-type { margin-top: 2rem; } /* Higher specificity than `.research-section p` so the 42rem prose cap doesn't apply — example claims fill the full card width. */ .research-section p.example-claim { font-size: 1.0625rem; font-weight: 600; line-height: 1.4; letter-spacing: -0.01em; color: var(--warm-800); margin: 0 0 0.5rem; max-width: none; } .example-meta-row { display: flex; justify-content: space-between; align-items: baseline; gap: 1rem; margin: 0 0 0.5rem; } .example-meta-row p { margin: 0 !important; } .example-chips { display: grid; grid-template-columns: repeat(5, minmax(0, 1fr)); gap: 0.5rem; align-items: stretch; margin-bottom: 0; } .example-chip-group { display: flex; flex-direction: column; align-items: center; gap: 0.25rem; min-width: 0; text-align: center; } .example-chip-group small { line-height: 1.2; } @media (max-width: 640px) { .example-chips { grid-template-columns: repeat(3, minmax(0, 1fr)); } } .example-chip-group small { font-size: 0.625rem; color: var(--warm-500); text-transform: uppercase; letter-spacing: 0.02em; font-weight: 600; white-space: nowrap; } .example-toggle { background: transparent; border: 0; padding: 0.5rem 0; color: var(--primary); font-weight: 600; font-size: 0.875rem; cursor: pointer; display: inline-flex; align-items: center; gap: 0.25rem; } .example-toggle:hover { text-decoration: underline; } .example-detail { margin-top: 1rem; padding-top: 1rem; border-top: 1px solid var(--warm-200); font-size: 0.9375rem; line-height: 1.65; color: var(--warm-600); } .example-detail h4 { font-size: 0.75rem; font-weight: 700; color: var(--warm-500); text-transform: uppercase; letter-spacing: 0.04em; margin: 1rem 0 0.375rem; } .example-detail h4:first-child { margin-top: 0; } .example-link { color: var(--primary); text-decoration: none; font-weight: 600; font-size: 0.8125rem; white-space: nowrap; flex-shrink: 0; } .example-link:hover { text-decoration: underline; } .example-claim-link { color: inherit; text-decoration: none; } .example-claim-link:hover { text-decoration: underline; } /* ── Code block ────────────────────────────────────────────────────── */ .research-code { background: var(--cream); border: 1px solid var(--warm-200); border-radius: var(--radius-md); padding: 1rem 1.25rem; font-family: ui-monospace, SFMono-Regular, Consolas, monospace; font-size: 0.8125rem; line-height: 1.6; color: var(--warm-700); white-space: pre-wrap; overflow-x: auto; margin: 1rem 0; } /* ── FAQ ───────────────────────────────────────────────────────────── */ .faq-list { margin-top: 1.5rem; } .faq-item { border-bottom: 1px solid var(--warm-200); padding: 1rem 0; } .faq-item:last-child { border-bottom: 0; } .faq-q { font-weight: 700; color: var(--warm-800); font-size: 1rem; margin-bottom: 0.5rem; } .faq-a { font-size: 0.9375rem; line-height: 1.65; color: