Hacker News • 64일 전

프롬프트 예의가 LLM 정확도에 미치는 영향

IMP

7/10

핵심 요약

최근 연구에 따르면, AI 모델에게 무례하고 공격적인 프롬프트를 입력할 때 오히려 정중한 프롬프트보다 정확도가 높게 나타납니다. 50개의 객관식 질문을 '매우 정중함'부터 '매무 무례함'까지 다섯 가지 어조로 변형하여 ChatGPT-4o에 테스트한 결과, 무례한 프롬프트의 정확도(84.8%)가 정중한 프롬프트(80.8%)를 상회했습니다. 이는 인간 사회의 통념과 반대되는 결과로, 최신 LLM의 튜닝 과정이나 RLHF(인간 피드백 기반 강화학습)가 모델의 톤 반응 방식을 변화시켰을 가능성을 시사합니다.

번역된 본문

--> 컴퓨터 과학 > 계산 및 언어 arXiv:2510.04950 (cs) [2025년 10월 6일 제출]

제목: 어조에 주의하라: 프롬프트의 예의가 LLM 정확도에 미치는 영향 조사 (단편 논문) 저자: Om Dobariya, Akhil Kumar

초록: 자연어 프롬프트의 표현 방식은 대형 언어 모델(LLM)의 성능에 영향을 미치는 것으로 알려져 있으나, 예의와 어조의 역할에 대해서는 여전히 연구가 부족한 실정입니다. 본 연구에서는 프롬프트의 예의 수준이 객관식 질문에 대한 모델의 정확도에 어떠한 영향을 미치는지 조사했습니다. 연구진은 수학, 과학, 역사 분야에 걸친 50개의 기본 질문으로 구성된 데이터셋을 제작하여, 각 질문을 '매우 정중함(Very Polite)', '정중함(Polite)', '중립적(Neutral)', '무례함(Rude)', '매우 무례함(Very Rude)'이라는 5가지 어조 변형으로 재작성하여 총 250개의 고유한 프롬프트를 생성했습니다.

연구진은 ChatGPT 4o(모델명 gpt-4o)를 사용하여 이러한 조건들에 걸친 응답을 평가하고, 대응 표본 t-검정(paired sample t-tests)을 적용하여 통계적 유의성을 평가했습니다. 기대와는 달리, 무례한 프롬프트가 정중한 프롬프트보다 지속적으로 더 높은 성능을 보였으며, '매우 정중한' 프롬프트의 정확도는 80.8%, '매우 무례한' 프롬프트의 정확도는 84.8%로 나타났습니다.

이러한 결과는 무례함이 더 낮은 결과를 초래한다고 주장했던 이전의 연구들과 다릅니다. 이는 최신 LLM이 어조 변화에 대해 이전 모델들과는 다르게 반응할 수 있음을 시사합니다. 본 연구의 결과는 프롬프트 엔지니어링의 실용적인 측면을 연구하는 것의 중요성을 강조하며, 인간과 AI 상호작용의 사회적 차원에 대한 더 광범위한 의문을 제기합니다.

덧붙임: 5페이지, 표 3개; 한계점 및 윤리적 고려사항 섹션 포함; ACL 2025 Findings에 제출 중인 단편 논문 주제: 계산 및 언어 (cs.CL); 인공지능 (cs.AI); 머신러닝 (cs.LG); 신경 및 진화 계산 (cs.NE); 방법론 (stat.ME) 인용: arXiv:2510.04950 [cs.CL] (또는 이 버전의 경우 arXiv:2510.04950v1 [cs.CL]) https://doi.org/10.48550/arXiv.2510.04950 제출 이력: Om Dobariya [ 이메일 보기 ] [v1] 2025년 10월 6일 월요일 15:50:39 UTC (337 KB) 전문 링크: 논문 전문 접근: Om Dobariya와 Akhil Kumar가 작성한 "Mind Your Tone: Investigating How Prompt Politeness Affects LLM Accuracy" PDF 보기 라이선스 보기 현재 탐색 컨텍스트: cs.CL < 이전 | 다음 > 새 글 | 최근 | 2025-10 탐색 기준 변경: cs cs.AI cs.LG cs.NE stat stat.ME 참고문헌 및 인용 NASA ADS Google Scholar Semantic Scholar BibTeX 내보내기 로딩 중... BibTeX 형식의 인용 × 로딩 중... 데이터 제공: 북마크 서지 도구: 서지 및 인용 도구, 서지 탐색기 전환, 커넥티드 페이퍼스 전환, Litmaps 전환, scite.ai 전환 코드, 데이터, 미디어: 본 논문과 관련된 코드, 데이터, 미디어 alphaXiv 전환, 코드 링크, CatalyzeX, DagsHub, GotitPub, Hugging Face 전환, Papers with Code, ScienceCast 전환 데모: Replicate, Hugging Face Spaces, TXYZ.AI 전환 관련 논문: 추천 및 검색 도구, 인플루언스 플라워 링크, CORE 추천기 전환 저자, 장소, 기관, 주제, arXivLabs 정보: 커뮤니티 협력자와 함께하는 실험적 프로젝트. arXivLabs는 협력자들이 웹사이트에서 직접 새로운 arXiv 기능을 개발하고 공유할 수 있도록 하는 프레임워크입니다.

원문 보기

원문 보기 (영어)

--> Computer Science > Computation and Language arXiv:2510.04950 (cs) [Submitted on 6 Oct 2025] Title: Mind Your Tone: Investigating How Prompt Politeness Affects LLM Accuracy (short paper) Authors: Om Dobariya , Akhil Kumar View a PDF of the paper titled Mind Your Tone: Investigating How Prompt Politeness Affects LLM Accuracy (short paper), by Om Dobariya and Akhil Kumar View PDF Abstract: The wording of natural language prompts has been shown to influence the performance of large language models (LLMs), yet the role of politeness and tone remains underexplored. In this study, we investigate how varying levels of prompt politeness affect model accuracy on multiple-choice questions. We created a dataset of 50 base questions spanning mathematics, science, and history, each rewritten into five tone variants: Very Polite, Polite, Neutral, Rude, and Very Rude, yielding 250 unique prompts. Using ChatGPT 4o, we evaluated responses across these conditions and applied paired sample t-tests to assess statistical significance. Contrary to expectations, impolite prompts consistently outperformed polite ones, with accuracy ranging from 80.8% for Very Polite prompts to 84.8% for Very Rude prompts. These findings differ from earlier studies that associated rudeness with poorer outcomes, suggesting that newer LLMs may respond differently to tonal variation. Our results highlight the importance of studying pragmatic aspects of prompting and raise broader questions about the social dimensions of human-AI interaction. Comments: 5 pages, 3 tables; includes Limitations and Ethical Considerations sections; short paper under submission to Findings of ACL 2025 Subjects: Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Methodology (stat.ME) Cite as: arXiv:2510.04950 [cs.CL] (or arXiv:2510.04950v1 [cs.CL] for this version) https://doi.org/10.48550/arXiv.2510.04950 Focus to learn more arXiv-issued DOI via DataCite Submission history From: Om Dobariya [ view email ] [v1] Mon, 6 Oct 2025 15:50:39 UTC (337 KB) Full-text links: Access Paper: View a PDF of the paper titled Mind Your Tone: Investigating How Prompt Politeness Affects LLM Accuracy (short paper), by Om Dobariya and Akhil Kumar View PDF view license Current browse context: cs.CL < prev | next > new | recent | 2025-10 Change to browse by: cs cs.AI cs.LG cs.NE stat stat.ME References & Citations NASA ADS Google Scholar Semantic Scholar export BibTeX citation Loading... BibTeX formatted citation × loading... Data provided by: Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer ( What is the Explorer? ) Connected Papers Toggle Connected Papers ( What is Connected Papers? ) Litmaps Toggle Litmaps ( What is Litmaps? ) scite.ai Toggle scite Smart Citations ( What are Smart Citations? ) Code, Data, Media Code, Data and Media Associated with this Article alphaXiv Toggle alphaXiv ( What is alphaXiv? ) Links to Code Toggle CatalyzeX Code Finder for Papers ( What is CatalyzeX? ) DagsHub Toggle DagsHub ( What is DagsHub? ) GotitPub Toggle Gotit.pub ( What is GotitPub? ) Huggingface Toggle Hugging Face ( What is Huggingface? ) Links to Code Toggle Papers with Code ( What is Papers with Code? ) ScienceCast Toggle ScienceCast ( What is ScienceCast? ) Demos Demos Replicate Toggle Replicate ( What is Replicate? ) Spaces Toggle Hugging Face Spaces ( What is Spaces? ) Spaces Toggle TXYZ.AI ( What is TXYZ.AI? ) Related Papers Recommenders and Search Tools Link to Influence Flower Influence Flower ( What are Influence Flowers? ) Core recommender toggle CORE Recommender ( What is CORE? ) Author Venue Institution Topic About arXivLabs arXivLabs: experimental projects with community collaborators arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs . Which authors of this paper are endorsers? | Disable MathJax ( What is MathJax? )

프롬프트-엔지니어링 llm-성능 인간-ai-상호작용 인공지능-연구 chatgpt