Awesome-LLM-in-Social-Science

Below we compile awesome papers that

evaluate Large Language Models (LLMs) from a perspective of Social Science.
align LLMs from a perspective of Social Science.
employ LLMs to facilitate research, address issues, and enhance tools in Social Science.
contribute surveys, perspectives, and datasets on the above topics.

The above taxonomies are by no means orthogonal. For example, evaluations require simulations. We categorize these papers based on our understanding of their focus. This collection has a special focus on Psychology and Human Values.

Welcome to contribute and discuss!

🤩 Papers marked with a ⭐️ are contributed by the maintainers of this repository. If you find them useful, we would greatly appreciate it if you could give the repository a star or cite our paper.

1. 📚 Survey
1. 🗂️ Dataset
1. 🔎 Evaluating LLM
- 3.1. ❤️ Value
- 3.2. 🩷 Personality
- 3.3. 🔞 Morality
- 3.4. 🎤 Opinion
- 3.5. 💚 General Preference
- 3.6. 🧠 Ability
- 3.7. ⚠️ Risk
1. ⚒️ Tool enhancement
1. ⛑️ Alignment
- 5.1. 🌈 Pluralistic Alignment
1. 🚀 Simulation
1. 👁️‍🗨️ Perspective

1. 📚 Survey

The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment, 2024.12, [paper].
Large Language Model Safety: A Holistic Survey, 2024.12, [paper].
Political-LLM: Large Language Models in Political Science, 2024.12, [paper], [website].
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods, 2024.12, [paper].
From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents, 2024.12, [paper], [repo].
A Survey on Human-Centric LLMs, 2024.11, [paper].
Survey of Cultural Awareness in Language Models: Text and Beyond, 2024.11, [paper].
How developments in natural language processing help us in understanding human behaviour, 2024.10 Nature Human Behavior, [paper].
How large language models can reshape collective intelligence, 2024.09, Nature Human Behavior, [paper].
Automated Mining of Structured Knowledge from Text in the Era of Large Language Models, 2024.08, KDD 2024, [paper].
Affective Computing in the Era of Large Language Models: A Survey from the NLP Perspective, 2024.07, [paper].
Perils and opportunities in using large language models in psychological research, 2024.07, [paper].
The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models, 2024.06, [paper].
Can Generative AI improve social science?, 2024.05, PNAS, [paper].
Foundational Challenges in Assuring Alignment and Safety of Large Language Models, 2024.04, [paper].
Large Language Model based Multi-Agents: A Survey of Progress and Challenges, 2024.01, [paper], [repo].
The Rise and Potential of Large Language Model Based Agents: A Survey, 2023, [paper], [repo].
A Survey on Large Language Model based Autonomous Agents, 2023, [paper], [repo].
AI Alignment: A Comprehensive Survey, 2023.11, [paper], [website].
Aligning Large Language Models with Human: A Survey, 2023, [paper], [repo].
Large Language Model Alignment: A Survey, 2023, [paper].
Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives, 2023.12, [paper].
A Survey on Evaluation of Large Language Models, 2023.07, [paper], [repo].
From Instructions to Intrinsic Human Values -- A Survey of Alignment Goals for Big Models, 2023.08, [paper], [repo].

2. 🗂️ Dataset

⭐️ ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models, ACL 2024, [paper], [code].
https://github.com/CLUEbenchmark/CLUEDatasetSearch
HATEDAY: Insights from a Global Hate Speech Dataset Representative of a Day on Twitter, 2024.11, [paper].
https://lit.eecs.umich.edu/downloads.html
COMPO: Community Preferences for Language Model Personalization, 2024.10, [paper].
Cultural Commonsense Knowledge for Intercultural Dialogues, CIKM 2024, [paper], [dataset].

3. 🔎 Evaluating LLM

3.1. ❤️ Value

⭐️ Measuring Human and AI Values Based on Generative Psychometrics with Large Language Models, AAAI 2025, [paper], [code].
⭐️ ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models, ACL 2024, [paper], [code].
NORMAD: A Framework for Measuring the Cultural Adaptability of Large Language Models, 2024.10, [paper].
LOCALVALUEBENCH: A Collaboratively Built and Extensible Benchmark for Evaluating Localized Value Alignment and Ethical Safety in Large Language Models, 2024.08, [paper].
Stick to your role! Stability of personal values expressed in large language models, 2024.08, [paper].
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing, 2024.07, [paper].
Do LLMs have Consistent Values?, 2024.07, [paper].
CLAVE: An Adaptive Framework for Evaluating Values of LLM Generated Responses, 2024.07, [paper].
Are Large Language Models Consistent over Value-laden Questions?, 2024.07, [paper].
Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches, 2024.04, [paper].
Heterogeneous Value Evaluation for Large Language Models, 2023.03, [paper], [code].
Measuring Value Understanding in Language Models through Discriminator-Critique Gap, 2023.10, [paper].
Value FULCRA: Mapping Large Language Models to the Multidimensional Spectrum of Basic Human Values, 2023.11, [paper].
Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties, AAAI24, [paper], [code].
High-Dimension Human Value Representation in Large Language Models, 2024.04, [paper], [code].

3.2. 🩷 Personality

Quantifying ai psychology: A psychometrics benchmark for large language models, 2024.07, [paper].
Incharacter: Evaluating personality fidelity in role-playing agents through psychological interviews, ACL 2024, [paper], [code]
[MBTI] Open Models, Closed Minds? On Agents Capabilities in Mimicking Human Personalities through Open Large Language Models, 2024.01, [paper]
Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench, ICLR 2024, [paper], [code]
[BFI] AI Psychometrics: Assessing the Psychological Profiles of Large Language Models Through Psychometric Inventories, Journal, 2024.01, [paper]
Does Role-Playing Chatbots Capture the Character Personalities? Assessing Personality Traits for Role-Playing Chatbots, 2023.10, [paper]
[MBTI] Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language Models, 2023.07, [paper]
[MBTI] Can ChatGPT Assess Human Personalities? A General Evaluation Framework, 2023.03, EMNLP 2023, [paper], [code].
[BFI] Personality Traits in Large Language Models, 2023.07, [paper]
[BFI] Revisiting the Reliability of Psychological Scales on Large Language Models, 2023.05, [paper]
[BFI] Systematic Evaluation of GPT-3 for Zero-Shot Personality Estimation, ACL 2023 workshop, [paper]
[BFI] Have Large Language Models Developed a Personality?: Applicability of Self-Assessment Tests in Measuring Personality in LLMs, 2023.05, [paper]
[BFI] Evaluating and Inducing Personality in Pre-trained Language Models, NeurIPS 2023 (spotlight), [paper]
[BFI] Identifying and Manipulating the Personality Traits of Language Models, 2022,12, [paper]
Who is GPT-3? An Exploration of Personality, Values and Demographics, 2022.09, [paper]
Does GPT-3 Demonstrate Psychopathy? Evaluating Large Language Models from a Psychological Perspective, 2022.12, [paper]

3.3. 🔞 Morality

Aligning AI With Shared Human Values, 2020, [paper].
Exploring the psychology of GPT-4's Moral and Legal Reasoning, 2023.08, [paper].
Probing the Moral Development of Large Language Models through Defining Issues Test
Moral Foundations of Large Language Models, 2023.10, [paper].
Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity, 2023.06, [paper]
Evaluating the Moral Beliefs Encoded in LLMs, 2023.07, [paper]

3.4. 🎤 Opinion

More human than human: measuring ChatGPT political bias, 2023, [paper].
Towards Measuring the Representation of Subjective Global Opinions in Language Models, 2023.07, [paper], [website].

3.5. 💚 General Preference

Diverging Preferences: When do Annotators Disagree and do Models Know?, 2024.10, [paper].

3.6. 🧠 Ability

Replication for Language Models: Problems, Principles, and Best Practice for Political Science, 2024.10, [paper].
Can Language Models Reason about Individualistic Human Values and Preferences?, 2024.10, [paper].
Language Models in Sociological Research: An Application to Classifying Large Administrative Data and Measuring Religiosity, 2021, [paper].
Can Large Language Models Transform Computational Social Science?, 2023, [paper], [code].
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents, 2023, [paper], [code].
Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View, 2023, [paper], [code].
Playing repeated games with Large Language Models, 2023.05, [paper].
Machine Psychology: Investigating Emergent Capabilities and Behavior in Large Language Models Using Psychological Methods, 2023, [paper].
Using cognitive psychology to understand GPT-3, 2023.02, PNAS, [paper].
Large language models as a substitute for human experts in annotating political text, 2024.02, [paper].

3.7. ❤️ Risk and Safety

From Lived Experience to Insight: Unpacking the Psychological Risks of Using AI Conversational Agents, 2024.12, [paper].
AIR-Bench 2024: A Safety Benchmark Based on Risk Categories from Regulations and Policies, 2024.07, [paper].

4. ⚒️ Tool enhancement

ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions, 2024.10, [paper].
AI can help humans find common ground in democratic deliberation, 2024.10, Science, [paper].
PsyDI: Towards a Personalized and Progressively In-depth Chatbot for Psychological Measurements, 2024, [paper], [code].
ChatFive: Enhancing User Experience in Likert Scale Personality Test through Interactive Conversation with LLM Agents, CUI 2024, [paper]
LLM Agents for Psychology: A Study on Gamified Assessments, 2024.02, [paper].
Generative Social Choice, 2023.09, [paper]

5. ⛑️ Alignment

Aligning Large Language Models with Human Opinions through Persona Selection and Value–Belief–Norm Reasoning, 2024.11, [paper].
SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation, 2024.10, [paper].
Moral Alignment for LLM Agents, 2024.10, [paper].
ProgressGym: Alignment with a Millennium of Moral Progress, NeurIPS 2024 D&B Tract Spotlight, [paper], [code].
Strong and weak alignment of large language models with human values, 2024.08, Nature Scientific Reports, [paper].
STELA: a community-centred approach to norm elicitation for AI alignment, 2024.03, Nature Scientific Reports, [paper].
A Roadmap to Pluralistic Alignment, ICML 2024, [paper], [code].
[Value] What are human values, and how do we align AI to them?, 2024.04, [paper].
Agent Alignment in Evolving Social Norms, 2024.01, [paper].
[Norm] Align on the Fly: Adapting Chatbot Behavior to Established Norms, 2023.12, [paper], [code].
[MBTI] Machine Mindset: An MBTI Exploration of Large Language Models, 2023.12, [paper], [code].
Training Socially Aligned Language Models in Simulated Human Society, 2023, [paper], [code].
Fine-tuning language models to find agreement among humans with diverse preferences, 2022, [paper].
ValueNet: A New Dataset for Human Value Driven Dialogue System, AAAI 2022, [paper], [dataset].

5.1. 🌈 Pluralistic Alignment

[Benchmark] Benchmarking Distributional Alignment of Large Language Models, 2024.11, [paper].
Legal Theory for Pluralistic Alignment, 2024.10, [paper].
Navigating the Cultural Kaleidoscope: A Hitchhiker’s Guide to Sensitivity in Large Language Models, 2024.10, [paper], [code and data].
PAD: Personalized Alignment at Decoding-Time, 2024.10, [paper].
Policy Prototyping for LLMs: Pluralistic Alignment via Interactive and Collaborative Policymaking, 2024.09, [paper].
Modular Pluralism: Pluralistic Alignment via Multi-LLM Collaboration, 2024.06, [paper].

6. 🚀 Simulation

OASIS: Open Agents Social Interaction Simulations on One Million Agents, 2024.11, [paper], [code].
Generative Agent Simulations of 1,000 People, 2024.11, [paper].
Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?, 2024.11, [paper].
Multi-expert Prompting Improves Reliability, Safety, and Usefulness of Large Language Models, EMNLP 2024, [paper].
Simulating Opinion Dynamics with Networks of LLM-based Agents, NAACL Findings 2024, [paper] [code]
Beyond demographics: Aligning role-playing llm-based agents using human belief networks, EMNLP Findings 2024, [paper]
The Wisdom of Partisan Crowds: Comparing Collective Intelligence in Humans and LLM-based Agents, CogSci 2024, [paper]
Large Language Models can Achieve Social Balance, 2024.10, [paper].
On the limits of agency in agent-based models, 2024.09, [paper], [code].
United in Diversity? Contextual Biases in LLM-Based Predictions of the 2024 European Parliament Elections, 2024.09, [paper].
Out of One, Many: Using Language Models to Simulate Human Samples, 2022, [paper].
Social Simulacra: Creating Populated Prototypes for Social Computing Systems, 2022, [paper].
Generative Agents: Interactive Simulacra of Human Behavior, 2023, [paper], [code].
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies, 2023, [paper], [code].
Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?, 2023 [paper], [code].
$S^3$: Social-network Simulation System with Large Language Model-Empowered Agents, 2023, [paper].
Rethinking the Buyer’s Inspection Paradox in Information Markets with Language Agents, 2023, [paper].
SocioDojo: Building Lifelong Analytical Agents with Real-world Text and Time Series, 2023, [paper].
Humanoid Agents: Platform for Simulating Human-like Generative Agents, 2023, [paper], [code].
When Large Language Model based Agent Meets User Behavior Analysis: A Novel User Simulation Paradigm, 2023, [paper], [code].
Large Language Model-Empowered Agents for Simulating Macroeconomic Activities, 2023, [paper].
Generative Agent-Based Modeling: Unveiling Social System Dynamics through Coupling Mechanistic Models with Generative Artificial Intelligence, 2023, [paper].
Using Imperfect Surrogates for Downstream Inference: Design-based Supervised Learning for Social Science Applications of Large Language Models, 2023.06, NeurIPS 2023, [paper].
Epidemic Modeling with Generative Agents, 2023.07, [paper], [code].
Emergent analogical reasoning in large language models, 2023.08, nature human behavior, [paper].
MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative Agents, 2023.10, [paper].
War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars, 2023.11, [paper], [code].
Emergence of Social Norms in Large Language Model-based Agent Societies, 2024.03, [paper], [code].
Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior, ICLR-2024, [paper]

7. 👁️‍🗨️ Perspective

The benefits, risks and bounds of personalizing the alignment of large language models to individuals, 2024.04, Nature Machine Intelligence, [paper].
A social path to human-like artificial intelligence, 2023.11, Nature Machine Intelligence, [paper].
Using large language models in psychology, 2023.10, Nature reviews psychology, [paper].

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-LLM-in-Social-Science

Table of Contents

1. 📚 Survey

2. 🗂️ Dataset

3. 🔎 Evaluating LLM

3.1. ❤️ Value

3.2. 🩷 Personality

3.3. 🔞 Morality

3.4. 🎤 Opinion

3.5. 💚 General Preference

3.6. 🧠 Ability

3.7. ❤️ Risk and Safety

4. ⚒️ Tool enhancement

5. ⛑️ Alignment

5.1. 🌈 Pluralistic Alignment

6. 🚀 Simulation

7. 👁️‍🗨️ Perspective

About

Releases

Packages

Contributors 4

License

Value4AI/Awesome-LLM-in-Social-Science

Folders and files

Latest commit

History

Repository files navigation

Awesome-LLM-in-Social-Science

Table of Contents

1. 📚 Survey

2. 🗂️ Dataset

3. 🔎 Evaluating LLM

3.1. ❤️ Value

3.2. 🩷 Personality

3.3. 🔞 Morality

3.4. 🎤 Opinion

3.5. 💚 General Preference

3.6. 🧠 Ability

3.7. ❤️ Risk and Safety

4. ⚒️ Tool enhancement

5. ⛑️ Alignment

5.1. 🌈 Pluralistic Alignment

6. 🚀 Simulation

7. 👁️‍🗨️ Perspective

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Packages