Advancing Cognitive Science with Large Language Models (LLMs)

Abstract

Cognitive science faces enduring challenges—fragmentation across subfields, informal theories, measurement confusion, and narrow models lacking generality and contextual sensitivity. Large language models (LLMs)—neural networks trained on vast text corpora—offer new tools to address these challenges. We review how LLMs can bridge disciplinary silos, formalize theories, refine measures, support integrated modeling, and capture ecological variation. We also discuss their limitations, including lack of embodiment, opacity, and ethical concerns. LLMs promise a more cumulative cognitive science—if integrated critically, ethically, and transparently.

1. Introduction

Modern LLMs (e.g., GPT-4, PaLM) demonstrate powerful capabilities in language processing and reasoning. These models open new frontiers for cognitive science by enabling automated synthesis, simulation, and model formalization. Cognitive theory, in turn, can inform the development of more human-like AI. This reciprocal relationship is inspiring new methodologies and epistemologies across the cognitive sciences.

2. Comparative Table: Traditional vs LLM Methods

Dimension	Traditional Approach	LLM-Enabled Approach
Integration across fields	Manual literature reviews	LLM-based semantic maps revealing interdisciplinary clusters
Theory formalization	Handcrafted models, often informal	Translation of verbal theories into code and executable forms
Measurement & taxonomy	Redundant constructs, expert-defined	Clustering constructs via semantic similarity
Modeling generality	Task-specific models	Multitask-trained LLMs supporting broader cognitive frameworks
Contextual sensitivity	Controlled, decontextualized experiments	LLMs reflect cultural and contextual diversity from training data
Transparency	Code is interpretable	LLMs are black-box models; interpretability remains a challenge
Bias & reproducibility	Human biases; moderate transparency	LLMs reflect internet biases; reproducibility limited in closed models

3. Bridging Disciplinary Silos

LLMs embed large-scale textual corpora into high-dimensional semantic spaces. These embeddings can generate research maps that visualize concept clusters across subfields, enabling researchers to identify hidden relationships. For example, LLMs have been used to map the "theory of mind" literature across psychology, neuroscience, and AI, revealing latent interdisciplinary links.

4. Formalizing Theory and Modeling

LLMs can translate verbal descriptions of cognitive processes into symbolic code or simulations. This facilitates the transition from informal to formal theory, allowing rapid prototyping and refinement. Additionally, LLMs themselves can be treated as generative models of cognition—simulating behaviors like memory, decision-making, or language understanding.

5. Improving Measurement and Taxonomy

By analyzing how different constructs are described across literature, LLMs can help consolidate redundant terms and clarify measurement taxonomies. They can cluster questionnaires and experimental tasks, revealing hidden equivalences and promoting construct alignment. This mitigates issues like jingle-jangle fallacies in psychological measurement.

6. Toward Integrated Modeling and Generality

Where traditional models are narrow, LLMs act as generalists. They can be probed across multiple domains, identifying shared mechanisms of cognition. This paves the way for more integrated cognitive architectures—connecting language, memory, attention, and reasoning in one framework.

7. Capturing Context and Individual Variation

LLMs encode information from diverse sources—capturing regional dialects, cultural frames, and sociolinguistic variation. Researchers can use prompts to simulate cognitive processing across virtual populations. LLMs thus offer tools for modeling contextualized cognition in ways that were previously inaccessible.

8. Limitations and Ethical Challenges

Embodiment: LLMs are disembodied and lack sensory grounding.
Interpretability: Black-box architectures obscure internal reasoning.
Bias: Training on web data leads to cultural and demographic biases.
Reproducibility: Use of closed LLMs hinders scientific transparency.
Ethical risks: Over-reliance, misuse, or deskilling in science workflows.

9. Responsible Integration

To responsibly integrate LLMs into cognitive science:

Use open-weight, auditable models.
Document prompt logic and limitations.
Compare LLM outputs to human-derived baselines.
Probe internal representations and behavior.
Ensure model use complements—not replaces—human insight.

10. Conclusion

Large language models open exciting opportunities for cognitive science. They enable automated synthesis, conceptual mapping, and even theory testing. Yet they are not substitutes for human cognition. Used carefully, LLMs can accelerate theory-building and evidence integration. But they require oversight, interpretability, and epistemic humility. Their power lies not in replacing researchers, but in helping us think better—together.

References

Wulff, D. U., & Mata, R. (2025). Advancing Cognitive Science with LLMs. arXiv. https://arxiv.org/abs/2511.00206
Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. ACL.
Birch, J. et al. (2024). Embodiment in AI: What can LLMs tell us about motivation and behavior? arXiv.
Zhou, X. et al. (2025). LLMs as cognitive models: Promise and pitfalls. Cognitive Science.
Guan, X. et al. (2025). Memory, reasoning, and representation in foundation models. NeurIPS.

```

MacMalschman

Sonntag, 9. November 2025