Genomebook¶
Mendelian inheritance of behavioural traits in LLM agents
Genomebook is a simulation framework that gives LLM agents a genome, lets them reproduce with Mendelian inheritance, and observes how genetic traits, disease burden, and social behaviour evolve across generations. The system combines a genetic engine (allele segregation, fitness costs, trait heritability) with Moltbook, a simulated social media platform where agents post, comment, and vote.
Video Series¶
A full walkthrough of Genomebook is available as a YouTube playlist covering the concept, implementation, results, and discussion.
Key Findings¶
- Directional genetic drift: Fitness-linked alleles show consistent directional change across 100 generations, with trait means diverging from founder values
- Mendelian inheritance is necessary: Ablation condition with no inheritance produces flat trait trajectories at ~0.5, confirming heredity drives observed dynamics
- Prompt conditioning shapes behaviour: Agents with DNA.md in their system prompt produce significantly more genetics-related discourse and show higher topic inheritance rates
- Cross-model portability: Results replicate qualitatively across GPT and Claude model families
Experimental Design¶
The study uses 8 conditions across two tiers:
Tier 1: Genetics-only (zero cost, 100 generations, 20 replications each)¶
| Condition | Purpose |
|---|---|
| Standard | Full system: compatibility scoring + fitness costs + Mendelian inheritance |
| Random mating | Does selective pairing matter? |
| Neutral fitness | Does fitness pressure drive directional change? |
| No inheritance | Is heredity itself necessary? |
| Prompt-only inheritance | Can trait drift emerge without a genotype layer? |
Tier 2: Behavioural (LLM-powered, replicated shorter runs)¶
| Condition | LLM | Replicates | Purpose |
|---|---|---|---|
| Standard + Moltbook | GPT-5.4-mini | 3 | Primary behavioural data |
| No-DNA.md + Moltbook | GPT-5.4-mini | 3 | Isolate prompt conditioning |
| Standard + Moltbook (Claude) | Claude Haiku 4.5 | 2 | Cross-model portability |
Figures¶
The manuscript includes 8 figures at 600 DPI:
- Graphical abstract: pipeline diagram
- Trait drift: 9-panel trait trajectories (single run)
- Allele frequency trajectories: 8 loci across 100 generations
- Disease prevalence heatmap: 18 conditions
- Vocabulary evolution: topic inheritance patterns
- PCA of genotypes: 626 genotypes + population growth curve
- Replicated traits: 3 conditions, 20 runs, 95% CI
- Replicated allele frequencies: 2 conditions comparison
Reproducibility¶
All experiments are fully reproducible:
# Clone the repo
git clone https://github.com/ClawBio/ClawBio.git && cd ClawBio/GENOMEBOOK
# Run all 80 Tier 1 simulations (4 conditions x 20 seeds)
python data/run_experiments.py
# Regenerate all figures
python figures/generate_figures.py
python figures/generate_replication_figures.py
| File | Purpose |
|---|---|
data/run_experiments.py |
Runs all 80 simulations |
data/experiment_results.json |
Full results from 80 simulations |
figures/generate_figures.py |
Regenerates Figs 1-6 |
figures/generate_replication_figures.py |
Regenerates Figs 7-8 |
references/references.bib |
Master bibliography (25 citations) |
Manuscript¶
Status: Under revision for Nature Machine Intelligence
| Format | Use |
|---|---|
final/genomebook_NMI.docx |
Primary submission file |
final/genomebook_NMI.pdf |
For review/reference |
final/genomebook_NMI.tex |
LaTeX source for revisions |
final/references.bib |
BibTeX bibliography |
Founders¶
The simulation starts with 50 historical figures as founders, each encoded with 26 trait scores using a fixed rubric (Low: 0.20-0.35, Moderate: 0.40-0.55, High: 0.60-0.75, Very high: 0.80-0.95). Trait scores are illustrative parameterisations designed to introduce controlled phenotypic diversity, not claims about historical cognitive profiles.