1 / 11
arrow keys or swipe to navigate
ClawBio Hackathon
Agentic Genomics

Build a bioinformatics skill that any AI agent can run.
One afternoon. Real biology. Real code. Real impact.

Sir Michael Uren Hub, Imperial College London · 19 March 2026

444
GitHub Stars
3,100+
Clones
28
Skills
3 weeks
Since launch
Jay Moore · Manuel Corpas · Nathan Skene · Josh Beale
Overview of the Day

7 hours. One skill.

12:00
Doors open, get set up
12:30
Welcome — Nathan & Manuel
12:45
Overview of the day — you are here
13:15
Databricks — Toz Ozturk, genomics on Databricks
13:30
Pizza 🍕
14:00
Guided tutorial with helpers
14:30
Build — solo or teams, helpers circulate
15:30
Breakout tables — reproducibility, sensitive biodata, open theme
16:30
Build — final stretch, polish SKILL.md
17:30
Show and Tell — ~3 min per team
18:30
Menti Poll & wrap-up
📖
docs.clawbio.ai/hackathon
ClawBio tutorial
🧪
Skills Cookbook
Jay Moore's agentic tutorial
💬
WhatsApp + Discord
Ask anything, anytime
The Problem

Biology is trapped in PDFs.
AI hallucinates the rest.

📄
26% reproducibility
Only 1 in 4 computational biology papers can be reproduced without emailing the authors. The rest? Broken links, wrong Python versions, hardcoded paths.
Garijo et al., PLOS ONE 2013; Collberg & Proebsting 2016
🤖
LLMs hallucinate biology
ChatGPT calls CYP2D6 *4 "reduced function". It is no function. For a patient on codeine, that is the difference between pain relief and nothing.
🔒
Data cannot leave
Genomic data stays on-device. Cloud-only platforms are a non-starter. Every analysis must run locally and produce a verifiable audit trail.
Why This Matters

These are not hypothetical risks.

7%
CYP2D6 Poor Metabolisers
Codeine gives them zero pain relief but they keep getting prescribed it. ~1 in 14 people. ClawBio catches this in under one second from a consumer genetic test.
0.5%
DPYD Variant Carriers
A standard fluorouracil chemotherapy dose can be lethal. Half a percent of the population. These variants are known, testable, and actionable today.
Published papers should ship as
executable skills, not PDFs.
What You Are Building

Anatomy of a ClawBio skill

SKILL.md
The contract. YAML frontmatter (name, inputs, outputs) plus three sections: Domain Decisions, Safety Rules, Agent Boundary. This is the part that makes it a skill and not just a script.
Demo Data
Synthetic test data so anyone can run the skill without their own files. Never real patient data. Should exercise at least one edge case.
Python Script
--input, --output, --demo flags. Reads input, runs analysis, writes a report. Paths derived from __file__, never hardcoded.
Pull Request
Fork the repo, add your skill directory, open a PR. Terminal or GitHub web interface, both work. That is your submission.
The Solution

ClawBio: the execution layer for biological AI.

# One command. 12 genes. 51 drugs. CPIC guidelines.
$ python3 pharmgx_reporter.py --input my_23andme.txt

✓ Report generated in 0.8 seconds
✓ 10 drugs AVOID · 20 drugs CAUTION · 21 drugs OK
✓ SHA-256 verified · Fully reproducible
🏠
Local-first
Data never leaves your machine
🔎
Inspectable
SKILL.md encodes every decision
🔁
Reproducible
commands.sh + checksums.sha256
🤖
Agent-native
Any AI agent can discover + run
Traction · 3 weeks since launch

11,000 views. 444 stars. Zero marketing.

444
GitHub Stars
all organic
3,100+
Clones
645 unique cloners
28
Skills
5 domains covered
70+
Here Today
Imperial, KCL, Crick, UCL
First community PR merged within 48 hours of launch. Presented at the London Bioinformatics Meetup, DoraHacks Imperial, and the Galaxy ML SIG. Contributors across 3 countries. Traffic from LinkedIn, Google, GitHub, Slack, and Teams — people are finding it and sharing it at work.
Today's Challenge

Three themes. Your skill.

1
Gaps in Genomics Tools
What analyses are still manual, brittle, or unreproducible? Build a skill that fills one of those gaps. Proteomics, clinical diagnostics, variant annotation, pathway analysis.
2
Trustworthy Agentic Approaches
How do we make AI agents safe for biology? Encode domain decisions and safety rules into SKILL.md so agents execute with proven logic, not hallucinations.
3
Accessible Complexity
Genomics data is vast and complex. Build skills that present results so clearly that a clinician, a patient, or a policy-maker can act on them.
You do not need a genomics background for every track.
AI engineers + domain experts = the most powerful combination.
Choose Your Track

Five tracks. All levels welcome.

A
AI Engineers
No genomics needed. Wire public APIs (PubMed, gnomAD, ClinicalTrials.gov) into skills.
B
Genomics Researchers
Wrap your existing workflows (VCF annotation, QC, DE) into reproducible skills.
C
Proteomics
Zero proteomics skills exist in ClawBio. Build the first one. High impact.
D
Clinical
ACMG classification, drug interactions, tumour mutational burden, rare disease matching.
E
Epidemiology
Outbreak clustering, vaccine equity, AMR profiling, GBD visualisation.
Full skill ideas and step-by-step guide at docs.clawbio.ai/hackathon
Minimum Viable Submission

What counts as done?

A new skills/your-skill/ directory
SKILL.md with Domain Decisions, Safety Rules, Agent Boundary
Synthetic demo data (never real patient data)
A runnable --demo command that produces output
A pull request to ClawBio/ClawBio
A focused, well-documented skill with clear domain decisions is better than an ambitious but incomplete one.
Let's Go
The best skills come from
scratching your own itch.
$ git clone https://github.com/ClawBio/ClawBio.git
$ cd ClawBio && pip3 install -r requirements.txt
$ python3 skills/pharmgx-reporter/pharmgx_reporter.py --demo
✓ You're in. Now build something.
ClawBio is a research and educational tool. It is not a medical device and does not provide clinical diagnoses.