ClawBio Hackathon

AI Agents for Health

Build agents for genomics, pharmacogenomics, and digital health

23 April 2026 · University of Westminster · 115 New Cavendish St, London

Agentic engineering: the new frontier

Biology has become a data-saturated science. The tools to analyse it have not kept up.

A single human genome produces three billion base pairs. Clinical interpretation requires alignment, variant calling, functional annotation, literature cross-referencing, and synthesis into actionable reports. Each step has its own software, dependencies, version history, and configuration.

Workflow managers (Nextflow, Snakemake, Galaxy) emerged because manual orchestration is unsustainable. But the barrier to entry remains high. A biologist who wants to analyse their own sequencing data must learn to code, hire someone who can, or rely on graphical interfaces that may not support the analysis they need.

The models are good enough. The harness is not.

Modern LLMs can write, debug, and execute code. They can plan multi-step operations, adapt based on intermediate results, and coordinate dozens of tools.

But a general-purpose LLM generating bioinformatics code from scratch is unreliable. Output varies between sessions. It lacks the specificity that domain experts build into workflows over years. It halluccinates gene-drug associations and invents variant classifications.

The problem is not the model. The problem is the harness: what constrains the model, what tools it can call, what guardrails prevent silent errors. Agentic engineering is the discipline of building that harness.

What is agentic AI?

The first wave of LLMs in the life sciences was information retrieval: summarising papers, answering questions about pathways, extracting structured data from text. Useful but incremental.

The second wave is qualitatively different. When connected to file systems, databases, and command-line tools, LLMs become autonomous agents that plan multi-step operations, execute them, and adapt based on intermediate results.

The shift: from AI that tells you things to AI that does things. The researcher's role shifts from constructing the analysis to evaluating it. From production to judgement.

What is an agent?

Agentic genomics is the use of autonomous AI agents, powered by large language models and operating within domain-constrained skill libraries, to discover, plan, execute, and iteratively refine multi-step genomic analyses, where the agent exercises runtime decision-making over tool selection, parameterisation, error handling, and output evaluation.

Corpas, Fatumo, Guio. "Agentic Genomics: From Pipeline Automation to Autonomous Validation." Cell Genomics (submitted).

Four necessary conditions: autonomy (runtime decisions), domain constraint (skill libraries, not ad hoc code), iterative refinement (error diagnosis and self-repair), and natural language mediation (no programming required).

How does it work?

Traditional workflow

Researcher writes code
Configures tools manually
Runs pipeline
Interprets results

Bottleneck: code production

Agentic workflow

Researcher describes intent in natural language
Agent discovers and executes skills from a library
Agent handles errors and adapts
Researcher validates results

Bottleneck: validation and judgement

What agentic genomics is not

Not workflow automation

Nextflow and Snakemake execute a fixed DAG specified by a human. The workflow manager does not decide at runtime which tools to use. Agentic genomics builds on workflow infrastructure but is not reducible to it.

Not AutoML

Systems that search over hyperparameters operate within a fixed search space defined by a human engineer. They optimise within constraints; they do not formulate the analysis strategy.

Not LLM-assisted scripting

Using a chatbot to generate a Python script for variant filtering is useful but is not agentic. The LLM produces code; the human executes it. Sometimes called "vibe coding," it is a precursor but lacks the defining properties.

Not a chatbot

Biomedical AI copilots that answer clinical questions operate as information retrieval systems. They do not execute multi-step computational analyses against real data.

Why is it useful?

Metric	Human-directed	Agent-mediated
Setup time	2-4 hours	5-15 minutes
Monitoring required	Continuous	Minimal (agent handles errors)
Reproducibility	Variable	High (skill specification fixed)
Error recovery	Manual debugging	Automated diagnosis and repair
Prerequisite expertise	Bioinformatics training	Domain knowledge for validation
Primary failure mode	Config errors, version conflicts	Silent plausible-looking errors

Based on exome and scRNA-seq comparisons across multiple independent systems (Corpas et al., Cell Genomics).

How does this apply to genomics?

An emerging ecosystem

Several independent groups have converged on architectures that share the core properties of agentic genomics:

CellAtria (AstraZeneca)

Agentic framework for single-cell RNA-seq. Dialogue-driven, document-to-analysis automation. Prioritises reproducibility over flexibility.

AutoBA

Autonomous agent for multi-omic analyses. Users supply minimal input; agent plans, generates, executes, and self-repairs code.

Bio-Copilot

Multi-agent system with self-reflection protocols, shared knowledge database, and structured human-agent collaboration.

ClawBio

Community-driven skill library. Domain experts encode knowledge into agent-executable modules without needing to become software engineers.

This convergence from independent groups, using different architectures but arriving at similar design principles, is evidence that agentic genomics reflects a genuine structural shift.

What is ClawBio?

An open-source toolkit of AI agent skills for genomic analysis.

A skill is a self-contained, versioned unit of bioinformatics functionality: a SKILL.md contract that encapsulates code, configuration, data references, I/O specifications, and test suites. Skills are designed to be discovered and executed by AI agents through semantic search and natural language invocation.

40+bioinformatics skills

737+GitHub stars

141forks

17contributors

How can it help me do better research?

Variant interpretation

VEP, ClinVar, gnomAD, ACMG classification, pharmacogenomics (CPIC). One command, structured report.

GWAS + PRS

Query 9 databases in parallel, compute polygenic risk scores, fine-map loci with SuSiE. All from summary statistics.

scRNA-seq

QC, doublet removal, clustering, marker genes. Scanpy pipeline wrapped in a single skill with demo data.

Equity audits

HEIM equity scorer measures how well a dataset represents diverse populations. Flags ancestry bias in analyses.

Literature + databases

PubMed summariser, UKB Navigator, Galaxy Bridge, protocols.io. Connect your analysis to the knowledge layer.

Clinical reports

WES clinical reports (English and Spanish), profile reports. From raw variants to PDF, with disclaimers and provenance.

What are the limitations?

Agentic genomics lowers the barrier to generating analyses. It does not lower the barrier to evaluating them.

Silent plausible errors

The primary failure mode across all agentic systems: results that look correct but are not. One skill returned "all normal" pharmacogenomics for an empty input file.

Hallucination

Agents can cite non-existent gene-disease associations, fabricate references, or generate variant annotations that conflate unrelated loci. Skill constraints reduce but do not eliminate this.

Equity bias

Agents default to European-ancestry resources unless explicitly constrained. 86% of GWAS data is European. Agentic genomics can automate existing biases at unprecedented scale.

Validation gap

Domain expertise, familiarity with common failure modes, intuition about plausible results: these are built over years and cannot be shortcut by AI. A novice user will not catch the errors.

The democratisation is real, but partial. It expands the capacity to produce; it does not expand the capacity to judge.

How do I get started?

Today. Right now. In this room.

Challenge tracks

Track 1: New skill

Add a bioinformatics skill to ClawBio. Variant annotation, pathway analysis, clinical reporting, or your own idea. Must conform to SKILL.md template with demo data and tests.

Track 2: Agent workflows

Chain multiple skills to solve a multi-step genomics question. Produce something no single skill could produce alone.

Track 3: Equity hack

Use the HEIM equity framework to audit representation gaps. Score a GWAS, compare PRS accuracy across populations, or build an equity dashboard.

Schedule

Time	Activity
12:00	Arrival and setup
12:30	Introduction to ClawBio and challenge briefing
13:00	Team formation and hacking begins
16:30	Demos and judging
17:30	Prizes and networking
19:00	Close

Team formation: We will pair domain experts (biologists, clinicians) with developers. No bioinformatics experience required. Teams of 2-4.

Setup

git clone https://github.com/ClawBio/ClawBio.git
cd ClawBio
pip install -r requirements.txt
python skills/bio-orchestrator/orchestrator.py --list-skills

Judging criteria

Criterion	Weight
Does it work? `--demo` runs without errors	30%
Real-world impact. Solves a genuine problem	25%
SKILL.md quality. Complete, follows template, gotchas documented	20%
Tests. At least one test file, edge cases considered	15%
Presentation. Clear 3-minute demo	10%

How to submit

Fork ClawBio/ClawBio on GitHub
Create a branch named hackathon/your-team-name
Add your skill under skills/your-skill-name/
Open a PR before 16:30 with a description of what it does
Demo live in 3 minutes: show the problem, run the skill, show the output

Prizes: Best new skill, best agent workflow, best equity hack. All winning skills will be merged into ClawBio with full attribution.

Join the community

WhatsApp

Questions, updates, team-finding.

WhatsApp Group

Discord

Chat with RoboTerri, share results, get help during the hack.

Discord Server

GitHub

Star the repo, read the docs, explore existing skills.

ClawBio/ClawBio

Let's build.