An AI research team that works while you sleep
ScienceClaw is an autonomous research platform for life sciences. It scans news, conducts deep research across your portfolio, finds cross-domain connections, analyses your datasets, detects anomalous clinical trial designs, and answers ad-hoc questions — all through email, with no software to install.
What it does
Daily Autonomous Research
Every morning, the platform scans hundreds of pharma, biotech, and regulatory sources. It tags stories to your topics, conducts two-pass research (direct answers, then cross-domain connections), and synthesises findings across your entire portfolio. Reports land in your inbox before lunch.
On-Demand Market Research
Need a competitive landscape memo? Email a question with the topics you care about. The platform draws on its accumulated knowledge base to produce a structured analysis — key takeaways, supporting evidence, risks, and next steps. If the evidence isn't there, it says so.
On-Demand Data Analysis
Email a CSV spreadsheet with a question in plain English. The platform computes real statistics from your data first, then uses the AI to interpret the results and answer your specific question. The AI cannot invent numbers — it works from computed facts.
Self-Expanding Coverage
The platform detects news stories that don't match any existing research topic and proposes new areas for investigation. You approve or reject proposals via email. Approved topics start receiving daily research automatically. The system drifts toward where the market is moving.
Trial Intelligence
Continuously monitors ClinicalTrials.gov for newly registered oncology trials whose design deviates from established norms — unusual endpoints, novel comparators, missing biomarkers, or new entrants. Anomalies are investigated and filtered before reporting. No anomalies means no email.
Design principles
The opportunity
Life sciences teams spend significant time on three recurring problems: gathering and synthesising market intelligence across fragmented sources, extracting reliable conclusions from experimental datasets, and tracking the design landscape of clinical trials in their therapeutic area. All three are manual, repetitive, and error-prone.
For market intelligence, competitive landscapes shift daily. Clinical trial readouts, regulatory decisions, and partnership announcements can alter strategic priorities overnight. Most organisations respond with periodic, manual research cycles — quarterly reports, ad-hoc analyst requests, internal briefings that are stale before they're circulated. ScienceClaw replaces that with a continuous, autonomous research process. The knowledge base grows every day. Cross-domain connections that analysts often miss — because they sit in different teams or therapeutic areas — are surfaced automatically.
For data analysis, the standard approach is to upload spreadsheets to a chatbot or web tool and hope the AI gets the numbers right. It often doesn't. LLMs estimate statistics rather than computing them, quietly ignore small sample sizes, and produce confident-sounding conclusions from fabricated figures. ScienceClaw takes a different approach: real statistics are computed deterministically before the AI sees anything, and the model is bound by those computed facts. It cannot invent numbers or contradict the ground truth.
For trial intelligence, no systematic method exists for continuously detecting when a newly registered trial's design deviates from established norms. Commercial intelligence services address this through curated databases and periodic reports, but these are retrospective, expensive, and optimised for landscape coverage rather than early signal detection. ScienceClaw builds empirical baselines from registry data, compares each new trial against the relevant baseline, investigates anomalies using public biomedical APIs, and reports only findings that survive a three-part confidence filter.
All three workflows share the same interface — email — and the same design philosophy: evidence only, gaps flagged, human always in the loop.
This is not a chatbot. It is a set of structured research, analysis, and surveillance workflows with anti-hallucination guardrails, computed statistics as ground truth, baseline-deviation anomaly detection, and a self-expanding topic registry. The AI is a component of the system, not the system itself.
Market Research Workflow
Send an email with your research question and the topics you care about. Get a structured memo back — grounded entirely in accumulated evidence, with gaps and unknowns flagged transparently.
How it works
You send an email
Tag the subject line with your topics (e.g. GLP-1 agonists, CRISPR therapeutics). Describe what you need — competitive landscape, pipeline comparison, regulatory outlook. Free text works, or use structured headings for more focused results.
The knowledge base is consulted
The platform matches your topics against its accumulated intelligence: daily market briefs, cross-domain synthesis notes, and news scan tags. Only evidence that actually exists in the knowledge base is used. No external searches are performed at this stage — the quality depends on what the daily research cycle has gathered.
AI produces a structured analysis
A large language model reads the knowledge base excerpts alongside your question. It is explicitly instructed to use only the provided evidence, separate facts from interpretation, and flag anything it cannot answer. The model cannot invent data or fill gaps with plausible-sounding guesses.
You receive a memo
A reply arrives in the same email thread, structured as: Key Takeaways, Supporting Evidence, Risks and Unknowns, and Recommended Next Steps. If the knowledge base lacked coverage for a topic, that's stated clearly rather than papered over.
What makes this different
Knowledge-base grounded
Every claim in the response traces back to specific research the system has already conducted and stored. This is not a generic internet search — it draws on a curated, growing body of domain-specific intelligence built by the daily autonomous cycle.
Honest about what it doesn't know
If the knowledge base has no coverage for a topic, the response says so. If the evidence is weak or contradictory, that's flagged in the Risks and Unknowns section. The system is designed to be useful precisely because it is honest about its limitations.
Cross-topic synthesis built in
The daily research cycle includes a dedicated synthesis stage that reads across all topics to find connections. When you ask about GLP-1 agonists, the platform can surface relevant signals from adjacent areas — obesity devices, metabolic biomarkers, regulatory precedents — that a siloed analyst might miss.
Example output
The following is a real response from ScienceClaw to a multi-topic market research query about pre-clinical workflow integration across lab automation, screening platforms, and AI drug discovery. Note the structured evidence tables, explicit confidence levels, and gaps flagged throughout.
Self-expanding research coverage
The platform doesn't only research topics you've defined. Each day, the retrospective stage scans for news stories that don't match any existing topic, recurring references to unrecognised entities, and gaps in the synthesis. When it finds persistent signals, it proposes new research areas via a daily email digest.
You reply with a simple approve or reject command. Approved topics immediately join the daily research cycle and become available for on-demand queries. This means the platform gradually expands its coverage toward where the market is moving, not just where you originally pointed it.
Guardrails: the AI can propose at most three new topics per day, each requiring at least two independent source signals. Your manually curated topic registry is never modified automatically. All proposals require explicit human approval before activation.
Data Analysis Workflow
Email a spreadsheet with your question in plain English. The platform computes real statistics first, then uses AI to interpret the results — never the other way around.
How it works
You email a dataset
Attach a CSV file and describe what you want to know. No special format needed — column descriptions, context about the experiment, specific questions. The more context you provide, the more targeted the analysis.
Statistics are computed first
Before the AI sees anything, the platform computes a numeric overview from your data: row counts, column-by-column statistics (means, minimums, maximums, ranges). This becomes the ground truth that the AI is bound by — it cannot contradict these numbers.
AI interprets and answers
The AI reads your question and the pre-computed statistics together. It provides domain context drawn from your question (not its own assumptions), highlights patterns, and flags where sample sizes are too small for confident conclusions. All interpretation is anchored to the computed facts.
You receive a structured report
A reply arrives in the same email thread with Key Takeaways, Supporting Evidence (always referencing the computed statistics), and Risks and Unknowns (including explicit small-sample caveats). If the AI cannot interpret the data, you still receive the raw statistical overview.
Why computation-first matters
Typical AI data analysis
- The AI reads raw data and estimates statistics — sometimes incorrectly
- Confident-sounding answers with no way to verify the underlying numbers
- Small sample sizes are quietly ignored; conclusions appear definitive
- Requires uploading data to a third-party web tool or chatbot
ScienceClaw's approach
- Real statistics are computed deterministically before the AI is involved
- The AI is given pre-computed facts and cannot invent or contradict them
- Small samples, missing columns, and data gaps are flagged explicitly
- Your data stays on your own infrastructure and never leaves the server
Example output
The following is a real response from ScienceClaw to a protein design analysis request. A CSV of 50 antibody variants (three design methods, multiple target antigens) was emailed with a request to compare methods. Note the computed statistics tables, modelling recommendations with explicit sample-size caveats, and the extensive Risks and Unknowns section.
What you can analyse
The data analysis workflow is domain-agnostic. It computes the same statistical overview regardless of what your spreadsheet contains. All domain understanding comes from the question you write in the email body — the AI uses your context to interpret the numbers, rather than imposing its own assumptions about what the data means.
Clinical trial data
Endpoint summaries, response rates across cohorts, adverse event frequency comparisons. Describe the trial design in your email for best results.
Compound screening results
Hit-list analysis, activity distributions, structure-activity patterns. Include assay descriptions and what "active" means in your context.
Portfolio and competitive data
Pipeline comparisons, deal valuations, patent filing patterns. The AI interprets relative to the context you provide about the competitive landscape.
Limitations to be aware of: the current statistical overview is deliberately simple (means, min, max per column). It does not perform group-by analysis, correlations, or distribution fitting. The AI may infer these from the numbers but cannot compute them. More sophisticated analysis capabilities are planned.
Trial Intelligence Workflow
Automated anomaly detection for oncology clinical trial design. The system builds empirical baselines of what is normal for each indication and modality, then flags new trials that deviate — investigating each anomaly before reporting.
Download the methods paper (PDF)How it works
Baselines are built from registry data
For each indication–modality pair (e.g. first-line NSCLC checkpoint inhibitors), the system queries ClinicalTrials.gov for all active Phase 2 and Phase 3 interventional trials — approximately 12,950 across oncology. It captures the modal primary endpoint, typical comparator, standard biomarkers, expected sample size range, and prevailing design architecture. Baselines require at least 15 trials to be usable.
New trials are compared against baselines
Each newly registered or updated trial is compared against its relevant baseline across six dimensions. A deviation on any dimension flags the trial as an anomaly candidate. Most trials match their baseline and are filed without further analysis — only deviations receive investigation.
Anomalies are investigated
Each flagged trial undergoes automated investigation: ChEMBL for compound mechanism, Open Targets for genetic evidence, bioRxiv for preclinical publications, openFDA for safety signals, and PubMed for regulatory guidance. Investigations are bounded by a decision budget of 10–20 API calls to prevent unbounded resource consumption.
A three-part confidence filter decides what to report
Every anomaly must pass three checks: is the deviation real (not a data artefact)? Is it novel (not previously reported)? Can it be triangulated against an independent source? Only findings that pass all three checks are emailed. On days with no reportable anomalies, no email is sent — silence is a valid result.
Six deviation dimensions
Endpoint deviation
Primary endpoint does not match the baseline's modal endpoint for its indication–modality pair.
Comparator deviation
Trial uses a comparator type (active, placebo, or none) that diverges from the baseline distribution.
Enrichment deviation
Eligibility criteria include a novel biomarker or omit one that the baseline shows as standard.
Sample size deviation
Enrolment target falls outside the baseline's interquartile range for its phase.
Design deviation
Trial uses a design architecture (basket, umbrella, adaptive platform) not represented in the baseline.
New entrant deviation
A sponsor or therapeutic modality not previously seen in this indication's baseline.
What makes this different
Anomaly detection, not landscape analysis
The system does not attempt to track all oncology trials. It builds baselines of what is normal, then scans for deviations. Most trials are filed without analysis. Only meaningful deviations receive investigation and reporting.
Investigation, not just flagging
Detecting a deviation is not enough. The system searches for regulatory guidance changes, competitor readouts, safety signals, and published validation studies to explain each anomaly. Findings are reported as "explained" or "unexplained" — the latter often the most interesting.
All public, free data sources
ClinicalTrials.gov, ChEMBL, Open Targets, bioRxiv, openFDA, and PubMed — all accessed via free public APIs. No commercial data subscriptions required. Reproducible by any research group.
Illustrative baseline data
Different indication–modality pairs exhibit meaningfully different design conventions. NSCLC Phase 3 trials concentrate heavily on progression-free survival (36.8% of primary endpoint designations), creating a clear baseline against which deviations are detectable. Melanoma trials distribute more evenly across PFS, ORR, OS, and DFS — requiring a higher deviation threshold.
NSCLC Phase 3 (100 trials, 136 endpoints)
- PFS dominates at 36.8% — a new trial choosing DoR or a PRO as sole primary endpoint would constitute a deviation
- OS is second at 22.8%, consistent with regulatory precedent
- DFS at 9.6% reflects perioperative trial designs
- "Other" endpoints at 11.0% — manageable heterogeneity
Melanoma Phase 3 (50 trials, 72 endpoints)
- PFS, ORR, OS, and DFS each represent 12–23% — no single dominant endpoint
- "Other" category at 36.1% — substantial heterogeneity from broad clinical spectrum
- Adjuvant, metastatic, and response-focused trial designs coexist
- Higher deviation threshold required to avoid false positives
Illustrative anomaly scenarios
The following scenarios are constructed from real registry data to demonstrate the detection logic. They illustrate the types of findings the system is designed to surface.
Endpoint deviation in NSCLC
A new first-line NSCLC checkpoint inhibitor trial registers with a patient-reported outcome (PRO-CTCAE) as co-primary alongside PFS — no other trial in the baseline uses a PRO co-primary. Investigation finds a recent FDA draft guidance recommending PRO co-primaries. Classified as "explained" with regulatory precedent.
Novel modality on emerging target
The gastric cancer baseline includes ADC and CAR-T trials targeting CLDN18.2, but no bispecifics. A new Phase 2 bispecific T-cell engager registers. Investigation chains through ChEMBL, Open Targets, and bioRxiv — no preclinical publications found. Classified as "unexplained" with a modality diversification note.
Clustered terminations
Two TIGIT inhibitor trials from the same sponsor change to TERMINATED within six months. Investigation finds no safety signal in openFDA, but identifies negative pivotal data from a competitor's TIGIT programme. Classified as "explained" with a class-level efficacy concern.
Limitations to be aware of: baselines depend on ClinicalTrials.gov data quality, which is sponsor-submitted and intentionally vague on design rationale. Indication–modality pairs with fewer than 15 trials cannot be monitored. The system detects anomalous design choices — it does not assess whether those choices are good or bad. Clinical judgment is required to interpret every finding. No prospective validation has been performed; a 12-week pilot is planned.