§0 · Pipeline

How LivingMeta turns a literature corpus into structured understanding

A continuous pipeline. Multiple AI perspectives. Human reviewers with the final word. No black box: every step is visible, every claim is traceable, every dataset is downloadable.

§1 · Sources

Where the papers come from

OpenAlex as the primary index (350 million+ academic works). arXiv for preprints. Twenty-nine thesis repositories across Europe. Journal feeds the instance owner specifies.

New papers are picked up on a rolling basis. The instance owner controls when updates are deployed — typically after each collection cycle. Between deploys, the Lab remains the live layer.

§2 · Screening

Relevance at scale

Every paper with an abstract is scored by an AI classifier on domain relevance, methodology type, and theme. Off-topic papers are filtered out by a topic filter built from the instance's own classifications.

Classifier decisions are visible and overridable. You can re-classify with a new taxonomy without re-extracting. Papers scoring below the relevance threshold do not appear on the public side of the instance — they remain in the database, available for re-evaluation.

§3 · Extraction

Six perspectives, one consensus

Top-relevance papers are read by six AI perspectives independently. Each one extracts the same fields — methodology, sample, effect sizes, study design, quality indicators, identified gaps, reusable resources — but with a different cognitive emphasis: a balanced reviewer, a methodological rigorist, a synthesizer, a skeptic, a resource scout, and a quantitative meta-analyst.

A consensus algorithm resolves field-by-field. Where perspectives disagree, the disagreement is shown. Agreement scores are visible on every paper card. Every factual claim must carry a verbatim quote from the source paper; claims without textual evidence are flagged.

Lower-relevance papers receive abstract-only extraction (single perspective) — sufficient for gap mapping but not for meta-analysis-grade synthesis.

§4 · Resources

Datasets, instruments, code — surfaced once

As papers are read, mentions of datasets, validated questionnaires, code repositories, APIs, software tools, and measurement instruments are extracted, classified, and FAIR-scored.

They live in a curated registry that the Lab agent and human readers can search by type, access level, or keyword. Bidirectional links: from any resource you find the papers using it; from any paper you see the resources it draws on.

URLs are health-checked; dead links are flagged.

§5 · Gaps

From “further research” to a priority agenda

Most papers end with a “further research” or “limitations” section. LivingMeta extracts those statements, clusters them across the corpus, classifies them by gap type (evidence, knowledge, practice, methodological, empirical, theoretical, population, integration), and scores them on frequency, impact, feasibility, recency, and coverage.

The result is a Priority Research Agenda that represents the field's own self-stated open questions — not an outsider's opinion. The agenda re-runs as the corpus grows and as researchers contribute gap analyses through the Lab.

§6 · The Lab

Where you work with the agent

The Lab is a forum-like research space. You start a thread with a question, choose your role (layperson, junior, senior, supervisor), and converse with an AI agent that has access to the entire pipeline — papers, extractions, resources, gaps.

The agent cites every claim by paper identifier and quote, refuses to answer outside its grounded corpus, and surfaces resources proactively when you reach the design, execute, or write phases of a research project.

Coaching adapts to role. A layperson gets explanation-first responses; a PhD student gets Socratic scaffolding; a senior gets peer-level discourse; a supervisor gets meta-level analysis and teaching materials.

§7 · What we are not

Boundaries — for honest expectations

We are not a general open-government search engine. That work is already done well by opub.nl, 1848.nl, and openstate.eu, and we link to them where their data is genuinely complementary to a research question.

We are not a literature search engine in the Google Scholar / Semantic Scholar sense. We use OpenAlex as our collection layer and add what it does not provide: multi-perspective extraction, consensus quality scoring, gap taxonomy, resource curation, and Lab orchestration.

We are not a chatbot wrapper. The Lab agent's edge is its grounding in the corpus and its citation discipline, not raw LLM capability.

§A · Open data

Open data, reproducibility, no vendor lock-in

  • Everything an instance produces is published as static JSON files, downloadable without authentication for public-tier instances.
  • The pipeline scripts and instance schema are publicly documented.
  • If LivingMeta disappeared tomorrow, every instance would still function as a static website and every dataset could be forked.

Want one for your field?

We build the corpus, run the pipeline, and hand you the keys.