Catalyst Data

Catalyst Data is the shared SQL layer for Sustainable Catalyst—designed to connect entities, sources, indicators,
and measurements so analysis remains traceable and reproducible. It’s infrastructure, not a dashboard.

Principle: every number and claim should be able to point back to “what it came from.”

What it is

Catalyst Data is a normalized database schema that gives the platform a common language.
Instead of each module inventing its own tables and identifiers, Catalyst Data provides shared structure:
entities, time periods, sources, metrics/indicators, and the measurements that connect them.

  • Unifies the system of record across modules
  • Preserves provenance (sources, methods, time)
  • Supports reproducibility for analytics and exports
  • Reduces drift between narrative and measurement

Why it matters

  • Close the gap between story and signal

    Narrative work and metrics work often live in different worlds. Catalyst Data helps keep them connected by
    referencing the same entities, periods, and sources.

    Outcome: fewer contradictions, clearer accountability

  • Auditability without bureaucracy

    Provenance isn’t paperwork—it’s a design choice. When the schema expects sources and definitions,
    audit trails become natural.

    Outcome: defensible outputs under scrutiny

  • Reproducible analysis

    Analytics becomes repeatable when it’s grounded in consistent entities, measurements, and time periods.
    This makes scenario work and indicator computation easier to rerun.

    Outcome: fewer “one-off” spreadsheets

  • Durability over time

    People change, tools change, and projects evolve. A shared data layer helps the platform survive turnover
    and prevents “schema chaos.”

    Outcome: continuity and coherence

Conceptual model

At a high level, Catalyst Data keeps a clear separation between what something is (entities),
where it came from (sources), what it means (indicators/definitions),
and what was observed (measurements).

  • Entities — organizations, topics, geographies, instruments, programs, etc.
  • Time — periods, intervals, reporting windows
  • Sources — links, datasets, documents, publications
  • Indicators / Metrics — definitions, units, method notes
  • Measurements — values tied to entity + period + source (+ method)

This structure supports traceability: measurement → indicator definition → source → entity → period.

How it connects to other modules

  • Analytics R

    Uses shared entities and measurements to compute indicators, run scenarios, and export reproducible outputs.

    Link: Catalyst Analytics R

  • Global Impact Catalyst

    Builds SDG-style indicator pipelines on consistent definitions, periods, and provenance.

    Link: Global Impact Catalyst

  • Narrative Risk

    Keeps claims and evidence aligned to shared sources and timelines—reducing drift between what’s said and what’s supported.

    Link: Narrative Risk

  • Infrastructure

    Catalyst Data is one of the core building blocks in the infrastructure layer: standards, exports, provenance, and durability.

    Link: Infrastructure

Boundaries

Catalyst Data is a schema and system of record. It does not claim to provide “real-time intelligence,” proprietary datasets,
or turnkey enterprise hosting. The value is structure: clarity, traceability, and reproducibility.

If you want implementation guidance or measurement design help, see Consulting.


Scroll to Top