Catalyst Data

Catalyst Data is the shared SQL layer for Sustainable Catalyst—designed to connect entities, sources, indicators,
and measurements so analysis remains traceable and reproducible. It’s infrastructure, not a dashboard.

How it works →
Infrastructure →
Foundations →

Principle: every number and claim should be able to point back to “what it came from.”

What it is

Catalyst Data is a normalized database schema that gives the platform a common language.
Instead of each module inventing its own tables and identifiers, Catalyst Data provides shared structure:
entities, time periods, sources, metrics/indicators, and the measurements that connect them.

Unifies the system of record across modules
Preserves provenance (sources, methods, time)
Supports reproducibility for analytics and exports
Reduces drift between narrative and measurement

View on GitHub →

Why it matters

Close the gap between story and signal

Narrative work and metrics work often live in different worlds. Catalyst Data helps keep them connected by
referencing the same entities, periods, and sources.

Outcome: fewer contradictions, clearer accountability
Auditability without bureaucracy

Provenance isn’t paperwork—it’s a design choice. When the schema expects sources and definitions,
audit trails become natural.

Outcome: defensible outputs under scrutiny
Reproducible analysis

Analytics becomes repeatable when it’s grounded in consistent entities, measurements, and time periods.
This makes scenario work and indicator computation easier to rerun.

Outcome: fewer “one-off” spreadsheets
Durability over time

People change, tools change, and projects evolve. A shared data layer helps the platform survive turnover
and prevents “schema chaos.”

Outcome: continuity and coherence

Conceptual model

At a high level, Catalyst Data keeps a clear separation between what something is (entities),
where it came from (sources), what it means (indicators/definitions),
and what was observed (measurements).

Entities — organizations, topics, geographies, instruments, programs, etc.
Time — periods, intervals, reporting windows
Sources — links, datasets, documents, publications
Indicators / Metrics — definitions, units, method notes
Measurements — values tied to entity + period + source (+ method)

This structure supports traceability: measurement → indicator definition → source → entity → period.

How it connects to other modules

Analytics R

Uses shared entities and measurements to compute indicators, run scenarios, and export reproducible outputs.

Link: Catalyst Analytics R
Global Impact Catalyst

Builds SDG-style indicator pipelines on consistent definitions, periods, and provenance.

Link: Global Impact Catalyst
Narrative Risk

Keeps claims and evidence aligned to shared sources and timelines—reducing drift between what’s said and what’s supported.

Link: Narrative Risk
Infrastructure

Catalyst Data is one of the core building blocks in the infrastructure layer: standards, exports, provenance, and durability.

Link: Infrastructure

Boundaries

Catalyst Data is a schema and system of record. It does not claim to provide “real-time intelligence,” proprietary datasets,
or turnkey enterprise hosting. The value is structure: clarity, traceability, and reproducibility.

If you want implementation guidance or measurement design help, see Consulting.

Back to Platform →
Consulting →

Catalyst Data

What it is

Why it matters

Close the gap between story and signal

Auditability without bureaucracy

Reproducible analysis

Durability over time

Conceptual model

How it connects to other modules

Analytics R

Global Impact Catalyst

Narrative Risk

Infrastructure

Boundaries