Counterfactual Reasoning in Algorithmic Systems: What Would Have Changed?

Last Updated June 21, 2026

Counterfactual reasoning in algorithmic systems asks what would have changed if inputs, rules, thresholds, assumptions, histories, design choices, or institutional conditions had been different. Many algorithms describe what happened, predict what may happen, or classify what is likely. Counterfactual reasoning asks a different question: under a specified alternative condition, would the result have changed?

This matters because computational systems increasingly influence decisions. A risk score may classify a person as high risk. A ranking system may make one item visible and another invisible. A policy model may recommend one intervention over another. A platform rule may trigger review, denial, escalation, or exclusion. Counterfactual reasoning helps analysts examine whether those outcomes depended on particular variables, thresholds, rules, data histories, or structural assumptions.

This article introduces counterfactual reasoning as a disciplined part of computational reasoning. It explains actual worlds, alternative worlds, contrastive explanation, causal counterfactuals, algorithmic recourse, threshold changes, feature changes, institutional conditions, fairness review, sensitivity analysis, validation limits, governance, and representation risk.

Series context: This article is part of the Algorithms & Computational Reasoning knowledge series, which examines algorithms as formal methods for problem solving, decision-making, representation, efficiency, search, optimization, data organization, computational limits, distributed systems, information retrieval, and responsible reasoning in technical and institutional systems.

A restrained scholarly illustration of a vintage research desk with branching causal pathways, alternate outcomes, intervention diagrams, comparison grids, notebooks, archival papers, rulers, and symbolic tokens representing counterfactual reasoning in algorithmic systems. — Counterfactual reasoning shown as structured comparison between what happened, what might have happened, and how algorithmic systems change under different assumptions or interventions.

This article explains counterfactual reasoning, contrastive explanation, causal counterfactuals, alternative scenarios, algorithmic recourse, threshold sensitivity, feature dependence, decision rules, fairness review, intervention assumptions, validation, contestability, and governance. It emphasizes that counterfactual claims require more than a nearby data point or a convenient alternative. They require meaningful alternatives, plausible interventions, clear assumptions, and responsible interpretation.

Why Counterfactual Reasoning Matters

Counterfactual reasoning matters because many computational decisions are only understandable when compared with an alternative. A person may ask why a decision occurred. A regulator may ask whether a different threshold would have changed outcomes. An engineer may ask whether an error came from data, model structure, or deployment context. A community may ask whether an institutional rule produced harm that could have been avoided.

Ordinary explanation often describes the actual path. Counterfactual explanation compares the actual path with an alternative path. It asks what difference made the difference. That makes counterfactual reasoning central to algorithmic explanation, debugging, causal analysis, fairness review, policy design, and responsible automation.

Computational question	Actual-world version	Counterfactual version
Classification	The system assigned a class.	Would the class change if selected inputs changed?
Ranking	The system placed one item above another.	Would a different rule, weight, or signal change visibility?
Risk scoring	The system produced a risk score.	Which assumptions or features would lower or raise the score?
Policy modeling	A model estimated an outcome under current conditions.	What would happen under a different policy condition?
Automated decision	A rule triggered denial, review, escalation, or approval.	Would a different threshold or rule have changed the decision?
Algorithmic harm review	Harm occurred after system deployment.	Would the harm have occurred under a different design or governance structure?

Counterfactual reasoning helps computational systems become explainable as choices among alternatives, not just outputs from procedures.

Counterfactual Reasoning Defined

Counterfactual reasoning is the disciplined comparison between what happened and what would have happened under a specified alternative condition. In algorithmic systems, the alternative condition might involve a changed input, a different decision rule, a shifted threshold, a revised model parameter, a different dataset, a different institutional rule, or a different intervention.

A counterfactual is not any imagined alternative. It must be framed carefully. The alternative should be meaningful, feasible, ethically interpretable, and connected to the question being asked. A counterfactual explanation that tells someone to change an immutable characteristic, for example, is not useful recourse. A counterfactual scenario that changes many hidden conditions at once may not explain what caused the decision.

Counterfactual element	Meaning	Review question
Actual condition	The observed case, decision, data state, rule, or outcome.	What actually happened?
Alternative condition	The changed input, rule, threshold, assumption, or system state.	What is being changed?
Outcome comparison	The difference between actual and alternative outcomes.	Would the result change?
Plausibility constraint	Limits on which alternatives are meaningful or feasible.	Could this alternative reasonably occur?
Intervention assumption	The claim that a change could be made without incoherent side effects.	Is the change actionable or merely mathematical?
Interpretation boundary	The scope within which the counterfactual should be understood.	What does this counterfactual not prove?

Counterfactual reasoning is the logic of asking “what would have been different if this condition had been different?”

Actual Worlds and Alternative Worlds

A counterfactual begins with an actual world: the observed state of the system. It then defines an alternative world: a changed condition whose consequences are being examined. The alternative world may be close to the actual world, such as changing a score threshold by a small amount, or more structural, such as changing an eligibility rule, institutional process, or design architecture.

The challenge is that not all alternative worlds are equally meaningful. Some alternatives are mathematically possible but socially incoherent. Some require changing variables that cannot be intervened on directly. Some ignore relationships among variables. Some hide causal assumptions. A serious counterfactual analysis must say what changed, what stayed fixed, and why that comparison matters.

Alternative-world type	Example	Interpretation risk
Feature change	A value in the input vector changes.	The feature may not be directly changeable.
Threshold change	A decision cutoff moves from 0.70 to 0.75.	The threshold may shift errors across groups.
Rule change	An eligibility rule is rewritten.	The rule may interact with other institutional processes.
Data-history change	A training dataset excludes a biased historical signal.	The model may still learn proxies for the removed signal.
Policy change	A support intervention replaces a punishment-oriented response.	The model may not capture implementation conditions.
Structural change	The institution changes how cases are routed or measured.	The whole data-generating process may change.

Counterfactual reasoning is strongest when it distinguishes local alternatives from structural alternatives.

Inputs, Rules, Thresholds, and Institutional Conditions

In algorithmic systems, counterfactuals often focus on inputs: what if this feature had a different value? But systems are not only input-output machines. They also include rules, thresholds, model weights, training data, preprocessing choices, monitoring systems, review pathways, institutional policies, and human decision procedures.

A counterfactual explanation that focuses only on individual inputs may miss the system-level condition that produced the outcome. If a person is denied by an automated decision system, the relevant counterfactual may not be “what if this person had a slightly different feature value?” It may be “what if the system used a different threshold, included a human review process, measured need differently, or evaluated risk as a signal for support rather than exclusion?”

System layer	Counterfactual question	Governance question
Input data	Would a changed input alter the output?	Is the input accurate, relevant, and fair to use?
Preprocessing	Would different cleaning, encoding, or aggregation change the result?	Are transformations documented and reviewable?
Model structure	Would a different model class produce a different decision?	Why was this model selected?
Threshold	Would a different cutoff change approvals, denials, or escalations?	Who chose the threshold and for what purpose?
Institutional rule	Would a different policy rule change the outcome?	Is the rule legitimate, contestable, and accountable?
Human review	Would meaningful review alter the decision?	Is review real or merely symbolic?

The most important counterfactual may live at the system-design level rather than the feature level.

Counterfactuals in Algorithms and Models

Algorithms can use counterfactual reasoning in several ways. A model may generate alternative inputs that would change a classification. A simulation may compare outcomes under different policies. A causal model may estimate potential outcomes under intervention. A debugging workflow may test whether an output is sensitive to a rule, parameter, or data condition. A governance audit may ask whether a harmful outcome would have occurred under a safer design.

These uses are related, but they are not identical. A counterfactual explanation for a classifier is not the same as a causal effect estimate. A threshold sensitivity test is not the same as a policy simulation. A plausible alternative input is not necessarily an actionable intervention.

Counterfactual use	Computational role	Review concern
Explanation	Show what would change an output.	Does the explanation reflect meaningful alternatives?
Recourse	Identify actions that could change a decision.	Are actions feasible, fair, and within the person’s control?
Sensitivity analysis	Test how fragile an output is to changes.	Does fragility reveal instability or hidden dependence?
Causal inference	Estimate what would happen under intervention.	Are identification assumptions justified?
Policy simulation	Compare outcomes across alternative policies.	Are structural assumptions realistic?
Governance audit	Ask whether harm would have occurred under safer design.	Can institutional responsibility be traced?

Counterfactual reasoning becomes clearer when its purpose is named: explanation, recourse, sensitivity, causality, simulation, or accountability.

Counterfactual Explanations and Recourse

A counterfactual explanation often takes the form: if certain inputs had been different, the system would have produced a different output. For example, an explanation might say that an application would have been approved if a debt ratio had been lower or a documentation field had been complete. This kind of explanation can be useful, but it is not automatically responsible.

Algorithmic recourse asks whether a person has a meaningful path to change a decision. Recourse is stronger than explanation. It requires that the recommended changes be feasible, actionable, stable, proportional, and not based on protected, immutable, or ethically inappropriate attributes. Recourse also requires that the system will still recognize the change when the person acts.

Recourse criterion	Meaning	Failure mode
Actionability	The person can realistically change the condition.	The explanation relies on immutable or inaccessible variables.
Proportionality	The required change is not excessive relative to the decision.	Recourse demands burdensome or unrealistic changes.
Stability	The path remains valid over time and across model updates.	Advice becomes obsolete before it can be used.
Legibility	The explanation is understandable to affected people.	Recourse is technically precise but practically opaque.
Fairness	Groups are not given systematically harder paths to change outcomes.	Some people face higher recourse burdens than others.
Contestability	The person can challenge errors, assumptions, and decision rules.	Recourse substitutes for appeal instead of supporting it.

A counterfactual explanation tells what would change an output. Responsible recourse asks whether that change is meaningful, fair, and possible.

Causal Counterfactuals and Prediction

Counterfactuals can be predictive, causal, or explanatory. A predictive counterfactual changes input values and observes how the model output changes. A causal counterfactual asks what would happen under an intervention in the world. These are not the same. A model may say that changing a feature changes a prediction, but that does not prove that changing the real-world condition would change the outcome.

This distinction is central to computational reasoning. Predictive models often encode associations. Causal counterfactuals require assumptions about how the world changes under intervention. A counterfactual explanation from a classifier may be useful for model transparency, but it should not be mistaken for evidence that an intervention will work.

Counterfactual type	Question	What it can support
Predictive counterfactual	Would the model output change if this input changed?	Model explanation and sensitivity review.
Causal counterfactual	Would the real outcome change under intervention?	Policy, treatment, and intervention reasoning.
Structural counterfactual	Would the outcome change if the system architecture changed?	Institutional redesign and governance review.
Historical counterfactual	Would a different history have produced a different dataset?	Bias analysis and data-generating-process review.
Threshold counterfactual	Would a different cutoff change decisions?	Policy calibration and error tradeoff analysis.
Recourse counterfactual	What action could change this decision?	Affected-person guidance and appeal design.

Prediction can reveal how a model behaves. Causality asks whether intervention changes the world.

Decision Systems and Policy Change

Counterfactual reasoning is especially important in decision systems because decisions are often presented as outputs of a model rather than consequences of choices. A classification threshold, error tolerance, appeal process, triage category, risk cutoff, or eligibility rule reflects institutional judgment. Counterfactual reasoning makes those judgments visible.

A policy counterfactual might ask what would happen if a support program replaced a sanction, if a threshold were recalibrated, if a human review layer were added, if certain proxy variables were excluded, or if the system optimized for harm reduction rather than throughput. These are not merely technical variations. They represent different institutional futures.

Decision-system choice	Counterfactual analysis	Institutional implication
Threshold selection	Compare decisions under different cutoffs.	Shows tradeoffs between false positives, false negatives, access, and burden.
Objective function	Compare outcomes under different optimization targets.	Reveals what the institution is prioritizing.
Appeal pathway	Ask whether meaningful review would change outcomes.	Tests whether contestability is substantive.
Feature inclusion	Compare outputs with and without disputed variables.	Identifies dependence on proxies or questionable measurements.
Support versus sanction	Compare policy responses to the same risk signal.	Separates risk identification from institutional response.
Deployment context	Compare results across settings or populations.	Tests whether the system travels responsibly.

Counterfactual reasoning shows that many algorithmic outcomes are not inevitable. They depend on design and governance choices.

Fairness, Harm, and Contestability

Counterfactual reasoning is often used in fairness analysis. A system might be examined by asking whether a decision would have changed if a protected attribute or proxy had been different. But this approach must be handled carefully. Some counterfactuals about identity are ethically and conceptually problematic because they treat social position as if it were a simple editable variable.

Fairness review should not reduce injustice to a one-feature substitution. A meaningful counterfactual may need to examine measurement systems, historical disadvantage, institutional rules, exposure to surveillance, unequal access, and different recourse burdens. Counterfactual fairness is most useful when connected to structural reasoning, not when it pretends that social categories can be swapped without changing the world around them.

Fairness question	Counterfactual framing	Responsible interpretation
Feature dependence	Would the output change if a disputed feature changed?	Check whether the feature is legitimate and causally meaningful.
Proxy dependence	Would removing a proxy alter decisions?	Review whether hidden substitutes preserve unfair patterns.
Recourse burden	Do different groups face different costs to change outcomes?	Evaluate fairness of actionable pathways, not only predictions.
Error distribution	Would another threshold distribute errors differently?	Make tradeoffs visible and accountable.
Historical dependence	Would a different data history produce different outputs?	Identify inherited institutional patterns.
Appeal effectiveness	Would review change wrongful or harmful outcomes?	Test whether contestability has power.

Counterfactual fairness must be grounded in institutional reality, not only mathematical substitution.

Validation, Sensitivity, and Limits

Counterfactual claims are difficult to validate because the alternative condition is not directly observed. Analysts can test local model behavior, compare historical analogues, run simulations, conduct experiments, examine causal assumptions, audit recourse stability, and evaluate whether counterfactual explanations remain valid after model updates. But counterfactual reasoning always involves limits.

A counterfactual may be valid as a model-behavior statement but invalid as a real-world intervention statement. A small change may cross a decision threshold without representing a meaningful life change. A suggested recourse path may fail because the model, institution, or data source changes. A scenario may appear plausible while ignoring constraints, feedback loops, or hidden dependencies.

Validation practice	Question	Evidence
Local sensitivity test	Does the output change under small input changes?	Perturbation or local explanation report.
Threshold sweep	How do decisions change across cutoffs?	Decision-rate and error-rate tables.
Recourse stability test	Does recommended action remain valid over time?	Model-update comparison.
Feasibility review	Can people actually make the suggested changes?	Domain, stakeholder, and institutional review.
Causal review	Does the counterfactual imply real-world intervention effects?	Causal graph, assumptions, and evidence design.
Scenario stress test	Does the conclusion hold under alternative assumptions?	Sensitivity and robustness analysis.

Counterfactual reasoning should be treated as a claim to be tested, not a visualization to be trusted automatically.

Governance and Responsible Use

Counterfactual reasoning can improve transparency, but it can also create false reassurance. A system may provide a tidy explanation while hiding the larger structure that produced the decision. Governance should ask who defines the counterfactual, which alternatives are allowed, which variables are treated as changeable, whose burden is increased, and whether affected people can challenge the explanation.

Counterfactual governance is especially important when systems affect access, rights, benefits, safety, employment, health, education, housing, finance, public services, or public visibility. In those settings, a counterfactual explanation should not replace due process. It should support meaningful review, correction, appeal, and institutional accountability.

Governance concern	Counterfactual question	Documentation
Purpose clarity	Is the counterfactual for explanation, recourse, audit, or policy design?	Counterfactual-use statement.
Actionability	Are recommended changes under a person’s control?	Recourse feasibility record.
Variable legitimacy	Should this variable be used or changed?	Feature legitimacy review.
Threshold accountability	Who selected the cutoff and error tradeoff?	Threshold decision memo.
Contestability	Can affected people challenge facts, assumptions, and rules?	Appeal and review pathway.
Use boundary	Where should the counterfactual not be used?	Scope and limitation statement.

Responsible counterfactual reasoning makes alternatives visible without shifting institutional responsibility onto affected people.

Representation Risk

Representation risk appears when counterfactuals make a system look more understandable, fair, or actionable than it really is. A counterfactual may suggest that a person could change a decision by altering one feature, while ignoring that the feature is hard to change, measured unfairly, or embedded in structural inequality. A counterfactual may imply causality even when it only describes model behavior. A scenario analysis may create the appearance of policy knowledge while hiding fragile assumptions.

Another risk is responsibility shifting. An institution may say, in effect, “the system told you what to change,” while leaving the underlying rule, threshold, objective, or data history unquestioned. Counterfactual reasoning should not become a way to individualize systemic problems.

Representation risk	How it appears	Review response
Model behavior as causality	A feature change is treated as a real-world intervention effect.	Separate predictive counterfactuals from causal counterfactuals.
Impossible recourse	The recommended change is not feasible or ethical.	Apply actionability and burden review.
Threshold mystification	A cutoff appears technical rather than institutional.	Document threshold choices and tradeoffs.
Structural erasure	Individual feature changes hide institutional design choices.	Include system-level counterfactuals.
False precision	A narrow counterfactual is presented with unwarranted certainty.	State uncertainty, assumptions, and scope.
Responsibility shifting	The burden of change is placed on affected people.	Preserve institutional accountability and appeal rights.

Counterfactual reasoning should clarify the conditions of a decision, not disguise institutional choice as technical inevitability.

Examples of Counterfactual Reasoning

The examples below show how counterfactual reasoning appears across algorithmic explanation, policy analysis, auditing, fairness review, and system design.

Decision threshold review

A system compares approvals, denials, false positives, and false negatives under alternative cutoffs.

Algorithmic recourse

A model identifies what actionable changes would move a case from denial to approval.

Feature sensitivity

An audit tests whether disputed features or proxies strongly influence individual outcomes.

Policy scenario modeling

A workflow compares outcomes under alternative institutional rules or interventions.

Fairness counterfactuals

A review asks whether comparable cases receive different outcomes because of protected attributes or proxies.

Appeal analysis

A governance review asks whether meaningful human review would have corrected a harmful decision.

Debugging

Engineers test whether a result depended on a data error, preprocessing choice, or model update.

Historical data audit

Analysts ask how outcomes would differ if biased historical labels or measurement systems had been changed.

Across these examples, counterfactual reasoning asks what difference made the difference.

Mathematics, Computation, and Modeling

A simple model-output counterfactual can be written as:

\[
f(x’) \neq f(x)
\]

Interpretation: The model output changes when the input changes from the actual case \(x\) to the alternative case \(x’\).

A minimal counterfactual search can be written as an optimization problem:

\[
x^* = \arg\min_{x’} d(x, x’) \quad \text{subject to} \quad f(x’) = y_{target}
\]

Interpretation: Find the closest alternative case that produces the target model outcome.

A threshold counterfactual can be written as:

\[
D_i(t) = \mathbb{1}[s_i \geq t]
\]

Interpretation: A decision \(D_i\) changes when a score \(s_i\) crosses threshold \(t\).

A causal counterfactual can be expressed using potential outcomes:

\[
Y_i(a) – Y_i(a’)
\]

Interpretation: The counterfactual contrast compares the outcome for unit \(i\) under action \(a\) with the outcome under alternative action \(a’\).

A structural counterfactual can be written as:

\[
Y_{M’} – Y_M
\]

Interpretation: The outcome under an alternative system model \(M’\) is compared with the outcome under the current system model \(M\).

These formulas show why counterfactual reasoning sits between optimization, causal inference, decision analysis, and governance.

Python Workflow: Counterfactual Audit

The Python workflow below creates a dependency-light counterfactual audit. It generates synthetic decision cases, applies a score threshold, searches for simple actionable counterfactuals, sweeps alternative thresholds, records recourse burdens, and writes reproducible CSV and JSON outputs.

# counterfactual_reasoning_algorithmic_systems_audit.py
# Dependency-light workflow for threshold counterfactuals, recourse search,
# sensitivity review, and governance documentation.

from __future__ import annotations

from dataclasses import asdict, dataclass
from pathlib import Path
from statistics import mean
import csv
import json
import math
import random
from datetime import datetime, timezone

ARTICLE_ROOT = Path(__file__).resolve().parents[1]
TABLES = ARTICLE_ROOT / "outputs" / "tables"
JSON_DIR = ARTICLE_ROOT / "outputs" / "json"


@dataclass(frozen=True)
class CounterfactualAuditConfig:
    article: str
    seed: int
    n: int
    decision_threshold: float
    target_decision: int


@dataclass(frozen=True)
class CounterfactualReviewItem:
    item: str
    description: str
    review_question: str
    status: str


def timestamp_utc() -> str:
    return datetime.now(timezone.utc).isoformat()


def sigmoid(value: float) -> float:
    return 1.0 / (1.0 + math.exp(-value))


def write_csv(path: Path, rows: list[dict[str, object]]) -> None:
    path.parent.mkdir(parents=True, exist_ok=True)
    if not rows:
        path.write_text("", encoding="utf-8")
        return
    fieldnames = sorted({key for row in rows for key in row.keys()})
    with path.open("w", newline="", encoding="utf-8") as handle:
        writer = csv.DictWriter(handle, fieldnames=fieldnames, extrasaction="ignore")
        writer.writeheader()
        writer.writerows(rows)


def write_json(path: Path, payload: object) -> None:
    path.parent.mkdir(parents=True, exist_ok=True)
    path.write_text(json.dumps(payload, indent=2, sort_keys=True), encoding="utf-8")


def default_config() -> CounterfactualAuditConfig:
    return CounterfactualAuditConfig(
        article="counterfactual_reasoning_in_algorithmic_systems",
        seed=2026,
        n=600,
        decision_threshold=0.68,
        target_decision=1,
    )


def score_case(case: dict[str, float]) -> float:
    return sigmoid(
        -1.10
        + 1.35 * case["document_completeness"]
        + 1.10 * case["stability_score"]
        - 0.95 * case["burden_index"]
        + 0.60 * case["prior_access"]
    )


def generate_cases(config: CounterfactualAuditConfig) -> list[dict[str, object]]:
    rng = random.Random(config.seed)
    rows: list[dict[str, object]] = []
    for case_id in range(1, config.n + 1):
        prior_access = max(0.0, min(1.0, rng.gauss(0.52, 0.22)))
        burden_index = max(0.0, min(1.0, rng.gauss(0.42 + 0.25 * (1 - prior_access), 0.18)))
        document_completeness = max(0.0, min(1.0, rng.gauss(0.58 + 0.20 * prior_access, 0.20)))
        stability_score = max(0.0, min(1.0, rng.gauss(0.55 - 0.15 * burden_index, 0.16)))
        features = {
            "prior_access": prior_access,
            "burden_index": burden_index,
            "document_completeness": document_completeness,
            "stability_score": stability_score,
        }
        score = score_case(features)
        decision = 1 if score >= config.decision_threshold else 0
        rows.append({
            "case_id": case_id,
            "prior_access": round(prior_access, 6),
            "burden_index": round(burden_index, 6),
            "document_completeness": round(document_completeness, 6),
            "stability_score": round(stability_score, 6),
            "score": round(score, 6),
            "threshold": config.decision_threshold,
            "decision": decision,
            "interpretation": "Synthetic case for counterfactual reasoning; not an operational decision record.",
        })
    return rows


def find_actionable_counterfactual(row: dict[str, object], threshold: float) -> dict[str, object]:
    current_score = float(row["score"])
    current_decision = int(row["decision"])
    if current_decision == 1:
        return {
            "case_id": row["case_id"],
            "current_decision": current_decision,
            "counterfactual_found": False,
            "recommended_change": "already_meets_target_decision",
            "new_score": current_score,
            "recourse_burden": 0.0,
            "interpretation": "No recourse needed because target decision is already reached.",
        }

    base = {
        "prior_access": float(row["prior_access"]),
        "burden_index": float(row["burden_index"]),
        "document_completeness": float(row["document_completeness"]),
        "stability_score": float(row["stability_score"]),
    }

    candidates: list[dict[str, object]] = []
    for document_increase in [0.05, 0.10, 0.15, 0.20, 0.25, 0.30]:
        for stability_increase in [0.00, 0.05, 0.10, 0.15, 0.20]:
            altered = dict(base)
            altered["document_completeness"] = min(1.0, altered["document_completeness"] + document_increase)
            altered["stability_score"] = min(1.0, altered["stability_score"] + stability_increase)
            new_score = score_case(altered)
            new_decision = 1 if new_score >= threshold else 0
            burden = document_increase + stability_increase
            if new_decision == 1:
                candidates.append({
                    "document_increase": document_increase,
                    "stability_increase": stability_increase,
                    "new_score": new_score,
                    "recourse_burden": burden,
                })

    if not candidates:
        return {
            "case_id": row["case_id"],
            "current_decision": current_decision,
            "counterfactual_found": False,
            "recommended_change": "no_simple_actionable_counterfactual_found",
            "new_score": current_score,
            "recourse_burden": None,
            "interpretation": "No simple actionable change crosses the threshold under this search grid.",
        }

    best = min(candidates, key=lambda item: float(item["recourse_burden"]))
    return {
        "case_id": row["case_id"],
        "current_decision": current_decision,
        "counterfactual_found": True,
        "recommended_change": f"increase_document_completeness_by_{best['document_increase']:.2f}_and_stability_by_{best['stability_increase']:.2f}",
        "new_score": round(float(best["new_score"]), 6),
        "recourse_burden": round(float(best["recourse_burden"]), 6),
        "interpretation": "This is a model-behavior counterfactual, not proof that the real-world intervention will cause approval.",
    }


def threshold_sweep(rows: list[dict[str, object]], thresholds: list[float]) -> list[dict[str, object]]:
    output: list[dict[str, object]] = []
    for threshold in thresholds:
        approved = sum(1 for row in rows if float(row["score"]) >= threshold)
        output.append({
            "threshold": round(threshold, 3),
            "approved_count": approved,
            "denied_count": len(rows) - approved,
            "approval_rate": round(approved / len(rows), 6),
            "interpretation": "Threshold counterfactual showing how decision volume changes under alternative cutoffs.",
        })
    return output


def review_register() -> list[dict[str, object]]:
    items = [
        CounterfactualReviewItem("purpose", "State whether the counterfactual is for explanation, recourse, audit, or policy design.", "What is this counterfactual for?", "complete"),
        CounterfactualReviewItem("actionability", "Check whether recommended changes are feasible and under the affected person's control.", "Can the person actually act on this?", "needs_review"),
        CounterfactualReviewItem("causal_status", "Distinguish model-behavior counterfactuals from causal intervention claims.", "Does this imply real-world causality?", "needs_review"),
        CounterfactualReviewItem("threshold_accountability", "Document who selected the threshold and why.", "Who chose the cutoff?", "partial"),
        CounterfactualReviewItem("recourse_burden", "Compare how costly recourse is across cases and groups.", "Are burdens distributed fairly?", "partial"),
        CounterfactualReviewItem("contestability", "Provide ways to challenge data, rules, assumptions, and outcomes.", "Can the affected person appeal?", "needs_review"),
    ]
    return [asdict(item) for item in items]


def main() -> None:
    config = default_config()
    rows = generate_cases(config)
    recourse_rows = [find_actionable_counterfactual(row, config.decision_threshold) for row in rows]
    sweep_rows = threshold_sweep(rows, [0.55, 0.60, 0.65, 0.68, 0.70, 0.75, 0.80])
    register_rows = review_register()

    denied = [row for row in rows if int(row["decision"]) == 0]
    recourse_found = [row for row in recourse_rows if row["counterfactual_found"] is True]
    burdens = [float(row["recourse_burden"]) for row in recourse_found if row["recourse_burden"] is not None]

    summary = {
        "article": config.article,
        "timestamp_utc": timestamp_utc(),
        "n": config.n,
        "decision_threshold": config.decision_threshold,
        "approved_count": sum(1 for row in rows if int(row["decision"]) == 1),
        "denied_count": len(denied),
        "recourse_found_count": len(recourse_found),
        "average_recourse_burden": round(mean(burdens), 6) if burdens else None,
        "review_items_needing_attention": sum(1 for row in register_rows if row["status"] in {"partial", "needs_review"}),
        "interpretation": "Counterfactual audit separates model-output changes from actionable, causal, and governable alternatives.",
    }

    write_csv(TABLES / "counterfactual_synthetic_cases.csv", rows)
    write_csv(TABLES / "counterfactual_recourse_results.csv", recourse_rows)
    write_csv(TABLES / "counterfactual_threshold_sweep.csv", sweep_rows)
    write_csv(TABLES / "counterfactual_review_register.csv", register_rows)
    write_csv(TABLES / "counterfactual_audit_summary.csv", [summary])

    write_json(JSON_DIR / "counterfactual_audit_config.json", asdict(config))
    write_json(JSON_DIR / "counterfactual_synthetic_cases.json", rows)
    write_json(JSON_DIR / "counterfactual_recourse_results.json", recourse_rows)
    write_json(JSON_DIR / "counterfactual_threshold_sweep.json", sweep_rows)
    write_json(JSON_DIR / "counterfactual_review_register.json", register_rows)
    write_json(JSON_DIR / "counterfactual_audit_summary.json", summary)

    print("Counterfactual audit complete.")
    print(TABLES / "counterfactual_audit_summary.csv")


if __name__ == "__main__":
    main()

The workflow is intentionally simple. It separates model-behavior counterfactuals from causal claims, writes review artifacts, and gives the repository a reproducible audit layer rather than a one-off code sample.

R Workflow: Counterfactual Summary and Diagnostics

The R workflow reads the Python-generated outputs and creates summary diagnostics for threshold sensitivity, recourse burden, and decision distribution.

# counterfactual_reasoning_algorithmic_systems_summary.R
# Reads generated CSV outputs and produces simple diagnostics.

args <- commandArgs(trailingOnly = FALSE)
file_arg <- grep("^--file=", args, value = TRUE)
if (length(file_arg) > 0) {
  script_path <- normalizePath(sub("^--file=", "", file_arg[1]), mustWork = TRUE)
  article_root <- normalizePath(file.path(dirname(script_path), ".."), mustWork = TRUE)
} else {
  article_root <- getwd()
}

setwd(article_root)
tables_dir <- file.path(article_root, "outputs", "tables")
figures_dir <- file.path(article_root, "outputs", "figures")
dir.create(tables_dir, recursive = TRUE, showWarnings = FALSE)
dir.create(figures_dir, recursive = TRUE, showWarnings = FALSE)

summary_path <- file.path(tables_dir, "counterfactual_audit_summary.csv")
if (!file.exists(summary_path)) stop(paste("Missing", summary_path, "Run the Python workflow first."))
summary_data <- read.csv(summary_path, stringsAsFactors = FALSE)

sweep_path <- file.path(tables_dir, "counterfactual_threshold_sweep.csv")
if (file.exists(sweep_path)) {
  sweep <- read.csv(sweep_path, stringsAsFactors = FALSE)
  png(file.path(figures_dir, "counterfactual_threshold_sweep.png"), width = 1300, height = 850)
  plot(sweep$threshold, sweep$approval_rate, type = "b", xlab = "Threshold", ylab = "Approval rate", main = "Counterfactual Threshold Sweep")
  grid()
  dev.off()
}

recourse_path <- file.path(tables_dir, "counterfactual_recourse_results.csv")
if (file.exists(recourse_path)) {
  recourse <- read.csv(recourse_path, stringsAsFactors = FALSE)
  burden_values <- recourse$recourse_burden[!is.na(recourse$recourse_burden)]
  if (length(burden_values) > 0) {
    png(file.path(figures_dir, "counterfactual_recourse_burden.png"), width = 1200, height = 850)
    hist(burden_values, xlab = "Recourse burden", main = "Distribution of Counterfactual Recourse Burden")
    grid()
    dev.off()
  }
}

r_summary <- data.frame(
  n = summary_data$n[1],
  decision_threshold = summary_data$decision_threshold[1],
  approved_count = summary_data$approved_count[1],
  denied_count = summary_data$denied_count[1],
  recourse_found_count = summary_data$recourse_found_count[1],
  average_recourse_burden = summary_data$average_recourse_burden[1],
  review_items_needing_attention = summary_data$review_items_needing_attention[1]
)

write.csv(r_summary, file.path(tables_dir, "r_counterfactual_summary.csv"), row.names = FALSE)
print(r_summary)

The R layer turns the audit outputs into review summaries and plots that can be inspected by analysts, editors, or governance reviewers.

GitHub Repository

The companion repository contains reproducible workflows, synthetic data, audit outputs, calculators, documentation, and multilingual examples for this article.

Complete Code Repository

Companion article folder with Python, R, Julia, SQL, Haskell, C, C++, Fortran, Rust, Go, Java, TypeScript, Prolog, Racket, notebooks, documentation, synthetic teaching data, generated outputs, schemas, calculators, and Canvas-ready workflow artifacts for counterfactual reasoning, alternative outcomes, decision thresholds, recourse, sensitivity analysis, causal comparison, fairness review, appeals, governance documentation, and responsible algorithmic interpretation.

View the Full GitHub Repository

A Practical Method for Reviewing Counterfactual Reasoning

A practical counterfactual review should begin by naming the purpose of the counterfactual. Is it explaining a model output, offering recourse, testing sensitivity, comparing policy alternatives, auditing fairness, or assigning institutional responsibility? The purpose determines what kind of evidence and constraint is needed.

Next, define the actual condition and the alternative condition. State what changes and what remains fixed. Identify whether the change is a feature substitution, threshold shift, rule change, data-history change, policy intervention, or system redesign. Then ask whether the alternative is feasible, meaningful, and ethically interpretable.

Review step	Question	Output
Define purpose	Why is the counterfactual being generated?	Explanation, recourse, audit, policy, or governance note.
Define actual condition	What happened under the current system?	Observed decision or outcome record.
Define alternative condition	What is changed in the counterfactual?	Feature, threshold, rule, policy, or system-change statement.
Constrain alternatives	Which changes are feasible, ethical, and coherent?	Actionability and plausibility register.
Separate model behavior from causality	Does the counterfactual show prediction change or real-world intervention effect?	Causal-status note.
Govern use	How will this counterfactual shape decisions or appeals?	Use-boundary and accountability record.

The goal is to make alternative conditions visible enough that they can be examined before they are used as explanation, advice, policy, or justification.

Common Pitfalls

A common pitfall is treating any model-output change as meaningful recourse. A classifier may say that a small feature change would alter the result, but the feature may not be changeable, measurable, legitimate, or causally connected to the real outcome. Another pitfall is treating a local counterfactual as a system explanation. A nearby alternative may explain one output while hiding broader institutional causes.

Common pitfalls include:

model behavior as causality: a predictive counterfactual is treated as an intervention effect;
inactionable recourse: recommended changes are impossible, excessive, or outside a person’s control;
identity substitution: protected or socially embedded attributes are treated as simple editable variables;
threshold hiding: cutoff choices are treated as technical facts rather than governance decisions;
proxy persistence: removing one feature leaves proxy dependence intact;
fragile explanations: counterfactuals change after minor model updates or data shifts;
structural erasure: individual-level explanations hide institutional design choices;
overprecision: a speculative alternative is presented as certain;
recourse burden inequality: some groups face harder paths to change decisions;
governance gap: counterfactual explanations are provided without appeal, review, or accountability.

The remedy is disciplined counterfactual review: define purpose, constrain alternatives, distinguish prediction from causality, evaluate actionability, test sensitivity, audit burdens, and preserve contestability.

Why Counterfactual Reasoning Is Computational Reasoning

Counterfactual reasoning is computational reasoning because algorithms do not only process what is actual. They also help compare alternatives. A model can be asked what would happen if an input changed. A simulation can compare policy scenarios. A causal workflow can estimate potential outcomes. A governance audit can ask whether a harmful decision would have occurred under a different rule, threshold, or review process.

This makes counterfactual reasoning powerful, but also risky. A counterfactual can clarify a decision, support recourse, reveal threshold dependence, expose unfair burdens, and test design choices. It can also create false confidence, shift responsibility onto affected people, imply causality where none has been shown, or make institutional choices look like technical inevitabilities.

The strongest counterfactual workflows make alternatives explicit. They show the actual condition, alternative condition, changed variables, fixed assumptions, output difference, actionability constraints, causal status, uncertainty, fairness implications, and governance boundaries.

The next article turns to causal algorithms and intervention modeling: how causal graphs, do-calculus, intervention design, and policy analysis support computational reasoning about what changes the world, not merely what changes a model output.

References

Barocas, S., Hardt, M. and Narayanan, A. (2019) Fairness and Machine Learning: Limitations and Opportunities. Available at: https://fairmlbook.org/.
Karimi, A.-H., Barthe, G., Schölkopf, B. and Valera, I. (2022) ‘A survey of algorithmic recourse: contrastive explanations and consequential recommendations’, ACM Computing Surveys, 55(5), pp. 1–29.
Lewis, D. (1973) Counterfactuals. Oxford: Blackwell.
Miller, T. (2019) ‘Explanation in artificial intelligence: insights from the social sciences’, Artificial Intelligence, 267, pp. 1–38.
Molnar, C. (2022) Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. 2nd edn. Available at: https://christophm.github.io/interpretable-ml-book/.
Morgan, S.L. and Winship, C. (2015) Counterfactuals and Causal Inference: Methods and Principles for Social Research. 2nd edn. Cambridge: Cambridge University Press.
Pearl, J. (2009) Causality: Models, Reasoning, and Inference. 2nd edn. Cambridge: Cambridge University Press.
Pearl, J., Glymour, M. and Jewell, N.P. (2016) Causal Inference in Statistics: A Primer. Chichester: Wiley.
Peters, J., Janzing, D. and Schölkopf, B. (2017) Elements of Causal Inference: Foundations and Learning Algorithms. Cambridge, MA: MIT Press.
Wachter, S., Mittelstadt, B. and Russell, C. (2017) ‘Counterfactual explanations without opening the black box: automated decisions and the GDPR’, Harvard Journal of Law & Technology, 31(2), pp. 841–887.
Woodward, J. (2003) Making Things Happen: A Theory of Causal Explanation. Oxford: Oxford University Press.

Continue the Algorithms & Computational Reasoning Series

Previous Article
Causal Inference and Computational Reasoning

Article Map
Algorithms & Computational Reasoning

Next Article
Causal Algorithms and Intervention Modeling

Why Counterfactual Reasoning Matters

Counterfactual Reasoning Defined

Actual Worlds and Alternative Worlds

Inputs, Rules, Thresholds, and Institutional Conditions

Counterfactuals in Algorithms and Models

Counterfactual Explanations and Recourse

Causal Counterfactuals and Prediction

Decision Systems and Policy Change

Fairness, Harm, and Contestability

Validation, Sensitivity, and Limits

Governance and Responsible Use

Representation Risk

Examples of Counterfactual Reasoning

Decision threshold review

Algorithmic recourse

Feature sensitivity

Policy scenario modeling

Fairness counterfactuals

Appeal analysis

Debugging

Historical data audit

Mathematics, Computation, and Modeling

Python Workflow: Counterfactual Audit

R Workflow: Counterfactual Summary and Diagnostics

GitHub Repository

A Practical Method for Reviewing Counterfactual Reasoning

Common Pitfalls

Why Counterfactual Reasoning Is Computational Reasoning

Further Reading

References

Leave a Comment Cancel Reply

Why Counterfactual Reasoning Matters

Counterfactual Reasoning Defined

Actual Worlds and Alternative Worlds

Inputs, Rules, Thresholds, and Institutional Conditions

Counterfactuals in Algorithms and Models

Counterfactual Explanations and Recourse

Causal Counterfactuals and Prediction

Decision Systems and Policy Change

Fairness, Harm, and Contestability

Validation, Sensitivity, and Limits

Governance and Responsible Use

Representation Risk

Examples of Counterfactual Reasoning

Decision threshold review

Algorithmic recourse

Feature sensitivity

Policy scenario modeling

Fairness counterfactuals

Appeal analysis

Debugging

Historical data audit

Mathematics, Computation, and Modeling

Python Workflow: Counterfactual Audit

R Workflow: Counterfactual Summary and Diagnostics

GitHub Repository

A Practical Method for Reviewing Counterfactual Reasoning

Common Pitfalls

Why Counterfactual Reasoning Is Computational Reasoning

Related Articles

Further Reading

References

Leave a Comment Cancel Reply