Last Updated June 11, 2026
Abstraction and representation are the central acts of mathematical modeling. A model does not reproduce the world in full. It selects, simplifies, encodes, and organizes some features of a situation so that they can be reasoned about mathematically. To model is to decide what matters, what can be ignored, what must be preserved, and what formal structure can carry the relationships needed for explanation, prediction, simulation, optimization, or decision support.
Abstraction removes detail while preserving structure. Representation gives that structure a form: a variable, parameter, equation, inequality, diagram, probability distribution, graph, simulation, algorithm, or computational workflow. The model becomes a bridge between a real-world target and a formal system. That bridge can clarify, but it can also distort. A model reveals by leaving things out.
This article examines how abstraction and representation work inside mathematical models. It explains why simplification is not a weakness by itself, why representation is never neutral, how different formal systems make different features visible, and why models must be assessed according to purpose, evidence, uncertainty, and consequences of use.

A mathematical model is therefore neither the world itself nor a decorative symbol for it. It is a selective representation. The quality of a model depends on whether the abstraction preserves the relationships relevant to the model’s purpose, whether the representation makes those relationships inspectable, whether the assumptions are explicit, and whether the model’s limits are communicated honestly.
Why Abstraction and Representation Matter
Abstraction and representation matter because mathematical models are not transparent windows onto reality. They are structured ways of seeing. A model makes some features visible, suppresses others, and gives selected relationships a formal shape. This is why the same real-world system can support many valid models depending on the question being asked.
A traffic system can be represented as a network of roads and intersections, a queueing system, a set of individual agents, a flow optimization problem, a stochastic process, or a spatial simulation. A disease outbreak can be represented as differential equations, a branching process, a contact network, an agent-based model, or a statistical forecasting model. A climate system can be represented through physical equations, parameterized submodels, statistical emulators, scenarios, or integrated assessment models.
These representations are not interchangeable. Each preserves different structure. Each supports different questions. Each creates different risks. A network representation may reveal connectivity but hide internal dynamics. A differential equation may clarify rates of change but hide individual heterogeneity. A statistical model may predict well while leaving mechanisms unclear. A simulation may show plausible behavior while making assumptions harder to inspect.
| Modeling act | What it does | What can go wrong |
|---|---|---|
| Abstraction | Removes detail while preserving selected structure. | Removes a feature that is actually essential. |
| Representation | Gives selected structure a formal form. | Uses a form that makes the wrong relationships appear important. |
| Idealization | Introduces simplifying assumptions that are knowingly false but useful. | Treats the idealization as literal reality. |
| Aggregation | Combines many units into a summary quantity. | Hides variation, inequality, thresholds, or local failure. |
| Parameterization | Compresses unresolved processes into parameters. | Turns unknown mechanisms into numbers that appear known. |
| Visualization | Makes patterns and relationships visible. | Creates persuasive images that exceed the model’s evidence. |
The question is not whether a model simplifies. Every model simplifies. The real question is whether the simplification is appropriate for the model’s purpose, whether the representation preserves the structure that matters, and whether users understand what has been left out.
What Abstraction Means in Modeling
Abstraction is the process of moving from a particular situation to a more general structure. It asks what can be removed without destroying the relationship the model is meant to examine. A reservoir becomes a stock with inflows and outflows. A population becomes a state variable. A supply chain becomes a graph. A policy choice becomes a decision variable. A bridge becomes a system of loads, supports, materials, and constraints.
Abstraction is not the same as vagueness. A good abstraction is often more precise than the messy situation it represents. It replaces raw complexity with a disciplined structure: variables, parameters, equations, assumptions, constraints, and outputs. It is selective, but not arbitrary.
Abstraction always answers a purpose. A model for teaching may abstract differently from a model for engineering certification. A model for short-term prediction may abstract differently from a model for causal explanation. A model for stakeholder deliberation may need to preserve interpretability, while a high-performance simulation may preserve detailed mechanism at the cost of transparency.
| Real-world feature | Possible abstraction | Preserved structure | Lost detail |
|---|---|---|---|
| Water stored in a reservoir | Stock variable \(S_t\) | Accumulation, depletion, capacity. | Spatial variation, water quality, operating rules unless added. |
| People in a disease outbreak | Compartments \(S,I,R\) | Population-level transitions. | Individual contact histories and heterogeneous susceptibility. |
| Transportation system | Graph \(G=(V,E)\) | Connectivity, paths, bottlenecks. | Driver behavior, timing, lane-level detail unless represented. |
| Economic demand | Function \(D(p)\) | Relationship between price and quantity. | Institutional, cultural, and behavioral heterogeneity. |
| Engineering material | Constitutive relation | Stress-strain behavior. | Microstructure, fatigue, defects unless modeled. |
| Climate pathway | Scenario trajectory | Long-term emissions or forcing assumptions. | Political uncertainty, regional variation, social response. |
Good abstraction keeps a connection to interpretation. A variable should have a meaning. A parameter should have a source or rationale. A constraint should represent a physical, legal, ethical, operational, or mathematical limit. A model output should connect back to the original question. If these connections are unclear, the abstraction may be mathematically elegant but practically weak.
What Representation Means in Modeling
Representation is the formal expression of an abstraction. It is how selected structure becomes mathematically usable. A representation may be symbolic, graphical, computational, statistical, geometric, algorithmic, or visual. The same abstract system can often be represented in multiple ways, and each representation supports different forms of reasoning.
A dynamic system can be represented by a differential equation, a recurrence relation, a simulation algorithm, a state-space model, or a diagram of stocks and flows. A relationship among entities can be represented by a graph, an adjacency matrix, a flow network, or an agent-based simulation. An uncertain quantity can be represented by a probability distribution, confidence interval, Bayesian posterior, scenario range, or ensemble output.
| Representation | Typical form | Strength | Limitation |
|---|---|---|---|
| Equation | \(y=f(x;\theta)\) | Compact, analyzable, precise. | May hide assumptions behind notation. |
| Differential equation | \(\frac{dx}{dt}=f(x,t,\theta)\) | Represents continuous change and rates. | May require assumptions of smoothness and homogeneity. |
| Recurrence relation | \(x_{t+1}=f(x_t,\theta)\) | Represents stepwise evolution. | Time-step choice can shape results. |
| Graph | \(G=(V,E)\) | Represents relationships, paths, connectivity. | May hide internal state or dynamics of nodes. |
| Probability model | \(Y\sim P(y\mid\theta)\) | Represents randomness and inference. | Distributional assumptions may dominate interpretation. |
| Optimization model | \(\min f(x)\) subject to constraints. | Clarifies objectives and feasible choices. | Objective function may omit values that matter. |
| Simulation | Rules, algorithms, time steps, events. | Explores complex behavior when closed-form analysis is unavailable. | Can be difficult to validate and interpret. |
| Visualization | Plot, map, diagram, dashboard. | Makes patterns visible. | Can persuade beyond the evidence. |
Representation is not merely communication after the model is built. It shapes discovery. A poor representation can make a simple structure look complicated. A good representation can reveal invariants, feedback loops, constraints, symmetries, or dependencies that were previously hidden.
Modelers therefore need representational judgment. They must ask not only whether a representation is mathematically valid, but whether it preserves the relationships relevant to the question, whether it supports the needed analysis, and whether it can be interpreted responsibly by the people who will use it.
The Model-Target Relation
A mathematical model usually represents a target system: the real or imagined system the model is about. The target may be a physical object, population, ecosystem, economy, disease process, infrastructure network, material, institution, decision process, or data-generating mechanism. The model-target relation is the relationship between the formal representation and what it is meant to stand for.
This relation is not simple copying. A model does not need to resemble its target in every way. A map represents a city by ignoring most of what the city contains. A circuit diagram represents an electrical system without drawing every physical detail. A compartmental disease model may represent population-level transitions without representing individual lives. The representation is useful because selected relationships are preserved.
| Question | Why it matters | Example |
|---|---|---|
| What is the target system? | Defines what the model is about. | A watershed, market, bridge, population, network, or dataset. |
| Which features are represented? | Clarifies what the model can reason about. | Storage, flow, connectivity, transmission, cost, risk. |
| Which features are omitted? | Identifies limits of interpretation. | Spatial variation, individual heterogeneity, behavior, policy constraints. |
| What relationship connects model and target? | Explains why outputs might teach us about the world. | Similarity, mechanism, structural correspondence, calibration, analogy. |
| What would break the representation? | Identifies failure conditions. | Regime change, nonstationarity, missing feedback, invalid assumptions. |
The model-target relation should be documented. Without such documentation, model users may assume the representation is broader or more literal than it is. A model that represents average behavior may be mistaken for a model of individual outcomes. A model calibrated to one region may be applied to another. A model built for scenario exploration may be treated as a precise forecast.
Representational adequacy is therefore purpose-dependent. The question is not whether the model is “realistic” in every respect. The question is whether the representation is adequate for the intended use, given its assumptions, evidence, uncertainty, and consequences.
Selective Simplification, Idealization, and Distortion
Simplification is unavoidable in modeling. The problem is not simplification itself. The problem is uncontrolled simplification, hidden simplification, or simplification that removes the structure needed for the model’s purpose.
Idealization is a special kind of simplification. It knowingly introduces assumptions that are not literally true in order to make analysis possible. A frictionless plane, perfectly mixed population, representative agent, point mass, continuous fluid, or normally distributed error term may be useful even though it is not literal reality. Idealizations can clarify mechanisms, but they must be interpreted carefully.
Distortion occurs when simplification changes the model’s meaning or encourages conclusions that the representation cannot support. Aggregating people into one average may erase inequality. Treating a nonlinear system as linear may hide thresholds. Treating a dynamic process as static may hide delays and path dependence. Treating uncertain inputs as fixed may produce false precision.
| Modeling move | Useful when | Dangerous when |
|---|---|---|
| Simplification | It removes irrelevant detail and improves clarity. | It removes a mechanism central to the question. |
| Idealization | It isolates a relationship for analysis. | It is treated as literal reality. |
| Aggregation | Average behavior is the object of interest. | Distribution, inequality, or local variation matters. |
| Linearization | Behavior near an operating point is approximately linear. | Thresholds, saturation, or feedback dominate outcomes. |
| Parameterization | Unresolved processes can be represented by calibrated terms. | Parameters hide mechanisms that drive conclusions. |
| Scenario reduction | A small set of cases clarifies a range of possibilities. | Important futures are excluded from analysis. |
A strong model does not hide its simplifications. It states them, explains why they were made, tests their importance where possible, and revises them when evidence shows that they distort the system in ways that matter.
Features, Structure, and Invariance
Abstraction often works by distinguishing features from structure. A feature is an attribute of a system: a value, label, property, or local detail. Structure is the pattern of relationships among features. Mathematical modeling usually preserves structure more than surface appearance.
For example, two systems may look very different but share a common structure of accumulation and depletion. A bank account, reservoir, inventory, population, and carbon stock can all be represented as stock-flow systems. Their meanings differ, but the structural relation is similar: next state equals current state plus inflows minus outflows.
X_{t+1}=X_t+\text{inflows}_t-\text{outflows}_t
\]
Interpretation: A stock-flow abstraction preserves the structure of accumulation across many different systems.
Invariance is especially important. An invariant is something that remains unchanged under transformation. Conservation laws, ratios, symmetries, ranks, topological properties, and equilibrium conditions can all serve as modeling anchors. When a model preserves an invariant, it may support reasoning across different representations.
| Structural idea | Modeling role | Example |
|---|---|---|
| Conservation | Tracks what is preserved through change. | Mass, energy, charge, population balance. |
| Connectivity | Represents relationships among units. | Road networks, supply chains, contagion networks. |
| Feedback | Shows how outputs influence future inputs. | Predator-prey systems, control systems, policy resistance. |
| Equilibrium | Identifies stable or balanced states. | Market equilibrium, mechanical equilibrium, steady state. |
| Threshold | Marks a qualitative change in behavior. | Tipping points, failure limits, epidemic thresholds. |
| Scaling | Shows how behavior changes with size or resolution. | Dimensional analysis, power laws, nondimensional groups. |
Good abstraction is often the art of preserving the right invariants. A model may omit many details but remain useful because it preserves the relationship, constraint, or transformation that matters most for the question.
Major Forms of Mathematical Representation
Mathematical models use many forms of representation. These forms are not only technical choices. They are epistemic choices: they shape what can be known, measured, simulated, optimized, explained, or communicated.
| Representation family | What it emphasizes | Useful for | Typical limitation |
|---|---|---|---|
| Algebraic representation | Static relationships among quantities. | Equilibrium, proportionality, constraints, closed-form reasoning. | May hide time, uncertainty, and dynamics. |
| Dynamic representation | Change over time. | Growth, decay, feedback, accumulation, oscillation. | Requires assumptions about time scale and mechanisms. |
| Probabilistic representation | Uncertainty, variation, inference. | Risk, prediction, estimation, noise, measurement error. | Distributional assumptions may be hard to justify. |
| Network representation | Relationships among entities. | Connectivity, contagion, infrastructure, social systems. | May underrepresent internal node dynamics. |
| Spatial representation | Location, geometry, distance, movement. | Ecology, urban systems, climate, diffusion, land use. | Resolution and boundary choices can dominate results. |
| Optimization representation | Objectives and constraints. | Resource allocation, design, scheduling, policy trade-offs. | Objective functions can encode narrow values. |
| Agent-based representation | Local rules and heterogeneous actors. | Emergence, diffusion, adaptation, social behavior. | Calibration and validation can be difficult. |
| Simulation representation | Executable process over time or events. | Complex systems, nonlinear behavior, scenario exploration. | Outputs may be hard to explain without diagnostics. |
The best representation is not always the most complex one. It is the representation that best matches the purpose, evidence, scale, and interpretive needs of the model. A simple algebraic model may be better than a complex simulation if the goal is conceptual clarity. A detailed simulation may be necessary if the goal is engineering performance under nonlinear constraints. A probabilistic model may be essential when uncertainty is the central issue.
Representation choice should therefore be a documented modeling decision. It should not be hidden behind software convenience or disciplinary habit.
Scale, Boundaries, and Resolution
Abstraction and representation depend on scale. A model may represent atoms, cells, individuals, populations, organizations, cities, ecosystems, nations, or planetary systems. A representation that works at one scale may fail at another. Individual behavior may not aggregate cleanly to population behavior. Local stability may not imply system-wide resilience. Short-term accuracy may not imply long-term reliability.
Boundaries define what belongs inside the model. Resolution defines how finely the model distinguishes states, space, time, groups, or mechanisms. Scale, boundary, and resolution are linked. A model with a wide boundary often requires lower resolution. A model with high resolution often requires a narrower boundary or more data.
| Design choice | Question | Modeling risk |
|---|---|---|
| Spatial scale | What geographic or relational space is represented? | External effects may be missed or wrongly internalized. |
| Temporal scale | What time horizon and time step are used? | Delays, cycles, or long-term effects may disappear. |
| Population scale | Who or what is included? | Excluded groups may disappear from analysis. |
| Mechanism scale | Which processes are represented explicitly? | Key mechanisms may be hidden in parameters. |
| Data resolution | How detailed are observations? | Model resolution may exceed data quality. |
| Output resolution | How detailed are conclusions? | Outputs may imply precision that the model cannot support. |
Resolution can create false confidence. A high-resolution map, simulation, or dashboard may look authoritative even when underlying assumptions are uncertain. A model can be visually detailed and epistemically weak. Conversely, a coarse model may be useful if it is honest about its purpose and uncertainty.
Good modeling practice states the scale, boundary, and resolution explicitly. It also explains why those choices are appropriate for the intended use.
The Abstraction Ladder
Abstraction often occurs in layers. A modeler may begin with a concrete situation, identify relevant features, define variables, choose relationships, formulate equations, implement computation, and produce interpretable outputs. Each step moves upward on an abstraction ladder.
| Layer | Question | Artifact |
|---|---|---|
| World situation | What is happening? | Narrative description, observation, domain context. |
| Modeling question | What do we need to understand or decide? | Question statement, intended use. |
| Conceptual abstraction | What structure matters? | System diagram, boundary note, mechanism list. |
| Formal representation | How can the structure be expressed mathematically? | Variables, parameters, constraints, equations, graphs. |
| Computational representation | How can the model be executed or analyzed? | Code, solver, simulation, notebook, workflow. |
| Output representation | How are results summarized? | Tables, plots, intervals, scenarios, diagnostics. |
| Interpretive representation | What do outputs mean in the original context? | Conclusion, uncertainty statement, decision note. |
Errors can enter at any layer. A good equation cannot fix a poorly framed question. A good simulation cannot fix an invalid abstraction. A beautiful visualization cannot fix weak evidence. A precise output cannot fix unreported uncertainty. The abstraction ladder helps modelers see where a model’s credibility is built and where it may fail.
Moving down the ladder is just as important as moving up. Once results are produced, they must be translated back into the real-world context. This return movement is where interpretation, judgment, and responsibility enter.
Choosing a Representation
Choosing a representation is one of the most consequential decisions in modeling. The representation should be selected because it fits the question, not because it is familiar, fashionable, or convenient. The modeler should ask what structure the question requires and which representation makes that structure analyzable.
| If the question asks… | Consider representing… | Possible model family |
|---|---|---|
| How does a quantity change over time? | State variables and rates. | Differential equations, recurrence relations, system dynamics. |
| How are entities connected? | Nodes and edges. | Network model, graph model, flow network. |
| What is the best choice under limits? | Decision variables, objectives, constraints. | Optimization model. |
| How uncertain is the outcome? | Probability distributions and error structures. | Stochastic model, Bayesian model, Monte Carlo simulation. |
| How do local interactions produce system behavior? | Agents, rules, interaction spaces. | Agent-based model. |
| How does location matter? | Coordinates, regions, adjacency, movement. | Spatial model, PDE, geospatial model. |
| How do events unfold in sequence? | States, queues, transitions, event timing. | Discrete-event simulation. |
Representation choice is also shaped by available evidence. A highly detailed representation may be inappropriate if data cannot support it. A statistically rich model may be misleading if the variables are weak proxies. A mechanistic model may be unjustified if mechanisms are unknown. A coarse representation may be acceptable for screening but inadequate for final design.
A useful test is to ask: what would this representation make easy, and what would it make difficult? Every representation has affordances and blind spots. Good modeling practice names both.
Mathematical Lens: Abstraction as a Mapping
One way to formalize abstraction is to view it as a mapping from a real-world situation into a mathematical structure. Let \(W\) represent the world or target system, and let \(M\) represent the model. The abstraction function selects and transforms features of \(W\) into formal objects in \(M\).
\alpha: W \rightarrow M
\]
Interpretation: The abstraction map \(\alpha\) translates selected features of the world \(W\) into a model \(M\).
The model can be represented as a structured tuple:
M=(V,P,A,R,C,O)
\]
Interpretation: A model \(M\) can be described by variables \(V\), parameters \(P\), assumptions \(A\), relationships \(R\), constraints \(C\), and outputs \(O\).
The representation is adequate when the model preserves the structure needed for the purpose. If \(\phi(W)\) is the target relationship in the world and \(\psi(M)\) is the corresponding relationship in the model, then adequacy depends on whether the model preserves enough of the relevant relationship for the intended use.
\psi(M) \approx \phi(W) \quad \text{for purpose } U
\]
Interpretation: A model representation is adequate only relative to a purpose \(U\), not as a complete copy of the world.
This formal lens helps clarify why modeling is not just equation writing. The model’s credibility depends on the abstraction map, the formal structure, the interpretation map back to the world, and the evidence supporting the model-target relation.
W \xrightarrow{\alpha} M \xrightarrow{\text{analysis}} O \xrightarrow{\iota} I
\]
Interpretation: Modeling moves from world \(W\) to model \(M\), from model analysis to outputs \(O\), and from outputs to interpretation \(I\).
The final interpretation is not automatic. It requires judgment about assumptions, uncertainty, validation, scale, boundary, and consequences of use.
Example: Representing a Resource System
Consider a resource system such as a reservoir. The real-world system includes rainfall, inflow, evaporation, storage, water quality, demand, infrastructure, regulation, ecological requirements, weather variability, seasonal use, political choices, and social consequences. A model cannot represent all of this in its first form.
A basic abstraction might preserve only storage, inflow, demand, losses, and capacity. The state variable is storage:
S_t = \text{stored resource at time } t
\]
Interpretation: The state variable \(S_t\) abstracts a complex physical system into a quantity that can be tracked over time.
The stock-flow representation is:
S_{t+1}=S_t+I_t-D_t-L_t
\]
Interpretation: Storage in the next period equals current storage plus inflow, minus demand and losses.
A bounded representation adds capacity and nonnegativity:
S_{t+1}=\min\left(K,\max\left(0,S_t+I_t-D_t-L_t\right)\right)
\]
Interpretation: The model represents storage as constrained between zero and maximum capacity \(K\).
This representation is useful for understanding accumulation, depletion, shortage risk, and capacity limits. But it hides many things. It may not represent water quality, legal rights, ecological flows, behavioral adaptation, distributional consequences, infrastructure failure, or climate-conditioned inflow variability.
| What the representation preserves | What it hides | Possible extension |
|---|---|---|
| Storage accumulation. | Spatial variation inside the reservoir. | Spatial or hydrodynamic model. |
| Inflow and outflow balance. | Stochastic hydrology. | Probabilistic inflow ensemble. |
| Capacity constraint. | Operating rules and legal allocation. | Policy constraint layer. |
| Demand pressure. | Demand response and conservation behavior. | Behavioral or econometric demand model. |
| Shortage periods. | Distribution of shortage across users. | Equity and allocation model. |
| Simple loss term. | Temperature, surface area, leakage, seasonality. | Process-based loss model. |
The model is not wrong because it leaves things out. It becomes wrong if users forget what it leaves out, or if they use it for a purpose that requires the hidden features.
Computation, Simulation, and Representation
Computation is itself a form of representation. When a model becomes code, equations become functions, states become data structures, parameters become configuration values, and assumptions become executable logic. The computational representation can clarify a model by making every operation explicit, but it can also hide assumptions inside software defaults, solver settings, data pipelines, or undocumented code.
A computational representation includes more than the mathematical formula. It includes:
- data structures for variables, parameters, and outputs;
- algorithms for updating state or solving equations;
- numerical methods and solver tolerances;
- random seeds and sampling methods;
- input files and scenario definitions;
- validation and diagnostic outputs;
- visualization and reporting choices.
This matters because two computational models can implement the same formal equation differently. A continuous-time model approximated by Euler integration may produce different results from the same model solved with a higher-order method. A stochastic simulation may produce different outputs depending on random seed and sample size. A spatial model may depend on grid resolution. A network model may depend on how edges are defined.
| Computational choice | Representational effect | Review question |
|---|---|---|
| Time step | Defines temporal resolution. | Do results change when the time step changes? |
| Solver method | Defines numerical approximation. | Has numerical error been checked? |
| Data structure | Defines what quantities are tracked. | Are important variables omitted? |
| Scenario file | Defines external assumptions. | Are scenarios documented and plausible? |
| Random seed | Controls stochastic reproducibility. | Are ensemble results reported rather than one run? |
| Plot or dashboard | Shapes interpretation of outputs. | Does the visualization show uncertainty and limits? |
Professional modeling workflows should treat code as a representational artifact that requires review, testing, documentation, and version control. The computational model should make the abstraction traceable rather than hiding it.
Validation, Uncertainty, and Representational Adequacy
Validation asks whether a model representation is adequate for a specific purpose. It does not ask whether the model is a complete copy of reality. No model is. Instead, validation asks whether the abstraction and representation preserve enough relevant structure for the intended use.
Uncertainty enters at every representational layer. The target system may be poorly understood. Measurements may be noisy. Parameters may be uncertain. The selected model form may omit important mechanisms. Numerical implementation may introduce approximation error. Future scenarios may be unknown. These uncertainties affect what the representation can support.
| Uncertainty source | Representational issue | Possible response |
|---|---|---|
| Measurement uncertainty | Variables may not measure what they claim. | Data-quality review, measurement-error model. |
| Parameter uncertainty | Model behavior depends on uncertain values. | Sensitivity analysis, calibration intervals, Bayesian inference. |
| Structural uncertainty | The chosen model form may be incomplete. | Model comparison, ensemble modeling, expert review. |
| Boundary uncertainty | Relevant processes may be outside the model. | Boundary critique, expanded scenarios, system mapping. |
| Numerical uncertainty | Computation approximates formal structure. | Solver comparison, convergence testing, tolerance checks. |
| Interpretive uncertainty | Outputs may not map cleanly back to the real question. | Model card, use limitations, decision-context review. |
A representation may be adequate for screening but not for final design. It may be adequate for explanation but not prediction. It may be adequate for aggregate planning but not equity analysis. It may be adequate for classroom demonstration but not policy deployment. Representational adequacy is always tied to use.
A responsible model therefore includes an adequacy statement: what it represents, what it omits, what evidence supports it, what uncertainty remains, and what uses it should not be put to.
Ethics of Abstraction and Representation
Abstraction has ethical consequences. When a model selects features of the world, it also decides what becomes visible and what disappears. A model that represents cost but not harm may favor efficiency over justice. A model that represents average outcomes but not distribution may hide inequality. A model that represents risk without uncertainty may produce false confidence. A model that represents people as interchangeable units may ignore agency, dignity, and context.
This does not mean mathematical representation is inherently harmful. It means representation is a responsibility. Modelers should be explicit about boundaries, proxies, assumptions, excluded values, uncertainty, and affected groups. They should ask who benefits from the abstraction, who is made invisible, and who has the ability to challenge the representation.
| Representational choice | Ethical risk | Responsible practice |
|---|---|---|
| Proxy variable | Proxy may not represent the intended concept. | Validate proxy and document limitations. |
| Aggregation | Subgroup harms may disappear. | Report distributional and subgroup outputs. |
| Objective function | Important values may be omitted. | Expose trade-offs and include stakeholder review. |
| Risk score | Uncertainty may be hidden behind a single number. | Report uncertainty, error rates, and appeal mechanisms. |
| Boundary choice | Externalities may be excluded. | State boundary assumptions and test alternatives. |
| Visualization | Charts may imply certainty or neutrality. | Show uncertainty and explain model limits. |
The ethical question is not only whether the mathematics is correct. It is whether the representation is responsible for the context in which it will be used. Models can clarify reality, but they can also authorize narrow ways of seeing. The modeling process should make this visible.
Python Workflow: Representation Audit and Scenario Model
The Python workflow below treats abstraction and representation as auditable modeling decisions. It defines a simple stock-flow system, records what is represented and omitted, simulates scenarios, and exports a model representation card.
# abstraction_representation_workflow.py
# Dependency-light workflow for auditing abstraction and representation choices.
from __future__ import annotations
from dataclasses import asdict, dataclass
from pathlib import Path
import csv
import json
from statistics import mean
ARTICLE_ROOT = Path(__file__).resolve().parents[1]
OUTPUTS = ARTICLE_ROOT / "outputs"
TABLES = OUTPUTS / "tables"
JSON_DIR = OUTPUTS / "json"
@dataclass(frozen=True)
class RepresentationChoice:
target_feature: str
abstraction: str
formal_representation: str
preserved_structure: str
omitted_detail: str
review_question: str
@dataclass(frozen=True)
class StockFlowScenario:
name: str
initial_stock: float
capacity: float
inflow: float
demand: float
loss_rate: float
periods: int
def bounded_update(stock: float, inflow: float, demand: float, losses: float, capacity: float) -> float:
return min(capacity, max(0.0, stock + inflow - demand - losses))
def simulate_stock_flow(scenario: StockFlowScenario) -> list[dict[str, float | str | int]]:
if scenario.initial_stock < 0:
raise ValueError("initial_stock must be nonnegative.")
if scenario.capacity <= 0: raise ValueError("capacity must be positive.") if scenario.initial_stock > scenario.capacity:
raise ValueError("initial_stock cannot exceed capacity.")
stock = scenario.initial_stock
rows: list[dict[str, float | str | int]] = []
for period in range(scenario.periods + 1):
losses = scenario.loss_rate * stock
shortage = max(0.0, scenario.demand + losses - (stock + scenario.inflow))
rows.append({
"scenario": scenario.name,
"period": period,
"stock": round(stock, 6),
"inflow": round(scenario.inflow, 6),
"demand": round(scenario.demand, 6),
"losses": round(losses, 6),
"shortage": round(shortage, 6),
"capacity": round(scenario.capacity, 6),
})
stock = bounded_update(stock, scenario.inflow, scenario.demand, losses, scenario.capacity)
return rows
def summarize(rows: list[dict[str, float | str | int]]) -> dict[str, float | str | int]:
stocks = [float(row["stock"]) for row in rows]
shortages = [float(row["shortage"]) for row in rows]
return {
"scenario": str(rows[0]["scenario"]),
"final_stock": round(stocks[-1], 6),
"mean_stock": round(mean(stocks), 6),
"min_stock": round(min(stocks), 6),
"max_stock": round(max(stocks), 6),
"shortage_periods": sum(1 for value in shortages if value > 0),
"total_shortage": round(sum(shortages), 6),
}
def representation_audit() -> list[RepresentationChoice]:
return [
RepresentationChoice(
target_feature="Stored resource",
abstraction="Aggregate stock",
formal_representation="S_t",
preserved_structure="Accumulation and depletion over time",
omitted_detail="Spatial distribution, quality, ownership, and local access",
review_question="Does aggregate storage answer the intended question?",
),
RepresentationChoice(
target_feature="Resource additions",
abstraction="External inflow",
formal_representation="I_t",
preserved_structure="Input to the stock-flow balance",
omitted_detail="Seasonality, stochastic hydrology, upstream governance",
review_question="Should inflow be stochastic or scenario-based?",
),
RepresentationChoice(
target_feature="Resource use",
abstraction="Demand term",
formal_representation="D_t",
preserved_structure="Outflow due to use",
omitted_detail="Heterogeneous users, conservation behavior, price response",
review_question="Does demand need subgroup or behavioral structure?",
),
RepresentationChoice(
target_feature="Physical limit",
abstraction="Capacity constraint",
formal_representation="0 <= S_t <= K", preserved_structure="Upper and lower feasibility limits", omitted_detail="Operating rules, emergency reserves, safety margins", review_question="Is physical capacity the same as usable capacity?", ), ] def write_csv(path: Path, rows: list[dict[str, object]]) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
if not rows:
raise ValueError(f"No rows supplied for {path}")
with path.open("w", newline="", encoding="utf-8") as handle:
writer = csv.DictWriter(handle, fieldnames=list(rows[0].keys()))
writer.writeheader()
writer.writerows(rows)
def write_json(path: Path, payload: object) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
with path.open("w", encoding="utf-8") as handle:
json.dump(payload, handle, indent=2, sort_keys=True)
def main() -> None:
scenarios = [
StockFlowScenario("aggregate_baseline", 80.0, 100.0, 8.0, 6.0, 0.015, 60),
StockFlowScenario("low_inflow", 80.0, 100.0, 5.0, 6.0, 0.015, 60),
StockFlowScenario("higher_losses", 80.0, 100.0, 8.0, 6.0, 0.035, 60),
StockFlowScenario("lower_capacity", 70.0, 75.0, 8.0, 6.0, 0.015, 60),
]
all_rows: list[dict[str, object]] = []
summary_rows: list[dict[str, object]] = []
for scenario in scenarios:
rows = simulate_stock_flow(scenario)
all_rows.extend(rows)
summary_rows.append(summarize(rows))
audit_rows = [asdict(item) for item in representation_audit()]
write_csv(TABLES / "stock_flow_timeseries.csv", all_rows)
write_csv(TABLES / "stock_flow_summary.csv", summary_rows)
write_csv(TABLES / "representation_audit.csv", audit_rows)
write_json(JSON_DIR / "representation_card.json", {
"article": "Abstraction and Representation in Mathematical Models",
"formal_model": "S[t+1] = min(K, max(0, S[t] + I[t] - D[t] - L[t]))",
"abstraction_map": "World resource system -> aggregate stock-flow representation",
"represented_features": [row["target_feature"] for row in audit_rows],
"review_questions": [row["review_question"] for row in audit_rows],
"known_limits": [row["omitted_detail"] for row in audit_rows],
})
print("Abstraction and representation workflow complete.")
print(f"Wrote outputs to {OUTPUTS}")
if __name__ == "__main__":
main()
This workflow does not treat the model as a neutral formula. It makes representation choices explicit, stores omitted detail, and turns abstraction into a reviewable artifact.
R Workflow: Comparing Abstraction Levels
The R workflow below compares scenario outputs generated by the Python workflow. It supports the article’s central point: different abstraction choices can lead to different interpretations, and outputs should be reviewed alongside the representation that produced them.
# abstraction_representation_review.R
# Base R workflow for reviewing abstraction levels and scenario summaries.
args <- commandArgs(trailingOnly = FALSE)
file_arg <- grep("^--file=", args, value = TRUE) if (length(file_arg) > 0) {
script_path <- normalizePath(sub("^--file=", "", file_arg[1]), mustWork = TRUE)
article_root <- normalizePath(file.path(dirname(script_path), ".."), mustWork = TRUE)
} else {
article_root <- getwd()
}
tables_dir <- file.path(article_root, "outputs", "tables")
figures_dir <- file.path(article_root, "outputs", "figures")
dir.create(tables_dir, recursive = TRUE, showWarnings = FALSE)
dir.create(figures_dir, recursive = TRUE, showWarnings = FALSE)
summary_path <- file.path(tables_dir, "stock_flow_summary.csv")
audit_path <- file.path(tables_dir, "representation_audit.csv")
if (!file.exists(summary_path)) {
stop("Missing stock_flow_summary.csv. Run the Python workflow first.")
}
summary_data <- read.csv(summary_path, stringsAsFactors = FALSE)
summary_data$shortage_flag <- ifelse(summary_data$shortage_periods > 0, "shortage observed", "no shortage")
summary_data$representation_review <- ifelse( summary_data$shortage_periods > 0,
"review abstraction and scenario assumptions",
"acceptable under stated abstraction"
)
write.csv(
summary_data,
file.path(tables_dir, "r_abstraction_level_review.csv"),
row.names = FALSE
)
if (file.exists(audit_path)) {
audit_data <- read.csv(audit_path, stringsAsFactors = FALSE)
audit_review <- data.frame(
target_feature = audit_data$target_feature,
represented_as = audit_data$formal_representation,
review_question = audit_data$review_question,
omitted_detail = audit_data$omitted_detail
)
write.csv(
audit_review,
file.path(tables_dir, "r_representation_review_queue.csv"),
row.names = FALSE
)
}
png(file.path(figures_dir, "r_final_stock_by_representation_scenario.png"), width = 1100, height = 720)
barplot(
height = summary_data$final_stock,
names.arg = summary_data$scenario,
las = 2,
ylab = "Final stock",
main = "Final Stock by Scenario Under Aggregate Stock-Flow Representation"
)
grid()
dev.off()
print(summary_data)
The R layer is useful for review and communication. It can be extended with uncertainty intervals, subgroup outputs, residual diagnostics, and side-by-side comparisons between aggregate, disaggregated, stochastic, and network representations.
Haskell Workflow: Typed Model Representations
Haskell strengthens this article because abstraction and representation depend on distinctions that should not be flattened. A variable is not a parameter. An assumption is not evidence. A proxy is not the same as the target concept. A simplified representation is not a validated representation. Algebraic data types make these distinctions explicit.
{-# OPTIONS_GHC -Wall #-}
module Main where
data RepresentationForm
= Equation
| Graph
| ProbabilityModel
| OptimizationModel
| Simulation
| Diagram
| DataTable
deriving (Eq, Show)
data AbstractionStatus
= PreservesRelevantStructure
| RequiresReview
| KnownIdealization
| PotentialDistortion
deriving (Eq, Show)
data ModelObject
= StateVariable String
| Parameter String
| Assumption String
| Constraint String
| ProxyVariable String
| OutputMetric String
deriving (Eq, Show)
data RepresentationRecord = RepresentationRecord
{ modelObject :: ModelObject
, representationForm :: RepresentationForm
, meaning :: String
, preservedStructure :: String
, omittedDetail :: String
, status :: AbstractionStatus
, reviewQuestion :: String
} deriving (Eq, Show)
records :: [RepresentationRecord]
records =
[ RepresentationRecord
(StateVariable "S_t")
Equation
"Aggregate storage at time t."
"Accumulation and depletion."
"Spatial distribution, quality, ownership, and access."
PreservesRelevantStructure
"Is aggregate storage sufficient for the intended use?"
, RepresentationRecord
(Parameter "K")
Constraint
"Maximum storage capacity."
"Upper feasibility bound."
"Operating rules, safety reserves, and infrastructure condition."
RequiresReview
"Is physical capacity equivalent to usable capacity?"
, RepresentationRecord
(Assumption "well-mixed system")
Simulation
"The system is treated as internally homogeneous."
"Aggregate dynamic behavior."
"Local heterogeneity and spatial variation."
KnownIdealization
"Does heterogeneity affect conclusions?"
, RepresentationRecord
(ProxyVariable "shortage risk")
DataTable
"Periods with shortage are used as a risk indicator."
"Failure frequency under scenarios."
"Severity distribution, affected users, and recovery time."
PotentialDistortion
"Does this proxy represent the risk stakeholders care about?"
]
needsReview :: RepresentationRecord -> Bool
needsReview record =
case status record of
PreservesRelevantStructure -> False
_ -> True
main :: IO ()
main = do
putStrLn "Typed abstraction and representation records:"
mapM_ print records
putStrLn "\nRecords requiring representational review:"
mapM_ print (filter needsReview records)
This typed approach is useful in professional modeling repositories because it encodes the difference between model objects, representation forms, abstraction status, and review questions. It helps prevent a repository from treating every artifact as merely a file or every quantity as merely a number.
GitHub Repository
The companion repository for this article is designed as a reproducible mathematical-modeling workspace. It contains article-specific code, data, documentation, notebooks, schemas, and generated outputs for abstraction audits, representation reviews, stock-flow modeling, scenario comparison, omitted-detail registers, typed Haskell representation records, and reproducible engineering/statistical workflows.
Complete Code Repository
Companion article folder with Python, R, Julia, SQL, Haskell, Rust, Go, C++, Fortran, and C examples for professional mathematical modeling, abstraction audits, formal representation, scenario simulation, model review, typed representation records, omitted-detail tracking, validation planning, and reproducible computational workflows.
A Practical Method for Abstraction and Representation
A practical abstraction and representation method begins with the model’s purpose and proceeds through reviewable choices. The goal is not to find the most impressive formalism. The goal is to preserve the structure needed for the question while making simplifications visible.
| Step | Task | Practical question | Artifact |
|---|---|---|---|
| 1 | Define the target system | What is the model about? | Target-system statement. |
| 2 | State the purpose | What should the model explain, predict, simulate, optimize, or support? | Purpose statement. |
| 3 | Identify relevant structure | What relationships must be preserved? | Structure list or system diagram. |
| 4 | Select abstractions | What features become variables, parameters, constraints, or outputs? | Variable and parameter table. |
| 5 | Record omissions | What is left out, aggregated, idealized, or parameterized? | Omitted-detail register. |
| 6 | Choose representation form | Should the model be an equation, graph, probability model, simulation, or optimization problem? | Formal representation. |
| 7 | Check scale and boundary | Do the model’s scale and boundary fit the intended use? | Boundary and resolution note. |
| 8 | Test adequacy | Does the representation preserve enough structure for the purpose? | Validation and uncertainty plan. |
| 9 | Revise | What must be changed after evidence, diagnostics, or review? | Revision log. |
This method is especially useful when multiple representations are possible. It prevents disciplinary habit or software convenience from determining the model before the modeling question has been understood.
Common Pitfalls
Abstraction and representation fail in predictable ways. The most common failures involve treating a selective representation as if it were complete.
- Confusing model and reality: treating the representation as a miniature copy of the world rather than a selective formal structure.
- Over-aggregation: hiding subgroup variation, spatial heterogeneity, distributional effects, or local failure modes.
- Representation by habit: using a familiar equation, graph, or simulation form without asking whether it fits the question.
- Hidden idealization: relying on assumptions such as linearity, homogeneity, equilibrium, or independence without documenting them.
- Proxy confusion: treating a measurable proxy as if it were the concept of interest.
- False precision: representing uncertainty with a point estimate because it is easier to communicate.
- Visual overconfidence: using detailed maps, dashboards, or simulations that look more certain than the model is.
- Scale mismatch: applying conclusions at a scale different from the one the model represents.
- Purpose drift: using a model built for exploration as if it were validated for operational decision-making.
- Unreviewed omissions: failing to ask what the model leaves out and whether those omissions matter.
These pitfalls are not reasons to avoid abstraction. They are reasons to make abstraction explicit. A model that states what it preserves and what it omits is more trustworthy than a model that hides its simplifications behind technical detail.
Conclusion: Models Reveal by Selecting
Abstraction and representation are not secondary details in mathematical modeling. They are the heart of the practice. A model becomes useful because it selects structure from the world and gives that structure a formal shape. It reveals by simplifying. It explains by leaving out. It supports reasoning because it is not the world in full.
This selective power is both the strength and the danger of modeling. A model can clarify a system by preserving the relationships that matter. It can also mislead by omitting relationships that matter, hiding assumptions, aggregating away variation, or presenting uncertain results as precise conclusions.
The discipline of mathematical modeling therefore requires representational awareness. Modelers must ask what the model represents, what it omits, what formal structure it uses, why that structure is appropriate, how it connects back to the target system, and what evidence would show that the representation is adequate or inadequate.
Good models do not pretend to be reality. They show how reality has been abstracted. They make their representation available for inspection, revision, and responsible use. That is what allows mathematical models to clarify complexity without confusing the model for the world it represents.
Related Articles
- What Is Mathematical Modeling?
- The Modeling Process: From World to Formal Representation
- Assumptions, Simplification, and Model Design
- Model Boundaries, Scale, and Scope
- Model Purpose: Explanation, Prediction, Control, and Decision Support
- Variables, Parameters, and Constraints
- Functional Relationships and Mathematical Structure
- Validation and Model Assessment
- Uncertainty in Mathematical Models
- Limits, Failure, and the Ethics of Modeling
Further Reading
- Frigg, R. and Hartmann, S. (2020) ‘Models in Science’, The Stanford Encyclopedia of Philosophy. Available at: https://plato.stanford.edu/entries/models-science/
- Frigg, R. (2020) ‘Scientific Representation’, The Stanford Encyclopedia of Philosophy. Available at: https://plato.stanford.edu/entries/scientific-representation/
- Garfunkel, S. and Montgomery, M. (eds.) (2019) GAIMME: Guidelines for Assessment and Instruction in Mathematical Modeling Education. 2nd edn. Philadelphia: Society for Industrial and Applied Mathematics. Available at: https://www.siam.org/publications/reports/guidelines-for-assessment-and-instruction-in-mathematical-modeling-education/
- COMAP (n.d.) Mathematical Modeling Handbook. Consortium for Mathematics and Its Applications. Available at: https://www.comap.com/membership/member-resources/item/mathematical-modeling-handbook
- Morgan, M.S. and Morrison, M. (eds.) (1999) Models as Mediators: Perspectives on Natural and Social Science. Cambridge: Cambridge University Press. Available at: https://www.cambridge.org/core/books/models-as-mediators/FBB3EA4AECAF824AD6F1E6C650CAE3AE
- Weisberg, M. (2013) Simulation and Similarity: Using Models to Understand the World. Oxford: Oxford University Press. Available at: https://global.oup.com/academic/product/simulation-and-similarity-9780199933662
- Giere, R.N. (2006) Scientific Perspectivism. Chicago: University of Chicago Press. Available at: https://press.uchicago.edu/ucp/books/book/chicago/S/bo4094708.html
- Morrison, M. (2015) Reconstructing Reality: Models, Mathematics, and Simulations. Oxford: Oxford University Press. Available at: https://global.oup.com/academic/product/reconstructing-reality-9780199380275
- Winsberg, E. (2010) Science in the Age of Computer Simulation. Chicago: University of Chicago Press. Available at: https://press.uchicago.edu/ucp/books/book/chicago/S/bo9003670.html
- National Research Council (2012) Assessing the Reliability of Complex Models: Mathematical and Statistical Foundations of Verification, Validation, and Uncertainty Quantification. Washington, DC: National Academies Press. Available at: https://doi.org/10.17226/13395
- Oberkampf, W.L. and Roy, C.J. (2010) Verification and Validation in Scientific Computing. Cambridge: Cambridge University Press. Available at: https://www.cambridge.org/core/books/verification-and-validation-in-scientific-computing/05CA1F8F3CCB5AE5445FDF55239A0183
- Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M. and Tarantola, S. (2008) Global Sensitivity Analysis: The Primer. Chichester: Wiley. Available at: https://doi.org/10.1002/9780470725184
References
- Box, G.E.P. (1976) ‘Science and statistics’, Journal of the American Statistical Association, 71(356), pp. 791–799. Available at: https://doi.org/10.1080/01621459.1976.10480949
- COMAP (n.d.) Mathematical Modeling Handbook. Bedford, MA: Consortium for Mathematics and Its Applications. Available at: https://www.comap.com/membership/member-resources/item/mathematical-modeling-handbook
- Frigg, R. (2020) ‘Scientific Representation’, The Stanford Encyclopedia of Philosophy. Available at: https://plato.stanford.edu/entries/scientific-representation/
- Frigg, R. and Hartmann, S. (2020) ‘Models in Science’, The Stanford Encyclopedia of Philosophy. Available at: https://plato.stanford.edu/entries/models-science/
- Garfunkel, S. and Montgomery, M. (eds.) (2019) GAIMME: Guidelines for Assessment and Instruction in Mathematical Modeling Education. 2nd edn. Philadelphia: Society for Industrial and Applied Mathematics. Available at: https://epubs.siam.org/doi/book/10.1137/1.9781611975741
- Giere, R.N. (2006) Scientific Perspectivism. Chicago: University of Chicago Press. Available at: https://press.uchicago.edu/ucp/books/book/chicago/S/bo4094708.html
- Hughes, R.I.G. (1997) ‘Models and representation’, Philosophy of Science, 64, pp. S325–S336. Available at: https://www.journals.uchicago.edu/doi/10.1086/392611
- Morgan, M.S. and Morrison, M. (eds.) (1999) Models as Mediators: Perspectives on Natural and Social Science. Cambridge: Cambridge University Press. Available at: https://www.cambridge.org/core/books/models-as-mediators/FBB3EA4AECAF824AD6F1E6C650CAE3AE
- Morrison, M. (2015) Reconstructing Reality: Models, Mathematics, and Simulations. Oxford: Oxford University Press. Available at: https://global.oup.com/academic/product/reconstructing-reality-9780199380275
- National Research Council (2012) Assessing the Reliability of Complex Models: Mathematical and Statistical Foundations of Verification, Validation, and Uncertainty Quantification. Washington, DC: National Academies Press. Available at: https://doi.org/10.17226/13395
- Oberkampf, W.L. and Roy, C.J. (2010) Verification and Validation in Scientific Computing. Cambridge: Cambridge University Press. Available at: https://www.cambridge.org/core/books/verification-and-validation-in-scientific-computing/05CA1F8F3CCB5AE5445FDF55239A0183
- Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M. and Tarantola, S. (2008) Global Sensitivity Analysis: The Primer. Chichester: Wiley. Available at: https://doi.org/10.1002/9780470725184
- U.S. Environmental Protection Agency (2009) Guidance on the Development, Evaluation, and Application of Environmental Models. Washington, DC: EPA. Available at: https://www.epa.gov/measurements-modeling/guidance-development-evaluation-and-application-environmental-models
- Weisberg, M. (2013) Simulation and Similarity: Using Models to Understand the World. Oxford: Oxford University Press. Available at: https://global.oup.com/academic/product/simulation-and-similarity-9780199933662
- Winsberg, E. (2010) Science in the Age of Computer Simulation. Chicago: University of Chicago Press. Available at: https://press.uchicago.edu/ucp/books/book/chicago/S/bo9003670.html
