Introduction to the Ontology

Why ontologies exist, what problems they solve, and where they fit between raw data and the applications that depend on it.

⚡ intermediate
⏱️ 45 minutes
👤 SuperML Team

· Ontology · 4 min read

📋 Prerequisites

  • Comfort with relational data modeling (tables, primary/foreign keys)
  • Basic familiarity with REST APIs

🎯 What You'll Learn

  • Define what an ontology is in the context of a data platform
  • Articulate the problems an ontology solves that a data warehouse alone does not
  • Identify the three core primitives: object types, link types, action types
  • Recognize when an ontology adds value versus when it is overkill

The problem ontologies solve

Most organizations have the same painful story:

  • Marketing has a definition of “active customer.”
  • Finance has a different one.
  • The data warehouse has a third.
  • The production database has a fourth, encoded implicitly in business logic across 12 microservices.

When the CEO asks “how many active customers do we have?” four people answer with four numbers, and nobody is wrong — they are each correct against a different definition.

The same fragmentation shows up everywhere: what is a shipment “in transit”? What counts as a “completed” order? Who is the “owner” of an account? Every team rebuilds the answers, in code, in SQL, in spreadsheets — slightly differently each time.

The ontology is the one place where these answers live.

What is an ontology?

An ontology is a typed, governed, semantic model of the real-world entities your business operates on — and the relationships and actions that connect them.

Concretely, an ontology is made of three primitives:

PrimitiveWhat it isExample
Object typeA noun in the business — an entityCustomer, Order, Shipment, Driver
Link typeA verb between objects — a relationshipCustomer → places → Order, Driver → operates → Vehicle
Action typeA typed mutation to ontology statemarkShipmentDelivered, assignDriverToRoute

Sometimes you also work with functions (compute over the ontology) and interfaces (contracts that multiple object types can satisfy), but those build on top of the three above.

Where the ontology sits

A useful way to picture it:

┌──────────────────────────────────────────┐
│   Applications, dashboards, AI agents    │
├──────────────────────────────────────────┤
│              ONTOLOGY                    │  ← typed business model
│   Object Types · Link Types · Actions    │
├──────────────────────────────────────────┤
│  Datasets · Streams · APIs · Files       │  ← raw data
└──────────────────────────────────────────┘

Below the ontology: raw data — Parquet files, Postgres tables, Kafka topics, REST APIs from SaaS tools.

Above the ontology: every consumer — operational apps, BI dashboards, ML models, AI agents — speaking a single, shared vocabulary.

Ontology vs. data warehouse

A common question: “isn’t this just a data warehouse?” No — they overlap, but the focus is different.

ConcernData warehouseOntology
Primary purposeAnalytical queriesOperational model + queries
Schema styleStar / snowflake, denormalizedObject-centric, typed graph
Write semanticsAppend-only ETLTyped actions with validation
Consumed byBI tools, analystsBI and apps, agents, services
IdentitySurrogate keysStable, business-meaningful IDs
GovernanceColumn docsObject-, property-, row-level policies

A warehouse asks “what happened?” An ontology asks “what is true right now, and how do I change it?”

A worked example

Imagine a logistics company. Their raw data:

  • shipments.csv — 200M rows, updated nightly from the operational DB
  • vehicle_telemetry — a Kafka stream, 50k events/second
  • drivers table in HR’s Workday account, synced via REST
  • customer_contracts — PDFs, parsed by an OCR pipeline

Without an ontology, every team that needs “the shipment with its current driver and the customer it belongs to” writes a join across all four sources — and each team writes it slightly differently.

With an ontology:

  • Shipment is one object type, backed by the operational DB and enriched by the stream.
  • Driver and Customer are their own object types.
  • Link types wire them together: Shipment → assignedTo → Driver, Shipment → orderedBy → Customer.
  • An action markDelivered(shipmentId, deliveredAt, signature) is the only way state can transition — validated, logged, permissioned.

Every app, dashboard, and AI agent now reads and writes through the same typed surface.

When to use an ontology — and when not to

Use an ontology when:

  • Multiple teams build on the same domain concepts.
  • You need operational reads and writes, not just analytics.
  • Definitions disagree across teams and the disagreement costs you.
  • You want AI agents or no-code apps to safely operate on real data.

Skip it when:

  • You have one app, one team, one database. Just use the database.
  • The domain is throwaway — a one-off analysis, a research notebook.
  • You do not yet have the data integrated. Get the data flowing first.

Key terms to remember

  • Object type — a definition of an entity (a class). The instances are objects.
  • Property — a typed field on an object type (firstName: string, weightKg: double).
  • Link type — a typed relationship between two object types.
  • Action type — a typed, validated mutation to ontology state.
  • Function — a typed compute over ontology data.
  • Datasource — the underlying dataset / stream / API backing an object type.

What’s next

Now that you know what an ontology is and why it exists, the next lesson covers the broader pattern it implements: the semantic layer.

Then we will look at the architecture that makes the three primitives work together.


Happy modeling! 🧭

Related Tutorials

⚡intermediate ⏱️ 75 minutes

Link Types and Relationships

Connect your object types into a graph. One-to-many, many-to-many, intersection links, cardinality, and the rules that keep relationships honest.

Ontology7 min read
ontologylink typesrelationships +1
⚡intermediate ⏱️ 75 minutes

Object Types

Object types are the nouns of your ontology. Learn how to define them: primary keys, titles, descriptions, properties, and the common pitfalls that ruin a model later.

Ontology6 min read
ontologyobject typesdata modeling
⚡intermediate ⏱️ 75 minutes

Property Types and Data Types

The type system at the heart of the ontology — primitives, semantic types, enums, structs, arrays, geo, attachments — and how to design properties that scale.

Ontology6 min read
ontologyproperty typesdata modeling +1
⚡intermediate ⏱️ 45 minutes

The Semantic Layer

What a semantic layer is, why it became necessary, and how the ontology pattern implements it as a typed, operational model — not just a metrics catalog.

Ontology5 min read
ontologysemantic layerdata architecture