Course Content
Designing Your Object Model
From whiteboard to schema: domain interviews, normalization, and naming conventions
Why design before code
The most expensive bug in an ontology is the wrong primary key. The second is splitting one concept into three object types because three teams used three names for the same thing.
Both bugs are cheap to fix on a whiteboard. They are extremely expensive to fix in production. Design first.
We will work through a full design exercise for our running example — a logistics company we’ll call Northwind Freight. By the end of this lesson, you will have a complete object model ready for the next lesson, where we implement it.
Step 1 — Domain interviews
Before drawing anything, talk to the people who do the work. Three roles per domain is a good baseline:
- An operator — the person clicking the buttons every day. Dispatcher, support agent, account manager.
- A manager — the person reading the dashboards, accountable for outcomes.
- A subject-matter expert — operations research, compliance, finance.
Ask open questions:
- “Walk me through a typical day.”
- “What’s the first thing you do when X happens?”
- “Show me the screens you use.” (Then take screenshots.)
- “What goes wrong? When was the last time it went wrong?”
- “What words do you use for [thing]? Does anyone else call it something different?”
Two big outcomes from interviews:
- A vocabulary list. Every distinct word, with its definition and who uses it.
- A handful of workflows. End-to-end stories: “a shipment from creation to delivery.”
For Northwind, the vocabulary might include:
- shipment, package, parcel — the dispatcher and warehouse use these interchangeably
- order — the customer-facing concept; one order can produce multiple shipments
- hub, depot, warehouse — hub is the formal name; the team uses “depot” colloquially
- driver, courier — internal drivers are called drivers; contractors are called couriers
- route, leg — a route is end-to-end; a leg is one hop in a route
- customer, account — a customer can have multiple billing accounts
Already we have decisions to make. Three rules of thumb:
Rule 1 — Use the most precise term, not the most popular. If “depot” and “hub” are used by different teams for the same thing, pick Hub (more specific) and educate. Drift comes from vagueness.
Rule 2 — Make distinctions that exist; ignore those that do not. Internal drivers and external couriers behave differently (different contracts, certifications, dispatching) — that is a real distinction. Split them. “Shipment” and “package” are used interchangeably by the same people for the same thing — collapse them.
Rule 3 — When two concepts look identical today but might diverge, plan for divergence early. A Customer and a BillingAccount could be one type today, but the day Finance asks “show me an account’s contract history independent of customer” you wish you had split them.
Step 2 — Extract nouns
From the vocabulary and workflows, list the candidate object types (nouns):
Candidate object types (first pass):
Customer
BillingAccount
Order
OrderLine
Shipment
ShipmentLeg
Hub
Route
Driver
Courier (split from Driver)
Vehicle
Telemetry (one ping)Now: prune.
- Is each of these a real, identifiable entity in the business?
- Does each have a stable identity (a primary key)?
- Does each have at least 3-5 properties that meaningfully describe it?
A failed candidate: Telemetry. Each ping is identifiable, but it does not really have its own identity for business decisions — it is a stream of measurements about a Vehicle. Decision: drop Telemetry as an object type; project it onto Vehicle.currentLocation.
A reinforced candidate: ShipmentLeg. A single shipment can pass through multiple hubs. Each leg has a distinct origin, destination, vehicle, driver, and timestamp. It is an entity in its own right. Keep it.
After pruning:
Object types:
Customer · BillingAccount · Order · OrderLine ·
Shipment · ShipmentLeg · Hub · Driver · Courier · VehicleStep 3 — Extract verbs
Now list candidate link types (verbs):
Customer ─owns→ BillingAccount
Customer ─placed→ Order
Order ─containsLine→ OrderLine
Order ─producedShipment→ Shipment
Shipment ─hasLeg→ ShipmentLeg
ShipmentLeg ─originatesAt→ Hub
ShipmentLeg ─terminatesAt→ Hub
ShipmentLeg ─carriedBy→ Vehicle
ShipmentLeg ─driverAssigned→ Driver OR Courier
Driver ─basedAt→ Hub
Vehicle ─basedAt→ HubNotice the driverAssigned link has a twist — it can target either Driver or Courier. This is the perfect motivation for an interface:
interface: Operator # someone who can drive a leg
implements: [Driver, Courier]
properties: [operatorId, displayName, currentStatus]
# Now the link is unambiguous:
ShipmentLeg ─operatedBy→ OperatorWe have just used the interface lesson in practice.
Step 4 — Properties
For each object type, list every property a consumer might want to read. Group by source:
Shipment:
Identity:
shipmentId, externalTrackingNumber
State:
status, currentLocation, currentLegId
Constraints:
weightKg, dimensions, hazardCodes
Lifecycle:
createdAt, expectedDeliveryAt, deliveredAt
Customer-facing:
deliveryInstructions, signatureRequired
Derived:
delayRiskScore, currentEta, totalDistanceKmA few prompts that surface missed properties:
- “What do you wish you could see on this screen?”
- “What does the customer ask about?”
- “What goes into a monthly report?”
For derived properties, write down the formula during design — even if the implementation comes later:
delayRiskScore = ML model over weightKg, route, hub backlog, weather
currentEta = max(leg.expectedArrival for leg in remaining legs)
totalDistanceKm = sum(leg.distanceKm)Step 5 — Actions
What state changes happen? Each is a candidate action type:
Shipment:
createShipment(orderId, ...)
routeShipment(shipmentId, legPlan)
markShipmentDispatched(shipmentId, legId)
markLegArrived(legId, arrivedAt)
markShipmentDelivered(shipmentId, deliveredAt, signature)
markShipmentException(shipmentId, reason, evidence)
rerouteShipment(shipmentId, newLegPlan)
cancelShipment(shipmentId, reason)
Order:
placeOrder(customerId, lines, ...)
cancelOrder(orderId, reason)
Driver / Courier:
assignOperatorToLeg(legId, operatorId)
unassignOperator(legId, reason)Two design questions to ask of every candidate action:
- What state does it transition from / to? If you cannot draw the state machine, the action is under-specified.
- What’s the smallest meaningful business intent? “Update shipment” is too coarse — split into the specific transitions.
Step 6 — The state machines
For object types with meaningful state, draw the legal transitions:
Shipment.status:
created ──routeShipment──→ in_transit
in_transit ──markLegArrived (last leg)──→ out_for_delivery
in_transit ──markLegArrived (intermediate)──→ in_transit
out_for_delivery ──markShipmentDelivered──→ delivered
any non-terminal ──markShipmentException──→ exception
any non-terminal ──cancelShipment──→ cancelledThis is where your action validations come from. Each arrow is one validation rule. Each missing arrow is a state transition that should not be allowed.
A state machine that is hard to draw is a state machine that will be hard to debug. Simplify until it is drawable.
Step 7 — Naming pass
Now do a single pass through every name in the model and apply the conventions:
- Object types in PascalCase.
- Properties in camelCase, units in the name,
Atfor timestamps,Idfor foreign keys. - Enums in PascalCase, values in snake_case.
- Actions in imperative verb-first.
Edit the model. This is much cheaper than renaming after the code exists.
Step 8 — Sanity-check
Three questions to ask of the finished model:
1. Can a new engineer read the model and answer “what does this business do”?
The answer should be yes. If not, your descriptions are weak, or your object types are too granular to convey purpose.
2. Can every property on every object type be answered “where does this value come from?”
For intrinsic properties: a source system. For derived: a function or formula. If you cannot answer, it does not belong.
3. Can every action you listed be expressed as a state transition or a typed write to a small set of objects?
If an action seems to need “five updates across four object types and a webhook call”, it is probably two actions plus an orchestration layer.
The finished model (preview)
By the end of design we have:
- 10 object types with primary keys, title keys, descriptions, and property lists.
- 12 link types with cardinalities and intersection object types where needed.
- 1 interface (
Operator). - ~15 action types with parameters and state transitions.
- ~8 derived properties / functions with formulas.
This is what we will implement in the next two lessons.
Tools for the design phase
You do not need fancy software. The combinations that work in practice:
- Whiteboard + camera for the noun-verb extraction.
- A shared markdown document for vocabulary and descriptions.
- draw.io / Excalidraw for object graph and state machines.
- A simple table (sheet, markdown) for the property lists.
We will commit the design document into docs/design/ alongside the code so the why never gets lost.
Anti-patterns at the modeling stage
Anti-pattern 1 — Modeling the source systems instead of the business. If your object types mirror your tables one-to-one, you have not modeled — you have renamed. Interpret.
Anti-pattern 2 — Premature generalization. Two object types that might share behavior in the future should not be collapsed today. Refactor when the need is real, not speculative.
Anti-pattern 3 — Skipping the description. A model without descriptions is a model nobody else can use. Write the descriptions during design, not as a future TODO.
Anti-pattern 4 — Modeling alone. A model produced by one person reflects that person’s view of the business. Review with the operators, managers, and SMEs you interviewed.
Key takeaways
- Design starts with people, not with schemas.
- Extract nouns → object types, verbs → links, state changes → actions.
- Apply naming conventions once, before code.
- Sanity-check: a new engineer should be able to read the model and understand the business.
What’s next
In the next lesson we implement this model. We will create the object types and link types in code, bind them to fixture datasources, and stand up a working multi-entity ontology.
Plans are cheap; refactors are not. 📐