Course Content
Property Types and Data Types
Strings, integers, geo-points, arrays, structs, attachments — and how typing enables validation and reuse
Why typing matters
Loose typing is the most expensive mistake in data modeling. A weight field that is sometimes pounds, sometimes kilograms, sometimes empty, costs more in production incidents over five years than the entire ontology effort costs to set up.
The ontology pushes types as far down as it can: every property has a declared type, every consumer reads it knowing the shape, and the type system catches misuse before runtime.
Primitive types
Every platform supports the basics. The names vary; the meaning does not:
| Type | Use for | Example |
|---|---|---|
string | Free text, IDs, codes | firstName, customerId |
int / long | Whole-number counts | seatCount, retryCount |
double / float | Continuous measurements | weightKg, priceUsd |
boolean | True/false flags | isActive, requiresSignature |
timestamp | Instant in time (UTC) | createdAt, deliveredAt |
date | Calendar date with no time | birthDate, policyStartDate |
bytes | Raw binary | signatureBlob |
Use timestamp for events, date for dates. Mixing them is a common bug source.
Avoid string for things that are not strings. A phone number is not a string — it has a format. An email is not a string — it has structure. Reach for semantic types where the platform supports them.
Semantic types
Semantic types are primitives with meaning attached. They are checked, formatted, and rendered specially:
| Semantic type | Backed by | Why it matters |
|---|---|---|
Money | double + currency | Currency mismatches caught at compile time |
LatLng / GeoPoint | double, double | Map rendering, distance queries, geo indexes |
Polygon / GeoShape | GeoJSON | Spatial joins (“which region contains this point?”) |
EmailAddress | string with RFC-5322 validation | Catches malformed emails at ingestion |
PhoneNumber | string + region | E.164 normalization |
URL | validated string | UI can render as a link |
Duration | long ms | Math without unit confusion |
Always prefer the semantic type if your platform supports it. Money with explicit currency is always better than priceUsd: double.
Enums
Enums turn unbounded strings into a closed set of allowed values. Use them whenever the set is small, known, and stable-ish.
ShipmentStatus:
values:
- created
- in_transit
- out_for_delivery
- delivered
- exception
- cancelledDesign rules:
- Lowercase, snake_case values — they appear in URLs, logs, and indexes.
- Stable values. Renaming an enum value is a breaking change. The display label can change freely; the API value cannot.
- Reserve
unknownwhen the closed-world assumption is risky. Better to haveunknownthan to wedge stray values. - Document the lifecycle — what transitions are legal? That belongs near the enum definition.
Anti-pattern: stuffing two concepts into one enum.
# BAD - mixes status and exception reason
ShipmentStatus:
values: [created, in_transit, delivered, lost, damaged, customs_held]
# GOOD - separate concerns
ShipmentStatus:
values: [created, in_transit, delivered, exception]
ExceptionReason:
values: [lost, damaged, customs_held, address_invalid]Shipment.status and Shipment.exceptionReason can now evolve independently.
Structs
A struct is a property whose value is itself a typed record:
properties:
- name: dimensions
type: struct
fields:
- { name: length, type: double }
- { name: width, type: double }
- { name: height, type: double }
- { name: unit, type: enum<LengthUnit> }Use structs when fields belong together and have no meaning apart from each other. Splitting dimensions into length, width, height at the top level invites bugs: someone reads three of them, forgets the unit, mixes inches with centimeters.
When to prefer a separate object type instead of a struct: if the struct has its own identity, can be referenced from multiple places, or needs to be queried independently. Address is on the edge — small businesses keep it as a struct; large ones promote it to an object type so they can deduplicate.
Arrays
A property can be a typed array:
- name: tags
type: array<string>
- name: stopOverHubIds
type: array<string>
- name: hazardCodes
type: array<enum<HazardCode>>Watch for:
- Arrays of mutable items — hard to update one entry without rewriting the whole array. If you find yourself wanting that, model the items as a linked object type.
- Order-significance — does the order matter? Document it.
route: array<string>(origin → stop → destination) is order-significant.tags: array<string>is not. - Unbounded growth — an array property growing into the millions is a sign you need a link to a separate object type.
Attachments
Attachments are a special property type for files referenced by an object:
- name: proofOfDeliveryPhoto
type: attachment
contentTypes: ["image/jpeg", "image/png"]
maxSizeBytes: 5242880 # 5MBThe file content lives in object storage; the property holds a reference. The ontology layer takes care of:
- Streaming the file when requested.
- Enforcing content type and size limits.
- Applying the same security policies as any other property.
Use attachments for proofs, contracts, photos, generated PDFs — anything binary that belongs to a specific object instance.
Nullability
Every property is nullable or non-nullable. Choose deliberately.
- Non-nullable is the default to aim for. It is a contract: every consumer can assume the value is present.
- Nullable is for properties that legitimately may not have a value yet (
Shipment.deliveredAtbefore delivery) or do not apply to every instance.
The mistake to avoid: making everything nullable “just in case.” That pushes null-handling onto every consumer and erodes the value of typing.
If a field is nullable, why must be obvious from context or stated in the description.
Validation
Properties can carry validation rules beyond their type. Common ones:
- name: customerId
type: string
validation:
pattern: "^cust_[a-zA-Z0-9]{8,16}$"
- name: weightKg
type: double
validation:
min: 0
max: 50000 # nothing heavier than 50 tons in our system
- name: emailAddress
type: EmailAddress # built-in semantic type does this for you
- name: scheduledAt
type: timestamp
validation:
notInPast: trueValidation at the property level is enforced on every write — through actions, through ingestion, through APIs. Catching bad data at the boundary keeps the rest of the model honest.
Property descriptions — write them
A property without a description is a debt. The cost shows up the third or fourth time someone asks “is this in seconds or milliseconds?”
Good descriptions answer:
- Units, if not obvious from the name.
- Source — where does this value come from?
- Nullability rationale — when and why is this null?
- Edge cases — known exceptions.
Example:
- name: deliveredAt
type: timestamp
nullable: true
description: >
UTC timestamp when the recipient signed for the shipment.
Null until delivery is confirmed through the markDelivered action.
For exceptions (lost, damaged) this remains null even when the
shipment lifecycle has ended.A future engineer reading this will not have to guess what null means.
A worked example — Customer revisited
Putting the typing lessons together:
objectType: Customer
properties:
- name: customerId
type: string
validation: { pattern: "^cust_[a-zA-Z0-9]{8,16}$" }
description: Stable identifier minted at signup.
- name: companyName
type: string
- name: primaryEmail
type: EmailAddress
- name: billingAddress
type: struct
fields:
- { name: line1, type: string }
- { name: line2, type: string, nullable: true }
- { name: city, type: string }
- { name: region, type: string }
- { name: postalCode, type: string }
- { name: country, type: enum<CountryCode> }
- name: headquartersLocation
type: GeoPoint
nullable: true
- name: tier
type: enum<CustomerTier> # bronze | silver | gold | platinum
- name: tags
type: array<string>
- name: signedAt
type: timestamp
- name: lifetimeValueUsd
type: Money
derived: true
description: Computed by function customerLifetimeValue(customer)Every property is typed, validated where it matters, scoped, and documented.
Key takeaways
- The ontology’s type system is your single biggest leverage point for data quality.
- Prefer semantic types (
Money,EmailAddress,GeoPoint) over raw primitives. - Use enums for closed sets; reserve
unknownif the world is not closed. - Structs group fields that belong together; arrays model collections; attachments model files.
- Choose nullability deliberately, and always describe properties so future readers do not guess.
What’s next
You now have well-typed objects sitting in isolation. The next lesson connects them: link types — how object types relate to each other, and how to model relationships that hold up over time.
Strongly typed. Strongly opinionated. 💪