Skip to content

Data Model

Uniflow uses a multi-table DynamoDB design with 6 dedicated tables. Each entity type has its own table for clarity and independent scaling.

TablePKSKGSIDescription
profilesTableuserId (S)sortKey (S)User profile (META) and event history (EVENT#ts#id). TTL enabled, PITR
identityTableanonymousId (S)Maps anonymous visitor → known userId
sourcesTableid (S)writeKeyHashIndex on writeKeyHashEvent sources with write-key auth
destinationsTableid (S)Destination connector configurations
segmentsTableid (S)Segment definitions (rules)
segmentMembersTablesegmentId (S)userId (S)User-segment membership records

All tables use PAY_PER_REQUEST billing and RETAIN removal policy.

Table: profilesTable
Key: { userId: "user_123", sortKey: "META" }
→ Returns merged profile with all traits
Table: profilesTable
Key: { userId: "user_123", sortKey begins_with "EVENT#" }
→ Returns all events for user, sorted by timestamp
Table: identityTable
Key: { anonymousId: "abc-123" }
→ Returns the linked userId for an anonymous visitor
Table: sourcesTable
GSI: writeKeyHashIndex
Key: { writeKeyHash: "<sha256>" }
→ Returns the source matching a write key
Table: segmentMembersTable
Key: { segmentId: "seg_456" }
→ Returns all users in a segment

Raw events are also stored in S3 via Kinesis Firehose in GZIP-compressed NDJSON format, partitioned by date:

s3://uniflow-events-{account}/
raw/
year=2025/
month=03/
day=08/
events-00001.json.gz

This data lake powers the Glue PySpark audience builder for segment evaluation. A Glue Catalog table provides the schema for SQL access. Segment membership results are also written to S3 as Parquet files under s3://{processed-bucket}/segments/{segment_id}/members.parquet.

DynamoDB is provisioned with PAY_PER_REQUEST (on-demand) billing, so you only pay for actual reads and writes. There is no capacity planning required.