Skip to main content

Data Model

Design Principles

The schema follows four consistent patterns across all tables.

1. Root + Revision Pattern

Authorable content entities — modules, lessons, assignments, and materials — use a two-table structure:

  • A root table holds the stable identity of the entity (ID, class association, created-by, soft-delete columns).
  • A revision table holds the versioned content (title, body, configuration) with a visibility_state of DRAFT or PUBLISHED.

This allows Teachers to edit draft content without affecting what students currently see, and to publish when ready.

2. Inline Draft/Published Columns

Grading rows use a different pattern. assignment_grade_rows stores both draft and published values as separate columns on a single row. This supports the operational autosave pattern where a Teacher's in-progress scoring is saved continuously without affecting what students see until explicitly published.

3. Computed Aggregates

Lesson and module grade aggregates are not stored. They are calculated at query time as weighted or simple averages of the underlying assignment grade rows. This avoids stale aggregate state.

4. Soft Delete Everywhere

No product flow performs a hard delete. Every deletable entity has:

  • deleted_at — timestamp of deletion
  • deleted_by_user_id — who performed the deletion

Rows with a non-null deleted_at are excluded from normal queries but remain in the database for audit purposes.


Entity Groups

Identity

TablePurpose
app_usersApplication user records linked to Supabase Auth

Classes

TablePurpose
classesClass roots
class_membershipsStudent and Teacher enrollment records

Content

TablePurpose
modulesModule roots
module_revisionsVersioned module content (draft/published)
lesson_unitsLesson roots
lesson_unit_revisionsVersioned lesson content (draft/published)

Materials

TablePurpose
materialsMaterial roots (uploaded files)
material_revisionsVersioned material metadata
lesson_unit_materialsJoin table linking materials to lessons
material_chunksText chunks and vector embeddings for RAG

Assignments

TablePurpose
assignmentsAssignment roots
assignment_revisionsVersioned assignment content and configuration
assignment_submissionsStudent submission records

Grading

TablePurpose
assignment_grade_rowsPer-student grade rows with inline draft and published columns

Chat

TablePurpose
chatsChat session records (scoped to student + context)
chat_messagesIndividual messages within a chat
chat_message_citationsSource chunk citations attached to assistant messages
chat_message_feedbackStudent thumbs-up/down feedback on messages

Admin

TablePurpose
app_configGlobal application configuration key-value store
admin_model_backendsConfigured LLM provider connections
admin_modelsIndividual model definitions within a backend
admin_knowledge_settingsRAG retrieval and knowledge base configuration

Audit

TablePurpose
audit_logsImmutable log of auditable events

Key Enumerations

EnumValues
global_roleUNASSIGNED, STUDENT, TEACHER, ADMIN
visibility_stateDRAFT, PUBLISHED
processing_statusPENDING, PROCESSING, READY, FAILED
message_roleUSER, ASSISTANT
generation_statusSTREAMING, COMPLETED, STOPPED, FAILED
source_modeLESSON, CLASS, GENERAL

PostgreSQL Extensions

The following extensions must be enabled before running migrations:

ExtensionPurpose
pgcryptoUUID generation and cryptographic functions
vector (pgvector)Vector column type and similarity search operators
citextCase-insensitive text type (used for email columns)

Vector Index

Material chunk embeddings are indexed for approximate nearest-neighbor search:

  • Index type: IVFFlat
  • Distance metric: Cosine similarity
  • Dimension: VECTOR(1536) — matches the OpenAI text-embedding-3-small output dimension. This value is configurable if a different embedding model is used.

When switching embedding models, the vector dimension in the schema and the index must be updated, and all existing embeddings must be re-generated.