Container taxonomy — data model rework

companion: multipart-prototypes.html (behavior) · research.md (full record) · 2026-06-10 · §6/D7 (above the assignment) added 2026-06-11

PROPOSAL — NOTHING DECIDED · for team discussion

0Read this first — the decision load is smaller than it looks

ALREADY DECIDEDYour team's rename: knowledge_components → question_containers, and assignment slots point at the container. This doc builds on that; it doesn't reopen it.

FORK №1D1 — what is a multipart? A new sequence_container entity (this proposal, sections 3–5) or a MULTIPART question type (the earlier, smaller design). Both work; both are prototyped or designed.

FORK №2 · 06-11D7 — the tree ABOVE the assignment. This doc modeled below the assignment; the curriculum needs units → sections → lessons above it (today: only flat modules). Typed tables (proposed, section 6) or nested modules. Same rework-appetite question as D1 — measure both in one spike.

FOLLOW FROM D1D2–D4 are defaults that come with the entity choice (steps point at containers; assignments stay assignments; typed tables, no polymorphism). If D1 goes the other way, they evaporate. Collapsed below — read only if you want to challenge a default.

ALREADY PROVEND5–D6 (resources owned by containers; responses on served questions, rollups as projections) are running in the interactive prototypes. Carried forward, not open.

NEXT STEP = MEASURENot a decision: a migration spike — write the real rename + assignment_items migration locally, run typecheck, count what breaks. D1 gets decided with a number, not a feeling.

1Today — what the tables actually do

The example throughout: a teacher assigns practice on equivalent ratios, plus the grape-catch mini. Here's how today's schema holds (and fails to hold) that.

knowledge_components …is secretly a CONTAINER: it groups interchangeable variations so serving can pick one

id	name
#501	Equivalent ratios	really: a variation container

questions variations point at their KC

id	stem	knowledge_component_id
#9301	Which ratio is equivalent to 2:3? (variation A)	#501
#9302	Which ratio is equivalent to 3:4? (variation B)	#501
#9303	Which ratio is equivalent to 1:5? (variation C)	#501

assignment_questions ⚠ points at ONE raw variation — serving must hop question → KC → siblings

assignment_id	question_id	order
#77	#9301	1	indirection: the slot means "KC #501" but says "question #9301"
#77	#4000	2	the mini: ONE giant D3 bundle question, ONE giant response

Three problems: (1) the KC table is a container wearing a pedagogy costume; (2) assignment slots point at a raw variation and the container is recovered by indirection; (3) sequences (minis/decks/testlets) don't fit at all — they get crushed into one mega-question. And resources hang off to the side, unable to participate in any of it.

2The rename — your team's plan, unchanged

Same rows, honest names. knowledge_components → question_containers (a variation set: members are interchangeable, serving picks ONE) and assignment slots point at the container, not a raw variation.

today

knowledge_components

#501	Equivalent ratios

assignment_questions

#77 → question #9301

order 1

after rename

question_containers serve-ONE-of

#501

Equivalent ratios

variation container

assignment_items

#77 → question_container #501

position 1

slot says what it means

3The proposal — add exactly ONE sibling: the sequence container

A question_container says "serve ONE of my members" (variations, unordered). The new sequence_container says "serve ALL of my members, in order" (steps: questions and resources). That one sibling is the entire multipart/deck/mini/testlet model. Fixed depth, every arrow a real foreign key:

assignment #77the teacher-assignable unit — name and identity unchanged

question_container #501"Equivalent ratios" · serve ONE

question #9301 variation A

#9302 B

#9303 C

sequence_container #70"Grape Catch mini" · serve ALL in order · template/nav/feedback config lives here

resource #61dot plot · owned by #70 (context)

question_container #511step 1 · serve ONE

#9311 Sakeem=3

#9312 Sakeem=4

question_container #512step 2 · serve ONE

#9321 count 5s

#9322 count 2s

The payoff hiding in this picture: because sequence steps point at question_containers (not raw questions), every step of a mini serves variations — "GO AGAIN" deals fresh numbers, retries and spaced review get parallel variants — using the same serving logic assignment slots already use. The hardest open question from this week (multipart variations) is answered by the structure itself.

External precedent (added 2026-06-11): 1EdTech QTI — the dominant assessment interchange standard — independently converged on the same two semantics: an assessmentSection carries selection + ordering rules ("select 1 of N, shuffled" = serve ONE; "present all, in order" = serve ALL), and QTI item templates (templated variables, re-randomized per attempt) are exactly the fresh-numbers variation story. We're not inventing a novel shape — we're giving a typed home to a twenty-year-old one. Bonus: a future QTI import path (existing item banks) maps onto these tables nearly 1:1.

4Every table, every row, for that one picture

assignment_items replaces assignment_questions · CHECK: exactly one FK set · role added 2026-06-11 (instructional | practice | check — what the slot means; see section 6)

assignment_id	position	role	question_container_id	sequence_container_id
#77	1	practice	#501	NULL
#77	2	instructional	NULL	#70

sequence_containers NEW

id	title	config (jsonb)
#70	Grape Catch mini	{ template: quick-hitter, navigation: linear, feedback: immediate, context: [resource #61] }

sequence_items NEW ordered · question_container_id XOR resource_id

sequence_container_id	position	question_container_id	resource_id
#70	1	#511	NULL
#70	2	#512	NULL

question_containers renamed knowledge_components — both standalone (#501) and step (#511/#512) containers are the same thing

id	name	concept (pedagogy tag)
#501	Equivalent ratios	ratio-equivalence
#511	Read Sakeem's dot	read-dot-plot
#512	Count a value's dots	frequency

resources owner moves to the container level

id	title	owner_container_id
#61	Stimulus: grape-catch dot plot	owned by #70
#482	Passage: The Inventor's Notebook	NULL — shared library

questions and responses keep their exact shapes — questions gain nothing, responses still attach only to the served question. The mini-level rollup stays a projection (now: sequence #70 → items → containers → served-question responses).

5Serving walkthrough — two students, same assignment

Click a student. Watch one rule run everywhere: slot → container → serve ONE member — for the plain practice slot AND for every step inside the mini.

slot 1 → question_container #501 → serve ONE → #9302 (variation B)

slot 2 → sequence_container #70 → serve ALL in order:

context → resource #61 (dot plot) displayed, never answered

step 1 → question_container #511 → serve ONE → #9311 (Sakeem=3)

step 2 → question_container #512 → serve ONE → #9322 (count 2s)

responses for Maria attach to: #9302 #9311 #9322 — never to the containers

…and on GO AGAIN, the mini's steps re-serve: fresh variations from the same containers. Responses rows attach to the served question ids above — which is why the data model must record which variation was served (open call “b”).

6Above the assignment — the missing half of the tree

Sections 1–5 model everything BELOW the assignment. Above it, today's schema has exactly one grouping primitive — modules — while the curriculum being authored (NY Grade 6 Math) has a real tree: unit → section → lesson, with the lesson owning four assignments by role (ONE interleaved building-blocks assignment + the three synthesis assignments) and the unit owning the unit test. Inside an assignment, items carry their own role (instructional | practice | check) — that item-level role is what lets building blocks be one assignment instead of three (see the #210 table below). Naming (settled 2026-06-11): the entity is called lessons — the domain word wins; teachers and the scope & sequence already say "Lesson 5," and an invented term would tax every conversation forever. The collision with today's config.mode: "lesson" resolves by deprecating that mode value, not by renaming the entity: a deck or mini-series was never a "lesson" — it's the INSTRUCTIONAL part of one — so the item role is instructional and the mode value dies with no successor (mode keeps only how-it-runs values: sequential, assessment, mastery, survey, collab). Bonus: checkType: "lesson" becomes correct — it finally points at a real lesson. The Open edX lesson properly read: one name per thing, aligned WITH domain language — calling it "cluster" internally while every human says "lesson" would have re-created their mess.

modules + assignment_modules today's ONLY above-assignment grouping — flat, ≤1 module per assignment (unique index), teacher-owned (created_by → teacher_profiles, is_locked, teacher_modules)

module	assignment_id	order
"Unit 0" (a module wearing a unit costume)	#206	1	no sections, no lessons, no roles — one flat level

Same disease as section 1, one level up: the module table is a unit wearing a classroom costume, and everything the curriculum knows ("this is the mastery check of Lesson 5 in Section B of Unit 0") has nowhere to live. checkType: "section" sitting in assignment config today is the fossil of someone needing section-level attachment with no table to put it in.

course "NY Grade 6 Math"course identity — already exists

unit "Fractions & Decimal Operations"position 0 → display "Unit 0" · owns the unit test + assessment bank

section "Multiplying Fractions"position 2 → "Section B" · the writer's unit of thought

lesson "1/n × Whole"numbering derives GLOBALLY across sections → "Lesson 5" · the unit of progress gating · also owns its LEARNING OBJECTIVES

learning objectives"SWBAT…" ×N · edges → concepts (primary KCs, via CTA) and → standards

#210bb · ONE assignment — per-KC minis + practice + checks, INTERLEAVED (items carry role)

#204syn-instructional · the deck

#205syn-practice

#206syn-check · the mastery check

Every leaf in this tree is a plain assignment — the entity sections 3–5 dissect, unchanged (that's D3 doing its job). The lesson layer doesn't touch serving; it gives "the check for this lesson" a foreign key instead of a naming convention. And the role makes mode/checkType stop carrying structural weight they were never meant to hold.

units / sections / lessons NEW three typed tables, fixed depth — D4's logic, one level up · all carry external_id (uuid, unique) so publish is an idempotent upsert

table	parent fk	position	example row
units	course_id	0	Fractions & Decimal Operations
sections	unit_id	2	Multiplying Fractions
lessons	section_id	5	1/n × Whole

Position is data, identity is not: display numbering ("Unit 0", "Section B", "Lesson 5") derives from position; slugs/external_ids never encode it. Reordering is an UPDATE, not a cascade of renames — the same rule the authoring repo already enforces for its directory names.

lesson_assignments NEW role is a closed enum of four · UNIQUE (lesson_id, role)

lesson_id	role	assignment_id
#12	bb	#210
#12	syn-instructional	#204
#12	syn-practice	#205
#12	syn-check	#206

assignment_items for #210 role ON THE ITEM · 2026-06-11 the building-blocks interleave, one KC at a time: learn it → practice it → prove it · role = instructional | practice | check

position	role	question_container_id	sequence_container_id
1	instructional	NULL	#71	KC-1's mini (quick-hitter)
2	practice	#551	NULL	KC-1 practice
3	practice	#552	NULL	KC-1 practice
4	check	#561	NULL	KC-1 check — mastery semantics key off the ITEM role
5	instructional	NULL	#72	KC-2's mini
6	practice	#553	NULL	KC-2 practice
7	practice	#554	NULL	KC-2 practice
8	check	#562	NULL	KC-2 check

Why role lives on the item: today the only way to say "this is practice" vs "this is a check" is the assignment-level mode — which is exactly why the curriculum first looked like six assignments per lesson. With role on assignment_items, per-item behavior (check items gate + lock; practice retries freely; instructional items teach) coexists inside one assignment, and the assignment's mode shrinks to pure orchestration — how the assignment RUNS (sequential, assessment lockdown, mastery scheduling, …). mode: "lesson" is deprecated with NO successor: it answered "what is this," and that job moved to item roles + sequence templates. (Legacy Sidekick lessons keep it until the player keys off roles/templates; then the enum value dies. The same fate eventually awaits mode: "check", but checks have live mastery behavior keyed on mode — later cleanup, not now.) A "block" needs no table: it's the span from one instructional item to the next, and remediation ("check at position 4 failed → re-serve positions 1–3, fresh variations") is just reading the order. Open question: synthesis could collapse the same way (one assignment: deck → practice → mastery check) — kept as three pending the product call on independent scheduling.

unit_assignments NEW the unit test lives at unit level · section-level attachment deliberately NOT built until something real needs it

unit_id	role	assignment_id
#3	unit-test	#200 (mode: assessment)

learning_objectives NEW · 2026-06-11 owned by the lesson · NO LO table exists in today's schema at all — net-new, rides the same migration

id	lesson_id	text	edges
#31	#12	SWBAT see 1/n × m as repeated addition	lo_concepts → primary KCs (via CTA + curation) · lo_standards → standards

Why LOs are entities, not prose: the builder flow gives them three structural jobs — CTA tags which KCs are primary per LO; LOs link to standards (with "stock LOs" generated for any KC not covered through an LO, so the KC→standard chain never breaks); LOs tie to the lesson's assignments. Edges need identity; a markdown bullet can't hold them. Open question (flagged in the Curriculum Building doc itself): do LO↔concept edges carry parameters, the way item↔concept edges carry weights (3 gatekeeper / 2 required / 1 supporting)? Undecided — the table ships without params and they're additive later.

LOs are lesson-owned, 1:N — shared-LO (M:N) considered and rejected (2026-06-11). The cross-lesson shared layer already exists: concepts (the registry) and standards (external, fixed). An LO is the lesson's LOCAL framing — "this lesson intends these concepts, stated for humans." When two lessons "share an objective," the shared truth is the concepts their LOs point at — already queryable. Making LOs shared would mint a second course-wide vocabulary (concepts with SWBAT phrasing) and drag drafty lesson-local statements into the registry's edit-affects-everyone consistency class. Two lessons with identical LO text = two rows; spiraling = progressively different LOs over the same concepts. Two named tripwires (cheap, additive if hit): real LO reuse appears → add the M:N join then; an assignment genuinely needs to live in TWO lessons (today: UNIQUE(lesson_id, role) + the repo's owned dirs = exclusive ownership, copy-per-lesson) → revisit the lesson's structure/meaning split then. And on that conflation worry: lessons holding both grouping and pedagogy is fine where knowledge_components wasn't, because the KC table's two jobs PULLED APART (1:N variation grouping vs M:N weighted tagging) while a lesson's meaning just IS its membership + order + LOs + gate — same fact, no divergence, no pressure.

authoring layer (the workspace repo) — generic on purpose

Dirs + files, any shape; course.json/unit.json manifests give the tree meaning; a reconciler renders it and degrades gracefully (typo'd manifest → raw tree + notice, content never hidden). Draft-tolerant, branch-isolated, git-durable. This layer ALREADY EXISTS and already authors exactly this tree.

published layer (postgres) — typed on purpose

Publish runs the reconciler ONCE and persists its output by external_id. Students never receive an uninterpreted tree — there is no acceptable student-dashboard equivalent of "manifest invalid, here are raw directories." The repo gets to stay permissive BECAUSE the DB isn't.

What happens to modules: nothing. Modules stay the teacher-facing classroom grouping they already are (different lifecycle: teacher-created, mutable, lockable) — they are simply not part of the curriculum model. Students see the tree itself: the curriculum ships with a student course view that reads units/sections/lessons directly ("you're in Unit 0 → Section B → Lesson 5") — a flat module can't even fake that (one module per unit = ~89 assignments in a flat list), and the bb interleave needs a player that understands item roles anyway. Assignment-level machinery (assigning, due dates, completion, progress) keys on assignments and keeps working with zero help. A publish-time module projection (derived module per unit, D6-style) stays in the back pocket ONLY as a fallback if some legacy surface unexpectedly needs it — not the plan of record. Retrofitting modules into the curriculum tree was considered and rejected: see D7.

External precedent here too: every major LMS standard splits along this exact seam. Common Cartridge / SCORM: a package is files + a manifest whose <organizations> tree gives meaning to generic resources — that's the repo layer, standardized. Open edX OLX: typed, fixed-depth course → chapter → sequential → vertical at runtime — the closest large-scale analog to units → sections → lessons, holding up for a decade. Nobody serves students out of a cartridge; platforms import it into their own typed model. The standards ecosystem already agrees with the two-layer split.

7The decisions — D1, D7, and the five that take care of themselves

Two forks are open: D1 (below the assignment) and D7 (above it, added 2026-06-11). D2–D4 are defaults that ride with D1 (expand to challenge one); D5–D6 are already running in the prototypes.

D1Is the multipart a separate entity (sequence_container) — or a question type?KEYSTONE

proposedSeparate entity. The team is making containers first-class (the rename). One container as a real table + another hidden in question jsonb is incoherent. Sequences and variation sets become siblings sharing one mental model.

the alternative (what we'd designed before)MULTIPART question type (parent-as-question). Virtue: everything that takes a question id today — pivots, pickers, Sidekick navigation, dashboards — keeps working without learning a new entity. Far smaller blast radius.

Hinges on: the real rework appetite. Separate entity means every consumer of "assignment = list of questions" learns assignment_items with two FK types. If the team only wants the rename + repointing, parent-as-question is the honest choice.

D7The curriculum tree above the assignment: typed tables — or nested modules?ADDED 2026-06-11

proposedThree typed tables (units → sections → lessons) + role-keyed join tables — section 6. D4's own argument, one level up: the three levels BEHAVE differently (a unit owns the unit test + assessment bank and is the dashboard identity; a section is the writer's unit of thought; a lesson has closed role-keyed semantics — one bb + three synthesis assignments — and is the unit of progress gating). One nested table turns every difference into if (kind === …) branches, the depth-3 invariant into app-level validation, and tree reads into recursive CTEs.

the alternative (Josh floated this)Nested modules — add parent_id (+ kind) to modules; assignment_modules grows a role. The classic adjacency-list design, smallest migration, teachers already know modules. Costs: published-curriculum rows and teacher-owned classroom rows (is_locked, teacher_modules) start sharing a table and a lifecycle; invariants can't be foreign keys.

Hinges on: the same rework appetite as D1 — fold it into the SAME migration spike and decide both with one measured number. Flips to nested only if course shapes genuinely vary in depth (sub-sections, missing levels) across the real roadmap — today's curriculum is honestly unit/section/lesson.

D2 Steps point at containers only — or raw questions? default: containers only

D2Do sequence steps point at containers only — or sometimes raw questions?

proposedContainers only. Every step gets variation serving; ONE serving path. A question that has no variations gets a singleton container.

alternativequestion_id XOR question_container_id on sequence_items. No wrapper rows for singletons — but two serving paths forever, and steps can't gain variations later without a migration.

Cost being accepted: thousands of one-member container rows + creation ceremony on every authored CFU. Cheap in Postgres, noisy in admin UIs unless filtered.
Stress-tested: "a sequence needs a SPECIFIC question at the right step" — covered today by singleton containers (pinned by construction); exotic cases (pin within a shared family, lockstep variation across steps) would be additive (one nullable column / serving config). Not proposed; no current requirement.

D3 Does "assignment" stay an assignment? default: yes — rework only its pivot

D3Does "assignment" stay an assignment — or become a container too?

proposedStays an assignment. Being assignable IS the identity: due dates, states, courses, modules hang off it. It's the curation boundary (teacher composes/reorders/swaps). Rework its pivot (assignment_items), keep the entity.

alternative (Josh floated this)assignments_container — full generalization: one container model everywhere, one membership pattern, one CRUD service. Maximal elegance; assignment becomes container type "collection".

If the alternative wins, D4's guardrail is what stops it from sliding into the uber-container. Also: do modules join the family next? Where does it stop? (2026-06-11: answered — it stops AT the assignment. The curriculum tree above it gets its own typed layer, section 6 / D7; modules stay teacher-facing.)

D4 Typed tables with real FKs — or one polymorphic container table? default: typed, fixed depth

D4Typed tables with real FKs — or one generic polymorphic container table?

proposedTyped + fixed depth (3 levels). Two named tables, nullable typed FK columns with CHECK exactly-one. Real referential integrity; queries never branch on a type column; no recursion.

alternativecontainers + container_members(member_type, member_id), self-nesting. One table to rule them all — and the configuration-driven-polymorphism antipattern: no FKs, type-branching in every query, unbounded depth.

Strong opinion, weakly held only if the team foresees many more container kinds (playlists? units?) — at which point we should debate that roadmap, not pre-generalize. (2026-06-11: units/sections/lessons arrived and did NOT join the container family — they're their own typed layer above the assignment, section 6 / D7. The guardrail held.)

D5 Resources owned by containers, library when unowned proven in prototypes

D5Resources: owned by containers, library when unownedcarried over

proposedowner_container_id (was owner_question_id in the prototype): a deck's slides are owned by its sequence container; NULL = shared library; sequence membership makes resources composable content. Integrity index (container↔resource) gives delete-protection + "where used?".

alternativeStatus quo: resources as side-panel attachments only. (This is the thing everyone is unhappy with.)

Commitments: deep-copy owned resources on clone; delete-restrict via the index; live-ref vs snapshot policy for shared resources in assessments.

D6 Responses on served questions; rollups are projections proven in prototypes

D6Responses attach to the served variation; everything mini-level is a projectioncarried over

proposedUnchanged from the prototypes: response rows only on served child questions; completion/score = one projection function consumed by gate, dashboard, and player.

prerequisite to verifyOpen call (b): does the current serving path durably record WHICH variation was served per student/slot? The rollup needs served-question linkage. Unverified — needs a code read.

No alternative really — this is the anti-giant-response principle the whole multipart effort exists for.

The whole meeting, in one sentence: argue D1 (entity vs question type) and D7 (typed lesson tree vs nested modules), accept or challenge the defaults that ride with them, and commission ONE migration spike covering both so each choice is made with a measured blast radius. Migration order if it all passes: rename → assignment_items → sequence_containers → resources owner column → units/sections/lessons + lesson_assignments (+ the publish-time module projection).