|
Bogdan Timofte
authored
2 weeks ago
|
1
|
# HealthProbe - Database-Led Refactoring Plan
|
|
|
2
|
|
|
Bogdan Timofte
authored
2 weeks ago
|
3
|
**Last Updated:** 2026-05-24
|
|
Bogdan Timofte
authored
2 weeks ago
|
4
|
**Status:** Active planning document
|
|
|
5
|
|
|
|
6
|
## Goal
|
|
|
7
|
|
|
|
8
|
Move HealthProbe from the current SwiftData/snapshot/anomaly prototype toward the target architecture:
|
|
|
9
|
|
|
|
10
|
- SQLite archive/analysis database as source of truth;
|
|
|
11
|
- differential observation storage;
|
|
|
12
|
- SQL-first analysis for large datasets;
|
|
|
13
|
- Core Data UI/report cache;
|
|
|
14
|
- recovery-compatible exports;
|
|
|
15
|
- iOS 15-era legacy-device support;
|
|
|
16
|
- Time Machine UI over local observations.
|
|
|
17
|
- destructive reset/reinitialization of prototype/test stores; old database
|
|
|
18
|
compatibility is not required.
|
|
|
19
|
|
|
|
20
|
UI refactoring happens after the storage and query foundations exist.
|
|
|
21
|
|
|
|
22
|
## Milestone 0 - Freeze Legacy Direction
|
|
|
23
|
|
|
|
24
|
**Purpose:** Stop work from deepening the old architecture.
|
|
|
25
|
|
|
|
26
|
Checklist:
|
|
|
27
|
- [ ] Mark SwiftData as legacy/prototype in active implementation tickets.
|
|
|
28
|
- [ ] Stop adding new SwiftData entities.
|
|
|
29
|
- [ ] Stop adding features that require recurring complete snapshots.
|
|
|
30
|
- [ ] Mark existing prototype/test installation data as disposable for archive v2.
|
|
|
31
|
- [ ] Point all storage agents to [`../02-architecture/Database-Design.md`](../02-architecture/Database-Design.md).
|
|
|
32
|
- [ ] Confirm root docs only bootstrap into `HealthProbe/Doc/`.
|
|
|
33
|
|
|
|
34
|
Acceptance:
|
|
|
35
|
- [ ] No active task describes SwiftData as target persistence.
|
|
|
36
|
- [ ] No active task proposes full periodic snapshot storage.
|
|
|
37
|
- [ ] No active task requires old prototype-store compatibility.
|
|
|
38
|
- [ ] `HealthProbe/Doc/README.md` points DB work to `Database-Design.md`.
|
|
|
39
|
|
|
|
40
|
## Milestone 1 - Lock Database Decisions
|
|
|
41
|
|
|
|
42
|
**Purpose:** Resolve irreversible archive choices before coding schema v2.
|
|
|
43
|
|
|
|
44
|
Checklist:
|
|
|
45
|
- [x] Decide timestamp storage convention.
|
|
|
46
|
- [x] Decide hash/salt/key strategy for source/device identifiers.
|
|
|
47
|
- [x] Define strict fingerprint foundation.
|
|
|
48
|
- [x] Define semantic/fuzzy fingerprint policy.
|
|
|
49
|
- [x] Define timezone policy for daily/monthly aggregate buckets.
|
|
|
50
|
- [x] Decide whether visibility ranges are maintained eagerly or rebuilt from events.
|
|
|
51
|
- [x] Define relationship preservation policy for workouts/samples/events.
|
|
|
52
|
- [x] Record prototype data policy: discard/reset old SwiftData and prototype SQLite stores; no compatibility migration.
|
|
|
53
|
- [x] Define export manifest canonicalization and hash algorithm.
|
|
|
54
|
|
|
|
55
|
Acceptance:
|
|
|
56
|
- [x] `Database-Design.md` open questions are answered or explicitly deferred.
|
|
|
57
|
- [x] Schema v2 can be implemented without guessing.
|
|
|
58
|
- [x] Test-install reset/reinitialization policy is documented.
|
|
|
59
|
- [x] Privacy implications of identifiers/provenance are documented.
|
|
|
60
|
|
|
|
61
|
## Milestone 2 - Synthetic Large-Data Test Harness
|
|
|
62
|
|
|
|
63
|
**Purpose:** Prove the new design can be tested before real HealthKit data is involved.
|
|
|
64
|
|
|
|
65
|
Checklist:
|
|
|
66
|
- [ ] Create synthetic observation generator.
|
|
|
67
|
- [ ] Generate low, medium, and high-volume sample sets.
|
|
|
68
|
- [ ] Include appeared/disappeared/representationChanged scenarios.
|
|
|
69
|
- [ ] Include consolidation-like high-frequency thinning scenarios.
|
|
|
70
|
- [ ] Include source/device/metadata variation.
|
|
|
71
|
- [ ] Include relationship fixtures.
|
|
|
72
|
- [ ] Add memory/performance measurement for large diff/export operations.
|
|
|
73
|
|
|
|
74
|
Acceptance:
|
|
|
75
|
- [ ] Tests can create a large synthetic archive without real health data.
|
|
|
76
|
- [ ] Large diff test does not require loading all records into Swift arrays.
|
|
|
77
|
- [ ] Export test streams/pages output.
|
|
|
78
|
- [ ] Fixtures contain no personal, device, location, or real health data.
|
|
|
79
|
|
|
|
80
|
## Milestone 3 - SQLite Archive V2 Schema
|
|
|
81
|
|
|
|
82
|
**Purpose:** Create the new archive foundation.
|
|
|
83
|
|
|
|
84
|
Checklist:
|
|
|
85
|
- [x] Implement `schema_migrations`.
|
|
|
86
|
- [x] Implement `archive_metadata`.
|
|
|
87
|
- [x] Implement `device_chains`.
|
|
|
88
|
- [x] Implement `observations`.
|
|
|
89
|
- [x] Implement `sample_types`.
|
|
|
90
|
- [x] Implement `observation_type_runs`.
|
|
|
91
|
- [x] Implement `sources`.
|
|
|
92
|
- [x] Implement `source_revisions`.
|
|
|
93
|
- [x] Implement `hk_devices`.
|
|
|
94
|
- [x] Implement `metadata_blobs`.
|
|
|
95
|
- [x] Implement `samples`.
|
|
|
96
|
- [x] Implement `sample_versions`.
|
|
|
97
|
- [x] Implement `sample_observation_events`.
|
|
|
98
|
- [x] Implement `sample_visibility_ranges`.
|
|
|
99
|
- [x] Implement `sample_relationships`.
|
|
|
100
|
- [x] Implement `observation_type_summaries`.
|
|
|
101
|
- [x] Implement `daily_type_aggregates`.
|
|
|
102
|
- [x] Implement `export_manifests`.
|
|
|
103
|
- [x] Implement `export_items`.
|
|
|
104
|
- [x] Add required indexes.
|
|
Bogdan Timofte
authored
2 weeks ago
|
105
|
- [x] Add archive integrity report for schema version, required tables, `PRAGMA integrity_check`, and foreign keys.
|
|
Bogdan Timofte
authored
2 weeks ago
|
106
|
- [x] Add SQLite integrity/open/schema-version tests.
|
|
Bogdan Timofte
authored
2 weeks ago
|
107
|
|
|
|
108
|
Acceptance:
|
|
Bogdan Timofte
authored
2 weeks ago
|
109
|
- [x] Fresh archive initializes successfully.
|
|
Bogdan Timofte
authored
2 weeks ago
|
110
|
- [x] Schema version is recorded.
|
|
|
111
|
- [x] Archive v2 can initialize after old prototype stores are removed or ignored.
|
|
Bogdan Timofte
authored
2 weeks ago
|
112
|
- [x] `PRAGMA integrity_check` passes.
|
|
Bogdan Timofte
authored
2 weeks ago
|
113
|
- [x] Required indexes exist.
|
|
Bogdan Timofte
authored
2 weeks ago
|
114
|
- [x] Empty archive queries return valid empty results.
|
|
Bogdan Timofte
authored
2 weeks ago
|
115
|
|
|
|
116
|
## Milestone 4 - Differential Write Path
|
|
|
117
|
|
|
|
118
|
**Purpose:** Write observations without storing full recurring snapshots.
|
|
|
119
|
|
|
|
120
|
Checklist:
|
|
Bogdan Timofte
authored
2 weeks ago
|
121
|
- [x] Create observation transaction wrapper.
|
|
Bogdan Timofte
authored
2 weeks ago
|
122
|
- [x] Attach HealthKit pages, deleted-object evidence, and final type verification to the same user-visible archive observation.
|
|
|
123
|
- [x] Store the archive observation id on the legacy `HealthSnapshot` bridge model for transition screens.
|
|
Bogdan Timofte
authored
2 weeks ago
|
124
|
- [x] Upsert sample types.
|
|
|
125
|
- [x] Upsert source/source revision/device/metadata rows.
|
|
|
126
|
- [x] Upsert sample identity.
|
|
|
127
|
- [x] Upsert sample payload version only when payload changes.
|
|
|
128
|
- [x] Insert appeared/verified/representationChanged events.
|
|
|
129
|
- [x] Record `HKDeletedObject` evidence by UUID hash.
|
|
|
130
|
- [x] Close visibility ranges for disappeared/deleted samples.
|
|
|
131
|
- [x] Maintain open visibility ranges for visible samples.
|
|
Bogdan Timofte
authored
2 weeks ago
|
132
|
- [x] Rebuild/update affected type summaries and daily aggregates after capture/delete observations.
|
|
Bogdan Timofte
authored
2 weeks ago
|
133
|
- [x] Commit SQLite before Core Data/cache work.
|
|
Bogdan Timofte
authored
2 weeks ago
|
134
|
- [x] Make repeated capture page writes idempotent.
|
|
Bogdan Timofte
authored
2 weeks ago
|
135
|
- [x] Stop writing the legacy `archive_samples` mirror during capture.
|
|
Bogdan Timofte
authored
2 weeks ago
|
136
|
- [x] Move verification/delete bookkeeping to archive v2 tables.
|
|
|
137
|
- [x] Remove remaining `archive_samples` schema/update remnants.
|
|
Bogdan Timofte
authored
2 weeks ago
|
138
|
|
|
|
139
|
Acceptance:
|
|
Bogdan Timofte
authored
2 weeks ago
|
140
|
- [x] Initial import stores identities and versions once.
|
|
|
141
|
- [x] Re-running same page does not duplicate sample identities or payload versions.
|
|
|
142
|
- [x] Representation change creates a new version, not a new logical sample.
|
|
|
143
|
- [x] Disappearance closes visibility range.
|
|
Bogdan Timofte
authored
2 weeks ago
|
144
|
- [x] No full observation copy table is created or written.
|
|
Bogdan Timofte
authored
2 weeks ago
|
145
|
- [x] A user-visible capture has one archive observation id that SQL diff/cache/UI layers can reference.
|
|
Bogdan Timofte
authored
2 weeks ago
|
146
|
|
|
|
147
|
## Milestone 5 - SQL Analysis Layer
|
|
|
148
|
|
|
|
149
|
**Purpose:** Make the archive useful without RAM-heavy processing.
|
|
|
150
|
|
|
|
151
|
Checklist:
|
|
Bogdan Timofte
authored
2 weeks ago
|
152
|
- [x] Implement point-in-time visible-record query.
|
|
|
153
|
- [x] Implement paged record table query.
|
|
Bogdan Timofte
authored
2 weeks ago
|
154
|
- [x] Implement appeared query between observations.
|
|
|
155
|
- [x] Implement disappeared query between observations.
|
|
|
156
|
- [x] Implement representationChanged query between observations.
|
|
|
157
|
- [x] Implement diff counts using temp tables or equivalent SQL-first strategy.
|
|
Bogdan Timofte
authored
2 weeks ago
|
158
|
- [x] Implement aggregate comparison query.
|
|
Bogdan Timofte
authored
2 weeks ago
|
159
|
- [x] Implement consolidation-likely evidence query.
|
|
|
160
|
- [x] Implement source/provenance breakdown query.
|
|
Bogdan Timofte
authored
2 weeks ago
|
161
|
- [x] Add large synthetic diff/pagination regression.
|
|
Bogdan Timofte
authored
2 weeks ago
|
162
|
- [x] Add formal query timing/memory metrics on synthetic large datasets.
|
|
Bogdan Timofte
authored
2 weeks ago
|
163
|
|
|
|
164
|
Acceptance:
|
|
Bogdan Timofte
authored
2 weeks ago
|
165
|
- [x] Observation T can be reconstructed from ranges/events.
|
|
Bogdan Timofte
authored
2 weeks ago
|
166
|
- [x] Large diff returns counts and first page without loading all rows.
|
|
Bogdan Timofte
authored
2 weeks ago
|
167
|
- [x] Query results are deterministic and ordered.
|
|
Bogdan Timofte
authored
2 weeks ago
|
168
|
- [x] Consolidation evidence includes count, aggregate, coverage, density, and uncertainty data.
|
|
Bogdan Timofte
authored
2 weeks ago
|
169
|
|
|
|
170
|
## Milestone 6 - Core Data UI/Report Cache
|
|
|
171
|
|
|
|
172
|
**Purpose:** Cache expensive presentation/report values while keeping SQLite authoritative.
|
|
|
173
|
|
|
|
174
|
Checklist:
|
|
Bogdan Timofte
authored
2 weeks ago
|
175
|
- [x] Define Core Data model for observation rows.
|
|
|
176
|
- [x] Define type summary cache entity.
|
|
|
177
|
- [x] Define daily/monthly aggregate cache entity.
|
|
|
178
|
- [x] Define diff summary cache entity.
|
|
|
179
|
- [x] Define export manifest/status cache entity.
|
|
|
180
|
- [x] Define archive health/status cache entity.
|
|
|
181
|
- [x] Implement initial cache rebuild from SQLite for observation/type/daily/diff/export/health rows.
|
|
|
182
|
- [x] Include archive schema/cache schema/version/hash fields on rebuilt rows.
|
|
|
183
|
- [x] Implement delete-cache-and-rebuild flow.
|
|
|
184
|
- [x] Add cache schema/version and rebuild tests.
|
|
|
185
|
- [ ] Wire Core Data cache into UI-facing view models.
|
|
|
186
|
- [ ] Add targeted partial invalidation for affected observation/type ranges.
|
|
Bogdan Timofte
authored
2 weeks ago
|
187
|
|
|
|
188
|
Acceptance:
|
|
Bogdan Timofte
authored
2 weeks ago
|
189
|
- [x] Deleting Core Data cache does not lose forensic data.
|
|
|
190
|
- [x] Cache rebuild restores dashboard/timeline/report summaries.
|
|
|
191
|
- [x] Cache rows include source observation ids and archive/cache schema versions.
|
|
Bogdan Timofte
authored
2 weeks ago
|
192
|
- [ ] SQLite wins on disagreement.
|
|
|
193
|
|
|
|
194
|
## Milestone 7 - Export Layer
|
|
|
195
|
|
|
|
196
|
**Purpose:** Produce scoped, recovery-compatible exports.
|
|
|
197
|
|
|
|
198
|
Checklist:
|
|
|
199
|
- [ ] Define JSON export envelope.
|
|
|
200
|
- [ ] Define CSV record-table export.
|
|
|
201
|
- [ ] Define manifest hash algorithm.
|
|
|
202
|
- [ ] Include archive/app/schema/observation metadata.
|
|
|
203
|
- [ ] Include sample identity and payload version hashes.
|
|
|
204
|
- [ ] Include values/dates/units/type fields.
|
|
|
205
|
- [ ] Include source/provenance metadata where available and allowed.
|
|
|
206
|
- [ ] Include relationships where available.
|
|
|
207
|
- [ ] Include provenance-loss warning for external HealthKit re-publication.
|
|
|
208
|
- [ ] Stream/page export from SQLite.
|
|
|
209
|
- [ ] Store export manifest rows.
|
|
|
210
|
- [ ] Add reproducibility test for export manifests.
|
|
|
211
|
|
|
|
212
|
Acceptance:
|
|
|
213
|
- [ ] Large export does not materialize full record set in RAM.
|
|
|
214
|
- [ ] Export can be verified against archive hashes.
|
|
|
215
|
- [ ] Export contains enough structure for external recovery/salvage tooling.
|
|
|
216
|
- [ ] App still does not perform restore, backup patching, or HealthKit re-publication.
|
|
|
217
|
|
|
|
218
|
## Milestone 8 - UI/Data Flow Migration
|
|
|
219
|
|
|
|
220
|
**Purpose:** Move UI from prototype storage to target cache/query flow.
|
|
|
221
|
|
|
|
222
|
Checklist:
|
|
|
223
|
- [ ] Replace direct SwiftData `@Query` dependencies for target screens.
|
|
|
224
|
- [ ] Dashboard reads Core Data cache.
|
|
|
225
|
- [ ] Observation timeline reads Core Data cache.
|
|
|
226
|
- [ ] Observation detail uses cached summaries plus paged SQLite DTOs.
|
|
|
227
|
- [ ] Diff detail uses cached summary plus paged SQLite DTOs.
|
|
|
228
|
- [ ] Data type screens use target change labels.
|
|
|
229
|
- [ ] Export preview uses export query/manifest APIs.
|
|
|
230
|
- [ ] Archive status reflects SQLite/Core Data cache health.
|
|
|
231
|
- [ ] Legacy/small-device UI mode simplifies heavy visualizations.
|
|
|
232
|
|
|
|
233
|
Acceptance:
|
|
|
234
|
- [ ] Core Time Machine flows work without SwiftData as target persistence.
|
|
|
235
|
- [ ] UI copy uses observation/diff/export language.
|
|
|
236
|
- [ ] No count-only critical data loss messaging.
|
|
|
237
|
- [ ] Large record tables are paged.
|
|
|
238
|
- [ ] Legacy mode preserves capture/report/export.
|
|
|
239
|
|
|
|
240
|
## Milestone 9 - Legacy SwiftData Retirement
|
|
|
241
|
|
|
|
242
|
**Purpose:** Remove prototype persistence from the target architecture.
|
|
|
243
|
|
|
|
244
|
Checklist:
|
|
|
245
|
- [ ] Identify all remaining SwiftData imports.
|
|
|
246
|
- [ ] Replace SwiftData models used by active flows.
|
|
|
247
|
- [ ] Remove/disable `ModelContainer` as required for target builds.
|
|
Bogdan Timofte
authored
2 weeks ago
|
248
|
- [x] Add prototype-store ignore/delete/reset path for test installs.
|
|
Bogdan Timofte
authored
2 weeks ago
|
249
|
- [ ] Verify no old-store compatibility layer remains in active flows.
|
|
|
250
|
- [ ] Lower deployment target as far as dependencies allow.
|
|
|
251
|
- [ ] Verify build for iOS 15-era target constraints.
|
|
|
252
|
|
|
|
253
|
Acceptance:
|
|
|
254
|
- [ ] SwiftData is not required for normal app launch.
|
|
|
255
|
- [ ] Active flows use SQLite + Core Data cache.
|
|
Bogdan Timofte
authored
2 weeks ago
|
256
|
- [x] Prototype data handling is explicit: old stores are ignored/deleted/reset for test installs.
|
|
Bogdan Timofte
authored
2 weeks ago
|
257
|
|
|
|
258
|
## Milestone 10 - Acceptance Gate
|
|
|
259
|
|
|
|
260
|
**Purpose:** Decide whether the refactor is complete enough to build product features on top.
|
|
|
261
|
|
|
|
262
|
Checklist:
|
|
|
263
|
- [ ] Point-in-time reconstruction works.
|
|
|
264
|
- [ ] Large diff works SQL-first.
|
|
|
265
|
- [ ] Materialized aggregates can be rebuilt and verified.
|
|
|
266
|
- [ ] Core Data cache can be deleted and rebuilt.
|
|
|
267
|
- [ ] Large export streams/pages.
|
|
|
268
|
- [ ] Recovery-compatible manifest is present.
|
|
|
269
|
- [ ] SQLite integrity checks pass.
|
|
|
270
|
- [ ] Low-memory synthetic tests pass.
|
|
|
271
|
- [ ] UI no longer depends on SwiftData as foundation.
|
|
|
272
|
- [ ] Docs match implementation.
|
|
|
273
|
|
|
|
274
|
Acceptance:
|
|
|
275
|
- [ ] Product can safely proceed to UI polish and higher-level workflows.
|
|
|
276
|
- [ ] Database is no longer the main unresolved architectural risk.
|
|
|
277
|
|
|
|
278
|
## Parallelization Guide
|
|
|
279
|
|
|
|
280
|
Can run in parallel after Milestone 1:
|
|
|
281
|
- synthetic data harness;
|
|
|
282
|
- schema implementation;
|
|
|
283
|
- Core Data cache model drafting;
|
|
|
284
|
- export format drafting;
|
|
|
285
|
- UI DTO contract design.
|
|
|
286
|
|
|
|
287
|
Must not run before dependencies:
|
|
|
288
|
- UI migration before SQL query layer and Core Data cache exist;
|
|
|
289
|
- export implementation before manifest design is locked;
|
|
|
290
|
- legacy SwiftData removal before replacement flows exist;
|
|
|
291
|
- archive v2 initialization before reset/reinitialization policy is documented.
|
|
|
292
|
|
|
|
293
|
## Agent Assignment Hints
|
|
|
294
|
|
|
|
295
|
| Workstream | Primary Doc |
|
|
|
296
|
|------------|-------------|
|
|
|
297
|
| SQLite schema/write path/query layer | [`../02-architecture/Database-Design.md`](../02-architecture/Database-Design.md) |
|
|
|
298
|
| HealthKit capture integration | [`../02-architecture/Implementation-Guide.md`](../02-architecture/Implementation-Guide.md) |
|
|
|
299
|
| Core Data cache | [`../02-architecture/Core-Data-Cache-Design.md`](../02-architecture/Core-Data-Cache-Design.md) |
|
|
|
300
|
| Export formats/manifests | [`../02-architecture/Export-Specification.md`](../02-architecture/Export-Specification.md) |
|
|
|
301
|
| UI migration | [`../00-agent-guides/CLAUDE.md`](../00-agent-guides/CLAUDE.md) |
|
|
|
302
|
| Product language/non-goals | [`../01-product/MVP-Specification.md`](../01-product/MVP-Specification.md) |
|
|
|
303
|
| Status updates | [`IMPLEMENTATION_STATUS.md`](IMPLEMENTATION_STATUS.md) |
|