Newer Older
303 lines | 13.23kb
Bogdan Timofte authored 2 weeks ago
1
# HealthProbe - Database-Led Refactoring Plan
2

            
Bogdan Timofte authored 2 weeks ago
3
**Last Updated:** 2026-05-24
Bogdan Timofte authored 2 weeks ago
4
**Status:** Active planning document
5

            
6
## Goal
7

            
8
Move HealthProbe from the current SwiftData/snapshot/anomaly prototype toward the target architecture:
9

            
10
- SQLite archive/analysis database as source of truth;
11
- differential observation storage;
12
- SQL-first analysis for large datasets;
13
- Core Data UI/report cache;
14
- recovery-compatible exports;
15
- iOS 15-era legacy-device support;
16
- Time Machine UI over local observations.
17
- destructive reset/reinitialization of prototype/test stores; old database
18
  compatibility is not required.
19

            
20
UI refactoring happens after the storage and query foundations exist.
21

            
22
## Milestone 0 - Freeze Legacy Direction
23

            
24
**Purpose:** Stop work from deepening the old architecture.
25

            
26
Checklist:
27
- [ ] Mark SwiftData as legacy/prototype in active implementation tickets.
28
- [ ] Stop adding new SwiftData entities.
29
- [ ] Stop adding features that require recurring complete snapshots.
30
- [ ] Mark existing prototype/test installation data as disposable for archive v2.
31
- [ ] Point all storage agents to [`../02-architecture/Database-Design.md`](../02-architecture/Database-Design.md).
32
- [ ] Confirm root docs only bootstrap into `HealthProbe/Doc/`.
33

            
34
Acceptance:
35
- [ ] No active task describes SwiftData as target persistence.
36
- [ ] No active task proposes full periodic snapshot storage.
37
- [ ] No active task requires old prototype-store compatibility.
38
- [ ] `HealthProbe/Doc/README.md` points DB work to `Database-Design.md`.
39

            
40
## Milestone 1 - Lock Database Decisions
41

            
42
**Purpose:** Resolve irreversible archive choices before coding schema v2.
43

            
44
Checklist:
45
- [x] Decide timestamp storage convention.
46
- [x] Decide hash/salt/key strategy for source/device identifiers.
47
- [x] Define strict fingerprint foundation.
48
- [x] Define semantic/fuzzy fingerprint policy.
49
- [x] Define timezone policy for daily/monthly aggregate buckets.
50
- [x] Decide whether visibility ranges are maintained eagerly or rebuilt from events.
51
- [x] Define relationship preservation policy for workouts/samples/events.
52
- [x] Record prototype data policy: discard/reset old SwiftData and prototype SQLite stores; no compatibility migration.
53
- [x] Define export manifest canonicalization and hash algorithm.
54

            
55
Acceptance:
56
- [x] `Database-Design.md` open questions are answered or explicitly deferred.
57
- [x] Schema v2 can be implemented without guessing.
58
- [x] Test-install reset/reinitialization policy is documented.
59
- [x] Privacy implications of identifiers/provenance are documented.
60

            
61
## Milestone 2 - Synthetic Large-Data Test Harness
62

            
63
**Purpose:** Prove the new design can be tested before real HealthKit data is involved.
64

            
65
Checklist:
66
- [ ] Create synthetic observation generator.
67
- [ ] Generate low, medium, and high-volume sample sets.
68
- [ ] Include appeared/disappeared/representationChanged scenarios.
69
- [ ] Include consolidation-like high-frequency thinning scenarios.
70
- [ ] Include source/device/metadata variation.
71
- [ ] Include relationship fixtures.
72
- [ ] Add memory/performance measurement for large diff/export operations.
73

            
74
Acceptance:
75
- [ ] Tests can create a large synthetic archive without real health data.
76
- [ ] Large diff test does not require loading all records into Swift arrays.
77
- [ ] Export test streams/pages output.
78
- [ ] Fixtures contain no personal, device, location, or real health data.
79

            
80
## Milestone 3 - SQLite Archive V2 Schema
81

            
82
**Purpose:** Create the new archive foundation.
83

            
84
Checklist:
85
- [x] Implement `schema_migrations`.
86
- [x] Implement `archive_metadata`.
87
- [x] Implement `device_chains`.
88
- [x] Implement `observations`.
89
- [x] Implement `sample_types`.
90
- [x] Implement `observation_type_runs`.
91
- [x] Implement `sources`.
92
- [x] Implement `source_revisions`.
93
- [x] Implement `hk_devices`.
94
- [x] Implement `metadata_blobs`.
95
- [x] Implement `samples`.
96
- [x] Implement `sample_versions`.
97
- [x] Implement `sample_observation_events`.
98
- [x] Implement `sample_visibility_ranges`.
99
- [x] Implement `sample_relationships`.
100
- [x] Implement `observation_type_summaries`.
101
- [x] Implement `daily_type_aggregates`.
102
- [x] Implement `export_manifests`.
103
- [x] Implement `export_items`.
104
- [x] Add required indexes.
Bogdan Timofte authored 2 weeks ago
105
- [x] Add archive integrity report for schema version, required tables, `PRAGMA integrity_check`, and foreign keys.
Bogdan Timofte authored 2 weeks ago
106
- [x] Add SQLite integrity/open/schema-version tests.
Bogdan Timofte authored 2 weeks ago
107

            
108
Acceptance:
Bogdan Timofte authored 2 weeks ago
109
- [x] Fresh archive initializes successfully.
Bogdan Timofte authored 2 weeks ago
110
- [x] Schema version is recorded.
111
- [x] Archive v2 can initialize after old prototype stores are removed or ignored.
Bogdan Timofte authored 2 weeks ago
112
- [x] `PRAGMA integrity_check` passes.
Bogdan Timofte authored 2 weeks ago
113
- [x] Required indexes exist.
Bogdan Timofte authored 2 weeks ago
114
- [x] Empty archive queries return valid empty results.
Bogdan Timofte authored 2 weeks ago
115

            
116
## Milestone 4 - Differential Write Path
117

            
118
**Purpose:** Write observations without storing full recurring snapshots.
119

            
120
Checklist:
Bogdan Timofte authored 2 weeks ago
121
- [x] Create observation transaction wrapper.
Bogdan Timofte authored 2 weeks ago
122
- [x] Attach HealthKit pages, deleted-object evidence, and final type verification to the same user-visible archive observation.
123
- [x] Store the archive observation id on the legacy `HealthSnapshot` bridge model for transition screens.
Bogdan Timofte authored 2 weeks ago
124
- [x] Upsert sample types.
125
- [x] Upsert source/source revision/device/metadata rows.
126
- [x] Upsert sample identity.
127
- [x] Upsert sample payload version only when payload changes.
128
- [x] Insert appeared/verified/representationChanged events.
129
- [x] Record `HKDeletedObject` evidence by UUID hash.
130
- [x] Close visibility ranges for disappeared/deleted samples.
131
- [x] Maintain open visibility ranges for visible samples.
Bogdan Timofte authored 2 weeks ago
132
- [x] Rebuild/update affected type summaries and daily aggregates after capture/delete observations.
Bogdan Timofte authored 2 weeks ago
133
- [x] Commit SQLite before Core Data/cache work.
Bogdan Timofte authored 2 weeks ago
134
- [x] Make repeated capture page writes idempotent.
Bogdan Timofte authored 2 weeks ago
135
- [x] Stop writing the legacy `archive_samples` mirror during capture.
Bogdan Timofte authored 2 weeks ago
136
- [x] Move verification/delete bookkeeping to archive v2 tables.
137
- [x] Remove remaining `archive_samples` schema/update remnants.
Bogdan Timofte authored 2 weeks ago
138

            
139
Acceptance:
Bogdan Timofte authored 2 weeks ago
140
- [x] Initial import stores identities and versions once.
141
- [x] Re-running same page does not duplicate sample identities or payload versions.
142
- [x] Representation change creates a new version, not a new logical sample.
143
- [x] Disappearance closes visibility range.
Bogdan Timofte authored 2 weeks ago
144
- [x] No full observation copy table is created or written.
Bogdan Timofte authored 2 weeks ago
145
- [x] A user-visible capture has one archive observation id that SQL diff/cache/UI layers can reference.
Bogdan Timofte authored 2 weeks ago
146

            
147
## Milestone 5 - SQL Analysis Layer
148

            
149
**Purpose:** Make the archive useful without RAM-heavy processing.
150

            
151
Checklist:
Bogdan Timofte authored 2 weeks ago
152
- [x] Implement point-in-time visible-record query.
153
- [x] Implement paged record table query.
Bogdan Timofte authored 2 weeks ago
154
- [x] Implement appeared query between observations.
155
- [x] Implement disappeared query between observations.
156
- [x] Implement representationChanged query between observations.
157
- [x] Implement diff counts using temp tables or equivalent SQL-first strategy.
Bogdan Timofte authored 2 weeks ago
158
- [x] Implement aggregate comparison query.
Bogdan Timofte authored 2 weeks ago
159
- [x] Implement consolidation-likely evidence query.
160
- [x] Implement source/provenance breakdown query.
Bogdan Timofte authored 2 weeks ago
161
- [x] Add large synthetic diff/pagination regression.
Bogdan Timofte authored 2 weeks ago
162
- [x] Add formal query timing/memory metrics on synthetic large datasets.
Bogdan Timofte authored 2 weeks ago
163

            
164
Acceptance:
Bogdan Timofte authored 2 weeks ago
165
- [x] Observation T can be reconstructed from ranges/events.
Bogdan Timofte authored 2 weeks ago
166
- [x] Large diff returns counts and first page without loading all rows.
Bogdan Timofte authored 2 weeks ago
167
- [x] Query results are deterministic and ordered.
Bogdan Timofte authored 2 weeks ago
168
- [x] Consolidation evidence includes count, aggregate, coverage, density, and uncertainty data.
Bogdan Timofte authored 2 weeks ago
169

            
170
## Milestone 6 - Core Data UI/Report Cache
171

            
172
**Purpose:** Cache expensive presentation/report values while keeping SQLite authoritative.
173

            
174
Checklist:
Bogdan Timofte authored 2 weeks ago
175
- [x] Define Core Data model for observation rows.
176
- [x] Define type summary cache entity.
177
- [x] Define daily/monthly aggregate cache entity.
178
- [x] Define diff summary cache entity.
179
- [x] Define export manifest/status cache entity.
180
- [x] Define archive health/status cache entity.
181
- [x] Implement initial cache rebuild from SQLite for observation/type/daily/diff/export/health rows.
182
- [x] Include archive schema/cache schema/version/hash fields on rebuilt rows.
183
- [x] Implement delete-cache-and-rebuild flow.
184
- [x] Add cache schema/version and rebuild tests.
185
- [ ] Wire Core Data cache into UI-facing view models.
186
- [ ] Add targeted partial invalidation for affected observation/type ranges.
Bogdan Timofte authored 2 weeks ago
187

            
188
Acceptance:
Bogdan Timofte authored 2 weeks ago
189
- [x] Deleting Core Data cache does not lose forensic data.
190
- [x] Cache rebuild restores dashboard/timeline/report summaries.
191
- [x] Cache rows include source observation ids and archive/cache schema versions.
Bogdan Timofte authored 2 weeks ago
192
- [ ] SQLite wins on disagreement.
193

            
194
## Milestone 7 - Export Layer
195

            
196
**Purpose:** Produce scoped, recovery-compatible exports.
197

            
198
Checklist:
199
- [ ] Define JSON export envelope.
200
- [ ] Define CSV record-table export.
201
- [ ] Define manifest hash algorithm.
202
- [ ] Include archive/app/schema/observation metadata.
203
- [ ] Include sample identity and payload version hashes.
204
- [ ] Include values/dates/units/type fields.
205
- [ ] Include source/provenance metadata where available and allowed.
206
- [ ] Include relationships where available.
207
- [ ] Include provenance-loss warning for external HealthKit re-publication.
208
- [ ] Stream/page export from SQLite.
209
- [ ] Store export manifest rows.
210
- [ ] Add reproducibility test for export manifests.
211

            
212
Acceptance:
213
- [ ] Large export does not materialize full record set in RAM.
214
- [ ] Export can be verified against archive hashes.
215
- [ ] Export contains enough structure for external recovery/salvage tooling.
216
- [ ] App still does not perform restore, backup patching, or HealthKit re-publication.
217

            
218
## Milestone 8 - UI/Data Flow Migration
219

            
220
**Purpose:** Move UI from prototype storage to target cache/query flow.
221

            
222
Checklist:
223
- [ ] Replace direct SwiftData `@Query` dependencies for target screens.
224
- [ ] Dashboard reads Core Data cache.
225
- [ ] Observation timeline reads Core Data cache.
226
- [ ] Observation detail uses cached summaries plus paged SQLite DTOs.
227
- [ ] Diff detail uses cached summary plus paged SQLite DTOs.
228
- [ ] Data type screens use target change labels.
229
- [ ] Export preview uses export query/manifest APIs.
230
- [ ] Archive status reflects SQLite/Core Data cache health.
231
- [ ] Legacy/small-device UI mode simplifies heavy visualizations.
232

            
233
Acceptance:
234
- [ ] Core Time Machine flows work without SwiftData as target persistence.
235
- [ ] UI copy uses observation/diff/export language.
236
- [ ] No count-only critical data loss messaging.
237
- [ ] Large record tables are paged.
238
- [ ] Legacy mode preserves capture/report/export.
239

            
240
## Milestone 9 - Legacy SwiftData Retirement
241

            
242
**Purpose:** Remove prototype persistence from the target architecture.
243

            
244
Checklist:
245
- [ ] Identify all remaining SwiftData imports.
246
- [ ] Replace SwiftData models used by active flows.
247
- [ ] Remove/disable `ModelContainer` as required for target builds.
Bogdan Timofte authored 2 weeks ago
248
- [x] Add prototype-store ignore/delete/reset path for test installs.
Bogdan Timofte authored 2 weeks ago
249
- [ ] Verify no old-store compatibility layer remains in active flows.
250
- [ ] Lower deployment target as far as dependencies allow.
251
- [ ] Verify build for iOS 15-era target constraints.
252

            
253
Acceptance:
254
- [ ] SwiftData is not required for normal app launch.
255
- [ ] Active flows use SQLite + Core Data cache.
Bogdan Timofte authored 2 weeks ago
256
- [x] Prototype data handling is explicit: old stores are ignored/deleted/reset for test installs.
Bogdan Timofte authored 2 weeks ago
257

            
258
## Milestone 10 - Acceptance Gate
259

            
260
**Purpose:** Decide whether the refactor is complete enough to build product features on top.
261

            
262
Checklist:
263
- [ ] Point-in-time reconstruction works.
264
- [ ] Large diff works SQL-first.
265
- [ ] Materialized aggregates can be rebuilt and verified.
266
- [ ] Core Data cache can be deleted and rebuilt.
267
- [ ] Large export streams/pages.
268
- [ ] Recovery-compatible manifest is present.
269
- [ ] SQLite integrity checks pass.
270
- [ ] Low-memory synthetic tests pass.
271
- [ ] UI no longer depends on SwiftData as foundation.
272
- [ ] Docs match implementation.
273

            
274
Acceptance:
275
- [ ] Product can safely proceed to UI polish and higher-level workflows.
276
- [ ] Database is no longer the main unresolved architectural risk.
277

            
278
## Parallelization Guide
279

            
280
Can run in parallel after Milestone 1:
281
- synthetic data harness;
282
- schema implementation;
283
- Core Data cache model drafting;
284
- export format drafting;
285
- UI DTO contract design.
286

            
287
Must not run before dependencies:
288
- UI migration before SQL query layer and Core Data cache exist;
289
- export implementation before manifest design is locked;
290
- legacy SwiftData removal before replacement flows exist;
291
- archive v2 initialization before reset/reinitialization policy is documented.
292

            
293
## Agent Assignment Hints
294

            
295
| Workstream | Primary Doc |
296
|------------|-------------|
297
| SQLite schema/write path/query layer | [`../02-architecture/Database-Design.md`](../02-architecture/Database-Design.md) |
298
| HealthKit capture integration | [`../02-architecture/Implementation-Guide.md`](../02-architecture/Implementation-Guide.md) |
299
| Core Data cache | [`../02-architecture/Core-Data-Cache-Design.md`](../02-architecture/Core-Data-Cache-Design.md) |
300
| Export formats/manifests | [`../02-architecture/Export-Specification.md`](../02-architecture/Export-Specification.md) |
301
| UI migration | [`../00-agent-guides/CLAUDE.md`](../00-agent-guides/CLAUDE.md) |
302
| Product language/non-goals | [`../01-product/MVP-Specification.md`](../01-product/MVP-Specification.md) |
303
| Status updates | [`IMPLEMENTATION_STATUS.md`](IMPLEMENTATION_STATUS.md) |