Showing 5 changed files with 117 additions and 82 deletions
+1 -1
HealthProbe/Doc/02-architecture/Database-Design.md
@@ -488,7 +488,7 @@ Do not include SQLite row ids in fingerprints. If HealthKit UUID is available, a
488 488
 
489 489
 `payload_hash` is `SHA-256` over the canonical sample payload representation, including dates, value/unit/category/workout fields, source revision fields, device provenance hashes, metadata hash, and relationship payload when available. A new `sample_versions` row is created when `payload_hash` changes.
490 490
 
491
-Implementation note, 2026-05-24: archive v2 capture must derive `payload_hash` from the same normalized row values that are persisted. Unknown HealthKit OS versions, including `0.0.0`, are stored as absent. Capture must not depend on the legacy `archive_samples` mirror, and capture pages must leave that transitional table empty while remaining v2 schema cleanup is in progress.
491
+Implementation note, 2026-05-24: archive v2 capture must derive `payload_hash` from the same normalized row values that are persisted. Unknown HealthKit OS versions, including `0.0.0`, are stored as absent. Capture, verification, and deletion bookkeeping must use archive v2 identity/version/visibility tables, not the removed legacy `archive_samples` mirror.
492 492
 
493 493
 `semantic_fingerprint` is type-specific and optional. It supports consolidation heuristics and fuzzy backup/export reconciliation, but it is never sufficient by itself to prove record identity.
494 494
 
+8 -9
HealthProbe/Doc/04-project/IMPLEMENTATION_STATUS.md
@@ -25,7 +25,7 @@ There are no real deployments, only test installations. Existing prototype datab
25 25
 |------|----------------|--------------------|
26 26
 | Product docs | Updated | Keep `HealthProbe/Doc/README.md` as canonical index |
27 27
 | HealthKit capture | Prototype exists | Adapt capture to write differential SQLite observations first |
28
-| SQLite archive | Archive v2 schema, differential write path, daily aggregate rebuilds, integrity report, v2 record reads, SQL diff/count/aggregate/provenance/consolidation-evidence APIs, large synthetic diff pagination coverage, formal timing/memory metrics, and XCTest coverage are in place; capture no longer writes the legacy `archive_samples` mirror | Remove the now-empty legacy schema/update remnants after v2 verification/delete flows are complete, then start Core Data cache work |
28
+| SQLite archive | Archive v2 schema, differential write path, v2 verification/delete bookkeeping, daily aggregate rebuilds, integrity report, v2 record reads, SQL diff/count/aggregate/provenance/consolidation-evidence APIs, large synthetic diff pagination coverage, formal timing/memory metrics, and XCTest coverage are in place; the legacy `archive_samples` mirror has been removed | Start Core Data cache work |
29 29
 | Core Data cache | Not implemented | Add rebuildable cache for expensive counts, summaries, report metadata, UI state |
30 30
 | SwiftData cache | Exists | Treat as disposable prototype data; reset/ignore during v2 transition |
31 31
 | UI | Prototype exists | Reframe screens around observations, diffs, export, archive status |
@@ -38,22 +38,21 @@ There are no real deployments, only test installations. Existing prototype datab
38 38
 
39 39
 Detailed checkable milestones live in [`Refactoring-Plan.md`](Refactoring-Plan.md).
40 40
 
41
-1. Remove the remaining empty legacy `archive_samples` schema/update remnants once v2 verification/delete paths no longer reference them.
42
-2. Add Core Data UI/report cache and rebuild pipeline.
43
-3. Replace SwiftData UI dependencies with Core Data/cache DTOs.
44
-4. Update UI language from anomaly/status to observation/diff/export.
45
-5. Add streaming exports with manifests.
46
-6. Validate on low-memory/legacy-class devices.
41
+1. Add Core Data UI/report cache and rebuild pipeline.
42
+2. Replace SwiftData UI dependencies with Core Data/cache DTOs.
43
+3. Update UI language from anomaly/status to observation/diff/export.
44
+4. Add streaming exports with manifests.
45
+5. Validate on low-memory/legacy-class devices.
47 46
 
48 47
 ## Known Prototype Mismatches
49 48
 
50 49
 - SwiftData currently blocks iOS 15-era device support.
51 50
 - Existing `Anomaly*` model/service names are legacy language.
52 51
 - Some screens still imply snapshot-count monitoring rather than Time Machine inspection.
53
-- Remaining `archive_samples` table/update statements are transitional leftovers; capture writes only archive v2 identity/version/visibility tables.
52
+- Current UI/cache layers still depend on SwiftData prototype models.
54 53
 - Existing implementation may decode or cache too much data for low-end devices.
55 54
 - Old prototype database compatibility is no longer required.
56
-- Initial SQLite archive tests cover open/init/reset/idempotency, legacy mirror non-use, small observation diffs, large synthetic diff pagination, formal timing/memory metrics, materialized aggregate comparison, source/provenance breakdowns, and consolidation-evidence labels, but not yet export behavior.
55
+- Initial SQLite archive tests cover open/init/reset/idempotency, legacy mirror removal, small observation diffs, large synthetic diff pagination, formal timing/memory metrics, materialized aggregate comparison, source/provenance breakdowns, and consolidation-evidence labels, but not yet export behavior.
57 56
 
58 57
 ## Verification Checklist
59 58
 
+3 -2
HealthProbe/Doc/04-project/Refactoring-Plan.md
@@ -131,14 +131,15 @@ Checklist:
131 131
 - [x] Commit SQLite before Core Data/cache work.
132 132
 - [x] Make repeated capture page writes idempotent.
133 133
 - [x] Stop writing the legacy `archive_samples` mirror during capture.
134
-- [ ] Remove remaining empty `archive_samples` schema/update remnants after v2 verification/delete paths are complete.
134
+- [x] Move verification/delete bookkeeping to archive v2 tables.
135
+- [x] Remove remaining `archive_samples` schema/update remnants.
135 136
 
136 137
 Acceptance:
137 138
 - [x] Initial import stores identities and versions once.
138 139
 - [x] Re-running same page does not duplicate sample identities or payload versions.
139 140
 - [x] Representation change creates a new version, not a new logical sample.
140 141
 - [x] Disappearance closes visibility range.
141
-- [x] No full observation copy table is written during capture.
142
+- [x] No full observation copy table is created or written.
142 143
 
143 144
 ## Milestone 5 - SQL Analysis Layer
144 145
 
+62 -69
HealthProbe/Services/SQLiteHealthArchiveStore.swift
@@ -68,19 +68,37 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
68 68
         let db = try openDatabase()
69 69
         defer { sqlite3_close(db) }
70 70
         try prepareSchemaIfNeeded(db)
71
-
72
-        let sql = """
73
-        UPDATE archive_samples
74
-        SET last_verified_at = ?, last_seen_at = COALESCE(last_seen_at, ?)
75
-        WHERE type_identifier = ? AND disappeared_at IS NULL
76
-        """
77
-        try withStatement(sql, db: db) { statement in
78
-            sqlite3_bind_double(statement, 1, verifiedAt.timeIntervalSinceReferenceDate)
79
-            sqlite3_bind_double(statement, 2, verifiedAt.timeIntervalSinceReferenceDate)
80
-            bindText(sampleType.identifier, to: 3, in: statement)
81
-            guard sqlite3_step(statement) == SQLITE_DONE else {
82
-                throw SQLiteHealthArchiveStoreError.stepFailed(lastErrorMessage(db))
83
-            }
71
+        try execute("BEGIN IMMEDIATE TRANSACTION", db: db)
72
+        do {
73
+            let observationID = try createObservation(
74
+                observedAt: verifiedAt,
75
+                triggerReason: "verification",
76
+                status: "completed",
77
+                db: db
78
+            )
79
+            let sampleTypeID = try upsertSampleType(typeIdentifier: sampleType.identifier, db: db)
80
+            let visibleCount = try visibleAggregate(sampleTypeID: sampleTypeID, db: db).visibleRecordCount
81
+            try insertObservationTypeRun(
82
+                observationID: observationID,
83
+                sampleTypeID: sampleTypeID,
84
+                status: "completed",
85
+                observedAt: verifiedAt,
86
+                insertedEventCount: 0,
87
+                deletedEventCount: 0,
88
+                verifiedVisibleCount: visibleCount,
89
+                db: db
90
+            )
91
+            try rebuildTypeSummary(observationID: observationID, sampleTypeID: sampleTypeID, db: db)
92
+            try rebuildDailyAggregates(
93
+                observationID: observationID,
94
+                sampleTypeID: sampleTypeID,
95
+                observedAt: verifiedAt,
96
+                db: db
97
+            )
98
+            try execute("COMMIT", db: db)
99
+        } catch {
100
+            try? execute("ROLLBACK", db: db)
101
+            throw error
84 102
         }
85 103
     }
86 104
 
@@ -123,20 +141,6 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
123 141
                 )
124 142
             }
125 143
 
126
-            let sql = """
127
-            UPDATE archive_samples
128
-            SET disappeared_at = ?, last_verified_at = ?
129
-            WHERE sample_uuid_hash = ? AND type_identifier = ?
130
-            """
131
-            try withStatement(sql, db: db) { statement in
132
-                sqlite3_bind_double(statement, 1, observedMissingAt.timeIntervalSinceReferenceDate)
133
-                sqlite3_bind_double(statement, 2, observedMissingAt.timeIntervalSinceReferenceDate)
134
-                bindText(sampleUUIDHash, to: 3, in: statement)
135
-                bindText(sampleTypeIdentifier, to: 4, in: statement)
136
-                guard sqlite3_step(statement) == SQLITE_DONE else {
137
-                    throw SQLiteHealthArchiveStoreError.stepFailed(lastErrorMessage(db))
138
-                }
139
-            }
140 144
             try execute("COMMIT", db: db)
141 145
         } catch {
142 146
             try? execute("ROLLBACK", db: db)
@@ -1276,48 +1280,6 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
1276 1280
             PRIMARY KEY (export_manifest_id, sample_id, version_id)
1277 1281
         )
1278 1282
         """, db: db)
1279
-        try createLegacyArchiveSamplesTable(db)
1280
-    }
1281
-
1282
-    private func createLegacyArchiveSamplesTable(_ db: OpaquePointer?) throws {
1283
-        try execute("""
1284
-        CREATE TABLE IF NOT EXISTS archive_samples (
1285
-            sample_uuid_hash TEXT PRIMARY KEY NOT NULL,
1286
-            type_identifier TEXT NOT NULL,
1287
-            strict_fingerprint TEXT NOT NULL,
1288
-            semantic_fingerprint TEXT,
1289
-            start_date REAL NOT NULL,
1290
-            end_date REAL NOT NULL,
1291
-            first_seen_at REAL NOT NULL,
1292
-            last_seen_at REAL,
1293
-            last_verified_at REAL,
1294
-            disappeared_at REAL,
1295
-            observed_count INTEGER NOT NULL DEFAULT 1,
1296
-            value_kind TEXT,
1297
-            value REAL,
1298
-            unit TEXT,
1299
-            category_value INTEGER,
1300
-            workout_activity_type INTEGER,
1301
-            duration_seconds REAL,
1302
-            source_name TEXT,
1303
-            source_bundle_identifier TEXT,
1304
-            source_product_type TEXT,
1305
-            source_version TEXT,
1306
-            source_operating_system_version TEXT,
1307
-            device_name TEXT,
1308
-            device_manufacturer TEXT,
1309
-            device_model TEXT,
1310
-            device_hardware_version TEXT,
1311
-            device_firmware_version TEXT,
1312
-            device_software_version TEXT,
1313
-            device_local_identifier TEXT,
1314
-            device_udi_device_identifier TEXT,
1315
-            metadata_json TEXT,
1316
-            archived_at REAL NOT NULL
1317
-        )
1318
-        """, db: db)
1319
-        try execute("CREATE INDEX IF NOT EXISTS idx_archive_samples_type_date ON archive_samples(type_identifier, start_date)", db: db)
1320
-        try execute("CREATE INDEX IF NOT EXISTS idx_archive_samples_strict_fingerprint ON archive_samples(strict_fingerprint)", db: db)
1321 1283
     }
1322 1284
 
1323 1285
     private func seedArchiveMetadata(_ db: OpaquePointer?) throws {
@@ -1498,6 +1460,37 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
1498 1460
         return sqlite3_last_insert_rowid(db)
1499 1461
     }
1500 1462
 
1463
+    private func insertObservationTypeRun(
1464
+        observationID: Int64,
1465
+        sampleTypeID: Int64,
1466
+        status: String,
1467
+        observedAt: Date,
1468
+        insertedEventCount: Int,
1469
+        deletedEventCount: Int,
1470
+        verifiedVisibleCount: Int?,
1471
+        db: OpaquePointer?
1472
+    ) throws {
1473
+        let sql = """
1474
+        INSERT OR REPLACE INTO observation_type_runs (
1475
+            observation_id, sample_type_id, status, started_at, ended_at,
1476
+            inserted_event_count, deleted_event_count, verified_visible_count
1477
+        ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
1478
+        """
1479
+        try withStatement(sql, db: db) { statement in
1480
+            bindInt64(observationID, to: 1, in: statement)
1481
+            bindInt64(sampleTypeID, to: 2, in: statement)
1482
+            bindText(status, to: 3, in: statement)
1483
+            sqlite3_bind_double(statement, 4, observedAt.timeIntervalSince1970)
1484
+            sqlite3_bind_double(statement, 5, observedAt.timeIntervalSince1970)
1485
+            bindInt(insertedEventCount, to: 6, in: statement)
1486
+            bindInt(deletedEventCount, to: 7, in: statement)
1487
+            bindInt(verifiedVisibleCount, to: 8, in: statement)
1488
+            guard sqlite3_step(statement) == SQLITE_DONE else {
1489
+                throw SQLiteHealthArchiveStoreError.stepFailed(lastErrorMessage(db))
1490
+            }
1491
+        }
1492
+    }
1493
+
1501 1494
     private func upsertCurrentDeviceChain(_ db: OpaquePointer?) throws -> Int64 {
1502 1495
         let resolution = KeychainService.resolveDeviceID(swiftDataStoreIsEmpty: false)
1503 1496
         let chainHash = HashService.archiveContentHash(domain: "hp:v2:device_chain", parts: [resolution.id])
+43 -1
HealthProbeTests/SQLiteHealthArchiveStoreTests.swift
@@ -73,7 +73,7 @@ final class SQLiteHealthArchiveStoreTests: XCTestCase {
73 73
         XCTAssertEqual(try countRows(in: "sample_versions", at: url), 1, versionDebugRows)
74 74
         XCTAssertEqual(try countRows(in: "sample_visibility_ranges", at: url), 1)
75 75
         XCTAssertEqual(try countRows(in: "source_revisions", at: url), 1)
76
-        XCTAssertEqual(try countRows(in: "archive_samples", at: url), 0)
76
+        XCTAssertFalse(try tableExists("archive_samples", at: url))
77 77
         XCTAssertEqual(secondWrite.insertedCount, 0)
78 78
         XCTAssertEqual(secondWrite.updatedCount, 0)
79 79
         XCTAssertEqual(secondWrite.unchangedCount, 1)
@@ -82,6 +82,21 @@ final class SQLiteHealthArchiveStoreTests: XCTestCase {
82 82
         XCTAssertTrue(report.passed)
83 83
     }
84 84
 
85
+    func testVerificationUsesArchiveV2TablesWithoutLegacyMirror() async throws {
86
+        let url = databaseURL()
87
+        let store = SQLiteHealthArchiveStore(databaseURL: url)
88
+        let sample = makeStepCountSample()
89
+
90
+        _ = try await store.upsertSamples([sample], observedAt: Date(timeIntervalSince1970: 2_000))
91
+        try await store.markVerification(sampleType: sample.sampleType, verifiedAt: Date(timeIntervalSince1970: 2_060))
92
+        let observationIDs = try observationIDs(at: url)
93
+
94
+        XCTAssertEqual(observationIDs.count, 2)
95
+        XCTAssertEqual(try countRows(in: "observation_type_runs", at: url), 1)
96
+        XCTAssertEqual(try countRows(in: "observation_type_summaries WHERE observation_id = \(observationIDs[1]) AND visible_record_count = 1", at: url), 1)
97
+        XCTAssertFalse(try tableExists("archive_samples", at: url))
98
+    }
99
+
85 100
     func testDiffSummaryAndRecordsBetweenObservationsUseSQLVisibility() async throws {
86 101
         let url = databaseURL()
87 102
         let store = SQLiteHealthArchiveStore(databaseURL: url)
@@ -444,6 +459,33 @@ final class SQLiteHealthArchiveStoreTests: XCTestCase {
444 459
         return Int(sqlite3_column_int(statement, 0))
445 460
     }
446 461
 
462
+    private func tableExists(_ tableName: String, at url: URL) throws -> Bool {
463
+        var db: OpaquePointer?
464
+        guard sqlite3_open_v2(url.path, &db, SQLITE_OPEN_READONLY | SQLITE_OPEN_FULLMUTEX, nil) == SQLITE_OK else {
465
+            sqlite3_close(db)
466
+            XCTFail("Could not open test database")
467
+            return false
468
+        }
469
+        defer { sqlite3_close(db) }
470
+
471
+        let sql = """
472
+        SELECT 1
473
+        FROM sqlite_master
474
+        WHERE type = 'table' AND name = ?
475
+        LIMIT 1
476
+        """
477
+        var statement: OpaquePointer?
478
+        guard sqlite3_prepare_v2(db, sql, -1, &statement, nil) == SQLITE_OK else {
479
+            sqlite3_finalize(statement)
480
+            XCTFail("Could not prepare table existence query")
481
+            return false
482
+        }
483
+        defer { sqlite3_finalize(statement) }
484
+
485
+        sqlite3_bind_text(statement, 1, tableName, -1, unsafeBitCast(-1, to: sqlite3_destructor_type.self))
486
+        return sqlite3_step(statement) == SQLITE_ROW
487
+    }
488
+
447 489
     private func observationIDs(at url: URL) throws -> [Int64] {
448 490
         var db: OpaquePointer?
449 491
         guard sqlite3_open_v2(url.path, &db, SQLITE_OPEN_READONLY | SQLITE_OPEN_FULLMUTEX, nil) == SQLITE_OK else {