Showing 5 changed files with 184 additions and 40 deletions
+2 -2
HealthProbe/Doc/04-project/IMPLEMENTATION_STATUS.md
@@ -24,8 +24,8 @@ There are no real deployments, only test installations. Existing prototype datab
24 24
 | Area | Current Status | Target / Next Work |
25 25
 |------|----------------|--------------------|
26 26
 | Product docs | Updated | Keep `HealthProbe/Doc/README.md` as canonical index |
27
-| HealthKit capture | Capture now opens one archive observation per user-visible snapshot, attaches HealthKit pages, deleted-object evidence, and type verification to that observation id before finishing it, no longer aborts initial full-history imports after a fixed 30-minute wall-clock cap while page-level HealthKit timeouts remain in place, defers grouped observation summary/daily aggregate rebuilds until per-type verification instead of rebuilding after every imported page, and persists large HealthKit pages in smaller archive chunks while using smaller and more conservative initial-import page sizes for high-volume types, adaptive write chunk sizing, explicit task yields, and lower-allocation streaming loops to avoid long monolithic SQLite stalls | Continue moving UI/cache reads to archive-backed observation ids and revisit full checkpoint/resume and background collection separately |
28
-| SQLite archive | Archive v2 schema, snapshot-level observation grouping, differential write path, v2 verification/delete bookkeeping, daily aggregate rebuilds, integrity report, v2 record reads, SQL diff/count/aggregate/provenance/consolidation-evidence APIs, large synthetic diff pagination coverage, formal timing/memory metrics, and XCTest coverage are in place; the legacy `archive_samples` mirror has been removed, the hot write path now reuses prepared SQLite statements within grouped page writes instead of reparsing the same SQL for every sample, processes sample rows in a lower-allocation streaming loop, and opens SQLite connections with import-friendly busy timeout / synchronous / temp-store pragmas | Continue moving capture/Dashboard actions to archive/cache DTOs |
27
+| HealthKit capture | Capture now opens one archive observation per user-visible snapshot, attaches HealthKit pages, deleted-object evidence, and type verification to that observation id before finishing it, no longer aborts initial full-history imports after a fixed 30-minute wall-clock cap while page-level HealthKit timeouts remain in place, defers grouped observation summary/daily aggregate rebuilds until per-type verification instead of rebuilding after every imported page, and persists large HealthKit pages in smaller archive chunks while using smaller and more conservative initial-import page sizes for high-volume types, adaptive write chunk sizing, batched deleted-object persistence, explicit task yields, and lower-allocation streaming loops to avoid long monolithic SQLite stalls | Continue moving UI/cache reads to archive-backed observation ids and revisit full checkpoint/resume and background collection separately |
28
+| SQLite archive | Archive v2 schema, snapshot-level observation grouping, differential write path, v2 verification/delete bookkeeping, daily aggregate rebuilds, integrity report, v2 record reads, SQL diff/count/aggregate/provenance/consolidation-evidence APIs, large synthetic diff pagination coverage, formal timing/memory metrics, and XCTest coverage are in place; the legacy `archive_samples` mirror has been removed, the hot write path now reuses prepared SQLite statements within grouped page writes instead of reparsing the same SQL for every sample, processes sample rows in a lower-allocation streaming loop, batches same-page deleted-object evidence in one transaction, and opens SQLite connections with import-friendly busy timeout / synchronous / temp-store pragmas | Continue moving capture/Dashboard actions to archive/cache DTOs |
29 29
 | Core Data cache | Initial programmatic Core Data model, full-cache rebuild service, read DTOs for observation/type/diff/health rows, and Dashboard archive-cache status wiring are in place | Move remaining export/report paths to cache DTOs and add targeted partial invalidation |
30 30
 | SwiftData cache | Exists; test builds now reset legacy prototype UI/archive/cache stores once for archive v2 so old SwiftData-only snapshots are not treated as backed-up observations. Metric timeout calibration, local device profile settings, operation logging, ContentView preview, Settings data maintenance, legacy detail/PDF views, unused legacy repair/observer services, Dashboard view/view-model access, and legacy anomaly/count-drop review have moved outside SwiftData or been removed. Remaining SwiftData imports are inventoried in [`SwiftData-Retirement-Inventory.md`](SwiftData-Retirement-Inventory.md) | Treat as disposable prototype data; stop returning/storing `HealthSnapshot` bridge handles before removing `ModelContainer` |
31 31
 | UI | Prototype exists; Dashboard status reads archive/cache observation rows and shows cache health, and Dashboard view/view-model code no longer imports SwiftData or reads `ModelContext`; capture/review actions now route through DTOs and snapshot ids, with the remaining legacy bridge isolated in `HealthKitService`. Snapshots and Data Types tab roots no longer import SwiftData, load Core Data cached observation rows, and open archive/cache-backed detail rows; `SnapshotArchiveDetailView` and `DataTypeArchiveDetailView` read Core Data type/diff summaries and page record drill-down through SQLite; unused legacy SwiftData snapshot/type detail and PDF views have been deleted; record-change evolution and temporal distribution screens now receive DTO rows/cache input instead of querying SwiftData directly; export preview reads the archive export API before showing/exporting JSON; simplified detail mode replaces heavy charts with summary rows on small/accessibility layouts or when enabled in Settings; visible change labels now use neutral new/missing/change-review language | Stop writing prototype `HealthSnapshot` bridge rows during capture/review |
+65 -15
HealthProbe/Services/HealthKitService.swift
@@ -1105,7 +1105,7 @@ final class HealthKitService {
1105 1105
                     limit: incrementalStrategy.queryPageLimit
1106 1106
                 )
1107 1107
             }
1108
-            persistenceState = try await archivePage(
1108
+            let archiveResult = try await archivePage(
1109 1109
                 page,
1110 1110
                 sampleType: sampleType,
1111 1111
                 observationID: archiveObservationID,
@@ -1117,6 +1117,7 @@ final class HealthKitService {
1117 1117
                 processedEventCount: processedEventCount,
1118 1118
                 persistenceState: persistenceState
1119 1119
             )
1120
+            persistenceState = archiveResult.persistenceState
1120 1121
             anchor = page.anchor
1121 1122
 
1122 1123
             if page.samples.isEmpty, page.deletedObjects.isEmpty,
@@ -1204,7 +1205,7 @@ final class HealthKitService {
1204 1205
                     limit: incrementalStrategy.queryPageLimit
1205 1206
                 )
1206 1207
             }
1207
-            persistenceState = try await archivePage(
1208
+            let archiveResult = try await archivePage(
1208 1209
                 page,
1209 1210
                 sampleType: sampleType,
1210 1211
                 observationID: archiveObservationID,
@@ -1216,6 +1217,7 @@ final class HealthKitService {
1216 1217
                 processedEventCount: processedEventCount,
1217 1218
                 persistenceState: persistenceState
1218 1219
             )
1220
+            persistenceState = archiveResult.persistenceState
1219 1221
             anchor = page.anchor
1220 1222
 
1221 1223
             applyDistributionPage(page, sampleType: sampleType, to: &recordMap)
@@ -1356,7 +1358,7 @@ final class HealthKitService {
1356 1358
                     limit: importStrategy.queryPageLimit
1357 1359
                 )
1358 1360
             }
1359
-            persistenceState = try await archivePage(
1361
+            let archiveResult = try await archivePage(
1360 1362
                 page,
1361 1363
                 sampleType: sampleType,
1362 1364
                 observationID: archiveObservationID,
@@ -1368,6 +1370,7 @@ final class HealthKitService {
1368 1370
                 processedEventCount: processedEventCount,
1369 1371
                 persistenceState: persistenceState
1370 1372
             )
1373
+            persistenceState = archiveResult.persistenceState
1371 1374
             anchor = page.anchor
1372 1375
 
1373 1376
             for sample in page.samples {
@@ -1571,8 +1574,11 @@ final class HealthKitService {
1571 1574
         progressStarted: Date? = nil,
1572 1575
         processedEventCount: Int? = nil,
1573 1576
         persistenceState: DistributionPagePersistenceState
1574
-    ) async throws -> DistributionPagePersistenceState {
1577
+    ) async throws -> DistributionPageArchiveResult {
1575 1578
         var persistenceState = persistenceState
1579
+        let completedEventCountBeforePage = processedEventCount ?? 0
1580
+        var persistedSampleCount = 0
1581
+        var persistedDeletedCount = 0
1576 1582
         let observedAt = Date()
1577 1583
         if !page.samples.isEmpty {
1578 1584
             var batchStart = 0
@@ -1593,12 +1599,12 @@ final class HealthKitService {
1593 1599
                     progress.updateBlockProgress(
1594 1600
                         typeIdentifier,
1595 1601
                         detail: detail,
1596
-                        recordCount: recordCount,
1602
+                        recordCount: recordCount.map { $0 + persistedSampleCount },
1597 1603
                         elapsedSeconds: progressStarted.map { Date().timeIntervalSince($0) },
1598 1604
                         samplesPerSecond: {
1599
-                            guard let progressStarted, let processedEventCount else { return nil }
1605
+                            guard let progressStarted else { return nil }
1600 1606
                             return Self.samplesPerSecond(
1601
-                                processedCount: processedEventCount,
1607
+                                processedCount: completedEventCountBeforePage + persistedSampleCount + persistedDeletedCount,
1602 1608
                                 elapsedSeconds: Date().timeIntervalSince(progressStarted)
1603 1609
                             )
1604 1610
                         }()
@@ -1610,20 +1616,56 @@ final class HealthKitService {
1610 1616
                     observedAt: observedAt,
1611 1617
                     observationID: observationID
1612 1618
                 )
1619
+                persistedSampleCount += sampleBatch.count
1613 1620
                 persistenceState.registerBatchDuration(Date().timeIntervalSince(batchStartedAt))
1614 1621
                 batchStart = batchEnd
1615 1622
                 await Task.yield()
1616 1623
             }
1617 1624
         }
1618
-        for deletedObject in page.deletedObjects {
1619
-            try await archiveStore.recordDisappearance(
1620
-                sampleUUIDHash: HashService.sampleUUIDHash(deletedObject.uuid.uuidString),
1621
-                sampleTypeIdentifier: sampleType.identifier,
1622
-                observedMissingAt: observedAt,
1623
-                observationID: observationID
1624
-            )
1625
+        if !page.deletedObjects.isEmpty {
1626
+            let deletedHashes = page.deletedObjects.map { HashService.sampleUUIDHash($0.uuid.uuidString) }
1627
+            var deleteBatchStart = 0
1628
+            while deleteBatchStart < deletedHashes.count {
1629
+                let deleteBatchEnd = min(
1630
+                    deleteBatchStart + DistributionCaptureConfiguration.deleteBatchSize,
1631
+                    deletedHashes.count
1632
+                )
1633
+                let deleteBatch = Array(deletedHashes[deleteBatchStart..<deleteBatchEnd])
1634
+                if let progress, let typeIdentifier, let progressDetail {
1635
+                    let detail = deletedHashes.count <= DistributionCaptureConfiguration.deleteBatchSize
1636
+                        ? "\(progressDetail) (deletes \(deleteBatchEnd)/\(deletedHashes.count))"
1637
+                        : "\(progressDetail) (deletes \(deleteBatchStart + 1)-\(deleteBatchEnd)/\(deletedHashes.count))"
1638
+                    progress.updateBlockProgress(
1639
+                        typeIdentifier,
1640
+                        detail: detail,
1641
+                        recordCount: recordCount.map { $0 + persistedSampleCount },
1642
+                        elapsedSeconds: progressStarted.map { Date().timeIntervalSince($0) },
1643
+                        samplesPerSecond: {
1644
+                            guard let progressStarted else { return nil }
1645
+                            return Self.samplesPerSecond(
1646
+                                processedCount: completedEventCountBeforePage + persistedSampleCount + persistedDeletedCount,
1647
+                                elapsedSeconds: Date().timeIntervalSince(progressStarted)
1648
+                            )
1649
+                        }()
1650
+                    )
1651
+                }
1652
+                let deletedCount = try await archiveStore.recordDisappearances(
1653
+                    sampleUUIDHashes: deleteBatch,
1654
+                    sampleTypeIdentifier: sampleType.identifier,
1655
+                    observedMissingAt: observedAt,
1656
+                    observationID: observationID
1657
+                )
1658
+                _ = deletedCount
1659
+                persistedDeletedCount += deleteBatch.count
1660
+                deleteBatchStart = deleteBatchEnd
1661
+                await Task.yield()
1662
+            }
1625 1663
         }
1626
-        return persistenceState
1664
+        return DistributionPageArchiveResult(
1665
+            persistenceState: persistenceState,
1666
+            persistedSampleCount: persistedSampleCount,
1667
+            persistedDeletedCount: persistedDeletedCount
1668
+        )
1627 1669
     }
1628 1670
 
1629 1671
     private static func archiveAnchor(_ anchor: HKQueryAnchor) -> Data? {
@@ -2500,6 +2542,12 @@ struct DistributionPagePersistenceState: Equatable {
2500 2542
     }
2501 2543
 }
2502 2544
 
2545
+struct DistributionPageArchiveResult: Equatable {
2546
+    let persistenceState: DistributionPagePersistenceState
2547
+    let persistedSampleCount: Int
2548
+    let persistedDeletedCount: Int
2549
+}
2550
+
2503 2551
 enum DistributionCaptureConfiguration {
2504 2552
     static let pageTimeoutSeconds: TimeInterval = 60
2505 2553
     static let incrementalStrategy = DistributionCaptureStrategy(
@@ -2537,4 +2585,6 @@ enum DistributionCaptureConfiguration {
2537 2585
             severeBatchThresholdSeconds: 4
2538 2586
         )
2539 2587
     }
2588
+
2589
+    static let deleteBatchSize = 100
2540 2590
 }
+1 -0
HealthProbe/Services/Protocols/HealthArchiveStore.swift
@@ -11,6 +11,7 @@ protocol HealthArchiveStore {
11 11
     func markVerification(sampleType: HKSampleType, verifiedAt: Date, observationID: Int64) async throws
12 12
     func recordDisappearance(sampleUUIDHash: String, sampleTypeIdentifier: String, observedMissingAt: Date) async throws
13 13
     func recordDisappearance(sampleUUIDHash: String, sampleTypeIdentifier: String, observedMissingAt: Date, observationID: Int64) async throws
14
+    func recordDisappearances(sampleUUIDHashes: [String], sampleTypeIdentifier: String, observedMissingAt: Date, observationID: Int64) async throws -> Int
14 15
     func records(for request: HealthArchiveRecordRequest) async throws -> [ArchivedHealthRecord]
15 16
     func diffSummary(_ request: HealthArchiveDiffRequest) async throws -> HealthArchiveDiffSummary
16 17
     func diffRecords(_ request: HealthArchiveDiffRecordRequest) async throws -> [ArchivedHealthRecord]
+83 -23
HealthProbe/Services/SQLiteHealthArchiveStore.swift
@@ -223,6 +223,35 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
223 223
         }
224 224
     }
225 225
 
226
+    func recordDisappearances(
227
+        sampleUUIDHashes: [String],
228
+        sampleTypeIdentifier: String,
229
+        observedMissingAt: Date,
230
+        observationID: Int64
231
+    ) async throws -> Int {
232
+        guard !sampleUUIDHashes.isEmpty else { return 0 }
233
+
234
+        let db = try openDatabase()
235
+        defer { sqlite3_close(db) }
236
+        try prepareSchemaIfNeeded(db)
237
+        try execute("BEGIN IMMEDIATE TRANSACTION", db: db)
238
+        do {
239
+            let deletedCount = try recordDisappearances(
240
+                sampleUUIDHashes: sampleUUIDHashes,
241
+                sampleTypeIdentifier: sampleTypeIdentifier,
242
+                observedMissingAt: observedMissingAt,
243
+                observationID: observationID,
244
+                db: db,
245
+                rebuildDerivedState: false
246
+            )
247
+            try execute("COMMIT", db: db)
248
+            return deletedCount
249
+        } catch {
250
+            try? execute("ROLLBACK", db: db)
251
+            throw error
252
+        }
253
+    }
254
+
226 255
     func records(for request: HealthArchiveRecordRequest) async throws -> [ArchivedHealthRecord] {
227 256
         let db = try openDatabase()
228 257
         defer { sqlite3_close(db) }
@@ -1920,47 +1949,77 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
1920 1949
         db: OpaquePointer?,
1921 1950
         rebuildDerivedState: Bool
1922 1951
     ) throws {
1952
+        _ = try recordDisappearances(
1953
+            sampleUUIDHashes: [sampleUUIDHash],
1954
+            sampleTypeIdentifier: sampleTypeIdentifier,
1955
+            observedMissingAt: observedMissingAt,
1956
+            observationID: observationID,
1957
+            db: db,
1958
+            rebuildDerivedState: rebuildDerivedState
1959
+        )
1960
+    }
1961
+
1962
+    private func recordDisappearances(
1963
+        sampleUUIDHashes: [String],
1964
+        sampleTypeIdentifier: String,
1965
+        observedMissingAt: Date,
1966
+        observationID: Int64,
1967
+        db: OpaquePointer?,
1968
+        rebuildDerivedState: Bool
1969
+    ) throws -> Int {
1923 1970
         let statementCache = SQLiteStatementCache(db: db)
1924 1971
         defer { statementCache.finalizeAll() }
1925 1972
         guard let sampleTypeID = try sampleTypeID(
1926 1973
                 typeIdentifier: sampleTypeIdentifier,
1927 1974
                 db: db,
1928 1975
                 statementCache: statementCache
1929
-              ),
1930
-              let sampleID = try sampleID(
1976
+              ) else {
1977
+            return 0
1978
+        }
1979
+
1980
+        var deletedCount = 0
1981
+        for sampleUUIDHash in sampleUUIDHashes {
1982
+            guard let sampleID = try sampleID(
1931 1983
                 sampleUUIDHash: sampleUUIDHash,
1932 1984
                 sampleTypeID: sampleTypeID,
1933 1985
                 db: db,
1934 1986
                 statementCache: statementCache
1935
-              ) else {
1936
-            return
1987
+            ) else {
1988
+                continue
1989
+            }
1990
+
1991
+            try insertObservationEvent(
1992
+                observationID: observationID,
1993
+                sampleID: sampleID,
1994
+                versionID: nil,
1995
+                eventKind: "disappeared",
1996
+                evidenceKind: "deleted_object",
1997
+                observedAt: observedMissingAt,
1998
+                db: db,
1999
+                statementCache: statementCache
2000
+            )
2001
+            try closeOpenVisibilityRanges(
2002
+                sampleID: sampleID,
2003
+                excludingVersionID: nil,
2004
+                closedAtObservationID: observationID,
2005
+                observedAt: observedMissingAt,
2006
+                db: db,
2007
+                statementCache: statementCache
2008
+            )
2009
+            deletedCount += 1
2010
+        }
2011
+
2012
+        guard deletedCount > 0 else {
2013
+            return 0
1937 2014
         }
1938 2015
 
1939
-        try insertObservationEvent(
1940
-            observationID: observationID,
1941
-            sampleID: sampleID,
1942
-            versionID: nil,
1943
-            eventKind: "disappeared",
1944
-            evidenceKind: "deleted_object",
1945
-            observedAt: observedMissingAt,
1946
-            db: db,
1947
-            statementCache: statementCache
1948
-        )
1949
-        try closeOpenVisibilityRanges(
1950
-            sampleID: sampleID,
1951
-            excludingVersionID: nil,
1952
-            closedAtObservationID: observationID,
1953
-            observedAt: observedMissingAt,
1954
-            db: db,
1955
-            statementCache: statementCache
1956
-        )
1957 2016
         try insertObservationTypeRun(
1958 2017
             observationID: observationID,
1959 2018
             sampleTypeID: sampleTypeID,
1960 2019
             status: "running",
1961 2020
             observedAt: observedMissingAt,
1962 2021
             insertedEventCount: 0,
1963
-            deletedEventCount: 1,
2022
+            deletedEventCount: deletedCount,
1964 2023
             verifiedVisibleCount: nil,
1965 2024
             db: db,
1966 2025
             statementCache: statementCache
@@ -1974,6 +2033,7 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
1974 2033
                 db: db
1975 2034
             )
1976 2035
         }
2036
+        return deletedCount
1977 2037
     }
1978 2038
 
1979 2039
     private func upsertArchiveV2Sample(
+33 -0
HealthProbeTests/SQLiteHealthArchiveStoreTests.swift
@@ -261,6 +261,39 @@ final class SQLiteHealthArchiveStoreTests: XCTestCase {
261 261
         XCTAssertGreaterThan(try countRows(in: "daily_type_aggregates WHERE observation_id = \(observationID)", at: url), 0)
262 262
     }
263 263
 
264
+    func testGroupedObservationCanBatchDeletedObjectsInSingleRun() async throws {
265
+        let url = databaseURL()
266
+        let store = SQLiteHealthArchiveStore(databaseURL: url)
267
+        let samples = [
268
+            makeStepCountSample(value: 42, start: 1_000),
269
+            makeStepCountSample(value: 7, start: 2_000),
270
+            makeStepCountSample(value: 9, start: 3_000)
271
+        ]
272
+        let typeIdentifier = HKQuantityTypeIdentifier.stepCount.rawValue
273
+
274
+        _ = try await store.upsertSamples(samples, observedAt: Date(timeIntervalSince1970: 4_000))
275
+        let observationID = try await store.beginObservation(
276
+            observedAt: Date(timeIntervalSince1970: 4_060),
277
+            triggerReason: "manual",
278
+            selectedTypeSetHash: "selected-types"
279
+        )
280
+        let deletedCount = try await store.recordDisappearances(
281
+            sampleUUIDHashes: samples.map { HashService.sampleUUIDHash($0.uuid.uuidString) },
282
+            sampleTypeIdentifier: typeIdentifier,
283
+            observedMissingAt: Date(timeIntervalSince1970: 4_060),
284
+            observationID: observationID
285
+        )
286
+        try await store.markVerification(
287
+            sampleType: samples[0].sampleType,
288
+            verifiedAt: Date(timeIntervalSince1970: 4_060),
289
+            observationID: observationID
290
+        )
291
+
292
+        XCTAssertEqual(deletedCount, 3)
293
+        XCTAssertEqual(try countRows(in: "sample_observation_events WHERE observation_id = \(observationID) AND event_kind = 'disappeared'", at: url), 3)
294
+        XCTAssertEqual(try countRows(in: "observation_type_runs WHERE observation_id = \(observationID) AND deleted_event_count = 3 AND status = 'completed'", at: url), 1)
295
+    }
296
+
264 297
     func testLargeSyntheticDiffReturnsCountsAndBoundedPages() async throws {
265 298
         let url = databaseURL()
266 299
         let store = SQLiteHealthArchiveStore(databaseURL: url)