Showing 3 changed files with 97 additions and 16 deletions
+7 -7
HealthProbe/Doc/00-agent-guides/AGENTS.md
@@ -222,13 +222,13 @@ struct TypeDistributionBin: Codable, Sendable {
222 222
 // representation context, not treated as inherently alarming.
223 223
 
224 224
 // Interface updated 2026-05-17 — see AGENTS.md
225
-// Models/TypeCount.detailCacheData stores precomputed detail data for the current
226
-// TypeCount compared with the immediately previous snapshot on the same device.
227
-// The cache contains aggregate added/disappeared counts, capped preview records for
228
-// UI drill-down, and daily change bins for temporal charts. It must be computed when
229
-// snapshots are saved and refreshed for neighboring snapshots when snapshot deletion
230
-// changes chain links. Existing stores are backfilled incrementally with a strict
231
-// per-launch TypeCount cap to avoid decoding many large archives in one run.
225
+// Models/TypeCount.detailCacheData is legacy SwiftData UI cache for comparing a
226
+// TypeCount with the immediately previous snapshot on the same device. It contains
227
+// aggregate added/disappeared counts, capped preview records, and daily change bins.
228
+// It may be precomputed only for bounded archive sizes; high-volume types must skip
229
+// this legacy cache and rely on SQLite/Core Data archive views instead. Existing
230
+// stores are backfilled incrementally with strict caps to avoid decoding many large
231
+// archives in one run.
232 232
 
233 233
 // Interface updated 2026-05-17 — see AGENTS.md
234 234
 // Models/HealthSnapshot.contentEquivalentSnapshotID marks snapshots whose TypeCount
+47 -6
HealthProbe/Doc/04-project/Import-Optimization-Log.md
@@ -361,6 +361,42 @@ records when a small number of records changed. A pure no-delta report is still
361 361
 needed to validate the aggregate-copy path. The unexplained global gap between
362 362
 wall clock and summed metric total also needs timing if it persists.
363 363
 
364
+### 2026-06-03 Incremental Snapshot With Large Legacy Detail Cache Crash
365
+
366
+Commit context: after `19ba656`. Source: user-provided diagnostic report plus
367
+console crash log.
368
+
369
+This was still not a pure no-delta run: total record count increased from
370
+1,579,239 to 1,579,243 (+4 records). The report shows that archive inserts
371
+remained effectively eliminated, while aggregate finalization improved versus
372
+the previous changed incremental run.
373
+
374
+| Metric | Previous | Current | Change |
375
+|--------|----------|---------|--------|
376
+| Total records | 1,579,239 | 1,579,243 | +4 |
377
+| Wall clock | 51.3s | 44.7s | -6.6s |
378
+| Summed metric total | 34.3s | 26.7s | -7.6s |
379
+| Summed fetch | 0.2s | 0.2s | flat |
380
+| Summed processing | 21.1s | 18.4s | -2.7s |
381
+| Summed insert | 0.1s | 0.0s | -0.1s |
382
+| Summed finalize | 12.1s | 7.4s | -4.7s |
383
+| Heart Rate processing | 13.9s | 13.5s | -0.4s |
384
+| Heart Rate finalize | 8.7s | 4.8s | -3.9s |
385
+| Active Energy processing | 5.0s | 4.9s | -0.1s |
386
+| Active Energy finalize | 1.9s | 1.8s | -0.1s |
387
+
388
+The follow-up console log identifies the remaining post-import issue:
389
+`healthKit.precomputeDetailCaches` built legacy SwiftData detail caches for
390
+Heart Rate and Active Energy, scanning archives of about 137 MB / 922k records
391
+and 52 MB / 349k records. Immediately after that work completed, Core Data
392
+aborted on the main thread with a mutated-while-enumerated exception during
393
+change processing.
394
+
395
+Conclusion: the SQLite archive path is not the crash source. The expensive and
396
+crash-prone work is the legacy `TypeCount.detailCacheData` precompute after the
397
+snapshot save. Large type detail caches should be skipped and served from the
398
+SQLite/Core Data archive/cache path instead.
399
+
364 400
 ## Optimization Iterations
365 401
 
366 402
 | Date | Commit | Change | Result / Status |
@@ -385,6 +421,7 @@ wall clock and summed metric total also needs timing if it persists.
385 421
 | 2026-06-02 | pending | Moved Dashboard archive cache refresh/rebuild off the UI path after snapshot completion. | Awaiting real-device confirmation that the app no longer stays unresponsive for roughly one minute after a completed snapshot. |
386 422
 | 2026-06-03 | `f60f09a` | Fast-path unchanged samples whose current version already has an open visibility range. | Confirmed on repeated no-delta capture: `SummedInsertElapsed` remained 0.0s; remaining cost is `SummedFinalizeElapsed` 12.9s, with Heart Rate finalize 9.1s. |
387 423
 | 2026-06-03 | `19ba656` | Copy previous type summaries and daily aggregates for unchanged metric observations instead of rebuilding from visible ranges. | First follow-up was not no-delta (+71 records), so it is inconclusive for the unchanged path. It did show `SummedInsertElapsed` 0.1s and a new changed-metric processing bottleneck: Heart Rate processing 13.9s, Active Energy processing 5.0s. |
424
+| 2026-06-03 | pending | Skip legacy SwiftData detail-cache precompute for large type archives. | Triggered by a crash after building Heart Rate and Active Energy detail caches. Expected signal: no post-import `NSGenericException`, lower post-import wall-clock gap, and `healthKit.detailCache.skipLargeLegacyArchive` logs for high-volume types. |
388 425
 
389 426
 ## Current Diagnosis
390 427
 
@@ -401,18 +438,20 @@ The likely bottleneck is per-row SQLite work:
401 438
 - multiple dependent writes per sample;
402 439
 - commit / transaction shape;
403 440
 - no-op visibility range maintenance for unchanged existing samples;
404
-- full daily aggregate rebuilds for unchanged metric observations;
405 441
 - processing existing high-volume metrics when only a small number of records changed;
406
-- Core Data or UI refresh work after SQLite completes. This became the active
407
-  bottleneck after a no-delta incremental snapshot completed in 9.2s but the UI
408
-  remained unresponsive for roughly one minute.
442
+- legacy SwiftData detail-cache precompute after SQLite completes. This became
443
+  the active crash/performance issue when Heart Rate and Active Energy detail
444
+  caches scanned large archives and Core Data aborted during change processing.
409 445
 
410 446
 ## Open Issues / Observations
411 447
 
412 448
 - Very small pages reduced freeze risk but introduced visible overhead.
413 449
 - Some progress timing displayed in the UI did not include overhead, so elapsed time and rates looked better than the real operation.
414 450
 - A previous Heart Rate import appeared to stall for long periods around roughly 900k records, but later progress resumed; avoid classifying this as a hard timeout without report evidence.
415
-- After a completed import, the app may remain unresponsive for more than one minute. A no-delta incremental report shows the import itself completed in 9.2s, so post-import cache rebuild / UI refresh is the likely cause.
451
+- After a completed import, the app may remain unresponsive or crash in legacy
452
+  post-import cache work. A 2026-06-03 console log showed Heart Rate and Active
453
+  Energy `TypeCount.detailCacheData` precompute immediately before a Core Data
454
+  mutated-while-enumerated abort.
416 455
 - Partial / old imported observations can pollute comparisons. Fresh first-snapshot performance comparisons should use a confirmed reset database.
417 456
 - Non-chain-start full scans can be slower than first imports if unchanged existing samples still write per-sample archive evidence.
418 457
 
@@ -420,7 +459,9 @@ The likely bottleneck is per-row SQLite work:
420 459
 
421 460
 Prioritize experiments in this order:
422 461
 
423
-1. Confirm whether the background dashboard cache refresh removes the post-import UI freeze. If not, add explicit timings around cache rebuild, dashboard refresh, diagnostic report generation, and sheet dismissal.
462
+1. Run an incremental snapshot after skipping large legacy detail caches. Confirm
463
+   that Heart Rate / Active Energy emit `healthKit.detailCache.skipLargeLegacyArchive`
464
+   and that the app does not crash or freeze after import.
424 465
 2. Run a repeated no-delta benchmark after copying unchanged metric summaries and daily aggregates. Compare `SummedFinalizeElapsed`, `Heart Rate finalizeElapsed`, `Active Energy finalizeElapsed`, and wall clock.
425 466
 3. Add or inspect timing around per-record processing for changed high-volume metrics, especially Heart Rate, to separate sample DTO/fingerprint work from SQLite idempotency checks.
426 467
 4. Run a non-chain-start/full-scan benchmark after skipping unchanged `verified` events and fast-pathing already-open visibility ranges. Compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, `Steps insertElapsed`, and `Walking + Running Distance insertElapsed`.
+43 -3
HealthProbe/Services/HealthKitService.swift
@@ -5,6 +5,8 @@ import UIKit
5 5
 import os.log
6 6
 
7 7
 private let logger = Logger(subsystem: "ro.xdev.healthprobe", category: "HealthKitService")
8
+private let legacyDetailCachePrecomputeRecordLimit = 250_000
9
+private let legacyDetailCachePrecomputeArchiveByteLimit = 32 * 1_024 * 1_024
8 10
 
9 11
 enum TypeCategory: String, CaseIterable {
10 12
     case activity    = "Activity"
@@ -665,6 +667,7 @@ final class HealthKitService {
665 667
         )
666 668
         var builtCount = 0
667 669
         var skippedAliasCount = 0
670
+        var skippedLargeCount = 0
668 671
 
669 672
         for typeCount in typeCounts {
670 673
             if typeCount.isContentAlias {
@@ -673,15 +676,34 @@ final class HealthKitService {
673 676
                 continue
674 677
             }
675 678
 
679
+            let previousType = previousByType[typeCount.typeIdentifier]
680
+            if let skipReason = legacyDetailCachePrecomputeSkipReason(
681
+                current: typeCount,
682
+                previous: previousType
683
+            ) {
684
+                typeCount.setDetailCache(nil)
685
+                skippedLargeCount += 1
686
+                MemoryLog.log("healthKit.detailCache.skipLargeLegacyArchive", metadata: detailCacheMetadata(
687
+                    current: typeCount,
688
+                    previous: previousType,
689
+                    source: "snapshotSave"
690
+                ).merging([
691
+                    "reason": skipReason,
692
+                    "recordLimit": "\(legacyDetailCachePrecomputeRecordLimit)",
693
+                    "archiveByteLimit": MemoryLog.format(UInt64(legacyDetailCachePrecomputeArchiveByteLimit))
694
+                ]) { _, new in new })
695
+                continue
696
+            }
697
+
676 698
             MemoryLog.log("healthKit.detailCache.buildBegin", metadata: detailCacheMetadata(
677 699
                 current: typeCount,
678
-                previous: previousByType[typeCount.typeIdentifier],
700
+                previous: previousType,
679 701
                 source: "snapshotSave"
680 702
             ))
681 703
             typeCount.setDetailCache(
682 704
                 TypeCountDetailCacheBuilder.build(
683 705
                     current: typeCount,
684
-                    previous: previousByType[typeCount.typeIdentifier],
706
+                    previous: previousType,
685 707
                     baselineSnapshotID: previous.id
686 708
                 )
687 709
             )
@@ -694,7 +716,8 @@ final class HealthKitService {
694 716
         }
695 717
         MemoryLog.log("healthKit.precomputeDetailCaches.end", metadata: [
696 718
             "builtCount": "\(builtCount)",
697
-            "skippedAliasCount": "\(skippedAliasCount)"
719
+            "skippedAliasCount": "\(skippedAliasCount)",
720
+            "skippedLargeCount": "\(skippedLargeCount)"
698 721
         ])
699 722
     }
700 723
 
@@ -770,6 +793,23 @@ final class HealthKitService {
770 793
         ]
771 794
     }
772 795
 
796
+    private func legacyDetailCachePrecomputeSkipReason(current: TypeCount, previous: TypeCount?) -> String? {
797
+        let largestRecordCount = max(max(current.count, 0), max(previous?.count ?? 0, 0))
798
+        if largestRecordCount > legacyDetailCachePrecomputeRecordLimit {
799
+            return "recordLimit"
800
+        }
801
+
802
+        let largestArchiveByteCount = max(
803
+            current.recordArchiveData?.count ?? 0,
804
+            previous?.recordArchiveData?.count ?? 0
805
+        )
806
+        if largestArchiveByteCount > legacyDetailCachePrecomputeArchiveByteLimit {
807
+            return "archiveByteLimit"
808
+        }
809
+
810
+        return nil
811
+    }
812
+
773 813
     private func hasAmbiguousCompleteDisappearance(
774 814
         snapshot: HealthSnapshot,
775 815
         typeCounts: [TypeCount],