Showing 2 changed files with 13 additions and 9 deletions
+10 -2
HealthProbe/Doc/04-project/Import-Optimization-Log.md
@@ -587,6 +587,7 @@ rows exist".
587 587
 | 2026-06-04 | `457fd80` | Incrementally replace changed daily aggregate buckets. | Follow-up full-profile delta report completed in `31.2s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=119, delta=8, initialImport=0`, and `DeltaEvents: 22`. Compared with the prior `42.1s` run, `SummedFinalizeElapsed` dropped `19.6s -> 15.5s`, Active Energy finalize dropped `4.8s -> 1.8s`, and total wall clock dropped `42.1s -> 31.2s`. Heart Rate finalize barely moved (`8.9s -> 8.7s`) despite only `5` delta events, proving that daily aggregate replacement helped some changed metrics but Heart Rate is still dominated by type summary `visibleAggregate` full scans. |
588 588
 | 2026-06-04 | `2ebfab3` | Incrementally update changed type summaries. | Follow-up full-profile delta report completed in `27.5s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=118, delta=9, initialImport=0`, and `DeltaEvents: 46`. Compared with the prior `31.2s` run, `SummedFinalizeElapsed` dropped `15.5s -> 11.7s`; Heart Rate finalize dropped `8.7s -> 4.8s`; Active Energy finalize stayed bounded at `1.7s`. Remaining cost moved back to delta archive processing: `SummedProcessingElapsed` was `11.6s`, with Heart Rate processing `6.1s`, Active Energy `2.3s`, and Basal Energy `1.9s` for small deltas. |
589 589
 | 2026-06-04 | `4894b77` | Patch compact archives from delta without full record maps. | Follow-up full-profile delta report completed in `52.1s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=106, delta=21, initialImport=0`, and `DeltaEvents: 11,093`. This is not comparable to the previous `46`-event baseline: Active Energy had `2,377` events, Basal Energy `2,347`, Cycling Distance `6,052`, and Heart Rate `231`. Processing remained bounded relative to delta size (`SummedProcessingElapsed: 16.0s`; Heart Rate `5.9s`, Active Energy `2.2s`, Basal Energy `1.9s`, Cycling Distance `4.2s`), but wall clock rose because fetch `16.1s`, insert `2.3s`, and finalize `14.8s` all had real work. Conclusion: compact dictionary removal did not regress and looks healthy for large deltas, but a small-delta repeat is still needed to validate the original `6.1s` Heart Rate target. |
590
+| 2026-06-04 | pending | Hash delta compact archives in recorded order. | Small-delta follow-up completed in `24.0s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=124, delta=3, initialImport=0`, and `DeltaEvents: 13`. This finally validated the remaining bottleneck: Heart Rate still spent `5.7s` processing only `6` events, Active Energy spent `2.1s` for `5` events, and Basal Energy spent `1.8s` for `2` events. The delta rebuild no longer builds a full UUID record map, but it still collected every fingerprint into a large array and sorted it for the per-type hash. Delta rebuild now uses the same recorded-order `TypeHashBuilder` strategy as initial import, avoiding the all-record fingerprint array and sort. Expected signal: lower Heart Rate processing than `5.7s` on the next small-delta run. |
590 591
 
591 592
 ## Current Diagnosis
592 593
 
@@ -666,6 +667,12 @@ The likely bottleneck is per-row SQLite work:
666 667
   seconds in an empty anchored HealthKit query (`Wheelchair Pushes`, `Wheezing`,
667 668
   `Zinc`). Treat this as a HealthKit fetch / profile scheduling issue, separate
668 669
   from compact archive rebuild.
670
+- The next small-delta report after `4894b77` confirmed that compact archive
671
+  rebuild itself is still expensive even with tiny deltas. Heart Rate had only
672
+  `6` delta events but still spent `5.7s` in processing. The current hypothesis
673
+  is that per-type hash sorting over all record fingerprints is a large part of
674
+  that residual cost. Delta hashing now uses recorded order, matching the
675
+  initial-import hash builder path and removing the full fingerprint array/sort.
669 676
 
670 677
 ## Open Issues / Observations
671 678
 
@@ -707,8 +714,9 @@ Prioritize experiments in this order:
707 714
 4. Run a full-profile repeated capture after compact-delta archive patching.
708 715
    Compare `SummedProcessingElapsed`, Heart Rate processing time, and
709 716
    `DeltaEvents`. Expected success is Heart Rate processing below the previous
710
-   `6.1s` baseline when its delta remains small. The `52.1s` / `11,093`-event
711
-   report is useful stress evidence, but not the small-delta validation run.
717
+   `5.7s` baseline when its delta remains small. The `52.1s` / `11,093`-event
718
+   report is useful stress evidence; the later `24.0s` / `13`-event report is
719
+   the current small-delta baseline before recorded-order delta hashing.
712 720
 5. Keep using `DeltaEvents` to quantify changed high-volume metrics, especially
713 721
    Heart Rate, Active Energy, and Basal Energy. If delta events are small while
714 722
    finalize remains large, optimize aggregate rebuild/finalization rather than
+3 -7
HealthProbe/Services/HealthKitService.swift
@@ -1388,14 +1388,13 @@ final class HealthKitService {
1388 1388
             typeIdentifier: typeIdentifier,
1389 1389
             estimatedRecordCount: max(0, previousDistribution.count + deltaSampleCount - deltaDeletedCount)
1390 1390
         )
1391
-        var recordFingerprints: [String] = []
1392
-        recordFingerprints.reserveCapacity(max(0, previousDistribution.count + deltaSampleCount - deltaDeletedCount))
1391
+        var hashBuilder = HashService.TypeHashBuilder(typeIdentifier: typeIdentifier)
1393 1392
         var earliestDate: Date?
1394 1393
         var latestDate: Date?
1395 1394
         var recordCount = 0
1396 1395
 
1397 1396
         func append(_ value: HealthRecordValue) {
1398
-            recordFingerprints.append(value.recordFingerprint)
1397
+            hashBuilder.append(recordFingerprint: value.recordFingerprint)
1399 1398
             earliestDate = min(earliestDate ?? value.startDate, value.startDate)
1400 1399
             latestDate = max(latestDate ?? value.endDate, value.endDate)
1401 1400
             writer.append(value)
@@ -1433,10 +1432,7 @@ final class HealthKitService {
1433 1432
 
1434 1433
         return RebuiltRecordArchive(
1435 1434
             count: recordCount,
1436
-            contentHash: HashService.typeHash(
1437
-                typeIdentifier: typeIdentifier,
1438
-                recordFingerprints: recordFingerprints
1439
-            ),
1435
+            contentHash: hashBuilder.finalize(),
1440 1436
             earliestDate: earliestDate,
1441 1437
             latestDate: latestDate,
1442 1438
             recordArchiveData: writer.finalize()