Showing 1 changed files with 54 additions and 9 deletions
+54 -9
HealthProbe/Doc/04-project/Import-Optimization-Log.md
@@ -318,6 +318,49 @@ aggregate/materialized summary work. The next optimization should avoid
318 318
 rescanning all visible records when a metric has no appeared, disappeared, or
319 319
 representation-changed events in the current observation.
320 320
 
321
+### 2026-06-03 Incremental Snapshot After Aggregate Copy Path
322
+
323
+Commit context: after `19ba656`. Source: user-provided diagnostic report with
324
+`previousSnapshotID` present and `isChainStart: false`.
325
+
326
+This was not a pure no-delta run: total record count increased from 1,579,168 to
327
+1,579,239 (+71 records). That means high-volume changed metrics still needed
328
+normal processing/finalization paths, so the aggregate-copy optimization was not
329
+fully exercised for Heart Rate or Active Energy.
330
+
331
+| Metric | Value |
332
+|--------|-------|
333
+| Wall clock | 51.3s |
334
+| Summed metric total | 34.3s |
335
+| Summed fetch | 0.2s |
336
+| Summed processing | 21.1s |
337
+| Summed insert | 0.1s |
338
+| Summed finalize | 12.1s |
339
+| Total records | 1,579,239 |
340
+| Heart Rate processing | 13.9s |
341
+| Heart Rate finalize | 8.7s |
342
+| Active Energy processing | 5.0s |
343
+| Active Energy finalize | 1.9s |
344
+| Steps processing | 1.1s |
345
+| Walking + Running Distance processing | 0.9s |
346
+
347
+Comparison against the previous repeated snapshot report:
348
+
349
+| Metric | Previous | Current | Change |
350
+|--------|----------|---------|--------|
351
+| Total records | 1,579,168 | 1,579,239 | +71 |
352
+| Wall clock | 13.6s | 51.3s | +37.7s |
353
+| Summed processing | 0.0s | 21.1s | +21.1s |
354
+| Summed insert | 0.0s | 0.1s | +0.1s |
355
+| Summed finalize | 12.9s | 12.1s | -0.8s |
356
+| Heart Rate finalize | 9.1s | 8.7s | -0.4s |
357
+
358
+Conclusion: writes remain bounded (`SummedInsertElapsed` 0.1s), but this report
359
+shows a new bottleneck for changed high-volume metrics: processing existing
360
+records when a small number of records changed. A pure no-delta report is still
361
+needed to validate the aggregate-copy path. The unexplained global gap between
362
+wall clock and summed metric total also needs timing if it persists.
363
+
321 364
 ## Optimization Iterations
322 365
 
323 366
 | Date | Commit | Change | Result / Status |
@@ -341,7 +384,7 @@ representation-changed events in the current observation.
341 384
 | 2026-06-02 | `06ee6be` | Removed unused `sample_versions(start_date, end_date)` and redundant `sample_visibility_ranges(sample_id, last_observation_id)` indexes. | Comparable first-import report was flat: wall clock 12m43s -> 12m39s and summed insert 10m11s -> 10m07s. Treat as no material performance change. |
342 385
 | 2026-06-02 | pending | Moved Dashboard archive cache refresh/rebuild off the UI path after snapshot completion. | Awaiting real-device confirmation that the app no longer stays unresponsive for roughly one minute after a completed snapshot. |
343 386
 | 2026-06-03 | `f60f09a` | Fast-path unchanged samples whose current version already has an open visibility range. | Confirmed on repeated no-delta capture: `SummedInsertElapsed` remained 0.0s; remaining cost is `SummedFinalizeElapsed` 12.9s, with Heart Rate finalize 9.1s. |
344
-| 2026-06-03 | pending | Copy previous type summaries and daily aggregates for unchanged metric observations instead of rebuilding from visible ranges. | Awaiting repeated no-delta report. Expected signal is lower `SummedFinalizeElapsed`, especially Heart Rate finalize. |
387
+| 2026-06-03 | `19ba656` | Copy previous type summaries and daily aggregates for unchanged metric observations instead of rebuilding from visible ranges. | First follow-up was not no-delta (+71 records), so it is inconclusive for the unchanged path. It did show `SummedInsertElapsed` 0.1s and a new changed-metric processing bottleneck: Heart Rate processing 13.9s, Active Energy processing 5.0s. |
345 388
 
346 389
 ## Current Diagnosis
347 390
 
@@ -359,6 +402,7 @@ The likely bottleneck is per-row SQLite work:
359 402
 - commit / transaction shape;
360 403
 - no-op visibility range maintenance for unchanged existing samples;
361 404
 - full daily aggregate rebuilds for unchanged metric observations;
405
+- processing existing high-volume metrics when only a small number of records changed;
362 406
 - Core Data or UI refresh work after SQLite completes. This became the active
363 407
   bottleneck after a no-delta incremental snapshot completed in 9.2s but the UI
364 408
   remained unresponsive for roughly one minute.
@@ -378,17 +422,18 @@ Prioritize experiments in this order:
378 422
 
379 423
 1. Confirm whether the background dashboard cache refresh removes the post-import UI freeze. If not, add explicit timings around cache rebuild, dashboard refresh, diagnostic report generation, and sheet dismissal.
380 424
 2. Run a repeated no-delta benchmark after copying unchanged metric summaries and daily aggregates. Compare `SummedFinalizeElapsed`, `Heart Rate finalizeElapsed`, `Active Energy finalizeElapsed`, and wall clock.
381
-3. Run a non-chain-start/full-scan benchmark after skipping unchanged `verified` events and fast-pathing already-open visibility ranges. Compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, `Steps insertElapsed`, and `Walking + Running Distance insertElapsed`.
382
-4. Reduce any remaining per-sample SQLite writes for unchanged existing samples during non-chain-start full scans.
383
-5. Profile whether index maintenance dominates first-import insert cost.
384
-6. Consider a guarded bulk-import mode for first observations:
425
+3. Add or inspect timing around per-record processing for changed high-volume metrics, especially Heart Rate, to separate sample DTO/fingerprint work from SQLite idempotency checks.
426
+4. Run a non-chain-start/full-scan benchmark after skipping unchanged `verified` events and fast-pathing already-open visibility ranges. Compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, `Steps insertElapsed`, and `Walking + Running Distance insertElapsed`.
427
+5. Reduce any remaining per-sample SQLite writes for unchanged existing samples during non-chain-start full scans.
428
+6. Profile whether index maintenance dominates first-import insert cost.
429
+7. Consider a guarded bulk-import mode for first observations:
385 430
    - keep archive semantics unchanged;
386 431
    - only relax work that can be safely reconstructed or validated;
387 432
    - re-enable normal idempotent paths for incremental observations.
388
-7. Run a fresh first-import benchmark after the unused-index removal and compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, and `Active Energy insertElapsed`.
389
-8. Investigate whether first-import-only deferred index creation or temporary staging tables can reduce `samples` / `sample_versions` / `sample_observation_events` write cost without weakening final archive integrity.
390
-9. Revisit adaptive page sizes only after SQLite write-path costs are reduced.
391
-10. Revisit background / scheduled collection once initial import can finish reliably and post-import UI recovery is bounded.
433
+8. Run a fresh first-import benchmark after the unused-index removal and compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, and `Active Energy insertElapsed`.
434
+9. Investigate whether first-import-only deferred index creation or temporary staging tables can reduce `samples` / `sample_versions` / `sample_observation_events` write cost without weakening final archive integrity.
435
+10. Revisit adaptive page sizes only after SQLite write-path costs are reduced.
436
+11. Revisit background / scheduled collection once initial import can finish reliably and post-import UI recovery is bounded.
392 437
 
393 438
 ## Verification Checklist For Each Optimization
394 439