|
593
|
593
|
| 2026-06-04 | `7e3b997` | Avoid summary full scans when delta safely replaces extremes. | Follow-up after `d4de48c` still showed `SummedFinalizeElapsed: 9.3s`, with Heart Rate finalize at `5.0s` despite only `4` Heart Rate delta events. `markVerification` already tries to update type summaries incrementally, but it fell back to a full visible-row aggregate scan whenever a removed row matched the previous earliest/latest/max, even if the same delta added a row that safely preserved or extended that extreme. The fallback is now narrower: full scan is still used when an extreme becomes unknown, but not when added rows replace the removed earliest/latest/max with an equivalent or stronger value. Follow-up report with `reportSchemaVersion: 3` and `buildFingerprint: 1.0(1)-1780603665-92064` completed in `21.0s`, with `127/127` complete, `CaptureModes: unchangedDelta=121, delta=6`, and `DeltaEvents: 20`. This was effectively flat versus the `d4de48c` report: wall `21.1s -> 21.0s`, processing `7.8s -> 7.4s`, finalize `9.3s -> 9.4s`, and Heart Rate finalize `5.0s -> 4.8s` while Heart Rate had `7` delta events. Conclusion: this optimization did not materially move the bottleneck; either the safe-extreme case did not trigger or type-summary fallback is no longer the dominant finalize cost. |
|
|
594
|
594
|
| 2026-06-04 | `f73f076` | Add archive finalization phase timings to diagnostics. | The post-`7e3b997` report proved the top-level `finalizeElapsed` bucket was too opaque for the next optimization. Diagnostics now split finalization into event-count/previous-summary lookup, type-summary work, daily-aggregate work, observation-type-run update, and residual other time. Follow-up report with `reportSchemaVersion: 3` and `buildFingerprint: 1.0(1)-1780606903-92064` completed in `22.6s`, with `127/127` complete, `CaptureModes: unchangedDelta=119, delta=8`, and `DeltaEvents: 27`. Finalization was `10.3s`: event-count/previous-summary lookup `1.8s`, type-summary `0.0s`, daily aggregates `7.3s`, run update `0.0s`, and other `1.2s`. Heart Rate had `9` delta events and spent `4.8s` finalizing, of which `3.8s` was daily aggregate work and `0.9s` was event-count/previous-summary lookup. Conclusion: the remaining finalize bottleneck is not type-summary fallback; it is changed-type daily aggregate maintenance, especially Heart Rate. |
|
|
595
|
595
|
| 2026-06-04 | older build / schema v2 | Captured large first-import baseline on a bigger device database. | Initial full-profile snapshot on an older build completed with `127/127` metrics and `8,421,978` records, but it used `reportSchemaVersion: 2` and has no build fingerprint. Treat it as a volume/shape baseline, not a precise current-build comparison. Wall clock was `166m10s`; summed fetch `5m19s`, processing `20m29s`, insert `137m31s`, finalize `1m53s`. The high-volume types dominated: Heart Rate `2,225,738` records and `46m57s` total (`39m16s` insert), Active Energy `1,914,449` records and `41m35s` total (`35m21s` insert), another high-volume type around `2,007,920` records and `41m20s` total (`34m29s` insert), and Basal Energy `1,116,074` records and `21m37s` total (`17m48s` insert). Conclusion: for clean first imports on very large databases, SQLite insert/index/write-path cost remains the central risk; incremental daily-aggregate optimization should not add first-import indexes without measurement. |
|
|
596
|
|
-| 2026-06-05 | pending | Split daily aggregate finalization timings. | The first finalization phase report identified daily aggregate work as the remaining changed-type bottleneck, but `finalizeDailyAggregateElapsed` still mixed affected-bucket lookup, previous aggregate copy, destination delete, affected-bucket rebuild, replacement insert, and residual SQL/transaction overhead. Diagnostics now emit aggregate and per-type daily subphase fields: bucket lookup, copy, delete, rebuild, insert, and other. Expected signal: the next repeated full-profile report should say whether Heart Rate's `~3.8s` daily aggregate cost is mostly copying previous materialized rows or rebuilding affected buckets from visible samples. Do not add `sample_versions`/visibility date indexes until this split shows rebuild is dominant, because the `8.4M`-record first-import baseline shows insert/index overhead is already the large-database risk. |
|
|
|
596
|
+| 2026-06-05 | `6041bac` | Split daily aggregate finalization timings. | The first finalization phase report identified daily aggregate work as the remaining changed-type bottleneck, but `finalizeDailyAggregateElapsed` still mixed affected-bucket lookup, previous aggregate copy, destination delete, affected-bucket rebuild, replacement insert, and residual SQL/transaction overhead. Diagnostics now emit aggregate and per-type daily subphase fields: bucket lookup, copy, delete, rebuild, insert, and other. Follow-up report with `buildFingerprint: 1.0(1)-1780618540-92064` completed in `23.5s`, with `127/127` complete, `CaptureModes: unchangedDelta=118, delta=9`, and `DeltaEvents: 97`. Finalization was `10.5s`, daily aggregate work was `7.4s`, and daily rebuild alone was `6.9s`; daily copy was only `0.5s`. Heart Rate had `40` delta events and spent `4.8s` finalizing, of which `3.8s` was daily aggregate rebuild. Conclusion: copying previous materialized daily rows is not the bottleneck; affected-bucket rebuild scans are. |
|
|
|
597
|
+| 2026-06-05 | pending | Rebuild changed daily aggregate buckets from time-ranged versions. | The changed-bucket rebuild query previously started from all samples for the type and only then filtered version `start_date`; for Heart Rate this can traverse roughly `900k` visible rows to rebuild a few affected days. The query now starts from `sample_versions(start_date, sample_id)` for the affected date window, joins to `samples` for type filtering, and joins open visibility ranges by `(sample_id, version_id, last_observation_id)`. Expected signal: repeated full-profile captures should reduce `SummedFinalizeDailyAggregateRebuildElapsed`, especially Heart Rate's `3.8s` daily rebuild. Risk to monitor: the new `sample_versions(start_date, sample_id)` index adds first-import write/index cost, so keep checking large first-import insert timing before accepting this as a permanent schema tradeoff. |
|
|
597
|
598
|
|
|
598
|
599
|
## Current Diagnosis
|
|
599
|
600
|
|
|
710
|
711
|
then rebuilds affected buckets; distinguish copy cost from affected-bucket
|
|
711
|
712
|
rebuild cost before adding new SQLite indexes, because first-import reports
|
|
712
|
713
|
show insert/index overhead is already the dominant large-database risk.
|
|
|
714
|
+- Daily aggregate subphase timings proved the problem is affected-bucket rebuild,
|
|
|
715
|
+ not daily aggregate copy. On the `6041bac` report, daily rebuild was `6.9s`
|
|
|
716
|
+ of `7.4s` daily aggregate work, while copy was `0.5s`. Heart Rate alone spent
|
|
|
717
|
+ `3.8s` rebuilding affected daily buckets. The next experiment changes the
|
|
|
718
|
+ affected-bucket rebuild query shape so it starts from time-ranged
|
|
|
719
|
+ `sample_versions` instead of all samples of the type.
|
|
713
|
720
|
- A large older-build first import on an `8.4M`-record database completed but
|
|
714
|
721
|
took `166m10s`, with `137m31s` summed insert time. This confirms that full
|
|
715
|
722
|
authorized backup volume can be much larger than the original 15-type test
|
|
773
|
780
|
identity unless the build provenance is otherwise certain. `sourceCommit`
|
|
774
|
781
|
and `sourceDirty` are useful when present, but may be `unknown` for normal
|
|
775
|
782
|
Xcode test installs.
|
|
776
|
|
-8. Run a repeated full-profile capture with daily aggregate subphase timings.
|
|
777
|
|
- The current known target is Heart Rate small deltas:
|
|
778
|
|
- `finalizeDailyAggregateElapsed` was `3.8s` for `9` events. Compare
|
|
779
|
|
- `finalizeDailyAggregateCopyElapsed`,
|
|
780
|
|
- `finalizeDailyAggregateRebuildElapsed`,
|
|
781
|
|
- `finalizeDailyAggregateBucketLookupElapsed`,
|
|
782
|
|
- `finalizeDailyAggregateInsertElapsed`, and
|
|
783
|
|
- `finalizeDailyAggregateOtherElapsed` before adding indexes that could slow
|
|
784
|
|
- first import.
|
|
|
783
|
+8. Run a repeated full-profile capture after the time-ranged daily aggregate
|
|
|
784
|
+ rebuild query. Compare `SummedFinalizeDailyAggregateRebuildElapsed` and Heart
|
|
|
785
|
+ Rate `finalizeDailyAggregateRebuildElapsed` against the `6.9s` total /
|
|
|
786
|
+ `3.8s` Heart Rate baseline from `6041bac`. Also watch first-import insert
|
|
|
787
|
+ timing on the next clean large-database import because the new
|
|
|
788
|
+ `sample_versions(start_date, sample_id)` index is a write-path tradeoff.
|
|
785
|
789
|
9. Investigate replacing legacy compact `recordArchiveData` delta rebuild with
|
|
786
|
790
|
a SQLite-derived capture-state/hash path. The current repeated full-profile
|
|
787
|
791
|
reports still spend about `4s` processing Heart Rate for tiny deltas because
|