HealthProbe / HealthProbe / Doc / 04-project / Import-Optimization-Log.md
1 contributor
663 lines | 36.943kb

HealthProbe Import Optimization Log

Canonical path: HealthProbe/Doc/04-project/Import-Optimization-Log.md
Created: 2026-06-02
Purpose: Track import performance work, measured results, regressions, and next experiments.

This is a living project log. Update it after each import optimization commit and after each real-device import report.

Scope

The current optimization target is the initial / full-history HealthKit import into the SQLite archive.

Primary goals: - complete large first-run imports without app freeze or watchdog-like stalls; - keep memory bounded for low-end devices; - reduce wall-clock duration enough to make future background / scheduled collection realistic; - keep archive writes idempotent and differential; - preserve SQLite as the source of truth.

Non-goals for this log: - redesigning HealthKit background collection strategy; - changing archive semantics; - optimizing UI rendering after import, except when post-import work blocks the app.

Measurement Fields

Use the HealthProbe diagnostic report fields below for comparisons:

Field Meaning
WallClockDuration / Duration User-visible total operation time.
SummedFetchElapsed Time spent fetching HealthKit samples. Per-metric sums may overlap.
SummedProcessingElapsed Time spent converting HealthKit samples into archive rows.
SummedInsertElapsed Time spent writing rows to SQLite. Current main bottleneck.
SummedFinalizeElapsed Type verification, aggregate rebuild, and finalization cost.
Per-type insertElapsed Most useful field for high-volume types such as Heart Rate and Active Energy.

Important interpretation: - per-metric timing sums can exceed wall-clock time because type fetches overlap; - progress rates shown during import may overestimate throughput if overhead is not included; - compare first snapshots only against first snapshots after a database reset.

Real-Device Results

2026-06-02 Baseline Before Latest Batch/Chunk Work

Source: user-provided diagnostic report.

Metric Value
Wall clock 20m 25s
Summed fetch 1m 03s
Summed processing 1m 31s
Summed insert 17m 02s
Summed finalize 9.2s
Heart Rate count 923,466
Heart Rate insert 10m 45s
Active Energy insert 4m 29s
Steps insert 27.1s
Walking + Running Distance insert 21.8s

Conclusion: SQLite insert dominated the run. HealthKit fetch was not the limiting factor.

2026-06-02 After Batched Initial Archive Writes

Source: user-provided diagnostic report after commit a026566.

Metric Value
Wall clock 18m 21s
Summed insert 15m 44s
Heart Rate insert 10m 03s
Active Energy insert 3m 51s
Steps insert 28.1s
Walking + Running Distance insert 25.4s

Conclusion: batching produced a useful improvement, but insert remained dominant.

2026-06-02 After Larger Initial Write Chunks

Source: user-provided diagnostic report after commit c138b7b.

Metric Value
Wall clock 18m 30s
Summed metric total 18m 02s
Summed fetch 46.8s
Summed processing 1m 37s
Summed insert 15m 24s
Summed finalize 10.5s
Heart Rate count 922,404
Heart Rate total 11m 23s
Heart Rate fetch 21.2s
Heart Rate processing 56.1s
Heart Rate insert 9m 58s
Active Energy count 348,635
Active Energy insert 3m 48s
Steps insert 24.2s
Walking + Running Distance insert 20.0s

Conclusion: larger chunks gave only marginal gains. Further optimization should reduce per-sample SQLite work rather than only increasing page/chunk size.

2026-06-02 After Direct Inserts For New Archive Samples

Commit: 44d9ebd (Use direct inserts for new archive samples)
Source: user-provided first-import diagnostic report after database reset.

Note: an earlier user-provided report looked significantly faster, but it was a reimport rather than a fresh first snapshot and is not used as a direct comparison against first-import runs.

Metric Value
Wall clock 17m 13s
Summed metric total 17m 13s
Summed fetch 43.0s
Summed processing 1m 40s
Summed insert 14m 38s
Summed finalize 9.5s
Heart Rate count 922,431
Heart Rate total 10m 25s
Heart Rate fetch 20.8s
Heart Rate processing 57.0s
Heart Rate insert 8m 59s
Active Energy count 348,669
Active Energy insert 3m 54s
Steps insert 24.2s
Walking + Running Distance insert 20.7s

Comparison against the previous comparable first-import run (c138b7b):

Metric Previous Current Change
Wall clock 18m 30s 17m 13s -1m 17s / -7%
Summed insert 15m 24s 14m 38s -46s / -5%
Heart Rate insert 9m 58s 8m 59s -59s / -10%
Active Energy insert 3m 48s 3m 54s +6s / +3%
Steps insert 24.2s 24.2s flat
Walking + Running Distance insert 20.0s 20.7s +0.7s / +4%

Conclusion: direct inserts for brand-new dependent rows produced a valid but modest first-import gain. The large reimport improvement was not representative of a clean first snapshot. SQLite insert remains the dominant bottleneck.

2026-06-02 Non-Chain-Start Full Scan After Index Removal

Commit context: after ff59257 (Drop unused sample import indexes) Source: user-provided diagnostic report with previousSnapshotID present and isChainStart: false.

This is not a comparable first-import benchmark for the unused-index removal, but it is important because it shows that non-initial captures can be slower than first imports when the app performs a full-history scan.

Metric Value
Wall clock 22m 33s
Summed metric total 22m 14s
Summed fetch 52.0s
Summed processing 2m 23s
Summed insert 18m 44s
Summed finalize 11.5s
Heart Rate count 922,440
Heart Rate total 13m 30s
Heart Rate fetch 24.3s
Heart Rate processing 1m 29s
Heart Rate insert 11m 25s
Active Energy count 348,698
Active Energy insert 4m 44s
Steps insert 40.4s
Walking + Running Distance insert 36.0s

Conclusion: this run should not be used to judge first-import index removal. However, it indicates a separate bottleneck: subsequent full scans still spend most of their time in SQLite writes, likely because unchanged samples are still touching the archive write path. The next implementation target should reduce per-sample work for unchanged existing samples during verification/full-scan captures.

2026-06-02 First Import After Index Removal And Reset Fortification

Commit context: after 3dd5f48 (Fortify scheduled test database reset), with the unused index removal from ff59257 included. Source: user-provided diagnostic report with previousSnapshotID: none and isChainStart: true.

This is a comparable first-import benchmark. The a281c51 verified-event change is included in the build, but it should not materially affect this run because a clean first import creates brand-new samples rather than unchanged existing samples.

Metric Value
Wall clock 12m 43s
Summed metric total 12m 42s
Summed fetch 40.4s
Summed processing 1m 37s
Summed insert 10m 11s
Summed finalize 10.8s
Total records 1,579,168
Heart Rate count 922,450
Heart Rate total 8m 06s
Heart Rate fetch 18.6s
Heart Rate processing 56.1s
Heart Rate insert 6m 41s
Active Energy count 348,701
Active Energy insert 2m 09s
Steps insert 21.6s
Walking + Running Distance insert 19.2s

Comparison against the previous comparable first-import run (44d9ebd):

Metric Previous Current Change
Wall clock 17m 13s 12m 43s -4m 30s / -26%
Summed insert 14m 38s 10m 11s -4m 27s / -30%
Heart Rate insert 8m 59s 6m 41s -2m 18s / -26%
Active Energy insert 3m 54s 2m 09s -1m 45s / -45%
Steps insert 24.2s 21.6s -2.6s / -11%
Walking + Running Distance insert 20.7s 19.2s -1.5s / -7%

Conclusion: first-import reset is now clean and the unused-index removal produced a large measurable gain. SQLite insert remains dominant, but the main bottleneck has moved from about 14m38s to 10m11s.

2026-06-02 First Import After Additional Index Removal

Commit context: after 06ee6be (Drop additional import write indexes). Source: user-provided diagnostic report with previousSnapshotID: none, isChainStart: true, and the same total record count as the previous clean run.

This is a comparable first-import benchmark for removing idx_sample_versions_time and idx_visibility_sample_open.

Metric Value
Wall clock 12m 39s
Summed metric total 12m 39s
Summed fetch 43.8s
Summed processing 1m 35s
Summed insert 10m 07s
Summed finalize 10.1s
Total records 1,579,168
Heart Rate count 922,450
Heart Rate total 8m 08s
Heart Rate fetch 19.2s
Heart Rate processing 55.0s
Heart Rate insert 6m 44s
Active Energy count 348,701
Active Energy insert 2m 07s
Steps insert 20.7s
Walking + Running Distance insert 18.2s

Comparison against the previous comparable first-import run (3dd5f48 context):

Metric Previous Current Change
Wall clock 12m 43s 12m 39s -4s / -1%
Summed insert 10m 11s 10m 07s -4s / -1%
Heart Rate insert 6m 41s 6m 44s +3s / +1%
Active Energy insert 2m 09s 2m 07s -2s / -2%
Steps insert 21.6s 20.7s -0.9s / -4%
Walking + Running Distance insert 19.2s 18.2s -1.0s / -5%

Conclusion: removing these two extra indexes did not materially change first import performance. The small differences are within expected run-to-run noise. The larger first-import gain remains attributable to the earlier hot samples index removal plus clean reset conditions.

2026-06-02 Incremental No-Delta Snapshot With Post-Import UI Freeze

Commit context: after 06ee6be. Source: user-provided diagnostic report with previousSnapshotID present, isChainStart: false, and the same total record count as the preceding first snapshot.

Metric Value
Wall clock 9.2s
Summed metric total 8.9s
Summed fetch 0.2s
Summed processing 0.0s
Summed insert 0.0s
Summed finalize 8.5s
Total records 1,579,168
Heart Rate finalize 4.8s
Active Energy finalize 1.8s

Conclusion: repeated no-delta capture is now fast and does not write unchanged records. The user still observed roughly one minute of app unresponsiveness after finalization, so the remaining issue is outside the measured import phases. The most likely culprit is synchronous post-import Core Data cache rebuild / dashboard refresh work. The dashboard cache refresh was moved to a background task after this report; next reports should distinguish import time from post-import UI recovery.

2026-06-03 Incremental No-Delta Snapshot After Visibility Fast-Path

Commit context: after f60f09a. Source: user-provided diagnostic report with previousSnapshotID present, isChainStart: false, and total record count unchanged at 1,579,168.

Metric Value
Wall clock 13.6s
Summed metric total 13.2s
Summed fetch 0.2s
Summed processing 0.0s
Summed insert 0.0s
Summed finalize 12.9s
Total records 1,579,168
Heart Rate finalize 9.1s
Active Energy finalize 1.8s
Steps finalize 0.5s
Walking + Running Distance finalize 0.4s

Conclusion: the repeated no-delta write path is effectively eliminated. The remaining measured import cost is finalization, especially Heart Rate daily aggregate/materialized summary work. The next optimization should avoid rescanning all visible records when a metric has no appeared, disappeared, or representation-changed events in the current observation.

2026-06-03 Incremental Snapshot After Aggregate Copy Path

Commit context: after 19ba656. Source: user-provided diagnostic report with previousSnapshotID present and isChainStart: false.

This was not a pure no-delta run: total record count increased from 1,579,168 to 1,579,239 (+71 records). That means high-volume changed metrics still needed normal processing/finalization paths, so the aggregate-copy optimization was not fully exercised for Heart Rate or Active Energy.

Metric Value
Wall clock 51.3s
Summed metric total 34.3s
Summed fetch 0.2s
Summed processing 21.1s
Summed insert 0.1s
Summed finalize 12.1s
Total records 1,579,239
Heart Rate processing 13.9s
Heart Rate finalize 8.7s
Active Energy processing 5.0s
Active Energy finalize 1.9s
Steps processing 1.1s
Walking + Running Distance processing 0.9s

Comparison against the previous repeated snapshot report:

Metric Previous Current Change
Total records 1,579,168 1,579,239 +71
Wall clock 13.6s 51.3s +37.7s
Summed processing 0.0s 21.1s +21.1s
Summed insert 0.0s 0.1s +0.1s
Summed finalize 12.9s 12.1s -0.8s
Heart Rate finalize 9.1s 8.7s -0.4s

Conclusion: writes remain bounded (SummedInsertElapsed 0.1s), but this report shows a new bottleneck for changed high-volume metrics: processing existing records when a small number of records changed. A pure no-delta report is still needed to validate the aggregate-copy path. The unexplained global gap between wall clock and summed metric total also needs timing if it persists.

2026-06-03 Incremental Snapshot With Large Legacy Detail Cache Crash

Commit context: after 19ba656. Source: user-provided diagnostic report plus console crash log.

This was still not a pure no-delta run: total record count increased from 1,579,239 to 1,579,243 (+4 records). The report shows that archive inserts remained effectively eliminated, while aggregate finalization improved versus the previous changed incremental run.

Metric Previous Current Change
Total records 1,579,239 1,579,243 +4
Wall clock 51.3s 44.7s -6.6s
Summed metric total 34.3s 26.7s -7.6s
Summed fetch 0.2s 0.2s flat
Summed processing 21.1s 18.4s -2.7s
Summed insert 0.1s 0.0s -0.1s
Summed finalize 12.1s 7.4s -4.7s
Heart Rate processing 13.9s 13.5s -0.4s
Heart Rate finalize 8.7s 4.8s -3.9s
Active Energy processing 5.0s 4.9s -0.1s
Active Energy finalize 1.9s 1.8s -0.1s

The follow-up console log identifies the remaining post-import issue: healthKit.precomputeDetailCaches built legacy SwiftData detail caches for Heart Rate and Active Energy, scanning archives of about 137 MB / 922k records and 52 MB / 349k records. Immediately after that work completed, Core Data aborted on the main thread with a mutated-while-enumerated exception during change processing.

Conclusion: the SQLite archive path is not the crash source. The expensive and crash-prone work is the legacy TypeCount.detailCacheData precompute after the snapshot save. Large type detail caches should be skipped and served from the SQLite/Core Data archive/cache path instead.

2026-06-03 Incremental Snapshot After Large Detail Cache Skip

Commit context: after e49a79d. Source: user-provided diagnostic report. The user still observed app unresponsiveness after the operation completed.

Metric Previous Current Change
Total records 1,579,243 1,579,253 +10
Wall clock 44.7s 31.3s -13.4s
Summed metric total 26.7s 30.6s +3.9s
Summed fetch 0.2s 0.2s flat
Summed processing 18.4s 18.1s -0.3s
Summed insert 0.0s 0.0s flat
Summed finalize 7.4s 11.5s +4.1s
Heart Rate processing 13.5s 13.1s -0.4s
Heart Rate finalize 4.8s 8.8s +4.0s
Active Energy processing 4.9s 4.7s -0.2s
Active Energy finalize 1.8s 1.9s +0.1s

Conclusion: the large legacy detail-cache skip removed the earlier crash signal and reduced wall clock, but the app can still become unresponsive after the snapshot reports success. The remaining suspect is the Dashboard Core Data cache refresh because the view model still awaited cache rebuild before releasing the post-snapshot UI path. That refresh should run fire-and-forget and publish its result back to the main actor only when complete.

2026-06-03 Final Freeze Log: Small Detail Cache Still Crashes

Source: user-provided overnight freeze/crash log after the large-cache skip.

The log shows healthKit.precomputeDetailCaches still building two small legacy SwiftData detail caches:

Type Current archive Current count Result
Stand Hours 1.1 MB 7,727 built, 2 added
Environmental Sound Levels 2 MB 13,384 built, 1 added

Heart Rate and Active Energy were correctly skipped by the large-cache guard, but Core Data still aborted immediately after healthKit.precomputeDetailCaches.end with the same mutated-while-enumerated exception during change processing.

Conclusion: the issue is not only large archive size. Mutating legacy TypeCount.detailCacheData during snapshot save is unsafe in this SwiftData / Core Data context. Snapshot save must not precompute or clear this legacy cache. The Snapshots list should also read observation timeline rows directly from SQLite so it does not depend on a delayed Core Data cache rebuild to show the freshly finished observation.

2026-06-03 Successful Snapshot Still Freezes After Copying Report

Commit context: after 1229f19. Source: user-provided diagnostic report copied from the app before a new freeze. Unlike the previous overnight log, this report contains no healthKit.detailCache.buildBegin evidence and the snapshot itself completed successfully.

Metric Previous Current Change
Total records 1,579,253 1,579,445 +192
Wall clock 31.3s 35.5s +4.2s
Summed metric total 30.6s 35.0s +4.4s
Summed fetch 0.2s 0.2s flat
Summed processing 18.1s 21.0s +2.9s
Summed insert 0.0s 0.2s +0.2s
Summed finalize 11.5s 12.6s +1.1s

Conclusion: the legacy detail-cache crash path was removed, but the app can still freeze after the user copies the diagnostic report. The remaining post-snapshot culprit is the automatic Dashboard Core Data cache rebuild, which can consume enough device I/O/CPU to make the app appear frozen even when run from a detached task. Automatic post-snapshot refresh should read the small timeline/status data directly from SQLite; full Core Data cache rebuild should remain explicit/manual until partial invalidation exists.

2026-06-03 Core Data Cache Rebuild Crash Stack

Source: user-provided LLDB backtrace after a fast crash. The stack still pointed to an app binary that called CoreDataArchiveCacheStore.rebuild from DashboardViewModel.startArchiveCacheRefresh, which was removed in 199d2ef. However, the stack also exposed a real cache-store bug: rebuild inserted NSManagedObjects through container.viewContext while running on a Swift utility task.

Crash location:

  • CoreDataArchiveCacheStore.insertDailyAggregateRows
  • NSEntityDescription.insertNewObject
  • NSManagedObjectContext insertObject
  • Core Data __CFBasicHashAddValue / EXC_BAD_ACCESS

Conclusion: even though automatic post-snapshot rebuild has been removed, manual cache rebuild must also be safe. CoreDataArchiveCacheStore.rebuild and deleteCache should use a dedicated background context and perform all Core Data mutations on that context's queue.

2026-06-03 No Crash, But Snapshot Detail Lost Data Types

Commit context: after disabling automatic/legacy Core Data cache work. Source: three user-provided reports and a screenshot of SnapshotArchiveDetailView.

Observed sequence:

Time Total records Duration Notes
13:36 1,579,596 36.2s Successful incremental snapshot, no crash.
13:45 1,579,601 31.3s Not a true no-delta run: +5 records versus the prior snapshot.
13:47 1,579,601 2.5s True no-delta run; processing and insert were 0.0s, finalize about 2.0s.

The detail screen showed Metrics: 15, Records: 1579596, and Record Changes: 175, but Data Range and Data Types were empty.

Conclusion: the archive/import path is working, and no-delta is now fast. The UI regression came from remaining reads of CachedTypeSummary and CachedDiffSummary through CoreDataArchiveCacheStore. Since automatic cache rebuild is intentionally disabled to prevent freezes/crashes, snapshot detail, Data Types, drilldown screens, and charts must read materialized SQLite archive summaries directly. Do not re-enable automatic full Core Data rebuild as a fix for missing UI details.

2026-06-03 Data Types Restored, Startup Shows False Empty State

Commit context: after 2a82f67. Source: user confirmation, screenshots, and two reports.

The user confirmed that Snapshot detail data types reappeared. A follow-up snapshot at 14:10 was a true no-delta run:

Metric Value
Total records 1,579,611
Wall clock 3.0s
Summed fetch 0.2s
Summed processing 0.0s
Summed insert 0.0s
Summed finalize 2.2s

Screenshots also showed a UI startup issue: the app briefly opened Dashboard / Snapshots as if there were no archive observations, then populated the real SQLite rows about three seconds later.

Conclusion: this is a loading-state bug, not an archive loss. Main tabs must distinguish "SQLite rows are still loading" from "SQLite query completed and no rows exist".

Optimization Iterations

Date Commit Change Result / Status
2026-06-02 fd08ded Added explicit fetch / processing / insert / finalize timings to reports. Made phase comparisons possible without inferring from UI progress.
2026-06-02 87f1a85 Cached repeated SQLite write-path lookups within grouped imports. Reduced repeated id lookup pressure in hot path.
2026-06-02 7294a01 Fast-pathed visibility writes for new archive samples. Removed redundant visibility close/existence work for brand-new samples.
2026-06-02 585d77f Tightened archive verification aggregate queries. Reduced finalization / verification rescans.
2026-06-02 2dd279c Used rowid fast path for new archive sample rows. Avoided follow-up id lookup queries when SQLite confirmed new inserts.
2026-06-02 f569b6c Fixed scheduled test database reset. Restored ability to compare fresh first-snapshot imports.
2026-06-02 986f343 Increased Heart Rate import write chunks. Early attempt to reduce paging/write overhead for the largest metric.
2026-06-02 c1ebd37 Sped up archive verification finalization. Reduced finalize pressure; insert remained dominant.
2026-06-02 bcbf9a5 Cleaned up import diagnostic timings. Corrected date-fetch wall-clock measurement and report text.
2026-06-02 a026566 Batched initial import archive writes across several fetched pages. Wall clock improved from about 20m25s to 18m21s on the measured first import.
2026-06-02 c138b7b Increased initial import write chunk sizes. Marginal improvement: summed insert from 15m44s to 15m24s on the next comparable run.
2026-06-02 44d9ebd Used direct inserts for dependent rows when samples creates a new sample. Confirmed modest first-import gain: wall clock 18m30s -> 17m13s, summed insert 15m24s -> 14m38s, Heart Rate insert 9m58s -> 8m59s.
2026-06-02 ff59257 Removed unused samples indexes on global UUID hash and semantic fingerprint. Confirmed large first-import gain after clean reset: wall clock 17m13s -> 12m43s, summed insert 14m38s -> 10m11s, Heart Rate insert 8m59s -> 6m41s. Deleted-object lookup remains covered by (sample_type_id, sample_uuid_hash).
2026-06-02 pending Captured non-chain-start full-scan report after index removal. Not comparable for first-import performance; reveals a separate full-scan/unchanged-sample write bottleneck.
2026-06-02 a281c51 Stopped writing verified observation events for unchanged existing samples. Awaiting comparable non-chain-start/full-scan report. Expected signal is lower SummedInsertElapsed and especially lower Heart Rate insert time when most rows are unchanged.
2026-06-02 3dd5f48 Fortified scheduled test database reset with a disk marker and extra SQLite sidecar cleanup. Real-device report confirmed reset produced previousSnapshotID: none, isChainStart: true, and a clean first-snapshot timeline.
2026-06-02 06ee6be Removed unused sample_versions(start_date, end_date) and redundant sample_visibility_ranges(sample_id, last_observation_id) indexes. Comparable first-import report was flat: wall clock 12m43s -> 12m39s and summed insert 10m11s -> 10m07s. Treat as no material performance change.
2026-06-02 pending Moved Dashboard archive cache refresh/rebuild off the UI path after snapshot completion. Awaiting real-device confirmation that the app no longer stays unresponsive for roughly one minute after a completed snapshot.
2026-06-03 f60f09a Fast-path unchanged samples whose current version already has an open visibility range. Confirmed on repeated no-delta capture: SummedInsertElapsed remained 0.0s; remaining cost is SummedFinalizeElapsed 12.9s, with Heart Rate finalize 9.1s.
2026-06-03 19ba656 Copy previous type summaries and daily aggregates for unchanged metric observations instead of rebuilding from visible ranges. First follow-up was not no-delta (+71 records), so it is inconclusive for the unchanged path. It did show SummedInsertElapsed 0.1s and a new changed-metric processing bottleneck: Heart Rate processing 13.9s, Active Energy processing 5.0s.
2026-06-03 e49a79d Skip legacy SwiftData detail-cache precompute for large type archives. Triggered by a crash after building Heart Rate and Active Energy detail caches. Expected signal: no post-import NSGenericException, lower post-import wall-clock gap, and healthKit.detailCache.skipLargeLegacyArchive logs for high-volume types.
2026-06-03 7d52262 Start Dashboard archive cache refresh without awaiting it after snapshot completion. Triggered by continued app unresponsiveness after a successful 31.3s incremental snapshot. Expected signal: progress sheet/result UI remains responsive while cache rows refresh later.
2026-06-03 1229f19 Disable legacy SwiftData detail-cache precompute completely and load Snapshots timeline from SQLite. Triggered by overnight crash after two small detail caches were built. Expected signal: no healthKit.detailCache.buildBegin logs during snapshot save, no Core Data mutated-while-enumerated abort, and the new SQLite observation appears in Snapshots without waiting for cache rebuild.
2026-06-03 199d2ef Stop automatic Dashboard Core Data cache rebuild after snapshot; refresh latest rows from SQLite only. Triggered by freeze after copying a successful diagnostic report. Expected signal: copying diagnostics and returning to Dashboard/Snapshots remains responsive; Core Data cache rebuild is no longer started automatically after snapshot completion.
2026-06-03 3abf63d Run Core Data cache rebuild/delete on a dedicated background context. Triggered by EXC_BAD_ACCESS inside Core Data object insertion during cache rebuild. Expected signal: manual Settings cache rebuild no longer crashes due to NSManagedObjectContext queue misuse.
2026-06-03 2a82f67 Load snapshot/type detail UI from SQLite materialized summaries instead of Core Data cache. Triggered by successful snapshots whose detail screens showed no data types after automatic cache rebuild was disabled. Expected signal: Snapshot detail, Data Types, per-type drilldown, and evolution chart show current archive details without rebuilding Core Data cache.
2026-06-03 ec7ee29 Add explicit loading states for Dashboard, Snapshots, and Data Types archive rows. Triggered by false "no observations/no snapshots/not enough data" states during the first few seconds after app launch. Expected signal: startup shows loading state until SQLite rows are available, then shows real archive data without flicker.
2026-06-03 e231eaf Use the HealthKit registry as SQLite sample type display-name fallback. Triggered by Snapshot detail showing raw identifiers such as HKCategoryTypeIdentifierAppleStandHour after UI moved from Core Data cache to SQLite summaries. Expected signal: existing and new archive rows show human-readable names such as Stand Hours without requiring reset/reimport.
2026-06-03 5fafcdd Expand the HealthKit type registry for full-dataset discovery while keeping the original 15-type profile as the tested default. Triggered by the decision that import/storage cannot be considered complete based only on the restricted v1 dataset. Expected signal: Settings/authorization can expose a much broader quantity/category/workout catalog, unsupported types are explicit, and real-device coverage reports can measure full authorized backup volume.
2026-06-03 committed Add explicit capture profile controls for full-dataset discovery. A real-device report after registry expansion still showed Types: 15/15 processed and the old monitored type-set hash because selectedTypeIDs persisted the v1 core profile in UserDefaults. Settings now exposes Select All Available Types, Select Core Profile, and selected/available counts so the next real-device run can deliberately switch from the v1 sample set to the expanded supported registry.
2026-06-03 pending Migrate legacy core-profile selections to full available capture by default. A follow-up real-device report still showed the old 4907... monitored type-set hash and Types: 15/15 processed, proving the running app still used the old persisted selected type set. New installs and pre-profile settings that exactly match the old core profile now migrate to All available; only an explicit Select Core Profile action persists the core subset. Settings also shows the active profile label (All available, Core, or Custom) for quick verification before capture.

Current Diagnosis

The import is no longer primarily a HealthKit fetch problem. On the latest comparable first-import measured run:

  • total wall clock was 12m43s after the unused-index removal and clean reset;
  • summed fetch was only 40.4s;
  • summed insert was 10m11s;
  • Heart Rate alone spent 6m41s inserting.

The likely bottleneck is per-row SQLite work: - uniqueness checks on hot tables; - index maintenance while importing high-volume rows; - multiple dependent writes per sample; - commit / transaction shape; - no-op visibility range maintenance for unchanged existing samples; - processing existing high-volume metrics when only a small number of records changed; - legacy SwiftData detail-cache precompute after SQLite completes. This became the active crash/performance issue even for small caches; any snapshot-save mutation of TypeCount.detailCacheData is unsafe and should remain disabled. - Dashboard Core Data archive cache refresh after snapshot completion, when the app starts a full rebuild immediately after the import. Even detached rebuilds can overwhelm real-device I/O/CPU, so automatic post-snapshot UI refresh should use SQLite summary rows only. - Core Data cache rebuild must not mutate viewContext from background Swift tasks. Rebuild/delete should use a private background context. - Snapshot and Data Types UI must not rely on Core Data cache rows being present. SQLite observation rows and type summaries are already materialized during archive finalization and should be the primary UI source for fresh snapshots. - UI state should distinguish loading from empty results. A nil or empty in-memory row list during app launch is not evidence that the archive is empty. - The validated import metrics are based on the original 15-type profile. The next correctness/performance question is full-dataset coverage and volume, not further confidence from the restricted sample alone.

Open Issues / Observations

  • Very small pages reduced freeze risk but introduced visible overhead.
  • Some progress timing displayed in the UI did not include overhead, so elapsed time and rates looked better than the real operation.
  • A previous Heart Rate import appeared to stall for long periods around roughly 900k records, but later progress resumed; avoid classifying this as a hard timeout without report evidence.
  • After a completed import, the app may remain unresponsive or crash in legacy post-import cache work. A 2026-06-03 console log showed Heart Rate and Active Energy TypeCount.detailCacheData precompute immediately before a Core Data mutated-while-enumerated abort.
  • A later 2026-06-03 overnight log showed the same abort after only Stand Hours and Environmental Sound Levels detail caches were built. Size limits are not enough; the whole snapshot-save detail-cache precompute path is disabled.
  • Partial / old imported observations can pollute comparisons. Fresh first-snapshot performance comparisons should use a confirmed reset database.
  • Non-chain-start full scans can be slower than first imports if unchanged existing samples still write per-sample archive evidence.

Next Experiments

Prioritize experiments in this order:

  1. Run full-dataset discovery with the expanded registry: request/refresh permissions, inspect supported vs unsupported types, and run a capture with all desired supported types enabled on a real device. Record type count, total records, failed/unsupported/empty types, and phase timings.
  2. Run an incremental snapshot after removing automatic Core Data cache rebuild. Confirm there are no healthKit.detailCache.buildBegin logs, copying the diagnostic report does not freeze the app, and Dashboard/Snapshots show the latest observation from SQLite. Also verify Snapshot detail and Data Types show per-type summaries without a manual cache rebuild.
  3. Run a repeated no-delta benchmark after copying unchanged metric summaries and daily aggregates. Compare SummedFinalizeElapsed, Heart Rate finalizeElapsed, Active Energy finalizeElapsed, and wall clock.
  4. Add or inspect timing around per-record processing for changed high-volume metrics, especially Heart Rate, to separate sample DTO/fingerprint work from SQLite idempotency checks.
  5. Run a non-chain-start/full-scan benchmark after skipping unchanged verified events and fast-pathing already-open visibility ranges. Compare SummedInsertElapsed, Heart Rate insertElapsed, Steps insertElapsed, and Walking + Running Distance insertElapsed.
  6. Reduce any remaining per-sample SQLite writes for unchanged existing samples during non-chain-start full scans.
  7. Profile whether index maintenance dominates first-import insert cost.
  8. Consider a guarded bulk-import mode for first observations:
    • keep archive semantics unchanged;
    • only relax work that can be safely reconstructed or validated;
    • re-enable normal idempotent paths for incremental observations.
  9. Run a fresh first-import benchmark after the unused-index removal and compare SummedInsertElapsed, Heart Rate insertElapsed, and Active Energy insertElapsed.
  10. Investigate whether first-import-only deferred index creation or temporary staging tables can reduce samples / sample_versions / sample_observation_events write cost without weakening final archive integrity.
  11. Revisit adaptive page sizes only after SQLite write-path costs are reduced.
  12. Revisit background / scheduled collection once initial import can finish reliably and post-import UI recovery is bounded.

Verification Checklist For Each Optimization

  • [ ] git diff --check passes.
  • [ ] SQLite archive store tests pass.
  • [ ] Import configuration tests pass if capture strategy changed.
  • [ ] Repeated-page/idempotency behavior remains covered.
  • [ ] A real-device report is attached or summarized in this log.
  • [ ] The next experiment is recorded before moving on.