Drop unused sample import indexes · ff59257

+5 -3

HealthProbe/Doc/04-project/Import-Optimization-Log.md

@@ -158,6 +158,7 @@ of a clean first snapshot. SQLite insert remains the dominant bottleneck.
 | 2026-06-02 | `a026566` | Batched initial import archive writes across several fetched pages. | Wall clock improved from about 20m25s to 18m21s on the measured first import. |
 | 2026-06-02 | `c138b7b` | Increased initial import write chunk sizes. | Marginal improvement: summed insert from 15m44s to 15m24s on the next comparable run. |
 | 2026-06-02 | `44d9ebd` | Used direct inserts for dependent rows when `samples` creates a new sample. | Confirmed modest first-import gain: wall clock 18m30s -> 17m13s, summed insert 15m24s -> 14m38s, Heart Rate insert 9m58s -> 8m59s. |
+| 2026-06-02 | `d0914b1` | Removed unused `samples` indexes on global UUID hash and semantic fingerprint. | Awaiting comparable first-import report. Expected signal is lower `SummedInsertElapsed`; deleted-object lookup remains covered by `(sample_type_id, sample_uuid_hash)`. |
 
 ## Current Diagnosis
 
@@ -193,9 +194,10 @@ Prioritize experiments in this order:
    - keep archive semantics unchanged;
    - only relax work that can be safely reconstructed or validated;
    - re-enable normal idempotent paths for incremental observations.
-4. Investigate whether first-import-only deferred index creation or temporary staging tables can reduce `samples` / `sample_versions` / `sample_observation_events` write cost without weakening final archive integrity.
-5. Revisit adaptive page sizes only after SQLite write-path costs are reduced.
-6. Revisit background / scheduled collection once initial import can finish reliably and post-import UI recovery is bounded.
+4. Run a fresh first-import benchmark after the unused-index removal and compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, and `Active Energy insertElapsed`.
+5. Investigate whether first-import-only deferred index creation or temporary staging tables can reduce `samples` / `sample_versions` / `sample_observation_events` write cost without weakening final archive integrity.
+6. Revisit adaptive page sizes only after SQLite write-path costs are reduced.
+7. Revisit background / scheduled collection once initial import can finish reliably and post-import UI recovery is bounded.
 
 ## Verification Checklist For Each Optimization
 


+2 -2

HealthProbe/Services/SQLiteHealthArchiveStore.swift

View

@@ -1659,10 +1659,10 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
             UNIQUE(sample_type_id, strict_fingerprint)
         )
         """, db: db)
-        try execute("CREATE INDEX IF NOT EXISTS idx_samples_uuid_hash ON samples(sample_uuid_hash)", db: db)
         try execute("CREATE INDEX IF NOT EXISTS idx_samples_type_id ON samples(sample_type_id, id)", db: db)
         try execute("CREATE INDEX IF NOT EXISTS idx_samples_type_uuid_hash ON samples(sample_type_id, sample_uuid_hash)", db: db)
-        try execute("CREATE INDEX IF NOT EXISTS idx_samples_type_semantic ON samples(sample_type_id, semantic_fingerprint)", db: db)
+        try execute("DROP INDEX IF EXISTS idx_samples_uuid_hash", db: db)
+        try execute("DROP INDEX IF EXISTS idx_samples_type_semantic", db: db)
         try execute("""
         CREATE TABLE IF NOT EXISTS sample_versions (
             id INTEGER PRIMARY KEY,


	@@ -158,6 +158,7 @@ of a clean first snapshot. SQLite insert remains the dominant bottleneck.
158	158	\| 2026-06-02 \| `a026566` \| Batched initial import archive writes across several fetched pages. \| Wall clock improved from about 20m25s to 18m21s on the measured first import. \|
159	159	\| 2026-06-02 \| `c138b7b` \| Increased initial import write chunk sizes. \| Marginal improvement: summed insert from 15m44s to 15m24s on the next comparable run. \|
160	160	\| 2026-06-02 \| `44d9ebd` \| Used direct inserts for dependent rows when `samples` creates a new sample. \| Confirmed modest first-import gain: wall clock 18m30s -> 17m13s, summed insert 15m24s -> 14m38s, Heart Rate insert 9m58s -> 8m59s. \|
	161	+\| 2026-06-02 \| `d0914b1` \| Removed unused `samples` indexes on global UUID hash and semantic fingerprint. \| Awaiting comparable first-import report. Expected signal is lower `SummedInsertElapsed`; deleted-object lookup remains covered by `(sample_type_id, sample_uuid_hash)`. \|
161	162
162	163	## Current Diagnosis
163	164
	@@ -193,9 +194,10 @@ Prioritize experiments in this order:
193	194	- keep archive semantics unchanged;
194	195	- only relax work that can be safely reconstructed or validated;
195	196	- re-enable normal idempotent paths for incremental observations.
196		-4. Investigate whether first-import-only deferred index creation or temporary staging tables can reduce `samples` / `sample_versions` / `sample_observation_events` write cost without weakening final archive integrity.
197		-5. Revisit adaptive page sizes only after SQLite write-path costs are reduced.
198		-6. Revisit background / scheduled collection once initial import can finish reliably and post-import UI recovery is bounded.
	197	+4. Run a fresh first-import benchmark after the unused-index removal and compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, and `Active Energy insertElapsed`.
	198	+5. Investigate whether first-import-only deferred index creation or temporary staging tables can reduce `samples` / `sample_versions` / `sample_observation_events` write cost without weakening final archive integrity.
	199	+6. Revisit adaptive page sizes only after SQLite write-path costs are reduced.
	200	+7. Revisit background / scheduled collection once initial import can finish reliably and post-import UI recovery is bounded.
199	201
200	202	## Verification Checklist For Each Optimization
201	203

	@@ -1659,10 +1659,10 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
1659	1659	UNIQUE(sample_type_id, strict_fingerprint)
1660	1660	)
1661	1661	""", db: db)
1662		- try execute("CREATE INDEX IF NOT EXISTS idx_samples_uuid_hash ON samples(sample_uuid_hash)", db: db)
1663	1662	try execute("CREATE INDEX IF NOT EXISTS idx_samples_type_id ON samples(sample_type_id, id)", db: db)
1664	1663	try execute("CREATE INDEX IF NOT EXISTS idx_samples_type_uuid_hash ON samples(sample_type_id, sample_uuid_hash)", db: db)
1665		- try execute("CREATE INDEX IF NOT EXISTS idx_samples_type_semantic ON samples(sample_type_id, semantic_fingerprint)", db: db)
	1664	+ try execute("DROP INDEX IF EXISTS idx_samples_uuid_hash", db: db)
	1665	+ try execute("DROP INDEX IF EXISTS idx_samples_type_semantic", db: db)
1666	1666	try execute("""
1667	1667	CREATE TABLE IF NOT EXISTS sample_versions (
1668	1668	id INTEGER PRIMARY KEY,