Add build identity to diagnostic reports · c9091de

Add build identity to diagnostic reports
Browse files

bogdan committed 3 days ago

main

1 parent d4de48c

commit c9091de

Showing 2 changed files with 54 additions and 12 deletions

+26 -11

HealthProbe/Doc/04-project/Import-Optimization-Log.md

@@ -588,7 +588,8 @@ rows exist".
 | 2026-06-04 | `2ebfab3` | Incrementally update changed type summaries. | Follow-up full-profile delta report completed in `27.5s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=118, delta=9, initialImport=0`, and `DeltaEvents: 46`. Compared with the prior `31.2s` run, `SummedFinalizeElapsed` dropped `15.5s -> 11.7s`; Heart Rate finalize dropped `8.7s -> 4.8s`; Active Energy finalize stayed bounded at `1.7s`. Remaining cost moved back to delta archive processing: `SummedProcessingElapsed` was `11.6s`, with Heart Rate processing `6.1s`, Active Energy `2.3s`, and Basal Energy `1.9s` for small deltas. |
 | 2026-06-04 | `4894b77` | Patch compact archives from delta without full record maps. | Follow-up full-profile delta report completed in `52.1s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=106, delta=21, initialImport=0`, and `DeltaEvents: 11,093`. This is not comparable to the previous `46`-event baseline: Active Energy had `2,377` events, Basal Energy `2,347`, Cycling Distance `6,052`, and Heart Rate `231`. Processing remained bounded relative to delta size (`SummedProcessingElapsed: 16.0s`; Heart Rate `5.9s`, Active Energy `2.2s`, Basal Energy `1.9s`, Cycling Distance `4.2s`), but wall clock rose because fetch `16.1s`, insert `2.3s`, and finalize `14.8s` all had real work. Conclusion: compact dictionary removal did not regress and looks healthy for large deltas, but a small-delta repeat is still needed to validate the original `6.1s` Heart Rate target. |
 | 2026-06-04 | `1ba6c38` | Hash delta compact archives in recorded order. | Small-delta follow-up completed in `24.0s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=124, delta=3, initialImport=0`, and `DeltaEvents: 13`. This finally validated the remaining bottleneck: Heart Rate still spent `5.7s` processing only `6` events, Active Energy spent `2.1s` for `5` events, and Basal Energy spent `1.8s` for `2` events. The delta rebuild no longer builds a full UUID record map, but it still collected every fingerprint into a large array and sorted it for the per-type hash. Delta rebuild now uses the same recorded-order `TypeHashBuilder` strategy as initial import, avoiding the all-record fingerprint array and sort. |
-| 2026-06-04 | pending | Copy unchanged daily aggregates inside SQLite. | Follow-up small-delta run after `1ba6c38` completed in `23.6s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=120, delta=7, initialImport=0`, and `DeltaEvents: 11`. The hash change helped processing: `SummedProcessingElapsed` dropped `9.6s -> 8.1s`; Heart Rate processing dropped `5.7s -> 4.2s`, Active Energy `2.1s -> 1.7s`, and Basal Energy `1.8s -> 1.5s`. The bottleneck shifted to finalization: `SummedFinalizeElapsed` rose `10.2s -> 11.4s`, with Heart Rate still at `4.8s`. Changed daily aggregates were copying all previous daily rows through Swift before replacing affected buckets. Copying those unchanged rows now happens with `INSERT ... SELECT` in SQLite, while affected buckets remain recalculated. Expected signal: lower Heart Rate finalize than `4.8s` on the next small-delta run. |
+| 2026-06-04 | `d4de48c` | Copy unchanged daily aggregates inside SQLite. | First small-delta run before this commit completed in `23.6s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=120, delta=7`, and `DeltaEvents: 11`. The hash change helped processing (`9.6s -> 8.1s`; Heart Rate `5.7s -> 4.2s`), but finalization stayed high (`11.4s`, Heart Rate `4.8s`). Changed daily aggregates were copying all previous daily rows through Swift before replacing affected buckets. This commit moved unchanged daily aggregate copying to SQLite `INSERT ... SELECT`, while affected buckets remain recalculated. Follow-up report completed in `21.1s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=123, delta=4`, and `DeltaEvents: 50`. `SummedFinalizeElapsed` improved `11.4s -> 9.3s` and wall clock improved `23.6s -> 21.1s`; however Heart Rate finalize was still `5.0s` with `4` events, so this helped overall finalize cost but did not remove the high-volume changed-type floor. |
+| 2026-06-04 | pending | Include build identity in diagnostic reports. | The latest diagnostic report only included `App Version: 1.0(1)` near the end and did not include a commit/build source identifier. This makes it too easy to compare reports from the wrong installed binary. Diagnostics now emit `appVersion`, `buildFingerprint`, `sourceCommit`, and `sourceDirty` in `OPERATION METADATA`. `buildFingerprint` is derived from the installed executable and should change when a different binary is installed; `sourceCommit/sourceDirty` remain available for builds that inject those Info.plist keys. Expected signal: future pasted reports have enough build identity to detect wrong-version reports immediately. |
 
 ## Current Diagnosis
 
@@ -681,6 +682,12 @@ The likely bottleneck is per-row SQLite work:
   time copying unchanged materialized daily rows before replacing affected
   buckets. Those copied rows now use SQLite `INSERT ... SELECT`; changed buckets
   are still rebuilt normally.
+- Reports before `reportSchemaVersion: 3` do not identify the build beyond
+  `App Version: 1.0(1)`, which is not enough for performance attribution during
+  rapid test-install iteration. Treat older report-to-commit mapping as inferred
+  from conversation/order unless backed by an external build note. New reports
+  should include `buildFingerprint`; `sourceCommit` and `sourceDirty` may still
+  be `unknown` unless the build pipeline injects those Info.plist keys.
 
 ## Open Issues / Observations
 
@@ -730,22 +737,30 @@ Prioritize experiments in this order:
    HealthKit fetch, SQLite insert, or legacy compact archive reconstruction.
 6. Run a full-profile repeated capture after SQL-side daily aggregate copying.
    Compare `SummedFinalizeElapsed` and Heart Rate finalize against the current
-   `11.4s` total finalize / `4.8s` Heart Rate finalize baseline.
-7. Investigate full-profile empty anchored-query cost for zero-count types.
+   `11.4s` total finalize / `4.8s` Heart Rate finalize baseline. The first
+   follow-up improved total finalize to `9.3s`, but Heart Rate stayed around
+   `5.0s`; keep this as evidence that there is still a high-volume
+   changed-type floor.
+7. Verify the next diagnostic report contains `reportSchemaVersion: 3` and
+   `buildFingerprint`. Do not compare performance reports without a build
+   identity unless the build provenance is otherwise certain. `sourceCommit`
+   and `sourceDirty` are useful when present, but may be `unknown` for normal
+   Xcode test installs.
+8. Investigate full-profile empty anchored-query cost for zero-count types.
    Compare slow empty types across reports before changing behavior; any skip or
    lower-frequency strategy must preserve the promise that full authorized
    backup can notice newly appearing data.
-8. Run a non-chain-start/full-scan benchmark after skipping unchanged `verified` events and fast-pathing already-open visibility ranges. Compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, `Steps insertElapsed`, and `Walking + Running Distance insertElapsed`.
-9. Reduce any remaining per-sample SQLite writes for unchanged existing samples during non-chain-start full scans.
-10. Profile whether index maintenance dominates first-import insert cost.
-11. Consider a guarded bulk-import mode for first observations:
+9. Run a non-chain-start/full-scan benchmark after skipping unchanged `verified` events and fast-pathing already-open visibility ranges. Compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, `Steps insertElapsed`, and `Walking + Running Distance insertElapsed`.
+10. Reduce any remaining per-sample SQLite writes for unchanged existing samples during non-chain-start full scans.
+11. Profile whether index maintenance dominates first-import insert cost.
+12. Consider a guarded bulk-import mode for first observations:
    - keep archive semantics unchanged;
    - only relax work that can be safely reconstructed or validated;
    - re-enable normal idempotent paths for incremental observations.
-12. Run a fresh first-import benchmark after the unused-index removal and compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, and `Active Energy insertElapsed`.
-13. Investigate whether first-import-only deferred index creation or temporary staging tables can reduce `samples` / `sample_versions` / `sample_observation_events` write cost without weakening final archive integrity.
-14. Revisit adaptive page sizes only after SQLite write-path costs are reduced.
-15. Revisit background / scheduled collection once initial import can finish reliably and post-import UI recovery is bounded.
+13. Run a fresh first-import benchmark after the unused-index removal and compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, and `Active Energy insertElapsed`.
+14. Investigate whether first-import-only deferred index creation or temporary staging tables can reduce `samples` / `sample_versions` / `sample_observation_events` write cost without weakening final archive integrity.
+15. Revisit adaptive page sizes only after SQLite write-path costs are reduced.
+16. Revisit background / scheduled collection once initial import can finish reliably and post-import UI recovery is bounded.
 
 ## Verification Checklist For Each Optimization
 


+28 -1

HealthProbe/Views/Dashboard/DashboardView.swift

View

@@ -330,8 +330,12 @@ struct DashboardView: View {
         lines.append("operationResult: \(operationResultValue())")
         lines.append("primaryObjectType: snapshot")
         lines.append("primaryObjectID: \(snapshotID)")
-        lines.append("reportSchemaVersion: 2")
+        lines.append("reportSchemaVersion: 3")
         lines.append("reportGeneratedAt: \(reportGeneratedAt)")
+        lines.append("appVersion: \(Bundle.main.appVersion)")
+        lines.append("buildFingerprint: \(Bundle.main.buildFingerprint)")
+        lines.append("sourceCommit: \(Bundle.main.sourceCommit)")
+        lines.append("sourceDirty: \(Bundle.main.sourceDirty)")
         lines.append("")
         lines.append("OPERATION SUMMARY")
         lines.append("Snapshot:   \(snapshotID)")
@@ -504,6 +508,9 @@ struct DashboardView: View {
         lines.append("DEVICE/APP CONTEXT")
         lines.append("OS: \(UIDevice.current.systemVersion)")
         lines.append("App Version: \(Bundle.main.appVersion)")
+        lines.append("Build Fingerprint: \(Bundle.main.buildFingerprint)")
+        lines.append("Source Commit: \(Bundle.main.sourceCommit)")
+        lines.append("Source Dirty: \(Bundle.main.sourceDirty)")
         lines.append("")
         lines.append("END HEALTHPROBE REPORT")
 
@@ -1458,6 +1465,26 @@ extension Bundle {
         }
         return "\(version)(\(build))"
     }
+
+    var sourceCommit: String {
+        let value = infoDictionary?["HPSourceCommit"] as? String
+        return value?.isEmpty == false ? value! : "unknown"
+    }
+
+    var sourceDirty: String {
+        let value = infoDictionary?["HPSourceDirty"] as? String
+        return value?.isEmpty == false ? value! : "unknown"
+    }
+
+    var buildFingerprint: String {
+        guard let url = executableURL,
+              let values = try? url.resourceValues(forKeys: [.contentModificationDateKey, .fileSizeKey]) else {
+            return "unknown"
+        }
+        let modified = values.contentModificationDate?.timeIntervalSince1970 ?? 0
+        let size = values.fileSize ?? 0
+        return "\(appVersion)-\(Int(modified))-\(size)"
+    }
 }
 
 #Preview {


	@@ -588,7 +588,8 @@ rows exist".
588	588	\| 2026-06-04 \| `2ebfab3` \| Incrementally update changed type summaries. \| Follow-up full-profile delta report completed in `27.5s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=118, delta=9, initialImport=0`, and `DeltaEvents: 46`. Compared with the prior `31.2s` run, `SummedFinalizeElapsed` dropped `15.5s -> 11.7s`; Heart Rate finalize dropped `8.7s -> 4.8s`; Active Energy finalize stayed bounded at `1.7s`. Remaining cost moved back to delta archive processing: `SummedProcessingElapsed` was `11.6s`, with Heart Rate processing `6.1s`, Active Energy `2.3s`, and Basal Energy `1.9s` for small deltas. \|
589	589	\| 2026-06-04 \| `4894b77` \| Patch compact archives from delta without full record maps. \| Follow-up full-profile delta report completed in `52.1s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=106, delta=21, initialImport=0`, and `DeltaEvents: 11,093`. This is not comparable to the previous `46`-event baseline: Active Energy had `2,377` events, Basal Energy `2,347`, Cycling Distance `6,052`, and Heart Rate `231`. Processing remained bounded relative to delta size (`SummedProcessingElapsed: 16.0s`; Heart Rate `5.9s`, Active Energy `2.2s`, Basal Energy `1.9s`, Cycling Distance `4.2s`), but wall clock rose because fetch `16.1s`, insert `2.3s`, and finalize `14.8s` all had real work. Conclusion: compact dictionary removal did not regress and looks healthy for large deltas, but a small-delta repeat is still needed to validate the original `6.1s` Heart Rate target. \|
590	590	\| 2026-06-04 \| `1ba6c38` \| Hash delta compact archives in recorded order. \| Small-delta follow-up completed in `24.0s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=124, delta=3, initialImport=0`, and `DeltaEvents: 13`. This finally validated the remaining bottleneck: Heart Rate still spent `5.7s` processing only `6` events, Active Energy spent `2.1s` for `5` events, and Basal Energy spent `1.8s` for `2` events. The delta rebuild no longer builds a full UUID record map, but it still collected every fingerprint into a large array and sorted it for the per-type hash. Delta rebuild now uses the same recorded-order `TypeHashBuilder` strategy as initial import, avoiding the all-record fingerprint array and sort. \|
591		-\| 2026-06-04 \| pending \| Copy unchanged daily aggregates inside SQLite. \| Follow-up small-delta run after `1ba6c38` completed in `23.6s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=120, delta=7, initialImport=0`, and `DeltaEvents: 11`. The hash change helped processing: `SummedProcessingElapsed` dropped `9.6s -> 8.1s`; Heart Rate processing dropped `5.7s -> 4.2s`, Active Energy `2.1s -> 1.7s`, and Basal Energy `1.8s -> 1.5s`. The bottleneck shifted to finalization: `SummedFinalizeElapsed` rose `10.2s -> 11.4s`, with Heart Rate still at `4.8s`. Changed daily aggregates were copying all previous daily rows through Swift before replacing affected buckets. Copying those unchanged rows now happens with `INSERT ... SELECT` in SQLite, while affected buckets remain recalculated. Expected signal: lower Heart Rate finalize than `4.8s` on the next small-delta run. \|
	591	+\| 2026-06-04 \| `d4de48c` \| Copy unchanged daily aggregates inside SQLite. \| First small-delta run before this commit completed in `23.6s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=120, delta=7`, and `DeltaEvents: 11`. The hash change helped processing (`9.6s -> 8.1s`; Heart Rate `5.7s -> 4.2s`), but finalization stayed high (`11.4s`, Heart Rate `4.8s`). Changed daily aggregates were copying all previous daily rows through Swift before replacing affected buckets. This commit moved unchanged daily aggregate copying to SQLite `INSERT ... SELECT`, while affected buckets remain recalculated. Follow-up report completed in `21.1s` with `127/127` complete, `0` degraded, `CaptureModes: unchangedDelta=123, delta=4`, and `DeltaEvents: 50`. `SummedFinalizeElapsed` improved `11.4s -> 9.3s` and wall clock improved `23.6s -> 21.1s`; however Heart Rate finalize was still `5.0s` with `4` events, so this helped overall finalize cost but did not remove the high-volume changed-type floor. \|
	592	+\| 2026-06-04 \| pending \| Include build identity in diagnostic reports. \| The latest diagnostic report only included `App Version: 1.0(1)` near the end and did not include a commit/build source identifier. This makes it too easy to compare reports from the wrong installed binary. Diagnostics now emit `appVersion`, `buildFingerprint`, `sourceCommit`, and `sourceDirty` in `OPERATION METADATA`. `buildFingerprint` is derived from the installed executable and should change when a different binary is installed; `sourceCommit/sourceDirty` remain available for builds that inject those Info.plist keys. Expected signal: future pasted reports have enough build identity to detect wrong-version reports immediately. \|
592	593
593	594	## Current Diagnosis
594	595
	@@ -681,6 +682,12 @@ The likely bottleneck is per-row SQLite work:
681	682	time copying unchanged materialized daily rows before replacing affected
682	683	buckets. Those copied rows now use SQLite `INSERT ... SELECT`; changed buckets
683	684	are still rebuilt normally.
	685	+- Reports before `reportSchemaVersion: 3` do not identify the build beyond
	686	+ `App Version: 1.0(1)`, which is not enough for performance attribution during
	687	+ rapid test-install iteration. Treat older report-to-commit mapping as inferred
	688	+ from conversation/order unless backed by an external build note. New reports
	689	+ should include `buildFingerprint`; `sourceCommit` and `sourceDirty` may still
	690	+ be `unknown` unless the build pipeline injects those Info.plist keys.
684	691
685	692	## Open Issues / Observations
686	693
	@@ -730,22 +737,30 @@ Prioritize experiments in this order:
730	737	HealthKit fetch, SQLite insert, or legacy compact archive reconstruction.
731	738	6. Run a full-profile repeated capture after SQL-side daily aggregate copying.
732	739	Compare `SummedFinalizeElapsed` and Heart Rate finalize against the current
733		- `11.4s` total finalize / `4.8s` Heart Rate finalize baseline.
734		-7. Investigate full-profile empty anchored-query cost for zero-count types.
	740	+ `11.4s` total finalize / `4.8s` Heart Rate finalize baseline. The first
	741	+ follow-up improved total finalize to `9.3s`, but Heart Rate stayed around
	742	+ `5.0s`; keep this as evidence that there is still a high-volume
	743	+ changed-type floor.
	744	+7. Verify the next diagnostic report contains `reportSchemaVersion: 3` and
	745	+ `buildFingerprint`. Do not compare performance reports without a build
	746	+ identity unless the build provenance is otherwise certain. `sourceCommit`
	747	+ and `sourceDirty` are useful when present, but may be `unknown` for normal
	748	+ Xcode test installs.
	749	+8. Investigate full-profile empty anchored-query cost for zero-count types.
735	750	Compare slow empty types across reports before changing behavior; any skip or
736	751	lower-frequency strategy must preserve the promise that full authorized
737	752	backup can notice newly appearing data.
738		-8. Run a non-chain-start/full-scan benchmark after skipping unchanged `verified` events and fast-pathing already-open visibility ranges. Compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, `Steps insertElapsed`, and `Walking + Running Distance insertElapsed`.
739		-9. Reduce any remaining per-sample SQLite writes for unchanged existing samples during non-chain-start full scans.
740		-10. Profile whether index maintenance dominates first-import insert cost.
741		-11. Consider a guarded bulk-import mode for first observations:
	753	+9. Run a non-chain-start/full-scan benchmark after skipping unchanged `verified` events and fast-pathing already-open visibility ranges. Compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, `Steps insertElapsed`, and `Walking + Running Distance insertElapsed`.
	754	+10. Reduce any remaining per-sample SQLite writes for unchanged existing samples during non-chain-start full scans.
	755	+11. Profile whether index maintenance dominates first-import insert cost.
	756	+12. Consider a guarded bulk-import mode for first observations:
742	757	- keep archive semantics unchanged;
743	758	- only relax work that can be safely reconstructed or validated;
744	759	- re-enable normal idempotent paths for incremental observations.
745		-12. Run a fresh first-import benchmark after the unused-index removal and compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, and `Active Energy insertElapsed`.
746		-13. Investigate whether first-import-only deferred index creation or temporary staging tables can reduce `samples` / `sample_versions` / `sample_observation_events` write cost without weakening final archive integrity.
747		-14. Revisit adaptive page sizes only after SQLite write-path costs are reduced.
748		-15. Revisit background / scheduled collection once initial import can finish reliably and post-import UI recovery is bounded.
	760	+13. Run a fresh first-import benchmark after the unused-index removal and compare `SummedInsertElapsed`, `Heart Rate insertElapsed`, and `Active Energy insertElapsed`.
	761	+14. Investigate whether first-import-only deferred index creation or temporary staging tables can reduce `samples` / `sample_versions` / `sample_observation_events` write cost without weakening final archive integrity.
	762	+15. Revisit adaptive page sizes only after SQLite write-path costs are reduced.
	763	+16. Revisit background / scheduled collection once initial import can finish reliably and post-import UI recovery is bounded.
749	764
750	765	## Verification Checklist For Each Optimization
751	766

	@@ -330,8 +330,12 @@ struct DashboardView: View {
330	330	lines.append("operationResult: \(operationResultValue())")
331	331	lines.append("primaryObjectType: snapshot")
332	332	lines.append("primaryObjectID: \(snapshotID)")
333		- lines.append("reportSchemaVersion: 2")
	333	+ lines.append("reportSchemaVersion: 3")
334	334	lines.append("reportGeneratedAt: \(reportGeneratedAt)")
	335	+ lines.append("appVersion: \(Bundle.main.appVersion)")
	336	+ lines.append("buildFingerprint: \(Bundle.main.buildFingerprint)")
	337	+ lines.append("sourceCommit: \(Bundle.main.sourceCommit)")
	338	+ lines.append("sourceDirty: \(Bundle.main.sourceDirty)")
335	339	lines.append("")
336	340	lines.append("OPERATION SUMMARY")
337	341	lines.append("Snapshot: \(snapshotID)")
	@@ -504,6 +508,9 @@ struct DashboardView: View {
504	508	lines.append("DEVICE/APP CONTEXT")
505	509	lines.append("OS: \(UIDevice.current.systemVersion)")
506	510	lines.append("App Version: \(Bundle.main.appVersion)")
	511	+ lines.append("Build Fingerprint: \(Bundle.main.buildFingerprint)")
	512	+ lines.append("Source Commit: \(Bundle.main.sourceCommit)")
	513	+ lines.append("Source Dirty: \(Bundle.main.sourceDirty)")
507	514	lines.append("")
508	515	lines.append("END HEALTHPROBE REPORT")
509	516
	@@ -1458,6 +1465,26 @@ extension Bundle {
1458	1465	}
1459	1466	return "\(version)(\(build))"
1460	1467	}
	1468	+
	1469	+ var sourceCommit: String {
	1470	+ let value = infoDictionary?["HPSourceCommit"] as? String
	1471	+ return value?.isEmpty == false ? value! : "unknown"
	1472	+ }
	1473	+
	1474	+ var sourceDirty: String {
	1475	+ let value = infoDictionary?["HPSourceDirty"] as? String
	1476	+ return value?.isEmpty == false ? value! : "unknown"
	1477	+ }
	1478	+
	1479	+ var buildFingerprint: String {
	1480	+ guard let url = executableURL,
	1481	+ let values = try? url.resourceValues(forKeys: [.contentModificationDateKey, .fileSizeKey]) else {
	1482	+ return "unknown"
	1483	+ }
	1484	+ let modified = values.contentModificationDate?.timeIntervalSince1970 ?? 0
	1485	+ let size = values.fileSize ?? 0
	1486	+ return "\(appVersion)-\(Int(modified))-\(size)"
	1487	+ }
1461	1488	}
1462	1489
1463	1490	#Preview {