@@ -1,271 +1,14 @@ |
||
| 1 |
-# HealthProbe – Multi-Model Development Guide |
|
| 1 |
+# HealthProbe Agent Bootstrap |
|
| 2 | 2 |
|
| 3 |
-## Overview |
|
| 3 |
+Canonical agent instructions live in: |
|
| 4 | 4 |
|
| 5 |
-HealthProbe is built by multiple AI models, each owning a distinct domain. |
|
| 6 |
-This document defines boundaries, interfaces, and handoff contracts. |
|
| 5 |
+- `HealthProbe/Doc/00-agent-guides/AGENTS.md` |
|
| 7 | 6 |
|
| 8 |
-**Agentic reality:** The repo is developed largely via agents (Codex CLI, Claude, and dedicated model sessions). When updating product scope, update docs first, then implement behind flags, and add tests for the new behavior. |
|
| 7 |
+Before working, read: |
|
| 9 | 8 |
|
| 9 |
+1. `HealthProbe/Doc/README.md` |
|
| 10 |
+2. the specific chapter linked there for your task |
|
| 11 |
+3. `HealthProbe/Doc/00-agent-guides/AGENTS.md` |
|
| 10 | 12 |
|
| 11 |
-## Model Allocation |
|
| 12 |
- |
|
| 13 |
-| Domain | Owner | Tools | |
|
| 14 |
-|--------|-------|-------| |
|
| 15 |
-| **UI / SwiftUI Views** | Claude Code | Xcode, SwiftUI, CLAUDE.md | |
|
| 16 |
-| **Archive Store** | Dedicated model session | SQLite/local archive format, HealthKit metadata mapping | |
|
| 17 |
-| **Data Models (SwiftData)** | Dedicated model session | Xcode, Swift; derived UI/cache/settings/log models only | |
|
| 18 |
-| **HealthKit Integration** | Dedicated model session | Xcode, HealthKit docs | |
|
| 19 |
-| **Anomaly Detection Algorithms** | Dedicated model session | Swift, statistical references | |
|
| 20 |
-| **Context Monitoring** | Dedicated model session | Xcode; logs Health/iCloud state as context only | |
|
| 21 |
-| **Documentation** | Claude Code + dedicated session | Markdown | |
|
| 22 |
-| **Tests** | Dedicated model session | XCTest, Swift Testing | |
|
| 23 |
- |
|
| 24 |
- |
|
| 25 |
-## Directory Ownership |
|
| 26 |
- |
|
| 27 |
-``` |
|
| 28 |
-HealthProbe/ |
|
| 29 |
-├── Views/ ← Claude Code (UI) |
|
| 30 |
-├── ViewModels/ ← Claude Code (UI) |
|
| 31 |
-├── Utilities/ ← Claude Code (shared helpers, mocks) |
|
| 32 |
-├── Models/ ← Models agent (SwiftData UI/cache schemas) |
|
| 33 |
-├── Services/ ← Services agent (HealthKit, archive store, anomaly, context) |
|
| 34 |
-└── Tests/ ← Tests agent |
|
| 35 |
-``` |
|
| 36 |
- |
|
| 37 |
-**Rule:** Each agent writes only within its owned directories. |
|
| 38 |
-Cross-boundary changes require an explicit interface contract (protocol) first. |
|
| 39 |
- |
|
| 40 |
-**Documentation scope:** `HealthProbe/Doc/` is shared. Keep it consistent with shipped behavior, and add a dated entry when objectives change. |
|
| 41 |
- |
|
| 42 |
- |
|
| 43 |
-## Interface Contracts |
|
| 44 |
- |
|
| 45 |
-All service boundaries are defined as Swift protocols. |
|
| 46 |
-Claude Code (UI) consumes protocols — never concrete implementations. |
|
| 47 |
- |
|
| 48 |
-### HealthMonitorProtocol |
|
| 49 |
- |
|
| 50 |
-```swift |
|
| 51 |
-/// Owned by: Services agent |
|
| 52 |
-/// Consumed by: UI (DashboardViewModel) |
|
| 53 |
-protocol HealthMonitorProtocol {
|
|
| 54 |
- var currentStatus: HealthStatus { get }
|
|
| 55 |
- var lastChecked: Date? { get }
|
|
| 56 |
- func runCheck() async throws |
|
| 57 |
-} |
|
| 58 |
-``` |
|
| 59 |
- |
|
| 60 |
-### AnomalyStoreProtocol |
|
| 61 |
- |
|
| 62 |
-```swift |
|
| 63 |
-/// Owned by: Services agent |
|
| 64 |
-/// Consumed by: UI (AnomalyListViewModel) |
|
| 65 |
-protocol AnomalyStoreProtocol {
|
|
| 66 |
- var anomalies: [DetectedAnomaly] { get }
|
|
| 67 |
- func markResolved(_ anomaly: DetectedAnomaly) async throws |
|
| 68 |
-} |
|
| 69 |
-``` |
|
| 70 |
- |
|
| 71 |
-### AuditTrailProtocol |
|
| 72 |
- |
|
| 73 |
-```swift |
|
| 74 |
-/// Owned by: Services agent |
|
| 75 |
-/// Consumed by: UI (AuditTrailView) |
|
| 76 |
-protocol AuditTrailProtocol {
|
|
| 77 |
- var entries: [AuditTrailEntry] { get }
|
|
| 78 |
- func export() async throws -> Data // JSON |
|
| 79 |
-} |
|
| 80 |
-``` |
|
| 81 |
- |
|
| 82 |
-### ContextMonitorProtocol |
|
| 83 |
- |
|
| 84 |
-```swift |
|
| 85 |
-/// Owned by: Services agent |
|
| 86 |
-/// Consumed by: UI (ContextViewModel) |
|
| 87 |
-protocol ContextMonitorProtocol {
|
|
| 88 |
- var iCloudEnabled: Bool { get }
|
|
| 89 |
- var lastObservedAt: Date? { get }
|
|
| 90 |
- var stateChanges: [ContextStateChange] { get }
|
|
| 91 |
-} |
|
| 92 |
-``` |
|
| 93 |
- |
|
| 94 |
- |
|
| 95 |
-## Shared Types (Models Agent) |
|
| 96 |
- |
|
| 97 |
-These types are defined once in `Models/` and shared across all agents: |
|
| 98 |
- |
|
| 99 |
-```swift |
|
| 100 |
-// Models/TypeDistributionBin.swift |
|
| 101 |
-@Model |
|
| 102 |
-final class TypeDistributionBin {
|
|
| 103 |
- var bucketStart: Date |
|
| 104 |
- var bucketEnd: Date |
|
| 105 |
- var count: Int |
|
| 106 |
-} |
|
| 107 |
- |
|
| 108 |
-// Models/TypeCount.swift |
|
| 109 |
-// TypeCount owns zero or more TypeDistributionBin records. |
|
| 110 |
-// These bins store sample counts and import anchors, not raw health values. |
|
| 111 |
- |
|
| 112 |
-// Interface updated 2026-05-12 — see AGENTS.md |
|
| 113 |
-// Models/HealthRecord.swift |
|
| 114 |
-// HealthRecord stores one anonymized HealthKit record fingerprint plus its start/end dates. |
|
| 115 |
-// It intentionally does not store raw health values, device identifiers, or source metadata. |
|
| 116 |
-// UI may compare HealthRecord fingerprints between adjacent snapshots to expose losses |
|
| 117 |
-// that are masked by newly-added records with the same total count. |
|
| 118 |
-// High-volume snapshots store these records in TypeCount.recordArchiveData instead of |
|
| 119 |
-// creating one SwiftData model per record, avoiding main-thread stalls after import. |
|
| 120 |
- |
|
| 121 |
-// Interface updated 2026-05-13 — see AGENTS.md |
|
| 122 |
-// TypeDistributionBin also stores content hashes and HealthKit query anchors. |
|
| 123 |
-// Import uses a global anchored query per data type so follow-up snapshots fetch only |
|
| 124 |
-// HealthKit deltas instead of scanning calendar blocks with fixed per-query latency. |
|
| 125 |
- |
|
| 126 |
-// Interface updated 2026-05-18 — see AGENTS.md |
|
| 127 |
-// SwiftData is not the forensic source of truth. TypeCount and related rows store |
|
| 128 |
-// precomputed UI/index data only. Complete HealthKit samples and metadata belong |
|
| 129 |
-// in the local archive store, in one schema that can preserve relationships across |
|
| 130 |
-// data types, sources, devices, workouts, and metadata. |
|
| 131 |
- |
|
| 132 |
-// Interface updated 2026-05-18 — see AGENTS.md |
|
| 133 |
-// Services/Protocols/HealthArchiveStore.swift defines the local archive boundary. |
|
| 134 |
-// SQLiteHealthArchiveStore is the current implementation. HealthKit anchored-query |
|
| 135 |
-// pages must be written to this archive before SwiftData UI/cache rows are saved. |
|
| 136 |
-// Deletions are recorded by sampleUUIDHash because HKDeletedObject exposes UUIDs, |
|
| 137 |
-// not complete sample payloads. |
|
| 138 |
- |
|
| 139 |
-// Interface updated 2026-05-17 — see AGENTS.md |
|
| 140 |
-// Models/TypeCount.detailCacheData stores precomputed detail data for the current |
|
| 141 |
-// TypeCount compared with the immediately previous snapshot on the same device. |
|
| 142 |
-// The cache contains aggregate added/disappeared counts, capped preview records for |
|
| 143 |
-// UI drill-down, and daily change bins for temporal charts. It must be computed when |
|
| 144 |
-// snapshots are saved and refreshed for neighboring snapshots when snapshot deletion |
|
| 145 |
-// changes chain links. Existing stores are backfilled incrementally with a strict |
|
| 146 |
-// per-launch TypeCount cap to avoid decoding many large archives in one run. |
|
| 147 |
- |
|
| 148 |
-// Interface updated 2026-05-17 — see AGENTS.md |
|
| 149 |
-// Models/HealthSnapshot.contentEquivalentSnapshotID marks snapshots whose TypeCount |
|
| 150 |
-// content is identical to a previous snapshot on the same device. These snapshots are |
|
| 151 |
-// retained as temporal labels but behave as aliases to the representative content |
|
| 152 |
-// snapshot for expensive detail cache/diff work. |
|
| 153 |
- |
|
| 154 |
-// Interface updated 2026-05-17 — see AGENTS.md |
|
| 155 |
-// Models/TypeCount.contentEquivalentTypeCountID marks individual data types whose |
|
| 156 |
-// content is identical to the previous snapshot's same TypeCount. This allows a |
|
| 157 |
-// snapshot to contain real changes for some metrics while long-stable metrics behave |
|
| 158 |
-// as temporal aliases and skip per-type detail cache/diff work. |
|
| 159 |
- |
|
| 160 |
-// Interface updated 2026-05-17 — see AGENTS.md |
|
| 161 |
-// Models/HealthSnapshot stores cached overview scalars for UI consumption: |
|
| 162 |
-// tracked type count, aggregate record count, and overall oldest/newest record dates. |
|
| 163 |
-// These values must be computed during snapshot save while TypeCount data is already |
|
| 164 |
-// in memory, so snapshot list/detail screens never recompute them by traversing |
|
| 165 |
-// snapshot.typeCounts on the UI thread. |
|
| 166 |
- |
|
| 167 |
-// Interface updated 2026-05-17 — see AGENTS.md |
|
| 168 |
-// Models/SnapshotDelta stores cached list/detail summary scalars derived from TypeDelta. |
|
| 169 |
-// Overview screens consume these scalars and type-delta summaries directly instead of |
|
| 170 |
-// recalculating per-snapshot changes from HealthSnapshot.typeCounts. |
|
| 171 |
- |
|
| 172 |
-// Models/DetectedAnomaly.swift |
|
| 173 |
-enum AnomalyType: String, Codable {
|
|
| 174 |
- case historicalInsertion = "historical_insertion" |
|
| 175 |
- case silentDeletion = "silent_deletion" |
|
| 176 |
- case duplicate = "duplicate" |
|
| 177 |
- case divergence = "divergence" |
|
| 178 |
-} |
|
| 179 |
- |
|
| 180 |
-enum Severity: String, Codable, Comparable {
|
|
| 181 |
- case info, warning, critical |
|
| 182 |
-} |
|
| 183 |
- |
|
| 184 |
-enum HealthStatus: String {
|
|
| 185 |
- case healthy, warning, critical, unknown |
|
| 186 |
-} |
|
| 187 |
-``` |
|
| 188 |
- |
|
| 189 |
-Any model changes must be announced in this file before other agents consume them. |
|
| 190 |
- |
|
| 191 |
- |
|
| 192 |
-## Handoff Process |
|
| 193 |
- |
|
| 194 |
-When a module is ready to be consumed by another agent: |
|
| 195 |
- |
|
| 196 |
-1. **Define the protocol** in `Services/Protocols/` (services agent) |
|
| 197 |
-2. **Implement a mock** in `Utilities/Mocks.swift` (Claude Code) |
|
| 198 |
-3. **Build UI against the mock** (Claude Code) |
|
| 199 |
-4. **Replace mock with real implementation** (services agent) |
|
| 200 |
-5. **Integration test** (tests agent) |
|
| 201 |
- |
|
| 202 |
-This allows UI development and service development to proceed in parallel. |
|
| 203 |
- |
|
| 204 |
- |
|
| 205 |
-## Algorithms & Detection Logic |
|
| 206 |
- |
|
| 207 |
-The following modules involve non-trivial logic and should be reviewed carefully: |
|
| 208 |
- |
|
| 209 |
-| Module | File | Description | |
|
| 210 |
-|--------|------|-------------| |
|
| 211 |
-| **Anomaly Detector** | `Services/AnomalyDetector.swift` | Statistical detection: insertions, deletions, duplicates, divergence | |
|
| 212 |
-| **Divergence Engine** | `Services/DivergenceEngine.swift` | Time-series trend analysis, σ comparison | |
|
| 213 |
-| **Fingerprinter** | `Services/SampleFingerprinter.swift` | Duplicate detection via sample hashing | |
|
| 214 |
-| **Snapshot Comparator** | `Services/SnapshotComparator.swift` | Diff between two HealthKit snapshots | |
|
| 215 |
-| **Distribution Comparator** | `Services/SnapshotDiffService.swift` | Daily per-type distribution diff to reveal old-data disappearance masked by new data | |
|
| 216 |
- |
|
| 217 |
-**Guidelines for algorithm modules:** |
|
| 218 |
-- Document assumptions explicitly (e.g., "assumes continuous monitoring since install") |
|
| 219 |
-- All thresholds (e.g., `age > 7 days`) must be configurable constants, not magic numbers |
|
| 220 |
-- Include unit tests for edge cases (empty snapshots, partial data, clock skew) |
|
| 221 |
-- No UI code; return plain Swift types only |
|
| 222 |
- |
|
| 223 |
- |
|
| 224 |
-## Privacy Directives — All Agents |
|
| 225 |
- |
|
| 226 |
-**Mandatory across all modules:** |
|
| 227 |
-- No credentials, API keys, tokens, or certificates in any file |
|
| 228 |
-- No personal data: names, emails, phone numbers, dates of birth |
|
| 229 |
-- No device identifiers: UDID, serial number, advertising ID, device name |
|
| 230 |
-- No account identifiers: Apple ID, iCloud account info, CloudKit record IDs |
|
| 231 |
-- No raw health values in code, tests, previews, logs, or comments |
|
| 232 |
-- No location data or patterns enabling re-identification |
|
| 233 |
-- Synthetic data only in tests and previews |
|
| 234 |
- |
|
| 235 |
-**Clarification:** “No raw health values” applies to this repository’s contents. The app may optionally store a user's raw HealthKit samples *locally on-device* for forensic backup purposes, but such samples must never appear in source control, logs, or docs. |
|
| 236 |
- |
|
| 237 |
- |
|
| 238 |
-## Communication Between Agents |
|
| 239 |
- |
|
| 240 |
-When one agent needs to communicate a decision or change to another: |
|
| 241 |
- |
|
| 242 |
-1. **Update this file** (`AGENTS.md`) with the protocol/interface change |
|
| 243 |
-2. **Update the relevant protocol** in `Services/Protocols/` |
|
| 244 |
-3. **Add a comment** in the affected file: `// Interface updated YYYY-MM-DD — see AGENTS.md` |
|
| 245 |
- |
|
| 246 |
- |
|
| 247 |
-## Current Status |
|
| 248 |
- |
|
| 249 |
-| Module | Status | Owner | |
|
| 250 |
-|--------|--------|-------| |
|
| 251 |
-| SwiftData Models | ✅ Done | Models agent | |
|
| 252 |
-| HealthKit Integration | ✅ Done | Services agent | |
|
| 253 |
-| Snapshot Diff Service | ✅ Done | Services agent | |
|
| 254 |
-| Service Protocols | ⏳ Not started | Services agent | |
|
| 255 |
-| Anomaly Detection | ⏳ Not started | Services agent | |
|
| 256 |
-| Sync Monitor | ⏳ Not started | Services agent | |
|
| 257 |
-| UI – App entry + TabView | ✅ Done | Claude Code | |
|
| 258 |
-| UI – Dashboard | ✅ Done (functional, minimal) | Claude Code | |
|
| 259 |
-| UI – Snapshots + Detail | ✅ Done | Claude Code | |
|
| 260 |
-| UI – Data Types | ✅ Done | Claude Code | |
|
| 261 |
-| UI – Settings | ✅ Done | Claude Code | |
|
| 262 |
-| Unit Tests | ⏳ Not started | Tests agent | |
|
| 13 |
+The root repository must not contain substantive project documentation. Keep |
|
| 14 |
+canonical docs under `HealthProbe/Doc/` so agents do not discover stale copies. |
|
@@ -1,239 +1,7 @@ |
||
| 1 |
-# HealthProbe – Claude Code Instructions |
|
| 1 |
+# HealthProbe Claude Bootstrap |
|
| 2 | 2 |
|
| 3 |
-## Project Context |
|
| 3 |
+Canonical Claude/UI instructions live in: |
|
| 4 | 4 |
|
| 5 |
-**HealthProbe** is an iOS app that audits Apple HealthKit data integrity. |
|
| 6 |
-It detects anomalies: data loss, historical insertions, duplicates, divergence trends. |
|
| 7 |
-Full specification: `HealthProbe/Doc/HealthProbe – Complete Specification & Motivations.md` |
|
| 5 |
+- `HealthProbe/Doc/00-agent-guides/CLAUDE.md` |
|
| 8 | 6 |
|
| 9 |
-**Current state:** SwiftUI + SwiftData app is active. Product direction changed on 2026-05-18: HealthProbe is a local audit/capture agent. Do not add HealthProbe CloudKit/iCloud sync. |
|
| 10 |
- |
|
| 11 |
- |
|
| 12 |
-## Claude Code Scope: UI Layer |
|
| 13 |
- |
|
| 14 |
-Claude Code is responsible for: |
|
| 15 |
-- All **SwiftUI Views** (`Views/` directory) |
|
| 16 |
-- All **ViewModels** (`ViewModels/` directory) |
|
| 17 |
-- **Navigation structure** and tab/split layout |
|
| 18 |
-- **Design system** (colors, typography, spacing) |
|
| 19 |
-- **Preview providers** for all views |
|
| 20 |
-- **Accessibility** (VoiceOver, Dynamic Type) |
|
| 21 |
- |
|
| 22 |
-Claude Code does NOT own: |
|
| 23 |
-- `Services/` — HealthKit queries, anomaly detection, archive store, context monitoring (see AGENTS.md) |
|
| 24 |
-- `Models/` — SwiftData models (see AGENTS.md) |
|
| 25 |
-- Entitlements, Info.plist, project configuration |
|
| 26 |
- |
|
| 27 |
-When services are not yet implemented, **consume their protocols and use mock implementations** for UI development. |
|
| 28 |
- |
|
| 29 |
- |
|
| 30 |
-## Target Screen Structure |
|
| 31 |
- |
|
| 32 |
-``` |
|
| 33 |
-App (TabView) |
|
| 34 |
-├── Tab 1: Dashboard → DashboardView |
|
| 35 |
-├── Tab 2: Anomalies → AnomalyListView → AnomalyDetailView |
|
| 36 |
-├── Tab 3: Audit Trail → AuditTrailView |
|
| 37 |
-├── Tab 4: Archive Status → ArchiveStatusView |
|
| 38 |
-└── Tab 5: Settings → SettingsView |
|
| 39 |
-``` |
|
| 40 |
- |
|
| 41 |
-### DashboardView |
|
| 42 |
-- Large status indicator: ✅ Healthy / ⚠️ Check / 🚨 Critical |
|
| 43 |
-- Last check timestamp |
|
| 44 |
-- Summary cards: samples tracked, anomalies found (all-time) |
|
| 45 |
-- Up to 3 recent active alerts (tappable → AnomalyDetailView) |
|
| 46 |
-- "Check Now" button (calls monitoring service) |
|
| 47 |
- |
|
| 48 |
-### AnomalyListView |
|
| 49 |
-- List of `DetectedAnomaly` sorted by date (most recent first) |
|
| 50 |
-- Filter: All / Critical / Warning / Info |
|
| 51 |
-- Filter: by type (deletion, insertion, duplicate, divergence) |
|
| 52 |
-- Each row: severity badge, type, sample type, date |
|
| 53 |
-- Swipe to mark resolved |
|
| 54 |
- |
|
| 55 |
-### AnomalyDetailView |
|
| 56 |
-- Full anomaly details |
|
| 57 |
-- Evidence dictionary displayed as key-value rows |
|
| 58 |
-- Severity badge |
|
| 59 |
-- Share button → exports as Markdown (for bug reports) |
|
| 60 |
-- "Mark Resolved" action |
|
| 61 |
- |
|
| 62 |
-### AuditTrailView |
|
| 63 |
-- Chronological list of `AuditTrailEntry` |
|
| 64 |
-- Each row: timestamp, event type chip, message |
|
| 65 |
-- Search/filter by event type |
|
| 66 |
-- Export button → JSON |
|
| 67 |
- |
|
| 68 |
-### ArchiveStatusView |
|
| 69 |
-- Current local archive health |
|
| 70 |
-- Last archive verification timestamp |
|
| 71 |
-- Selected data types covered by forensic capture |
|
| 72 |
-- Recent Health/iCloud context events (for correlation only; no HealthProbe sync) |
|
| 73 |
- |
|
| 74 |
-### SettingsView |
|
| 75 |
-- Check frequency: 2h / 6h / 12h / 24h (Picker) |
|
| 76 |
-- Sample types to monitor (MultiSelect toggle list) |
|
| 77 |
-- Alert thresholds (severity level for push notifications) |
|
| 78 |
-- Point export/report actions for selected findings |
|
| 79 |
-- Delete all audit data (destructive, confirm alert) |
|
| 80 |
- |
|
| 81 |
- |
|
| 82 |
-## Design Guidelines |
|
| 83 |
- |
|
| 84 |
-**Tone:** Professional, calm, medical-adjacent. Not alarming unless critical. |
|
| 85 |
- |
|
| 86 |
-**Color System:** |
|
| 87 |
-```swift |
|
| 88 |
-// Status colors |
|
| 89 |
-.healthyGreen // SF: green — all clear |
|
| 90 |
-.warningAmber // SF: yellow — attention needed |
|
| 91 |
-.criticalRed // SF: red — action required |
|
| 92 |
-.neutralGray // SF: gray — informational / resolved |
|
| 93 |
-``` |
|
| 94 |
- |
|
| 95 |
-**Typography:** SF Pro (system font). No custom fonts. |
|
| 96 |
- |
|
| 97 |
-**Spacing:** 8pt grid. Use `VStack(spacing: 12)` as baseline. |
|
| 98 |
- |
|
| 99 |
-**Icons:** SF Symbols only. No third-party icon sets. |
|
| 100 |
- |
|
| 101 |
-**Key SF Symbols:** |
|
| 102 |
-- `checkmark.shield.fill` — healthy status |
|
| 103 |
-- `exclamationmark.triangle.fill` — warning |
|
| 104 |
-- `xmark.shield.fill` — critical |
|
| 105 |
-- `clock.arrow.circlepath` — audit trail |
|
| 106 |
-- `externaldrive.fill.badge.checkmark` — archive status |
|
| 107 |
-- `waveform.path.ecg` — health data |
|
| 108 |
-- `doc.text.magnifyingglass` — anomaly detail |
|
| 109 |
- |
|
| 110 |
-**Dark mode:** Required. Test in both modes. |
|
| 111 |
- |
|
| 112 |
-**Privacy-first UI:** |
|
| 113 |
-- Health metric values are **never shown in plain text** in list rows |
|
| 114 |
-- Values visible only in `AnomalyDetailView` after tap |
|
| 115 |
-- Evidence dictionary values shown as monospace text, not highlighted |
|
| 116 |
- |
|
| 117 |
- |
|
| 118 |
-## SwiftData Integration |
|
| 119 |
- |
|
| 120 |
-Models are defined in `Models/`. Reference them read-only from views: |
|
| 121 |
- |
|
| 122 |
-```swift |
|
| 123 |
-// In views, use @Query — never write directly from a View |
|
| 124 |
-@Query(sort: \DetectedAnomaly.detectedAt, order: .reverse) |
|
| 125 |
-private var anomalies: [DetectedAnomaly] |
|
| 126 |
- |
|
| 127 |
-// Mutations go through ViewModels or services only |
|
| 128 |
-``` |
|
| 129 |
- |
|
| 130 |
-Until `Models/` are implemented, use mock data via `PreviewProvider`. |
|
| 131 |
- |
|
| 132 |
- |
|
| 133 |
-## ViewModel Pattern |
|
| 134 |
- |
|
| 135 |
-```swift |
|
| 136 |
-// Pattern for all ViewModels |
|
| 137 |
-@MainActor |
|
| 138 |
-@Observable |
|
| 139 |
-final class DashboardViewModel {
|
|
| 140 |
- private let monitor: HealthMonitorProtocol // protocol, not concrete type |
|
| 141 |
- |
|
| 142 |
- var status: HealthStatus = .unknown |
|
| 143 |
- var recentAnomalies: [DetectedAnomaly] = [] |
|
| 144 |
- var lastChecked: Date? |
|
| 145 |
- |
|
| 146 |
- init(monitor: HealthMonitorProtocol = HealthMonitorService.shared) {
|
|
| 147 |
- self.monitor = monitor |
|
| 148 |
- } |
|
| 149 |
- |
|
| 150 |
- func refresh() async {
|
|
| 151 |
- await monitor.runCheck() |
|
| 152 |
- } |
|
| 153 |
-} |
|
| 154 |
-``` |
|
| 155 |
- |
|
| 156 |
-Always inject dependencies via protocols — makes previews and tests possible without real HealthKit. |
|
| 157 |
- |
|
| 158 |
- |
|
| 159 |
-## Mock Data Protocol |
|
| 160 |
- |
|
| 161 |
-Until services are ready, define preview mocks in `Utilities/Mocks.swift`: |
|
| 162 |
- |
|
| 163 |
-```swift |
|
| 164 |
-struct MockHealthMonitor: HealthMonitorProtocol {
|
|
| 165 |
- func runCheck() async { }
|
|
| 166 |
- var status: HealthStatus { .warning }
|
|
| 167 |
-} |
|
| 168 |
- |
|
| 169 |
-extension DetectedAnomaly {
|
|
| 170 |
- static var preview: DetectedAnomaly {
|
|
| 171 |
- DetectedAnomaly( |
|
| 172 |
- detectedAt: .now, |
|
| 173 |
- type: "silent_deletion", |
|
| 174 |
- severity: "warning", |
|
| 175 |
- sampleType: "Steps", |
|
| 176 |
- summary: "72 samples missing without deletion event", |
|
| 177 |
- evidence: ["loss_count": "72", "loss_percent": "23.4"] |
|
| 178 |
- ) |
|
| 179 |
- } |
|
| 180 |
-} |
|
| 181 |
-``` |
|
| 182 |
- |
|
| 183 |
- |
|
| 184 |
-## File Organization |
|
| 185 |
- |
|
| 186 |
-``` |
|
| 187 |
-HealthProbe/ |
|
| 188 |
-├── Views/ |
|
| 189 |
-│ ├── Dashboard/ |
|
| 190 |
-│ │ ├── DashboardView.swift |
|
| 191 |
-│ │ └── StatusCardView.swift |
|
| 192 |
-│ ├── Anomalies/ |
|
| 193 |
-│ │ ├── AnomalyListView.swift |
|
| 194 |
-│ │ └── AnomalyDetailView.swift |
|
| 195 |
-│ ├── AuditTrail/ |
|
| 196 |
-│ │ └── AuditTrailView.swift |
|
| 197 |
-│ ├── Archive/ |
|
| 198 |
-│ │ └── ArchiveStatusView.swift |
|
| 199 |
-│ └── Settings/ |
|
| 200 |
-│ └── SettingsView.swift |
|
| 201 |
-├── ViewModels/ |
|
| 202 |
-│ ├── DashboardViewModel.swift |
|
| 203 |
-│ ├── AnomalyListViewModel.swift |
|
| 204 |
-│ └── ArchiveStatusViewModel.swift |
|
| 205 |
-├── Models/ ← NOT owned by Claude Code |
|
| 206 |
-├── Services/ ← NOT owned by Claude Code |
|
| 207 |
-└── Utilities/ |
|
| 208 |
- ├── Mocks.swift |
|
| 209 |
- ├── DateFormatters.swift |
|
| 210 |
- └── DesignSystem.swift |
|
| 211 |
-``` |
|
| 212 |
- |
|
| 213 |
- |
|
| 214 |
-## Privacy Directives |
|
| 215 |
- |
|
| 216 |
-**Mandatory — no exceptions:** |
|
| 217 |
-- No credentials, tokens, or API keys in any file |
|
| 218 |
-- No personal data, device identifiers, or account identifiers |
|
| 219 |
-- No real health values in code, comments, previews, or tests |
|
| 220 |
-- Synthetic preview data only (see Mocks.swift above) |
|
| 221 |
- |
|
| 222 |
- |
|
| 223 |
-## Before Marking a Task Complete |
|
| 224 |
- |
|
| 225 |
-- [ ] View renders in both Light and Dark mode (use Preview) |
|
| 226 |
-- [ ] VoiceOver labels set on interactive elements |
|
| 227 |
-- [ ] Dynamic Type tested (at least xSmall and AX3) |
|
| 228 |
-- [ ] Works with mock data (no real HealthKit dependency in View layer) |
|
| 229 |
-- [ ] No health values displayed without explicit user tap |
|
| 230 |
-- [ ] Compiles without warnings |
|
| 7 |
+Read `HealthProbe/Doc/README.md` first, then the UI agent guide above. |
|
@@ -1,46 +0,0 @@ |
||
| 1 |
-# Contributing to HealthProbe |
|
| 2 |
- |
|
| 3 |
-## ⚠️ Privacy Rules — Non-Negotiable |
|
| 4 |
- |
|
| 5 |
-Before submitting any code, issue, PR, or documentation: |
|
| 6 |
- |
|
| 7 |
-**Never include:** |
|
| 8 |
-- Credentials, API keys, tokens, or certificates |
|
| 9 |
-- Personal data: names, emails, phone numbers, dates of birth |
|
| 10 |
-- Device identifiers: UDID, serial number, advertising ID, device name |
|
| 11 |
-- Account identifiers: Apple ID, iCloud account, CloudKit record IDs |
|
| 12 |
-- Raw health data: actual measurements, records, or workout details |
|
| 13 |
-- Location data: GPS coordinates, location history |
|
| 14 |
-- Any combination of fields that could identify a person or device |
|
| 15 |
- |
|
| 16 |
-**For examples and tests, use synthetic data only:** |
|
| 17 |
-``` |
|
| 18 |
-Device: "iPhone-TESTDEVICE-001" |
|
| 19 |
-User: "Test User" |
|
| 20 |
-Date: 2000-01-01 |
|
| 21 |
-Value: 0 (or clearly fictional) |
|
| 22 |
-``` |
|
| 23 |
- |
|
| 24 |
-Submissions containing real credentials or personal data will be closed without review. |
|
| 25 |
- |
|
| 26 |
- |
|
| 27 |
-## Contribution Standards |
|
| 28 |
- |
|
| 29 |
-- **Observations ≠ conclusions:** Label theories and speculation explicitly |
|
| 30 |
-- **Read-only HealthKit:** No code that modifies or deletes health data |
|
| 31 |
-- **Evidence-based:** Bug reports require reproduction steps, device model, and iOS version |
|
| 32 |
-- **No raw health exports:** Aggregated counts only; never raw sample values |
|
| 33 |
- |
|
| 34 |
-## Bug Reports |
|
| 35 |
- |
|
| 36 |
-Include: |
|
| 37 |
-- Device model (e.g., iPhone 15 Pro) — no serial/UDID |
|
| 38 |
-- iOS version |
|
| 39 |
-- HealthProbe version |
|
| 40 |
-- Observed vs. expected behavior |
|
| 41 |
-- Anonymized screenshot or export (values redacted) |
|
| 42 |
- |
|
| 43 |
-## License |
|
| 44 |
- |
|
| 45 |
-By contributing you agree your code is released under the project license. |
|
@@ -14,6 +14,21 @@ |
||
| 14 | 14 |
439832862FA4933F003C0182 /* Exceptions for "HealthProbe" folder in "HealthProbe" target */ = {
|
| 15 | 15 |
isa = PBXFileSystemSynchronizedBuildFileExceptionSet; |
| 16 | 16 |
membershipExceptions = ( |
| 17 |
+ "Doc/00-agent-guides/AGENTS.md", |
|
| 18 |
+ "Doc/00-agent-guides/CLAUDE.md", |
|
| 19 |
+ "Doc/01-product/Forensics-Limitations.md", |
|
| 20 |
+ "Doc/01-product/MVP-Specification.md", |
|
| 21 |
+ "Doc/01-product/Product-Specification.md", |
|
| 22 |
+ "Doc/02-architecture/Core-Data-Cache-Design.md", |
|
| 23 |
+ "Doc/02-architecture/Database-Design.md", |
|
| 24 |
+ "Doc/02-architecture/Export-Specification.md", |
|
| 25 |
+ "Doc/02-architecture/Implementation-Guide.md", |
|
| 26 |
+ "Doc/03-ui/README.md", |
|
| 27 |
+ "Doc/04-project/IMPLEMENTATION_STATUS.md", |
|
| 28 |
+ "Doc/04-project/Refactoring-Plan.md", |
|
| 29 |
+ "Doc/99-archive/DATA_TYPE_VIEWS_OPTIMIZATION.md", |
|
| 30 |
+ "Doc/99-archive/REFACTORING_DATA_TYPE_VIEWS.md", |
|
| 31 |
+ Doc/README.md, |
|
| 17 | 32 |
Info.plist, |
| 18 | 33 |
); |
| 19 | 34 |
target = 439832782FA4933E003C0182 /* HealthProbe */; |
@@ -91,7 +106,7 @@ |
||
| 91 | 106 |
attributes = {
|
| 92 | 107 |
BuildIndependentTargetsInParallel = 1; |
| 93 | 108 |
LastSwiftUpdateCheck = 2640; |
| 94 |
- LastUpgradeCheck = 2640; |
|
| 109 |
+ LastUpgradeCheck = 2650; |
|
| 95 | 110 |
TargetAttributes = {
|
| 96 | 111 |
439832782FA4933E003C0182 = {
|
| 97 | 112 |
CreatedOnToolsVersion = 26.4.1; |
@@ -278,6 +293,7 @@ |
||
| 278 | 293 |
MTL_FAST_MATH = YES; |
| 279 | 294 |
ONLY_ACTIVE_ARCH = YES; |
| 280 | 295 |
SDKROOT = iphoneos; |
| 296 |
+ STRING_CATALOG_GENERATE_SYMBOLS = YES; |
|
| 281 | 297 |
SWIFT_ACTIVE_COMPILATION_CONDITIONS = "DEBUG $(inherited)"; |
| 282 | 298 |
SWIFT_OPTIMIZATION_LEVEL = "-Onone"; |
| 283 | 299 |
}; |
@@ -335,6 +351,7 @@ |
||
| 335 | 351 |
MTL_ENABLE_DEBUG_INFO = NO; |
| 336 | 352 |
MTL_FAST_MATH = YES; |
| 337 | 353 |
SDKROOT = iphoneos; |
| 354 |
+ STRING_CATALOG_GENERATE_SYMBOLS = YES; |
|
| 338 | 355 |
SWIFT_COMPILATION_MODE = wholemodule; |
| 339 | 356 |
VALIDATE_PRODUCT = YES; |
| 340 | 357 |
}; |
@@ -0,0 +1,341 @@ |
||
| 1 |
+# HealthProbe - Multi-Model Development Guide |
|
| 2 |
+ |
|
| 3 |
+Canonical path: `HealthProbe/Doc/00-agent-guides/AGENTS.md` |
|
| 4 |
+ |
|
| 5 |
+Start every documentation lookup from [`../README.md`](../README.md). |
|
| 6 |
+ |
|
| 7 |
+## Overview |
|
| 8 |
+ |
|
| 9 |
+HealthProbe is built by multiple AI models, each owning a distinct domain. |
|
| 10 |
+This document defines boundaries, interfaces, and handoff contracts. |
|
| 11 |
+ |
|
| 12 |
+**Agentic reality:** The repo is developed largely via agents (Codex CLI, Claude, and dedicated model sessions). When updating product scope, update docs first, then implement behind flags, and add tests for the new behavior. |
|
| 13 |
+ |
|
| 14 |
+**Objective updated 2026-05-23:** HealthProbe is a single-device local |
|
| 15 |
+Health database time machine. It captures the local HealthKit database over time, |
|
| 16 |
+lets the user inspect how accessible health data looked at a chosen observation |
|
| 17 |
+date, explains what changed between local observations, and preserves exportable |
|
| 18 |
+evidence that may no longer be available after Apple Health consolidates, |
|
| 19 |
+aggregates, or prunes historical records. The app no longer treats raw |
|
| 20 |
+record-count drops as inherently alarming, no longer studies differences between |
|
| 21 |
+snapshots from different devices, and does not sync HealthProbe data through |
|
| 22 |
+CloudKit/iCloud. Device metadata may still be stored as local provenance for the |
|
| 23 |
+current device's chain, but UI and algorithms should compare snapshots only |
|
| 24 |
+within one local device timeline. |
|
| 25 |
+ |
|
| 26 |
+**Storage decision updated 2026-05-23:** HealthProbe targets legacy devices still |
|
| 27 |
+used for Health collection, including iPhone 6s / Apple Watch Series 3 class |
|
| 28 |
+setups. SwiftData is therefore not an acceptable long-term foundation because it |
|
| 29 |
+requires newer OS versions. The target architecture is: |
|
| 30 |
+1. a direct SQLite archive/analysis database as source of truth; |
|
| 31 |
+2. differential observation storage, never recurring complete snapshot copies; |
|
| 32 |
+3. SQL-first analysis using indexes, joins, CTEs, temporary tables, and paged |
|
| 33 |
+ result sets without loading large archives into RAM; |
|
| 34 |
+4. a rebuildable Core Data UI/reporting cache for expensive counts, summaries, |
|
| 35 |
+ timeline rows, report metadata, and display state. |
|
| 36 |
+ |
|
| 37 |
+Current SwiftData models are legacy/prototype implementation details until the |
|
| 38 |
+Core Data cache replacement is implemented. New product/storage work should not |
|
| 39 |
+expand SwiftData dependency. |
|
| 40 |
+ |
|
| 41 |
+**Deployment/reset note updated 2026-05-23:** HealthProbe has no real |
|
| 42 |
+deployments, only test installations. Existing SwiftData stores and prototype |
|
| 43 |
+SQLite archives are disposable for the archive v2 refactor: agents should reset, |
|
| 44 |
+ignore, or reinitialize them rather than building one-way migrations or backward |
|
| 45 |
+compatibility layers. Future real archive versions may require migrations, but |
|
| 46 |
+the current prototype schema does not. |
|
| 47 |
+ |
|
| 48 |
+**Recovery compatibility note updated 2026-05-23:** HealthProbe will not perform |
|
| 49 |
+disaster-recovery workflows such as transplanting Health database files into |
|
| 50 |
+encrypted iOS backups or re-publishing archived values into HealthKit. However, |
|
| 51 |
+the local archive and user exports should preserve enough structure to support |
|
| 52 |
+external recovery/salvage procedures: stable record identity, values, dates, |
|
| 53 |
+units, source/provenance metadata where available, observation history, |
|
| 54 |
+relationships, hashes, and manifests. Recovery compatibility is an archive/export |
|
| 55 |
+requirement, not an in-app restore feature. |
|
| 56 |
+ |
|
| 57 |
+--- |
|
| 58 |
+ |
|
| 59 |
+## Model Allocation |
|
| 60 |
+ |
|
| 61 |
+| Domain | Owner | Tools | |
|
| 62 |
+|--------|-------|-------| |
|
| 63 |
+| **UI / SwiftUI Views** | Claude Code | Xcode, SwiftUI, `HealthProbe/Doc/00-agent-guides/CLAUDE.md` | |
|
| 64 |
+| **Archive Store** | Dedicated model session | SQLite/local archive format, HealthKit metadata mapping | |
|
| 65 |
+| **Data Models (Core Data cache)** | Dedicated model session | Xcode, Swift; derived UI/cache/settings/log/report models only | |
|
| 66 |
+| **HealthKit Integration** | Dedicated model session | Xcode, HealthKit docs | |
|
| 67 |
+| **Change Explanation Algorithms** | Dedicated model session | Swift, archive diffing, consolidation heuristics | |
|
| 68 |
+| **Context Monitoring** | Dedicated model session | Xcode; logs Health/iCloud state as context only | |
|
| 69 |
+| **Documentation** | Claude Code + dedicated session | Markdown | |
|
| 70 |
+| **Tests** | Dedicated model session | XCTest, Swift Testing | |
|
| 71 |
+ |
|
| 72 |
+--- |
|
| 73 |
+ |
|
| 74 |
+## Directory Ownership |
|
| 75 |
+ |
|
| 76 |
+``` |
|
| 77 |
+HealthProbe/ |
|
| 78 |
+├── Views/ ← Claude Code (UI) |
|
| 79 |
+├── ViewModels/ ← Claude Code (UI) |
|
| 80 |
+├── Utilities/ ← Claude Code (shared helpers, mocks) |
|
| 81 |
+├── Models/ ← Models agent (legacy SwiftData now; target Core Data UI/cache schemas) |
|
| 82 |
+├── Services/ ← Services agent (HealthKit, archive store, change explanation, context) |
|
| 83 |
+└── Tests/ ← Tests agent |
|
| 84 |
+``` |
|
| 85 |
+ |
|
| 86 |
+**Rule:** Each agent writes only within its owned directories. |
|
| 87 |
+Cross-boundary changes require an explicit interface contract (protocol) first. |
|
| 88 |
+ |
|
| 89 |
+**Documentation scope:** `HealthProbe/Doc/` is shared. Keep it consistent with shipped behavior, and add a dated entry when objectives change. |
|
| 90 |
+ |
|
| 91 |
+**Database work starts here:** `HealthProbe/Doc/02-architecture/Database-Design.md`. |
|
| 92 |
+The database is the central project artifact. Agents changing archive schema, |
|
| 93 |
+capture persistence, diff logic, aggregate caches, exports, reset behavior, or |
|
| 94 |
+future migrations must read and update that document before code changes. |
|
| 95 |
+ |
|
| 96 |
+--- |
|
| 97 |
+ |
|
| 98 |
+## Interface Contracts |
|
| 99 |
+ |
|
| 100 |
+All service boundaries are defined as Swift protocols. |
|
| 101 |
+Claude Code (UI) consumes protocols — never concrete implementations. |
|
| 102 |
+ |
|
| 103 |
+### CaptureMonitorProtocol |
|
| 104 |
+ |
|
| 105 |
+```swift |
|
| 106 |
+/// Owned by: Services agent |
|
| 107 |
+/// Consumed by: UI (DashboardViewModel) |
|
| 108 |
+protocol CaptureMonitorProtocol {
|
|
| 109 |
+ var archiveStatus: ArchiveStatus { get }
|
|
| 110 |
+ var lastObservedAt: Date? { get }
|
|
| 111 |
+ func captureNow() async throws |
|
| 112 |
+} |
|
| 113 |
+``` |
|
| 114 |
+ |
|
| 115 |
+### ChangeSummaryStoreProtocol |
|
| 116 |
+ |
|
| 117 |
+```swift |
|
| 118 |
+/// Owned by: Services agent |
|
| 119 |
+/// Consumed by: UI (change timeline/detail views) |
|
| 120 |
+protocol ChangeSummaryStoreProtocol {
|
|
| 121 |
+ var changes: [DetectedChange] { get }
|
|
| 122 |
+ func markReviewed(_ change: DetectedChange) async throws |
|
| 123 |
+} |
|
| 124 |
+``` |
|
| 125 |
+ |
|
| 126 |
+### AuditTrailProtocol |
|
| 127 |
+ |
|
| 128 |
+```swift |
|
| 129 |
+/// Owned by: Services agent |
|
| 130 |
+/// Consumed by: UI (AuditTrailView) |
|
| 131 |
+protocol AuditTrailProtocol {
|
|
| 132 |
+ var entries: [AuditTrailEntry] { get }
|
|
| 133 |
+ func export() async throws -> Data // JSON |
|
| 134 |
+} |
|
| 135 |
+``` |
|
| 136 |
+ |
|
| 137 |
+### ContextMonitorProtocol |
|
| 138 |
+ |
|
| 139 |
+```swift |
|
| 140 |
+/// Owned by: Services agent |
|
| 141 |
+/// Consumed by: UI (ContextViewModel) |
|
| 142 |
+protocol ContextMonitorProtocol {
|
|
| 143 |
+ var iCloudEnabled: Bool { get }
|
|
| 144 |
+ var lastObservedAt: Date? { get }
|
|
| 145 |
+ var stateChanges: [ContextStateChange] { get }
|
|
| 146 |
+} |
|
| 147 |
+``` |
|
| 148 |
+ |
|
| 149 |
+--- |
|
| 150 |
+ |
|
| 151 |
+## Shared Types (Models Agent) |
|
| 152 |
+ |
|
| 153 |
+These types are defined once in `Models/` and shared across all agents: |
|
| 154 |
+ |
|
| 155 |
+```swift |
|
| 156 |
+// Models/TypeDistributionBin.swift |
|
| 157 |
+@Model |
|
| 158 |
+final class TypeDistributionBin {
|
|
| 159 |
+ var bucketStart: Date |
|
| 160 |
+ var bucketEnd: Date |
|
| 161 |
+ var count: Int |
|
| 162 |
+} |
|
| 163 |
+ |
|
| 164 |
+// Models/TypeCount.swift |
|
| 165 |
+// TypeCount owns zero or more TypeDistributionBin records. |
|
| 166 |
+// These bins store sample counts and import anchors, not raw health values. |
|
| 167 |
+ |
|
| 168 |
+// Interface updated 2026-05-12 — see AGENTS.md |
|
| 169 |
+// Models/HealthRecord.swift |
|
| 170 |
+// HealthRecord stores one anonymized HealthKit record fingerprint plus its start/end dates. |
|
| 171 |
+// It intentionally does not store raw health values, device identifiers, or source metadata. |
|
| 172 |
+// UI may compare HealthRecord fingerprints between adjacent snapshots to expose local |
|
| 173 |
+// record-level changes that are masked by newly-added records with the same total count. |
|
| 174 |
+// High-volume snapshots store these records in TypeCount.recordArchiveData instead of |
|
| 175 |
+// creating one SwiftData model per record, avoiding main-thread stalls after import. |
|
| 176 |
+ |
|
| 177 |
+// Interface updated 2026-05-13 — see AGENTS.md |
|
| 178 |
+// TypeDistributionBin also stores content hashes and HealthKit query anchors. |
|
| 179 |
+// Import uses a global anchored query per data type so follow-up snapshots fetch only |
|
| 180 |
+// HealthKit deltas instead of scanning calendar blocks with fixed per-query latency. |
|
| 181 |
+ |
|
| 182 |
+// Interface updated 2026-05-18 — see AGENTS.md |
|
| 183 |
+// SwiftData is not the forensic source of truth and is legacy/prototype storage |
|
| 184 |
+// for the current app. Target architecture uses Core Data as a rebuildable |
|
| 185 |
+// UI/reporting cache only. Complete HealthKit samples and metadata belong in the |
|
| 186 |
+// SQLite archive store, in one schema that can preserve relationships across data |
|
| 187 |
+// types, sources, devices, workouts, and metadata. |
|
| 188 |
+ |
|
| 189 |
+// Interface updated 2026-05-18 — see AGENTS.md |
|
| 190 |
+// Services/Protocols/HealthArchiveStore.swift defines the local archive boundary. |
|
| 191 |
+// SQLiteHealthArchiveStore is the current implementation. HealthKit anchored-query |
|
| 192 |
+// pages must be written to this archive before UI/cache rows are saved. |
|
| 193 |
+// Deletions are recorded by sampleUUIDHash because HKDeletedObject exposes UUIDs, |
|
| 194 |
+// not complete sample payloads. |
|
| 195 |
+ |
|
| 196 |
+// Storage objective updated 2026-05-23 — see AGENTS.md |
|
| 197 |
+// Recurring complete snapshots are out of scope for the target architecture. |
|
| 198 |
+// Store differential observations, versioned sample payloads, observation ranges, |
|
| 199 |
+// and materialized aggregates. Expensive counts used by reports/UI should be |
|
| 200 |
+// cached in Core Data and be rebuildable from SQLite. |
|
| 201 |
+ |
|
| 202 |
+// Objective updated 2026-05-23 — see AGENTS.md |
|
| 203 |
+// HealthProbe is a local Health DB Time Machine. Snapshot/device identifiers are |
|
| 204 |
+// retained only to preserve local provenance and keep comparisons within one |
|
| 205 |
+// device chain. Record-count drops must be explained with aggregate and |
|
| 206 |
+// representation context, not treated as inherently alarming. |
|
| 207 |
+ |
|
| 208 |
+// Interface updated 2026-05-17 — see AGENTS.md |
|
| 209 |
+// Models/TypeCount.detailCacheData stores precomputed detail data for the current |
|
| 210 |
+// TypeCount compared with the immediately previous snapshot on the same device. |
|
| 211 |
+// The cache contains aggregate added/disappeared counts, capped preview records for |
|
| 212 |
+// UI drill-down, and daily change bins for temporal charts. It must be computed when |
|
| 213 |
+// snapshots are saved and refreshed for neighboring snapshots when snapshot deletion |
|
| 214 |
+// changes chain links. Existing stores are backfilled incrementally with a strict |
|
| 215 |
+// per-launch TypeCount cap to avoid decoding many large archives in one run. |
|
| 216 |
+ |
|
| 217 |
+// Interface updated 2026-05-17 — see AGENTS.md |
|
| 218 |
+// Models/HealthSnapshot.contentEquivalentSnapshotID marks snapshots whose TypeCount |
|
| 219 |
+// content is identical to a previous snapshot on the same device. These snapshots are |
|
| 220 |
+// retained as temporal labels but behave as aliases to the representative content |
|
| 221 |
+// snapshot for expensive detail cache/diff work. |
|
| 222 |
+ |
|
| 223 |
+// Interface updated 2026-05-17 — see AGENTS.md |
|
| 224 |
+// Models/TypeCount.contentEquivalentTypeCountID marks individual data types whose |
|
| 225 |
+// content is identical to the previous snapshot's same TypeCount. This allows a |
|
| 226 |
+// snapshot to contain real changes for some metrics while long-stable metrics behave |
|
| 227 |
+// as temporal aliases and skip per-type detail cache/diff work. |
|
| 228 |
+ |
|
| 229 |
+// Interface updated 2026-05-17 — see AGENTS.md |
|
| 230 |
+// Models/HealthSnapshot stores cached overview scalars for UI consumption: |
|
| 231 |
+// tracked type count, aggregate record count, and overall oldest/newest record dates. |
|
| 232 |
+// These values must be computed during snapshot save while TypeCount data is already |
|
| 233 |
+// in memory, so snapshot list/detail screens never recompute them by traversing |
|
| 234 |
+// snapshot.typeCounts on the UI thread. |
|
| 235 |
+ |
|
| 236 |
+// Interface updated 2026-05-17 — see AGENTS.md |
|
| 237 |
+// Models/SnapshotDelta stores cached list/detail summary scalars derived from TypeDelta. |
|
| 238 |
+// Overview screens consume these scalars and type-delta summaries directly instead of |
|
| 239 |
+// recalculating per-snapshot changes from HealthSnapshot.typeCounts. |
|
| 240 |
+ |
|
| 241 |
+// Interface updated 2026-05-23 — see AGENTS.md |
|
| 242 |
+// Future UI/domain naming should prefer "change" or "observation diff" over |
|
| 243 |
+// "anomaly". Existing AnomalyRecord/AnomalyType code is legacy naming until the |
|
| 244 |
+// model replacement/refactor is implemented. |
|
| 245 |
+enum ChangeClassification: String, Codable {
|
|
| 246 |
+ case appeared |
|
| 247 |
+ case disappeared |
|
| 248 |
+ case representationChanged = "representation_changed" |
|
| 249 |
+ case consolidationLikely = "consolidation_likely" |
|
| 250 |
+ case aggregateChanged = "aggregate_changed" |
|
| 251 |
+ case uncertain |
|
| 252 |
+} |
|
| 253 |
+ |
|
| 254 |
+enum ReviewPriority: String, Codable, Comparable {
|
|
| 255 |
+ case info, review, important |
|
| 256 |
+} |
|
| 257 |
+ |
|
| 258 |
+enum ArchiveStatus: String {
|
|
| 259 |
+ case ready, needsCapture, degraded, unknown |
|
| 260 |
+} |
|
| 261 |
+``` |
|
| 262 |
+ |
|
| 263 |
+Any model changes must be announced in this file before other agents consume them. |
|
| 264 |
+ |
|
| 265 |
+--- |
|
| 266 |
+ |
|
| 267 |
+## Handoff Process |
|
| 268 |
+ |
|
| 269 |
+When a module is ready to be consumed by another agent: |
|
| 270 |
+ |
|
| 271 |
+1. **Define the protocol** in `Services/Protocols/` (services agent) |
|
| 272 |
+2. **Implement a mock** in `Utilities/Mocks.swift` (Claude Code) |
|
| 273 |
+3. **Build UI against the mock** (Claude Code) |
|
| 274 |
+4. **Replace mock with real implementation** (services agent) |
|
| 275 |
+5. **Integration test** (tests agent) |
|
| 276 |
+ |
|
| 277 |
+This allows UI development and service development to proceed in parallel. |
|
| 278 |
+ |
|
| 279 |
+--- |
|
| 280 |
+ |
|
| 281 |
+## Algorithms & Change Explanation Logic |
|
| 282 |
+ |
|
| 283 |
+The following modules involve non-trivial logic and should be reviewed carefully: |
|
| 284 |
+ |
|
| 285 |
+| Module | File | Description | |
|
| 286 |
+|--------|------|-------------| |
|
| 287 |
+| **Change Explainer** | `Services/AnomalyDetector.swift` *(legacy name)* | Classify appeared/disappeared/representation-changed records without assuming loss | |
|
| 288 |
+| **Consolidation Heuristics** | `Services/DivergenceEngine.swift` *(legacy name)* | Compare aggregates, intervals, and density to identify likely HealthKit consolidation | |
|
| 289 |
+| **Fingerprinter** | `Services/SampleFingerprinter.swift` | Record matching via sample and semantic hashes | |
|
| 290 |
+| **Snapshot Comparator** | `Services/SnapshotComparator.swift` | Diff between observations in one local device timeline | |
|
| 291 |
+| **Distribution Comparator** | `Services/SnapshotDiffService.swift` | Daily per-type distribution diff to distinguish detail thinning from aggregate change | |
|
| 292 |
+ |
|
| 293 |
+**Guidelines for algorithm modules:** |
|
| 294 |
+- Document assumptions explicitly (e.g., "HealthProbe can only preserve detail it observed") |
|
| 295 |
+- All thresholds (e.g., `age > 7 days`) must be configurable constants, not magic numbers |
|
| 296 |
+- Include unit tests for edge cases (empty observations, partial data, clock skew, consolidation-like rewrites) |
|
| 297 |
+- No UI code; return plain Swift types only |
|
| 298 |
+ |
|
| 299 |
+--- |
|
| 300 |
+ |
|
| 301 |
+## Privacy Directives — All Agents |
|
| 302 |
+ |
|
| 303 |
+**Mandatory across all modules:** |
|
| 304 |
+- No credentials, API keys, tokens, or certificates in any file |
|
| 305 |
+- No personal data: names, emails, phone numbers, dates of birth |
|
| 306 |
+- No device identifiers: UDID, serial number, advertising ID, device name |
|
| 307 |
+- No account identifiers: Apple ID, iCloud account info, CloudKit record IDs |
|
| 308 |
+- No raw health values in code, tests, previews, logs, or comments |
|
| 309 |
+- No location data or patterns enabling re-identification |
|
| 310 |
+- Synthetic data only in tests and previews |
|
| 311 |
+ |
|
| 312 |
+**Clarification:** “No raw health values” applies to this repository’s contents. The app may optionally store a user's raw HealthKit samples *locally on-device* for forensic backup purposes, but such samples must never appear in source control, logs, or docs. |
|
| 313 |
+ |
|
| 314 |
+--- |
|
| 315 |
+ |
|
| 316 |
+## Communication Between Agents |
|
| 317 |
+ |
|
| 318 |
+When one agent needs to communicate a decision or change to another: |
|
| 319 |
+ |
|
| 320 |
+1. **Update this file** (`HealthProbe/Doc/00-agent-guides/AGENTS.md`) with the protocol/interface change |
|
| 321 |
+2. **Update the relevant protocol** in `Services/Protocols/` |
|
| 322 |
+3. **Add a comment** in the affected file: `// Interface updated YYYY-MM-DD — see AGENTS.md` |
|
| 323 |
+ |
|
| 324 |
+--- |
|
| 325 |
+ |
|
| 326 |
+## Current Status |
|
| 327 |
+ |
|
| 328 |
+| Module | Status | Owner | |
|
| 329 |
+|--------|--------|-------| |
|
| 330 |
+| Core Data UI/Report Cache | ⏳ Planned replacement of SwiftData | Models agent | |
|
| 331 |
+| HealthKit Integration | ✅ Done | Services agent | |
|
| 332 |
+| Snapshot Diff Service | ✅ Done | Services agent | |
|
| 333 |
+| Service Protocols | ⏳ Not started | Services agent | |
|
| 334 |
+| Change Explanation / Consolidation Heuristics | ⏳ Needs refocus | Services agent | |
|
| 335 |
+| Context Monitor | ⏳ Not started | Services agent | |
|
| 336 |
+| UI – App entry + TabView | ✅ Done | Claude Code | |
|
| 337 |
+| UI – Dashboard | ✅ Done (functional, minimal) | Claude Code | |
|
| 338 |
+| UI – Snapshots + Detail | ✅ Done | Claude Code | |
|
| 339 |
+| UI – Data Types | ✅ Done | Claude Code | |
|
| 340 |
+| UI – Settings | ✅ Done | Claude Code | |
|
| 341 |
+| Unit Tests | ⏳ Not started | Tests agent | |
|
@@ -0,0 +1,157 @@ |
||
| 1 |
+# HealthProbe - Claude/UI Agent Guide |
|
| 2 |
+ |
|
| 3 |
+## Read First |
|
| 4 |
+ |
|
| 5 |
+Before UI work, read: |
|
| 6 |
+ |
|
| 7 |
+1. [`../README.md`](../README.md) |
|
| 8 |
+2. [`../01-product/MVP-Specification.md`](../01-product/MVP-Specification.md) |
|
| 9 |
+3. [`../02-architecture/Implementation-Guide.md`](../02-architecture/Implementation-Guide.md) |
|
| 10 |
+4. this file |
|
| 11 |
+ |
|
| 12 |
+## Current UI Objective |
|
| 13 |
+ |
|
| 14 |
+HealthProbe is not an alert-first anomaly dashboard. It is a local Health DB Time Machine. |
|
| 15 |
+ |
|
| 16 |
+The UI should help the user: |
|
| 17 |
+- browse local observations over time; |
|
| 18 |
+- inspect how HealthKit-accessible data looked at a selected observation date; |
|
| 19 |
+- compare two local observations; |
|
| 20 |
+- understand appeared/disappeared/representation-changed records without assuming data loss; |
|
| 21 |
+- export selected point-in-time views and diffs; |
|
| 22 |
+- see archive/cache health and capture status. |
|
| 23 |
+ |
|
| 24 |
+## UI Ownership |
|
| 25 |
+ |
|
| 26 |
+Claude/UI owns: |
|
| 27 |
+- SwiftUI views in `HealthProbe/Views/`; |
|
| 28 |
+- view models in `HealthProbe/ViewModels/`; |
|
| 29 |
+- navigation and screen composition; |
|
| 30 |
+- visual design, accessibility, Dynamic Type, and legacy-device UI simplification; |
|
| 31 |
+- mock data for previews. |
|
| 32 |
+ |
|
| 33 |
+Claude/UI does not own: |
|
| 34 |
+- SQLite archive schema or analysis queries; |
|
| 35 |
+- Core Data cache model design or storage replacement strategy; |
|
| 36 |
+- HealthKit capture internals; |
|
| 37 |
+- recovery/salvage tooling; |
|
| 38 |
+- project entitlements or signing. |
|
| 39 |
+ |
|
| 40 |
+Cross-boundary needs should be expressed as protocols or view-ready DTOs. |
|
| 41 |
+ |
|
| 42 |
+## Target Screens |
|
| 43 |
+ |
|
| 44 |
+Current target surfaces: |
|
| 45 |
+ |
|
| 46 |
+```text |
|
| 47 |
+App |
|
| 48 |
+├── Dashboard / Capture Status |
|
| 49 |
+├── Observation Timeline |
|
| 50 |
+├── Observation Detail |
|
| 51 |
+├── Diff Detail |
|
| 52 |
+├── Data Types |
|
| 53 |
+├── Export Preview / Export History |
|
| 54 |
+├── Archive Status |
|
| 55 |
+└── Settings |
|
| 56 |
+``` |
|
| 57 |
+ |
|
| 58 |
+Legacy/low-memory devices may get simplified tables and summaries instead of heavy charts. Capture, reporting, and export remain more important than visual richness. |
|
| 59 |
+ |
|
| 60 |
+## Legacy Device UI Mode |
|
| 61 |
+ |
|
| 62 |
+Use simplified UI when the app runs on iOS 15-era devices, very small screens, or when memory/performance instrumentation shows repeated pressure during chart/detail screens. |
|
| 63 |
+ |
|
| 64 |
+Simplify by: |
|
| 65 |
+- preferring tables and summary rows over dense charts; |
|
| 66 |
+- limiting default record previews; |
|
| 67 |
+- loading one detail surface at a time; |
|
| 68 |
+- using paged SQLite DTOs for drill-down; |
|
| 69 |
+- hiding non-essential visual comparisons behind explicit taps. |
|
| 70 |
+ |
|
| 71 |
+Do not remove: |
|
| 72 |
+- capture controls; |
|
| 73 |
+- archive health/status; |
|
| 74 |
+- cached report summaries; |
|
| 75 |
+- export flows; |
|
| 76 |
+- paged record inspection. |
|
| 77 |
+ |
|
| 78 |
+## Language Rules |
|
| 79 |
+ |
|
| 80 |
+Prefer: |
|
| 81 |
+- "records no longer visible in this observation" |
|
| 82 |
+- "representation changed" |
|
| 83 |
+- "consolidation likely" |
|
| 84 |
+- "aggregate changed" |
|
| 85 |
+- "cause not inferred" |
|
| 86 |
+- "export selected evidence" |
|
| 87 |
+ |
|
| 88 |
+Avoid: |
|
| 89 |
+- "Apple lost your data" |
|
| 90 |
+- "critical data loss" from counts alone |
|
| 91 |
+- "sync bug" |
|
| 92 |
+- "cross-device truth" |
|
| 93 |
+- "restore from HealthProbe" |
|
| 94 |
+ |
|
| 95 |
+## Design Guidelines |
|
| 96 |
+ |
|
| 97 |
+Tone: |
|
| 98 |
+- calm, technical, evidence-oriented; |
|
| 99 |
+- no emergency language unless the user explicitly chooses an interpretation; |
|
| 100 |
+- make uncertainty visible. |
|
| 101 |
+ |
|
| 102 |
+Visual hierarchy: |
|
| 103 |
+- summary first, details on demand; |
|
| 104 |
+- paged record tables for large datasets; |
|
| 105 |
+- chart only when it is cheap and helpful; |
|
| 106 |
+- no UI that implies all data is loaded in memory. |
|
| 107 |
+ |
|
| 108 |
+Controls: |
|
| 109 |
+- segmented controls for observation/diff modes; |
|
| 110 |
+- filters for type/date/source/change classification; |
|
| 111 |
+- export buttons only for explicit user-triggered actions; |
|
| 112 |
+- destructive actions require confirmation. |
|
| 113 |
+ |
|
| 114 |
+Accessibility: |
|
| 115 |
+- support Dynamic Type; |
|
| 116 |
+- keep tables readable on small devices; |
|
| 117 |
+- provide VoiceOver labels for charts and summary cards; |
|
| 118 |
+- avoid color-only meaning. |
|
| 119 |
+ |
|
| 120 |
+## Data Access Pattern |
|
| 121 |
+ |
|
| 122 |
+Views should consume: |
|
| 123 |
+- view models; |
|
| 124 |
+- protocols; |
|
| 125 |
+- paged DTOs; |
|
| 126 |
+- Core Data cached summaries once implemented. |
|
| 127 |
+ |
|
| 128 |
+Views should not: |
|
| 129 |
+- query the large SQLite archive directly; |
|
| 130 |
+- decode full record archives into arrays; |
|
| 131 |
+- mutate HealthKit; |
|
| 132 |
+- treat SwiftData as target architecture. |
|
| 133 |
+ |
|
| 134 |
+## Preview Data |
|
| 135 |
+ |
|
| 136 |
+Preview/mock data must be synthetic: |
|
| 137 |
+- no real health values; |
|
| 138 |
+- no real source names; |
|
| 139 |
+- no device identifiers; |
|
| 140 |
+- no real dates tied to a person. |
|
| 141 |
+ |
|
| 142 |
+Use examples such as: |
|
| 143 |
+- type: `step_count` |
|
| 144 |
+- value: `42` |
|
| 145 |
+- source hash: `source_hash_example` |
|
| 146 |
+- record hash: `record_hash_example` |
|
| 147 |
+ |
|
| 148 |
+## Completion Checklist |
|
| 149 |
+ |
|
| 150 |
+- [ ] UI copy follows Time Machine / observation language. |
|
| 151 |
+- [ ] No count-only critical loss messaging. |
|
| 152 |
+- [ ] Large record lists are paged or mocked as paged. |
|
| 153 |
+- [ ] Works on narrow/small-device layouts. |
|
| 154 |
+- [ ] Dark mode and Dynamic Type reviewed. |
|
| 155 |
+- [ ] VoiceOver labels exist for interactive elements. |
|
| 156 |
+- [ ] No real health values in previews/tests. |
|
| 157 |
+- [ ] No new dependency on SwiftData as target storage. |
|
@@ -0,0 +1,225 @@ |
||
| 1 |
+# HealthProbe - Risks, Limitations & Forensic Capabilities |
|
| 2 |
+ |
|
| 3 |
+**Version:** 1.5 |
|
| 4 |
+**Last Updated:** 2026-05-23 |
|
| 5 |
+ |
|
| 6 |
+## 1. Product Boundary |
|
| 7 |
+ |
|
| 8 |
+HealthProbe is a local Health DB Time Machine. It preserves selected HealthKit-accessible observations on one device, explains how those observations differ over time, and exports scoped evidence. |
|
| 9 |
+ |
|
| 10 |
+HealthProbe is not: |
|
| 11 |
+- a proof engine for Apple Health bugs |
|
| 12 |
+- a cross-device HealthKit database comparator |
|
| 13 |
+- a CloudKit/iCloud sync product |
|
| 14 |
+- a guarantee that HealthKit currently exposes all historical detail |
|
| 15 |
+- a replacement for Apple Health or Apple exports |
|
| 16 |
+- a disaster-recovery tool that mutates HealthKit or patches iOS backups |
|
| 17 |
+ |
|
| 18 |
+## 2. Known Limitations |
|
| 19 |
+ |
|
| 20 |
+### 2.1 HealthKit Framework Constraints |
|
| 21 |
+ |
|
| 22 |
+| Gap | Why | Mitigation | |
|
| 23 |
+|-----|-----|------------| |
|
| 24 |
+| **No stable raw database contract** | HealthKit exposes API objects, not a forensic SQLite dump | Store what the API exposes at each observation | |
|
| 25 |
+| **Representation changes** | Older high-frequency samples can become intervalized, thinned, or aggregated | Compare records and aggregates; label consolidation separately from loss | |
|
| 26 |
+| **Modifications without explicit change events** | HealthKit primarily reports added/deleted objects | Use observation diffs and fingerprints | |
|
| 27 |
+| **Missed background windows** | iOS can delay or skip background work | Manual capture, app-launch capture, observation timestamps | |
|
| 28 |
+| **Private Health types** | Some data is not available to third-party apps | Document inaccessible gaps | |
|
| 29 |
+| **Cross-device divergence** | Apple devices may expose different local HealthKit states | Compare only within the current device timeline | |
|
| 30 |
+ |
|
| 31 |
+### 2.2 Interpretation Limits |
|
| 32 |
+ |
|
| 33 |
+A disappeared fingerprint does not automatically mean permanent user data loss. It may mean: |
|
| 34 |
+- Apple Health consolidated older records |
|
| 35 |
+- a permission changed |
|
| 36 |
+- a source app rewrote or re-imported data |
|
| 37 |
+- HealthKit query timing changed |
|
| 38 |
+- the user deleted or edited data |
|
| 39 |
+- the app missed one or more background windows |
|
| 40 |
+ |
|
| 41 |
+HealthProbe should present evidence and uncertainty. Count-only drops must not be shown as critical alerts without supporting detail. |
|
| 42 |
+ |
|
| 43 |
+### 2.3 Data Retention Constraints |
|
| 44 |
+ |
|
| 45 |
+HealthProbe can only preserve detail after it has observed it. It cannot reconstruct records that were already aggregated or unavailable before installation. |
|
| 46 |
+ |
|
| 47 |
+The local archive can preserve selected details beyond what future HealthKit queries or Apple exports expose, but only for data types the user has enabled and granted permission to read. |
|
| 48 |
+ |
|
| 49 |
+## 3. Privacy & Security Risks |
|
| 50 |
+ |
|
| 51 |
+| Risk | Impact | Mitigation | |
|
| 52 |
+|------|--------|------------| |
|
| 53 |
+| **Raw health data exposure** | Critical | Local-only archive; explicit user exports only | |
|
| 54 |
+| **Device/source re-identification** | High | Hash or redact identifiers where possible; avoid personal data in logs/docs/tests | |
|
| 55 |
+| **Behavior inference from timestamps** | Medium | No automatic cloud sync; scoped exports | |
|
| 56 |
+| **Lost archive on uninstall/device loss** | High | Encourage explicit exports for important evidence | |
|
| 57 |
+| **Local device compromise** | High | Rely on iOS data protection; no network copy by default | |
|
| 58 |
+ |
|
| 59 |
+## 4. Questions HealthProbe Can Answer |
|
| 60 |
+ |
|
| 61 |
+### Q1: "What did my Health data look like at this observation date?" |
|
| 62 |
+ |
|
| 63 |
+Method: |
|
| 64 |
+1. Select an observation. |
|
| 65 |
+2. Load archived records visible at that observation. |
|
| 66 |
+3. Show per-type counts, date ranges, aggregates, source summaries, and record tables. |
|
| 67 |
+4. Export the selected view if needed. |
|
| 68 |
+ |
|
| 69 |
+### Q2: "What changed between these two observations?" |
|
| 70 |
+ |
|
| 71 |
+Method: |
|
| 72 |
+1. Compare adjacent or selected observations from the same local device timeline. |
|
| 73 |
+2. Group records as appeared, disappeared, retained, or representation-changed. |
|
| 74 |
+3. Compare aggregate totals and coverage windows. |
|
| 75 |
+4. Label likely consolidation when record detail changed but aggregate meaning is preserved. |
|
| 76 |
+ |
|
| 77 |
+### Q3: "Can I export details that are no longer available from current HealthKit?" |
|
| 78 |
+ |
|
| 79 |
+Method: |
|
| 80 |
+1. Query the local archive by observation date and sample type. |
|
| 81 |
+2. Select the historical record set. |
|
| 82 |
+3. Export JSON/CSV plus manifest hashes and observation metadata. |
|
| 83 |
+ |
|
| 84 |
+### Q4: "When did this record first become visible to HealthProbe?" |
|
| 85 |
+ |
|
| 86 |
+Method: |
|
| 87 |
+1. Search the archive observation history for the record fingerprint. |
|
| 88 |
+2. Report first-seen, last-seen, last-verified, and disappeared-at timestamps. |
|
| 89 |
+3. Include context events without claiming causality. |
|
| 90 |
+ |
|
| 91 |
+### Q5: "Is a change probably consolidation rather than loss?" |
|
| 92 |
+ |
|
| 93 |
+Method: |
|
| 94 |
+1. Compare missing old records with newer interval or aggregate records over the same time window. |
|
| 95 |
+2. Check value sums, date coverage, and source metadata. |
|
| 96 |
+3. Report "consolidation likely" only when aggregate evidence supports it; otherwise report "uncertain." |
|
| 97 |
+ |
|
| 98 |
+## 5. Export Formats |
|
| 99 |
+ |
|
| 100 |
+### JSON Diff Report |
|
| 101 |
+ |
|
| 102 |
+```json |
|
| 103 |
+{
|
|
| 104 |
+ "report_id": "DIFF_SYNTHETIC_001", |
|
| 105 |
+ "exported_at": "2026-05-23T12:00:00Z", |
|
| 106 |
+ "from_observation": "2026-04-23T12:00:00Z", |
|
| 107 |
+ "to_observation": "2026-05-23T12:00:00Z", |
|
| 108 |
+ "sample_type": "HKQuantityTypeIdentifierStepCount", |
|
| 109 |
+ "summary": {
|
|
| 110 |
+ "appeared": 12, |
|
| 111 |
+ "disappeared": 84, |
|
| 112 |
+ "representation_changed": 6, |
|
| 113 |
+ "aggregate_delta_percent": 0.1, |
|
| 114 |
+ "label": "consolidation_likely" |
|
| 115 |
+ }, |
|
| 116 |
+ "manifest_hash": "synthetic-example-hash" |
|
| 117 |
+} |
|
| 118 |
+``` |
|
| 119 |
+ |
|
| 120 |
+### CSV Point-In-Time Table |
|
| 121 |
+ |
|
| 122 |
+```csv |
|
| 123 |
+observation_at,type,start_date,end_date,value,unit,source_hash,record_hash |
|
| 124 |
+2026-05-23T12:00:00Z,step_count,2026-01-01T00:00:00Z,2026-01-01T00:10:00Z,42,count,source_hash_example,record_hash_example |
|
| 125 |
+``` |
|
| 126 |
+ |
|
| 127 |
+### Markdown Summary |
|
| 128 |
+ |
|
| 129 |
+```markdown |
|
| 130 |
+# HealthProbe Observation Diff |
|
| 131 |
+ |
|
| 132 |
+Observation A: 2026-04-23 12:00 UTC |
|
| 133 |
+Observation B: 2026-05-23 12:00 UTC |
|
| 134 |
+ |
|
| 135 |
+## Summary |
|
| 136 |
+- Appeared records: 12 |
|
| 137 |
+- Disappeared records: 84 |
|
| 138 |
+- Representation changes: 6 |
|
| 139 |
+- Aggregate delta: 0.1% |
|
| 140 |
+- Interpretation: consolidation likely |
|
| 141 |
+ |
|
| 142 |
+## Notes |
|
| 143 |
+This report describes local HealthKit observations from one device. |
|
| 144 |
+It does not infer cause or compare against another device as a source of truth. |
|
| 145 |
+``` |
|
| 146 |
+ |
|
| 147 |
+## 6. Forensic Techniques Enabled |
|
| 148 |
+ |
|
| 149 |
+**Timeline reconstruction:** Walk observation history and rebuild visible records for a selected date. |
|
| 150 |
+ |
|
| 151 |
+**Representation-change analysis:** Compare count, intervals, value sums, value max, and coverage windows to distinguish detail thinning from meaningful aggregate drift. |
|
| 152 |
+ |
|
| 153 |
+**Archive-backed export:** Export data from HealthProbe's local archive even when a current HealthKit query no longer exposes the same record-level detail. |
|
| 154 |
+ |
|
| 155 |
+**Context correlation:** Place local changes near iOS/app/permission/iCloud-state events while avoiding causal claims. |
|
| 156 |
+ |
|
| 157 |
+## 7. Recovery-Compatible Export Boundary |
|
| 158 |
+ |
|
| 159 |
+External tools may use HealthProbe exports to: |
|
| 160 |
+- inspect what HealthProbe observed at a point in time; |
|
| 161 |
+- compare exported records with backup/XML/database evidence; |
|
| 162 |
+- build salvage workflows outside the app; |
|
| 163 |
+- re-publish values into another system with explicit user consent. |
|
| 164 |
+ |
|
| 165 |
+HealthProbe exports should therefore preserve: |
|
| 166 |
+- values, dates, units, and type identifiers; |
|
| 167 |
+- stable hashes and payload version hashes; |
|
| 168 |
+- source/provenance hashes where available; |
|
| 169 |
+- relationships where available and in scope; |
|
| 170 |
+- observation history and manifest/item hashes. |
|
| 171 |
+ |
|
| 172 |
+HealthProbe exports cannot guarantee: |
|
| 173 |
+- original Apple/private database primary keys; |
|
| 174 |
+- original HealthKit metadata that was never exposed to the app; |
|
| 175 |
+- proof that a disappeared record was deleted by Apple; |
|
| 176 |
+- lossless re-publication into HealthKit with original provenance. |
|
| 177 |
+ |
|
| 178 |
+The iOS app remains read-only. Any backup transplant, HealthKit re-publication, or disaster-recovery procedure is external tooling, not an in-app feature. |
|
| 179 |
+ |
|
| 180 |
+## 8. Recommended Usage |
|
| 181 |
+ |
|
| 182 |
+### Individual Users |
|
| 183 |
+ |
|
| 184 |
+1. Enable the data types that matter most. |
|
| 185 |
+2. Let HealthProbe build an initial archive. |
|
| 186 |
+3. Capture manually before/after OS updates, restores, migrations, or major app changes. |
|
| 187 |
+4. Use the timeline to inspect old observations. |
|
| 188 |
+5. Export selected views when a detail set matters. |
|
| 189 |
+ |
|
| 190 |
+### Researchers And Support |
|
| 191 |
+ |
|
| 192 |
+1. Work from scoped exports, not raw device databases. |
|
| 193 |
+2. Treat HealthProbe evidence as local observation history. |
|
| 194 |
+3. Avoid claiming root cause without additional Apple/system evidence. |
|
| 195 |
+4. Prefer aggregate-neutral language: changed, appeared, disappeared, consolidated, uncertain. |
|
| 196 |
+ |
|
| 197 |
+## 9. Future Enhancements |
|
| 198 |
+ |
|
| 199 |
+- richer point-in-time query tools |
|
| 200 |
+- improved consolidation heuristics |
|
| 201 |
+- archive integrity audits and repair workflows |
|
| 202 |
+- optional encrypted archive copy chosen by the user |
|
| 203 |
+- macOS analysis of explicit HealthProbe exports |
|
| 204 |
+- recovery-compatible archive/export manifests that external tools can use as input, without adding restore or re-publication features to the app |
|
| 205 |
+ |
|
| 206 |
+## 10. Troubleshooting |
|
| 207 |
+ |
|
| 208 |
+| Issue | Cause | Fix | |
|
| 209 |
+|-------|-------|-----| |
|
| 210 |
+| **No recent observations** | Background refresh disabled or app not opened | Open app and run manual capture | |
|
| 211 |
+| **Some types missing** | HealthKit permission not granted or type unavailable | Review permissions and selected types | |
|
| 212 |
+| **Large count drop shown** | Possible consolidation or query/permission change | Inspect diff and aggregate evidence before interpreting | |
|
| 213 |
+| **Old detail unavailable** | HealthProbe did not observe it before aggregation | Only future observations can be preserved | |
|
| 214 |
+| **Export too large** | High-frequency type selected over long interval | Narrow date/type filter | |
|
| 215 |
+ |
|
| 216 |
+## 11. References |
|
| 217 |
+ |
|
| 218 |
+- Apple HealthKit Framework: https://developer.apple.com/documentation/healthkit/ |
|
| 219 |
+- HKAnchoredObjectQuery: https://developer.apple.com/documentation/healthkit/hkanchoredobjectquery |
|
| 220 |
+- Core Data: https://developer.apple.com/documentation/coredata |
|
| 221 |
+- DearApple Issue #001: historical context for reported Apple Health data anomalies |
|
| 222 |
+ |
|
| 223 |
+--- |
|
| 224 |
+ |
|
| 225 |
+*HealthProbe - A local time machine for HealthKit-accessible data.* |
|
@@ -0,0 +1,154 @@ |
||
| 1 |
+# HealthProbe iOS - Specification (MVP) |
|
| 2 |
+ |
|
| 3 |
+**Version:** 1.5 |
|
| 4 |
+**Last Updated:** 2026-05-23 |
|
| 5 |
+**Status:** MVP iOS local Health DB Time Machine |
|
| 6 |
+ |
|
| 7 |
+## Overview |
|
| 8 |
+ |
|
| 9 |
+HealthProbe is a read-only iOS app that captures selected HealthKit data into a local archive so the user can revisit how their Health database looked at earlier observation dates. |
|
| 10 |
+ |
|
| 11 |
+The MVP is not a cloud sync product and not a cross-device comparator. It is a single-device local timeline: |
|
| 12 |
+- capture what HealthKit exposes today |
|
| 13 |
+- preserve selected details before they are aggregated, consolidated, or no longer queryable |
|
| 14 |
+- show what changed between local observations |
|
| 15 |
+- export selected historical views and record tables |
|
| 16 |
+ |
|
| 17 |
+The original motivation was data-loss detection. The current objective is broader and calmer: help the user understand the evolution of their local Health database over time. A record-count drop is a change to explain, not automatically an emergency. |
|
| 18 |
+ |
|
| 19 |
+## Core Principles |
|
| 20 |
+ |
|
| 21 |
+1. **Read-only with respect to HealthKit** |
|
| 22 |
+ - Never modify or delete HealthKit data |
|
| 23 |
+ - Only observe, archive, compare, and export |
|
| 24 |
+ |
|
| 25 |
+2. **Single-device local timeline** |
|
| 26 |
+ - Compare only observations captured on this device |
|
| 27 |
+ - Do not infer correctness by comparing HealthKit databases from different devices |
|
| 28 |
+ |
|
| 29 |
+3. **Local-first storage** |
|
| 30 |
+ - HealthProbe data works without network access |
|
| 31 |
+ - No HealthProbe CloudKit/iCloud sync for raw samples, digests, reports, or caches |
|
| 32 |
+ |
|
| 33 |
+4. **Archive-first design** |
|
| 34 |
+ - The local archive is the source of truth |
|
| 35 |
+ - SQLite stores differential observations and performs large analyses |
|
| 36 |
+ - Core Data stores UI/report/cache/settings/history data that can be rebuilt from the archive |
|
| 37 |
+ |
|
| 38 |
+5. **Legacy-device support** |
|
| 39 |
+ - Keep iOS 15-era Health collection devices in scope |
|
| 40 |
+ - Do not require SwiftData for target architecture |
|
| 41 |
+ - Simplify heavy visualizations on legacy devices while preserving capture, reporting, and export |
|
| 42 |
+ |
|
| 43 |
+6. **Consolidation-aware interpretation** |
|
| 44 |
+ - Apple Health may aggregate or rewrite older high-frequency records |
|
| 45 |
+ - Count-only alerts are not reliable evidence of loss |
|
| 46 |
+ - UI language should describe additions, removals, consolidation, and uncertainty |
|
| 47 |
+ |
|
| 48 |
+7. **Differential storage** |
|
| 49 |
+ - Do not append complete periodic snapshots for large datasets |
|
| 50 |
+ - Store sample identities, payload versions, observation events/ranges, and aggregates |
|
| 51 |
+ |
|
| 52 |
+## MVP Features |
|
| 53 |
+ |
|
| 54 |
+### 1. Local HealthKit Capture |
|
| 55 |
+ |
|
| 56 |
+Use: |
|
| 57 |
+- `HKAnchoredObjectQuery` for incremental capture |
|
| 58 |
+- `HKObserverQuery` as a prompt to refresh when iOS wakes the app |
|
| 59 |
+- manual capture from the app UI |
|
| 60 |
+ |
|
| 61 |
+Track initially: |
|
| 62 |
+- workouts |
|
| 63 |
+- steps and activity quantities |
|
| 64 |
+- heart rate and other high-frequency quantities selected by the user |
|
| 65 |
+- sleep and relevant category samples |
|
| 66 |
+- additional types through a selected-type registry |
|
| 67 |
+ |
|
| 68 |
+Persist differentially in the local SQLite archive where HealthKit exposes it: |
|
| 69 |
+- sample UUID hash, type, start/end date, value, and unit |
|
| 70 |
+- source and source revision metadata |
|
| 71 |
+- HealthKit metadata dictionaries |
|
| 72 |
+- device provenance exposed by HealthKit, subject to privacy redaction/hash policy |
|
| 73 |
+- first-seen, last-seen, last-verified, disappeared-at observations |
|
| 74 |
+- fingerprints for internal matching and explicit user exports |
|
| 75 |
+- materialized per-type/per-day aggregates needed by reports and presentation |
|
| 76 |
+ |
|
| 77 |
+### 2. Health DB Time Machine |
|
| 78 |
+ |
|
| 79 |
+The app must let the user answer: |
|
| 80 |
+- "What did my accessible HealthKit data look like on this observation date?" |
|
| 81 |
+- "What changed between these two observations?" |
|
| 82 |
+- "Which older details can HealthProbe still export even if HealthKit later consolidates them?" |
|
| 83 |
+ |
|
| 84 |
+Core views: |
|
| 85 |
+- observation timeline |
|
| 86 |
+- point-in-time summary |
|
| 87 |
+- per-type detail table |
|
| 88 |
+- adjacent-observation diff |
|
| 89 |
+- selected-record export preview |
|
| 90 |
+ |
|
| 91 |
+On legacy/low-memory devices, heavy charts may be replaced with tables, summaries, and generated reports. Export/report correctness is more important than rich visualization. |
|
| 92 |
+ |
|
| 93 |
+### 3. Change Explanation |
|
| 94 |
+ |
|
| 95 |
+Changes are classified as observations, not accusations: |
|
| 96 |
+- **Appeared:** record/fingerprint was absent before and present now |
|
| 97 |
+- **Disappeared:** record/fingerprint was present before and absent now |
|
| 98 |
+- **Changed representation:** aggregates remain similar but record granularity, intervals, timestamps, or values changed |
|
| 99 |
+- **Consolidation likely:** old high-frequency detail appears thinned or intervalized while aggregate totals remain explainable |
|
| 100 |
+- **Uncertain:** HealthKit/API/background constraints prevent confident classification |
|
| 101 |
+ |
|
| 102 |
+Severity should be used sparingly. The MVP should prefer neutral labels and evidence summaries over alarm language. |
|
| 103 |
+ |
|
| 104 |
+### 4. Exports |
|
| 105 |
+ |
|
| 106 |
+Exports are a primary product feature. They preserve selected historical evidence and support external analysis. |
|
| 107 |
+ |
|
| 108 |
+MVP exports: |
|
| 109 |
+- selected point-in-time table as JSON/CSV |
|
| 110 |
+- diff report between two observations |
|
| 111 |
+- manifest with hashes and observation metadata |
|
| 112 |
+- selected disappeared/appeared/changed records |
|
| 113 |
+ |
|
| 114 |
+Routine full-database export is not an MVP goal. The archive itself is the local backup; user-facing exports are scoped to what the user is inspecting. |
|
| 115 |
+ |
|
| 116 |
+Exports must stream or page from SQLite. The app must not materialize large high-frequency record sets in RAM before writing an export. |
|
| 117 |
+ |
|
| 118 |
+### 5. Context Logging |
|
| 119 |
+ |
|
| 120 |
+Health/iCloud state may be logged as context only: |
|
| 121 |
+- HealthKit permission changes |
|
| 122 |
+- app version / iOS version |
|
| 123 |
+- local capture start/end/failure |
|
| 124 |
+- iCloud sign-in state if available |
|
| 125 |
+ |
|
| 126 |
+Context logging must not become HealthProbe cloud sync and must not imply that iCloud state proves why a change happened. |
|
| 127 |
+ |
|
| 128 |
+## Out Of Scope |
|
| 129 |
+ |
|
| 130 |
+- HealthProbe CloudKit/iCloud sync |
|
| 131 |
+- comparing snapshots from different devices |
|
| 132 |
+- claiming Apple lost data solely because sample counts changed |
|
| 133 |
+- predicting future loss |
|
| 134 |
+- modifying HealthKit data |
|
| 135 |
+- restoring by patching/transplanting Health database files into iOS backups |
|
| 136 |
+- re-publishing archived samples into HealthKit as HealthProbe-owned replacement records |
|
| 137 |
+- automatic upload or community reporting |
|
| 138 |
+ |
|
| 139 |
+The archive/export format should still preserve enough structure for external recovery tools to use it as input. |
|
| 140 |
+ |
|
| 141 |
+## Success Criteria |
|
| 142 |
+ |
|
| 143 |
+| Objective | MVP Target | |
|
| 144 |
+|-----------|------------| |
|
| 145 |
+| Historical inspection | User can select an observation and inspect per-type counts, ranges, and records | |
|
| 146 |
+| Change explanation | User can compare adjacent observations with neutral, consolidation-aware labels | |
|
| 147 |
+| Preservation | User can export selected records/details that may no longer be available from HealthKit later | |
|
| 148 |
+| Legacy support | Capture/report/export works on iOS 15-era devices with simplified visuals where needed | |
|
| 149 |
+| Privacy | All HealthProbe data remains local unless the user explicitly exports a file | |
|
| 150 |
+| Performance | High-frequency capture streams to the archive without blocking UI | |
|
| 151 |
+ |
|
| 152 |
+--- |
|
| 153 |
+ |
|
| 154 |
+*HealthProbe iOS: a local time machine for HealthKit-accessible data.* |
|
@@ -0,0 +1,510 @@ |
||
| 1 |
+# HealthProbe – Complete Specification & Motivations |
|
| 2 |
+ |
|
| 3 |
+**Version:** 1.5 |
|
| 4 |
+**Status:** MVP (iOS local Health DB Time Machine) |
|
| 5 |
+**Last Updated:** 2026-05-23 |
|
| 6 |
+ |
|
| 7 |
+--- |
|
| 8 |
+ |
|
| 9 |
+## 1. Executive Summary |
|
| 10 |
+ |
|
| 11 |
+HealthProbe is a **local time machine for Apple HealthKit-accessible data**. It captures selected local HealthKit observations over time so the user can inspect how their Health database looked at a chosen date, understand what changed, and export details that may later be unavailable after aggregation, consolidation, pruning, or API limitations. |
|
| 12 |
+ |
|
| 13 |
+**Core Problem:** Apple Health is not a stable forensic record store. The same user-visible health history may be represented differently over time: high-frequency samples can be aggregated, old records can become intervalized, exports can lose detail, and different Apple devices may expose different local HealthKit database states. A count drop is therefore not reliable proof of loss. |
|
| 14 |
+ |
|
| 15 |
+**Solution:** HealthProbe incrementally captures selected HealthKit data into a robust local SQLite archive and presents a single-device observation timeline. The archive is differential and analysis-capable: it stores observations, sample identities, payload versions, visibility ranges/events, and materialized aggregates rather than recurring complete snapshot copies. Core Data is the target UI/reporting cache for expensive counts, summaries, report metadata, and display state; it is rebuildable and not the source of truth. |
|
| 16 |
+ |
|
| 17 |
+--- |
|
| 18 |
+ |
|
| 19 |
+## 2. Motivation & Product History |
|
| 20 |
+ |
|
| 21 |
+### 2.1 Original Trigger |
|
| 22 |
+ |
|
| 23 |
+HealthProbe started after a user-observed mass disappearance of Apple Health detail. The first idea was simple: count records, compare snapshots, and warn the user when Apple Health appeared to "lose" data. |
|
| 24 |
+ |
|
| 25 |
+That framing was useful for discovery but incomplete. |
|
| 26 |
+ |
|
| 27 |
+### 2.2 What Changed |
|
| 28 |
+ |
|
| 29 |
+Further observation showed that older HealthKit data can change representation without necessarily representing user-meaningful loss: |
|
| 30 |
+- high-frequency samples may be thinned |
|
| 31 |
+- old point samples may become interval samples |
|
| 32 |
+- per-record values may change while daily/monthly aggregates remain explainable |
|
| 33 |
+- HealthKit exports taken months apart may not contain the same record-level detail |
|
| 34 |
+- different devices can expose different local HealthKit database states |
|
| 35 |
+ |
|
| 36 |
+Because of this, record-by-record cross-device comparison is out of scope and count-only alerts create false alarms. |
|
| 37 |
+ |
|
| 38 |
+### 2.3 Current Objective |
|
| 39 |
+ |
|
| 40 |
+HealthProbe is now a single-device local Health DB Time Machine: |
|
| 41 |
+- capture selected HealthKit-accessible data as it exists at observation time |
|
| 42 |
+- reconstruct how the local Health database looked at a chosen date |
|
| 43 |
+- show additions, removals, representation changes, and aggregate changes between observations |
|
| 44 |
+- preserve local evidence that HealthKit may later aggregate or no longer export |
|
| 45 |
+- export scoped historical views for personal backup, support, research, or external analysis |
|
| 46 |
+ |
|
| 47 |
+### 2.4 Interpretation Model |
|
| 48 |
+ |
|
| 49 |
+HealthProbe describes changes neutrally: |
|
| 50 |
+- **Appeared:** a record/fingerprint is newly visible |
|
| 51 |
+- **Disappeared:** a record/fingerprint is no longer visible |
|
| 52 |
+- **Representation changed:** timestamps, intervals, values, or granularity changed |
|
| 53 |
+- **Consolidation likely:** record detail decreased while aggregates remain explainable |
|
| 54 |
+- **Uncertain:** available evidence cannot distinguish user action, HealthKit behavior, app permissions, background timing, or system state |
|
| 55 |
+ |
|
| 56 |
+These classifications are evidence labels, not claims about Apple's intent or definitive proof of corruption. |
|
| 57 |
+ |
|
| 58 |
+### 2.5 Why This Matters |
|
| 59 |
+ |
|
| 60 |
+| Concern | Impact | HealthProbe Role | |
|
| 61 |
+|---------|--------|-----------------| |
|
| 62 |
+| **Historical detail is unstable** | Older HealthKit detail may later be aggregated or unavailable | Preserve selected observations locally | |
|
| 63 |
+| **Counts mislead** | A sample-count drop may be consolidation, not loss | Explain changes with record and aggregate evidence | |
|
| 64 |
+| **No built-in time travel** | Health.app shows current state, not prior database states | Provide local point-in-time views | |
|
| 65 |
+| **Exports are time-sensitive** | Future exports may not contain old detail | Export selected evidence while it exists locally | |
|
| 66 |
+| **Privacy of monitoring** | Health data is sensitive | Local-only archive, explicit user exports | |
|
| 67 |
+ |
|
| 68 |
+--- |
|
| 69 |
+ |
|
| 70 |
+## 3. Core Architecture |
|
| 71 |
+ |
|
| 72 |
+### 3.1 Design Principles |
|
| 73 |
+ |
|
| 74 |
+1. **Read-only operations** (never modify HealthKit data) |
|
| 75 |
+2. **Local-first** (full functionality without network) |
|
| 76 |
+3. **Incremental queries** (efficient, avoid repeating work) |
|
| 77 |
+4. **Single archive store** (do not split the forensic store per data type; cross-type relationships and shared metadata matter) |
|
| 78 |
+5. **Auditability** (every observation logged, timestamped, reproducible) |
|
| 79 |
+6. **Privacy by default** (no HealthProbe cloud sync; local storage remains under user control) |
|
| 80 |
+7. **Time-machine capture** (selected data types are archived locally so prior HealthKit-accessible states can be revisited) |
|
| 81 |
+8. **Single-device timeline** (snapshot comparisons stay within the local device chain; cross-device record comparison is not a product goal) |
|
| 82 |
+9. **Consolidation-aware explanations** (record-count changes are described with uncertainty and aggregate context) |
|
| 83 |
+10. **Legacy device support** (target iOS 15-era Health collection devices; avoid SwiftData as a required foundation) |
|
| 84 |
+11. **SQL-first analysis** (large diffs and reports run inside SQLite using indexes, temporary tables, joins, and paged results) |
|
| 85 |
+ |
|
| 86 |
+### 3.2 Threading Model |
|
| 87 |
+ |
|
| 88 |
+``` |
|
| 89 |
+┌─────────────────────────────────────────┐ |
|
| 90 |
+│ Main Thread (UI) │ |
|
| 91 |
+│ - Display archive and capture status │ |
|
| 92 |
+│ - Show timeline, diffs, exports │ |
|
| 93 |
+│ - User interaction │ |
|
| 94 |
+└──────────────┬──────────────────────────┘ |
|
| 95 |
+ │ |
|
| 96 |
+ ├─ Delegate query results |
|
| 97 |
+ │ |
|
| 98 |
+┌──────────────▼──────────────────────────┐ |
|
| 99 |
+│ Background Queue (HealthKit Queries) │ |
|
| 100 |
+│ - HKAnchoredObjectQuery (efficient) │ |
|
| 101 |
+│ - HKObserverQuery (reactive) │ |
|
| 102 |
+│ - Observation comparisons │ |
|
| 103 |
+│ - Change explanation logic │ |
|
| 104 |
+└──────────────┬──────────────────────────┘ |
|
| 105 |
+ │ |
|
| 106 |
+ ├─ Write observations and change summaries |
|
| 107 |
+ │ |
|
| 108 |
+┌──────────────▼──────────────────────────┐ |
|
| 109 |
+│ Local Archive Store │ |
|
| 110 |
+│ - Canonical HealthKit samples │ |
|
| 111 |
+│ - Sources, devices, metadata │ |
|
| 112 |
+│ - Cross-type relationships │ |
|
| 113 |
+│ - Fingerprints and verification hashes │ |
|
| 114 |
+└──────────────┬──────────────────────────┘ |
|
| 115 |
+ │ |
|
| 116 |
+┌──────────────▼──────────────────────────┐ |
|
| 117 |
+│ Core Data UI/Report Cache │ |
|
| 118 |
+│ - Precomputed counts/statistics │ |
|
| 119 |
+│ - Visualization state and settings │ |
|
| 120 |
+│ - Logs, history, report indexes │ |
|
| 121 |
+└─────────────────────────────────────────┘ |
|
| 122 |
+``` |
|
| 123 |
+ |
|
| 124 |
+### 3.3 Storage Model |
|
| 125 |
+ |
|
| 126 |
+**SQLite Archive Store (source of truth):** |
|
| 127 |
+- one robust local database for all archived samples and analysis, not one archive per data type |
|
| 128 |
+- differential observation storage, not recurring complete snapshot copies |
|
| 129 |
+- normalized entities for samples, sample payload versions, workouts, sources, source revisions, devices, metadata, relationships, and observations |
|
| 130 |
+- multiple fingerprints per sample: HealthKit UUID hash, strict fingerprint, semantic fingerprint, and fuzzy matching keys for export/backup reconciliation |
|
| 131 |
+- append-only observation history (`firstSeen`, `lastSeen`, `lastVerified`, disappearance evidence) |
|
| 132 |
+- visibility ranges/events so point-in-time reconstruction can be queried without duplicating every record per observation |
|
| 133 |
+- materialized aggregates for expensive counts and report inputs |
|
| 134 |
+- snapshot/observation-level and table-level hashes for integrity checks |
|
| 135 |
+- SQL-first analysis using indexes, temporary tables, joins, CTEs, and streaming/paged result sets |
|
| 136 |
+ |
|
| 137 |
+**Core Data UI/Report Cache (derived/cache layer):** |
|
| 138 |
+- settings and selected data types |
|
| 139 |
+- import job state and progress |
|
| 140 |
+- precomputed counts, temporal bins, display ranges, and summary statistics |
|
| 141 |
+- audit log entries and report indexes |
|
| 142 |
+- change summaries and links into the archive store |
|
| 143 |
+ |
|
| 144 |
+Core Data cache rows must be rebuildable from the local archive store. If the two disagree, the SQLite archive wins. Current SwiftData models are legacy/prototype implementation details until this cache layer replaces them. |
|
| 145 |
+ |
|
| 146 |
+There are no real deployments, only test installations. Existing prototype stores are disposable during the archive v2 refactor and do not require backward-compatible migration. |
|
| 147 |
+ |
|
| 148 |
+### 3.4 Storage Architecture Decision |
|
| 149 |
+ |
|
| 150 |
+HealthProbe keeps the database on-device. The current objectives do not require a server or external analytical engine for the iOS app. SQLite is the durable archive and analysis engine because it supports the operations needed for large local Health datasets: indexes, temporary tables, joins, CTEs, transactions, and paged/streaming reads. |
|
| 151 |
+ |
|
| 152 |
+Core Data is the right destination for cached counts and UI/reporting state because: |
|
| 153 |
+- it supports legacy iOS versions that SwiftData does not; |
|
| 154 |
+- it is well-suited for bounded object graphs and presentation-ready summaries; |
|
| 155 |
+- cached rows can be deleted and rebuilt from SQLite; |
|
| 156 |
+- expensive counts can be persisted without forcing every screen/report to recalculate them. |
|
| 157 |
+ |
|
| 158 |
+Core Data is not the right place for heavy archive analysis. Diffs across large observations, export selection, consolidation heuristics over high-frequency records, and record table pagination should run against SQLite and return bounded result pages or materialized summary rows. |
|
| 159 |
+ |
|
| 160 |
+The target storage split is therefore: |
|
| 161 |
+- `HealthProbeArchive.sqlite`: source of truth, differential observation storage, SQL analysis, export source; |
|
| 162 |
+- Core Data cache store: rebuildable UI/report summaries, expensive counts, timeline rows, progress, settings, and export metadata. |
|
| 163 |
+ |
|
| 164 |
+This resolves the product tension: the app remains usable on older Health collection devices while still allowing large-dataset analysis without loading entire snapshots into RAM. |
|
| 165 |
+ |
|
| 166 |
+--- |
|
| 167 |
+ |
|
| 168 |
+## 4. Time Machine Features (MVP) |
|
| 169 |
+ |
|
| 170 |
+### 4.1 Incremental Capture |
|
| 171 |
+ |
|
| 172 |
+**Using `HKAnchoredObjectQuery`:** |
|
| 173 |
+``` |
|
| 174 |
+Query pattern: |
|
| 175 |
+├─ Initial query: anchor = 0 → captures all existing data |
|
| 176 |
+├─ Store anchor locally |
|
| 177 |
+├─ Periodic queries: anchor = stored → captures only new/modified samples |
|
| 178 |
+└─ Update anchor → efficient incremental updates |
|
| 179 |
+``` |
|
| 180 |
+ |
|
| 181 |
+**What triggers capture:** |
|
| 182 |
+- App launch |
|
| 183 |
+- Background refresh (iOS allows periodic background queries) |
|
| 184 |
+- User manually triggers capture |
|
| 185 |
+- Every 12-24 hours (configurable) |
|
| 186 |
+ |
|
| 187 |
+### 4.2 Tracked Sample Types (Extensible) |
|
| 188 |
+ |
|
| 189 |
+| Type | Why Captured | Change Signal | |
|
| 190 |
+|------|---------------|----------------| |
|
| 191 |
+| **Workouts** | High-value user records | Appeared/disappeared records, metadata changes | |
|
| 192 |
+| **Heart Rate** | High-frequency detail likely to be consolidated | Granularity changes, intervalization, aggregate drift | |
|
| 193 |
+| **Activity Summary** | Auto-computed, depends on other types | Recalculation between observations | |
|
| 194 |
+| **Steps** | Cumulative and often consolidated | Aggregate preservation vs record thinning | |
|
| 195 |
+| **Sleep** | Frequently edited and reclassified | Stage/category representation changes | |
|
| 196 |
+| **Blood Pressure** | Manual/clinical-style records | Point-in-time history and export preservation | |
|
| 197 |
+| **Audio Exposure** | Often high-frequency/device-specific | Detail retention and later aggregation | |
|
| 198 |
+ |
|
| 199 |
+### 4.3 Change Explanation Logic |
|
| 200 |
+ |
|
| 201 |
+#### A. Point-In-Time Reconstruction |
|
| 202 |
+``` |
|
| 203 |
+Input: observation timestamp T and selected sample type |
|
| 204 |
+Use: archived records whose observation history makes them visible at T |
|
| 205 |
+Output: table, aggregates, source breakdown, and manifest hash |
|
| 206 |
+``` |
|
| 207 |
+ |
|
| 208 |
+#### B. Adjacent Observation Diff |
|
| 209 |
+``` |
|
| 210 |
+Previous observation: S_prev |
|
| 211 |
+Current observation: S_now |
|
| 212 |
+ |
|
| 213 |
+Appeared = S_now - S_prev |
|
| 214 |
+Disappeared = S_prev - S_now |
|
| 215 |
+Retained = S_now ∩ S_prev |
|
| 216 |
+ |
|
| 217 |
+For retained semantic groups: |
|
| 218 |
+ compare record count, interval length, value sum, value max, and source metadata |
|
| 219 |
+``` |
|
| 220 |
+ |
|
| 221 |
+#### C. Consolidation Heuristic |
|
| 222 |
+``` |
|
| 223 |
+IF old high-frequency records disappear |
|
| 224 |
+AND newer interval records cover similar time ranges |
|
| 225 |
+AND aggregate sums remain within tolerance |
|
| 226 |
+THEN classify as "consolidation likely" |
|
| 227 |
+ELSE classify as "changed/uncertain" with evidence |
|
| 228 |
+``` |
|
| 229 |
+ |
|
| 230 |
+#### D. Export Preservation |
|
| 231 |
+``` |
|
| 232 |
+For selected records or diffs: |
|
| 233 |
+ export archived details, observation metadata, hashes, and explanatory labels |
|
| 234 |
+ never require a cloud round trip |
|
| 235 |
+``` |
|
| 236 |
+ |
|
| 237 |
+--- |
|
| 238 |
+ |
|
| 239 |
+## 5. Context Logging |
|
| 240 |
+ |
|
| 241 |
+HealthProbe does **not** sync its own archive through iCloud or CloudKit. Observed HealthKit databases can diverge between devices, and HealthProbe no longer attempts to compare snapshots from different devices. The product scope is one local observation timeline on the current device. |
|
| 242 |
+ |
|
| 243 |
+Health/iCloud state is still useful as **context** for interpreting local changes, but it is not treated as proof of cause. |
|
| 244 |
+ |
|
| 245 |
+### 5.1 Context Tracking |
|
| 246 |
+ |
|
| 247 |
+**Observe HealthKit permission & coarse system context:** |
|
| 248 |
+```swift |
|
| 249 |
+HKHealthStore().requestAuthorization(...) |
|
| 250 |
+// → Detect when user grants/revokes permissions |
|
| 251 |
+ |
|
| 252 |
+// Monitor iCloud sign-in state as context only |
|
| 253 |
+FileManager.default.ubiquityIdentityToken |
|
| 254 |
+// → Detects iCloud sign-in/sign-out |
|
| 255 |
+// → Logs context for later correlation |
|
| 256 |
+``` |
|
| 257 |
+ |
|
| 258 |
+**Capture lifecycle events:** |
|
| 259 |
+- iCloud sign-in detected → log context and schedule a local archive verification pass |
|
| 260 |
+- iCloud sign-out detected → note local-only mode |
|
| 261 |
+- Device backup initiated → pre-backup snapshot |
|
| 262 |
+- App backgrounded/foregrounded → capture if needed |
|
| 263 |
+ |
|
| 264 |
+### 5.2 Context Documentation |
|
| 265 |
+ |
|
| 266 |
+**Audit trail entries:** |
|
| 267 |
+``` |
|
| 268 |
+[2026-05-01 14:23:15] SYNC_STATE_CHANGE: iCloud enabled |
|
| 269 |
+ - Previous: local-only |
|
| 270 |
+ - Action: archive verification scheduled |
|
| 271 |
+ - Result: no HealthProbe cloud sync performed |
|
| 272 |
+ |
|
| 273 |
+[2026-05-01 14:24:02] CAPTURE_COMPLETED: local HealthKit observation |
|
| 274 |
+ - Samples observed: 87 new, 3 no longer visible |
|
| 275 |
+ - Representation changes: 2 groups |
|
| 276 |
+ - Cause: not inferred |
|
| 277 |
+ |
|
| 278 |
+[2026-05-01 16:15:00] CHANGE_SUMMARY: Historical record appeared |
|
| 279 |
+ - Type: Workout |
|
| 280 |
+ - Record date: 2024-03-15 |
|
| 281 |
+ - First observed by HealthProbe: 2026-05-01 |
|
| 282 |
+ - Label: appeared; cause unknown |
|
| 283 |
+``` |
|
| 284 |
+ |
|
| 285 |
+### 5.3 Background Monitoring |
|
| 286 |
+ |
|
| 287 |
+**iOS Background Modes enabled:** |
|
| 288 |
+- `background-fetch` — periodic archive and context checks |
|
| 289 |
+- `remote-notification` → not required for HealthProbe archive sync |
|
| 290 |
+ |
|
| 291 |
+**Check frequency:** |
|
| 292 |
+- Min: 2 hours |
|
| 293 |
+- Max: 24 hours |
|
| 294 |
+- Adapts based on archive cost and user preference |
|
| 295 |
+ |
|
| 296 |
+--- |
|
| 297 |
+ |
|
| 298 |
+## 6. Local Archive, Reports & Forensics |
|
| 299 |
+ |
|
| 300 |
+### 6.1 Local Archive Store |
|
| 301 |
+ |
|
| 302 |
+The main backup artifact is the on-device archive store. It is populated incrementally from HealthKit and is not dependent on Apple Health ZIP exports or full encrypted iPhone backups. |
|
| 303 |
+ |
|
| 304 |
+The archive must preserve as much HealthKit information as the API exposes: |
|
| 305 |
+- sample UUID, type, start/end date, value, unit, and metadata |
|
| 306 |
+- source, source revision, bundle identifier, product type, version/build if available |
|
| 307 |
+- device fields exposed by `HKDevice` |
|
| 308 |
+- relationships between workouts, samples, events, and other linked records where available |
|
| 309 |
+- first-seen / last-seen / last-verified observations |
|
| 310 |
+- fingerprints suitable for matching against Apple Health XML exports and extracted backup databases |
|
| 311 |
+ |
|
| 312 |
+The archive is selected by data type for performance and privacy, but it is stored in **one schema** so later analysis can follow relationships between types. |
|
| 313 |
+ |
|
| 314 |
+### 6.2 Reports and Point Exports |
|
| 315 |
+ |
|
| 316 |
+HealthProbe does not need to optimize for routine complete exports. The local archive is the backup; point exports are the user-facing way to preserve or share a historical view. |
|
| 317 |
+ |
|
| 318 |
+Export is scoped to what the user is inspecting: |
|
| 319 |
+- point-in-time record tables |
|
| 320 |
+- diff reports between two observations |
|
| 321 |
+- point-in-time manifests and hashes |
|
| 322 |
+- selected record sets needed for external analysis |
|
| 323 |
+ |
|
| 324 |
+### 6.3 Forensic Query Examples |
|
| 325 |
+ |
|
| 326 |
+**"What did my step data look like on March 1?"** |
|
| 327 |
+``` |
|
| 328 |
+1. Select the March 1 observation |
|
| 329 |
+2. Load archived visible records and aggregates for Steps |
|
| 330 |
+3. Show counts, time range, daily totals, and source breakdown |
|
| 331 |
+4. Offer JSON/CSV export for the selected view |
|
| 332 |
+``` |
|
| 333 |
+ |
|
| 334 |
+**"What changed since the previous observation?"** |
|
| 335 |
+``` |
|
| 336 |
+1. Compare adjacent local observations |
|
| 337 |
+2. Group appeared, disappeared, retained, and representation-changed records |
|
| 338 |
+3. Explain likely consolidation when aggregates remain stable |
|
| 339 |
+``` |
|
| 340 |
+ |
|
| 341 |
+**"Can I still export detail that HealthKit no longer shows?"** |
|
| 342 |
+``` |
|
| 343 |
+1. Search the local archive for the earlier observation |
|
| 344 |
+2. Select the preserved records or diff set |
|
| 345 |
+3. Export archived details with manifest hashes and observation metadata |
|
| 346 |
+``` |
|
| 347 |
+ |
|
| 348 |
+--- |
|
| 349 |
+ |
|
| 350 |
+## 7. User-Facing Features |
|
| 351 |
+ |
|
| 352 |
+### 7.1 Dashboard (iOS App) |
|
| 353 |
+ |
|
| 354 |
+**Home Screen:** |
|
| 355 |
+- **Latest Observation** — timestamp and capture quality |
|
| 356 |
+- **Archive Coverage** — selected data types, date range, storage use |
|
| 357 |
+- **Recent Changes** — neutral summary of appeared/disappeared/changed records |
|
| 358 |
+- **Export Shortcuts** — selected observation or diff report |
|
| 359 |
+ |
|
| 360 |
+**Detail Views:** |
|
| 361 |
+- **Timeline** — local observations over time |
|
| 362 |
+- **Observation Detail** — point-in-time tables and aggregates |
|
| 363 |
+- **Diff Detail** — changes between two observations |
|
| 364 |
+- **Audit Trail** — complete immutable log |
|
| 365 |
+- **Archive Status** — current local archive health, last verification, selected data types |
|
| 366 |
+ |
|
| 367 |
+**Settings:** |
|
| 368 |
+- Capture frequency |
|
| 369 |
+- Sample types to track |
|
| 370 |
+- Change-label thresholds/tolerances |
|
| 371 |
+- Local archive retention and report export options |
|
| 372 |
+ |
|
| 373 |
+### 7.2 Notifications |
|
| 374 |
+ |
|
| 375 |
+Notification-led alerting is not a current product objective. The app may later add reminders for scheduled capture or completed exports, but alerts about presumed data loss are explicitly out of scope. |
|
| 376 |
+ |
|
| 377 |
+--- |
|
| 378 |
+ |
|
| 379 |
+## 8. Future Work Parking Lot |
|
| 380 |
+ |
|
| 381 |
+Items in this section are not active product objectives. They require a separate scope decision before implementation. |
|
| 382 |
+ |
|
| 383 |
+### 8.1 Better Reconstruction |
|
| 384 |
+- richer point-in-time query language |
|
| 385 |
+- improved consolidation heuristics |
|
| 386 |
+- archive compaction without losing observation history |
|
| 387 |
+ |
|
| 388 |
+### 8.2 External Analysis Tools |
|
| 389 |
+- Analyze explicit HealthProbe exports outside the iOS app if needed for the DearApple investigation/article |
|
| 390 |
+- Do not treat a macOS companion, community sharing, or open-source publication as committed product scope |
|
| 391 |
+ |
|
| 392 |
+### 8.3 Recovery-Compatible Archives |
|
| 393 |
+ |
|
| 394 |
+HealthProbe will not perform recovery workflows. It will not patch iOS backups, transplant Health database files, or re-publish archived values into HealthKit. |
|
| 395 |
+ |
|
| 396 |
+However, HealthProbe archives and exports should be suitable input for external recovery/salvage procedures, including: |
|
| 397 |
+ |
|
| 398 |
+1. **Backup transplant restoration outside the app** |
|
| 399 |
+ - External tooling may use HealthProbe evidence alongside the reverse of the DearApple scratchpad HealthDB extraction/reinsertion workflow. |
|
| 400 |
+ - Archive/export requirements: preserve source database identity where available, record identity, payload versions, dates, values, units, metadata, relationships, observation timestamps, and manifest hashes. |
|
| 401 |
+ |
|
| 402 |
+2. **HealthKit re-publication outside the app** |
|
| 403 |
+ - External tooling may choose to write missing values back through HealthKit as new app-owned samples. |
|
| 404 |
+ - Archive/export requirements: preserve enough value/date/unit/type detail to recreate user-visible values, plus explicit provenance warnings because original source/device/sync metadata may be lost. |
|
| 405 |
+ |
|
| 406 |
+Recovery compatibility is therefore an archive/export design requirement, not an in-app restore feature. The app remains read-only. |
|
| 407 |
+ |
|
| 408 |
+--- |
|
| 409 |
+ |
|
| 410 |
+## 9. Technical Specifications |
|
| 411 |
+ |
|
| 412 |
+### 9.1 Platform |
|
| 413 |
+- **iOS 15.0+** (HealthKit framework support; keeps iPhone 6s-era Health collection devices in scope) |
|
| 414 |
+Legacy Apple Watch devices remain relevant as Health data sources paired to the target iPhone, but HealthProbe itself is scoped as an iOS app. |
|
| 415 |
+ |
|
| 416 |
+### 9.2 Permissions Required |
|
| 417 |
+- `HealthKit` — read-only access to specified types |
|
| 418 |
+- `Background Modes` — "Background Fetch" |
|
| 419 |
+ |
|
| 420 |
+### 9.3 Data Storage |
|
| 421 |
+- **SQLite Archive Store:** canonical differential HealthKit observation archive and analysis engine (source of truth) |
|
| 422 |
+- **Core Data:** derived UI/report/cache/settings/log/history store, rebuildable from SQLite |
|
| 423 |
+- **No CloudKit sync:** HealthProbe data remains local unless the user exports a report or selected record table |
|
| 424 |
+ |
|
| 425 |
+### 9.4 Performance |
|
| 426 |
+- Query time: < 5 seconds (anchored queries) |
|
| 427 |
+- UI/report cache size: bounded, rebuildable, and safe to purge |
|
| 428 |
+- Archive storage: differential; depends on selected high-frequency data types and number of representation changes, not number of full periodic snapshots |
|
| 429 |
+- Large analysis: runs in SQLite with paged results; Swift must not load full high-frequency datasets into RAM |
|
| 430 |
+ |
|
| 431 |
+--- |
|
| 432 |
+ |
|
| 433 |
+## 10. Privacy & Security |
|
| 434 |
+ |
|
| 435 |
+### 10.1 What HealthProbe Never Does |
|
| 436 |
+- ❌ Exports raw health samples to cloud |
|
| 437 |
+- ❌ Identifies users by name/account |
|
| 438 |
+- ❌ Shares device location or personal context |
|
| 439 |
+- ❌ Modifies any HealthKit data |
|
| 440 |
+- ❌ Patches, transplants, or rewrites iOS backup databases |
|
| 441 |
+- ❌ Sells or shares data with third parties |
|
| 442 |
+ |
|
| 443 |
+### 10.2 What HealthProbe Collects (Local Only) |
|
| 444 |
+- ✅ Aggregated counts and per-sample archive data for user-selected types |
|
| 445 |
+- ✅ Observation and change timestamps |
|
| 446 |
+- ✅ Device model & iOS version (for context) |
|
| 447 |
+- ✅ Change labels and evidence summaries |
|
| 448 |
+ |
|
| 449 |
+**Local archive:** |
|
| 450 |
+- ✅ Per-sample archive for user-selected types, stored on-device and exportable by user |
|
| 451 |
+- ✅ Metadata needed for recognition in Apple Health XML exports, backup database extracts, and future datasets |
|
| 452 |
+ |
|
| 453 |
+### 10.3 Cloud Policy |
|
| 454 |
+- No HealthProbe CloudKit/iCloud sync |
|
| 455 |
+- No cross-device HealthProbe snapshot comparison |
|
| 456 |
+- No automatic upload of raw samples, digests, reports, or device fingerprints |
|
| 457 |
+- User-triggered exports are explicit, scoped, and local-file based |
|
| 458 |
+ |
|
| 459 |
+--- |
|
| 460 |
+ |
|
| 461 |
+## 11. Success Criteria |
|
| 462 |
+ |
|
| 463 |
+| Objective | Metric | Target | |
|
| 464 |
+|-----------|--------|--------| |
|
| 465 |
+| **Time-machine inspection** | User can inspect a selected observation | All captured types | |
|
| 466 |
+| **Change explanation** | Diffs include neutral labels and evidence | > 95% of visible changes classified or marked uncertain | |
|
| 467 |
+| **Export preservation** | Selected historical records can be exported | JSON/CSV with manifest hashes | |
|
| 468 |
+| **False alarms** | Count-only drops framed as critical loss | 0 by design | |
|
| 469 |
+| **Privacy** | % of users comfortable with data practices | > 90% | |
|
| 470 |
+| **Performance** | Background capture battery impact | < 2% drain/day | |
|
| 471 |
+| **Reproducibility** | Users can preserve scoped evidence | High relevance for personal analysis and the DearApple investigation context | |
|
| 472 |
+ |
|
| 473 |
+--- |
|
| 474 |
+ |
|
| 475 |
+## 12. References & Related Work |
|
| 476 |
+ |
|
| 477 |
+- [DearApple Issue #001](https://github.com/overbog/dear-apple/issues/0001-apple-health-mass-data-loss.md) — historical context for reported Apple Health data anomalies |
|
| 478 |
+- [Apple HealthKit Documentation](https://developer.apple.com/documentation/healthkit/) |
|
| 479 |
+- [HKAnchoredObjectQuery](https://developer.apple.com/documentation/healthkit/hkanchoredrobjectquery) — Efficient incremental queries |
|
| 480 |
+ |
|
| 481 |
+--- |
|
| 482 |
+ |
|
| 483 |
+## Appendix A: Example Diff Export |
|
| 484 |
+ |
|
| 485 |
+```json |
|
| 486 |
+{
|
|
| 487 |
+ "report_id": "DIFF_20260501_001", |
|
| 488 |
+ "type": "observation_diff", |
|
| 489 |
+ "exported_at": "2026-05-01T14:35:22Z", |
|
| 490 |
+ "from_observation": "2026-04-01T08:00:00Z", |
|
| 491 |
+ "to_observation": "2026-05-01T08:00:00Z", |
|
| 492 |
+ "evidence": {
|
|
| 493 |
+ "sample_type": "HKQuantityTypeIdentifierStepCount", |
|
| 494 |
+ "appeared": 12, |
|
| 495 |
+ "disappeared": 84, |
|
| 496 |
+ "representation_changed": 6, |
|
| 497 |
+ "aggregate_delta": {
|
|
| 498 |
+ "value_sum_percent": 0.1, |
|
| 499 |
+ "covered_days": 31 |
|
| 500 |
+ }, |
|
| 501 |
+ "label": "consolidation_likely", |
|
| 502 |
+ "cause": "not inferred" |
|
| 503 |
+ }, |
|
| 504 |
+ "manifest_hash": "synthetic-example-hash" |
|
| 505 |
+} |
|
| 506 |
+``` |
|
| 507 |
+ |
|
| 508 |
+--- |
|
| 509 |
+ |
|
| 510 |
+*HealthProbe — A local time machine for your Health database.* |
|
@@ -0,0 +1,188 @@ |
||
| 1 |
+# HealthProbe - Core Data Cache Design |
|
| 2 |
+ |
|
| 3 |
+**Last Updated:** 2026-05-23 |
|
| 4 |
+**Status:** Target design for UI/report cache |
|
| 5 |
+ |
|
| 6 |
+## 1. Purpose |
|
| 7 |
+ |
|
| 8 |
+Core Data is not the forensic archive. It is the bounded, UI-friendly store for values already derived from the SQLite archive. |
|
| 9 |
+ |
|
| 10 |
+Use Core Data for: |
|
| 11 |
+- observation rows shown in timelines; |
|
| 12 |
+- type summaries and expensive counts; |
|
| 13 |
+- daily/monthly aggregate display rows; |
|
| 14 |
+- diff summary rows; |
|
| 15 |
+- export history/status rows; |
|
| 16 |
+- archive health/status rows; |
|
| 17 |
+- local app state and settings that are not forensic evidence. |
|
| 18 |
+ |
|
| 19 |
+Do not use Core Data for: |
|
| 20 |
+- the only copy of HealthKit samples; |
|
| 21 |
+- raw record payload history; |
|
| 22 |
+- relationship evidence; |
|
| 23 |
+- point-in-time reconstruction truth; |
|
| 24 |
+- large record tables. |
|
| 25 |
+ |
|
| 26 |
+SQLite wins on disagreement. |
|
| 27 |
+ |
|
| 28 |
+## 2. Store Categories |
|
| 29 |
+ |
|
| 30 |
+Core Data may contain two categories of entities. |
|
| 31 |
+ |
|
| 32 |
+**Rebuildable cache entities** |
|
| 33 |
+Can be deleted and rebuilt from SQLite: |
|
| 34 |
+- `CachedObservationRow`; |
|
| 35 |
+- `CachedTypeSummary`; |
|
| 36 |
+- `CachedDailyAggregate`; |
|
| 37 |
+- `CachedDiffSummary`; |
|
| 38 |
+- `CachedExportManifest`; |
|
| 39 |
+- `CachedArchiveHealth`. |
|
| 40 |
+ |
|
| 41 |
+**Local app state/settings** |
|
| 42 |
+Not forensic, not necessarily rebuildable: |
|
| 43 |
+- selected type preferences; |
|
| 44 |
+- UI display preferences; |
|
| 45 |
+- last opened screen/state; |
|
| 46 |
+- feature flags for legacy-device UI simplification. |
|
| 47 |
+ |
|
| 48 |
+Cache rebuild must not delete settings unless the user explicitly resets the app. |
|
| 49 |
+ |
|
| 50 |
+## 3. Entity Contracts |
|
| 51 |
+ |
|
| 52 |
+### CachedObservationRow |
|
| 53 |
+ |
|
| 54 |
+Purpose: timeline/list display. |
|
| 55 |
+ |
|
| 56 |
+Required fields: |
|
| 57 |
+- `observationID`; |
|
| 58 |
+- `observedAt`; |
|
| 59 |
+- `status`; |
|
| 60 |
+- `triggerReason`; |
|
| 61 |
+- `timeZoneIdentifier`; |
|
| 62 |
+- `trackedTypeCount`; |
|
| 63 |
+- `visibleRecordCount`; |
|
| 64 |
+- `appearedCount`; |
|
| 65 |
+- `disappearedCount`; |
|
| 66 |
+- `representationChangedCount`; |
|
| 67 |
+- `archiveSchemaVersion`; |
|
| 68 |
+- `cacheSchemaVersion`; |
|
| 69 |
+- `sourceAggregateHash`; |
|
| 70 |
+- `computedAt`. |
|
| 71 |
+ |
|
| 72 |
+### CachedTypeSummary |
|
| 73 |
+ |
|
| 74 |
+Purpose: per-observation/per-type summary cards and reports. |
|
| 75 |
+ |
|
| 76 |
+Required fields: |
|
| 77 |
+- `observationID`; |
|
| 78 |
+- `sampleTypeIdentifier`; |
|
| 79 |
+- `displayName`; |
|
| 80 |
+- `visibleRecordCount`; |
|
| 81 |
+- `appearedCount`; |
|
| 82 |
+- `disappearedCount`; |
|
| 83 |
+- `representationChangedCount`; |
|
| 84 |
+- `earliestStartDate`; |
|
| 85 |
+- `latestEndDate`; |
|
| 86 |
+- `valueSum`; |
|
| 87 |
+- `valueMax`; |
|
| 88 |
+- `aggregateHash`; |
|
| 89 |
+- `computedAt`. |
|
| 90 |
+ |
|
| 91 |
+### CachedDailyAggregate |
|
| 92 |
+ |
|
| 93 |
+Purpose: charts and report tables. |
|
| 94 |
+ |
|
| 95 |
+Required fields: |
|
| 96 |
+- `observationID`; |
|
| 97 |
+- `sampleTypeIdentifier`; |
|
| 98 |
+- `bucketStart`; |
|
| 99 |
+- `bucketEnd`; |
|
| 100 |
+- `timeZoneIdentifier`; |
|
| 101 |
+- `visibleRecordCount`; |
|
| 102 |
+- `valueSum`; |
|
| 103 |
+- `valueMax`; |
|
| 104 |
+- `sourceRevisionDisplayHash`; |
|
| 105 |
+- `aggregateHash`; |
|
| 106 |
+- `computedAt`. |
|
| 107 |
+ |
|
| 108 |
+### CachedDiffSummary |
|
| 109 |
+ |
|
| 110 |
+Purpose: observation comparison list/detail. |
|
| 111 |
+ |
|
| 112 |
+Required fields: |
|
| 113 |
+- `fromObservationID`; |
|
| 114 |
+- `toObservationID`; |
|
| 115 |
+- `sampleTypeIdentifier`; |
|
| 116 |
+- `appearedCount`; |
|
| 117 |
+- `disappearedCount`; |
|
| 118 |
+- `representationChangedCount`; |
|
| 119 |
+- `consolidationLikely`; |
|
| 120 |
+- `uncertaintyReason`; |
|
| 121 |
+- `sourceAggregateHash`; |
|
| 122 |
+- `computedAt`. |
|
| 123 |
+ |
|
| 124 |
+### CachedExportManifest |
|
| 125 |
+ |
|
| 126 |
+Purpose: export history/status display. |
|
| 127 |
+ |
|
| 128 |
+Required fields: |
|
| 129 |
+- `exportID`; |
|
| 130 |
+- `exportKind`; |
|
| 131 |
+- `createdAt`; |
|
| 132 |
+- `fromObservationID`; |
|
| 133 |
+- `toObservationID`; |
|
| 134 |
+- `filterSummary`; |
|
| 135 |
+- `recordCount`; |
|
| 136 |
+- `manifestHash`; |
|
| 137 |
+- `fileURLBookmarkData`; |
|
| 138 |
+- `status`; |
|
| 139 |
+- `computedAt`. |
|
| 140 |
+ |
|
| 141 |
+### CachedArchiveHealth |
|
| 142 |
+ |
|
| 143 |
+Purpose: archive status screen. |
|
| 144 |
+ |
|
| 145 |
+Required fields: |
|
| 146 |
+- `archiveSchemaVersion`; |
|
| 147 |
+- `cacheSchemaVersion`; |
|
| 148 |
+- `lastIntegrityCheckAt`; |
|
| 149 |
+- `lastIntegrityStatus`; |
|
| 150 |
+- `lastErrorKind`; |
|
| 151 |
+- `lastErrorMessageHash`; |
|
| 152 |
+- `cacheBuildID`; |
|
| 153 |
+- `computedAt`. |
|
| 154 |
+ |
|
| 155 |
+## 4. Invalidation |
|
| 156 |
+ |
|
| 157 |
+Invalidate/rebuild cache rows when: |
|
| 158 |
+- archive schema version changes; |
|
| 159 |
+- archive reset/reinitialization occurs; |
|
| 160 |
+- selected type registry changes; |
|
| 161 |
+- a new observation commits in SQLite; |
|
| 162 |
+- aggregate hashes change; |
|
| 163 |
+- cache schema version changes. |
|
| 164 |
+ |
|
| 165 |
+Rebuild order: |
|
| 166 |
+1. archive health/status; |
|
| 167 |
+2. observation rows; |
|
| 168 |
+3. type summaries; |
|
| 169 |
+4. daily/monthly aggregates; |
|
| 170 |
+5. diff summaries; |
|
| 171 |
+6. export status rows. |
|
| 172 |
+ |
|
| 173 |
+Partial rebuild is allowed when SQLite can identify affected observations/types. Full rebuild must remain available for repair and tests. |
|
| 174 |
+ |
|
| 175 |
+## 5. Legacy Device Mode |
|
| 176 |
+ |
|
| 177 |
+Legacy or low-memory UI should still use the same Core Data cache. It may reduce: |
|
| 178 |
+- chart density; |
|
| 179 |
+- default date range; |
|
| 180 |
+- preview row count; |
|
| 181 |
+- simultaneous loaded detail panes. |
|
| 182 |
+ |
|
| 183 |
+It must preserve: |
|
| 184 |
+- capture; |
|
| 185 |
+- cached summaries; |
|
| 186 |
+- report generation; |
|
| 187 |
+- paged SQLite detail/export access. |
|
| 188 |
+ |
|
@@ -0,0 +1,777 @@ |
||
| 1 |
+# HealthProbe - Database Design |
|
| 2 |
+ |
|
| 3 |
+**Version:** 1.1 |
|
| 4 |
+**Last Updated:** 2026-05-23 |
|
| 5 |
+**Status:** Canonical database/storage design |
|
| 6 |
+ |
|
| 7 |
+## 1. Purpose |
|
| 8 |
+ |
|
| 9 |
+The database is the central piece of HealthProbe. The app can only reconstruct, analyze, export, and explain HealthKit history if the archive is complete, correct, queryable, and stable across product changes. |
|
| 10 |
+ |
|
| 11 |
+UI can be refactored cheaply. A wrong archive design can permanently lose evidence, make large analyses impossible on low-end devices, or prevent future recovery-compatible exports. All storage work must start from this document. |
|
| 12 |
+ |
|
| 13 |
+## 2. Non-Negotiable Requirements |
|
| 14 |
+ |
|
| 15 |
+1. **SQLite archive is the source of truth.** |
|
| 16 |
+ Core Data is a rebuildable cache. SwiftData is legacy/prototype only. |
|
| 17 |
+ |
|
| 18 |
+2. **Store differentially.** |
|
| 19 |
+ Do not append recurring complete snapshots of large HealthKit datasets. Store identities, payload versions, observation events/ranges, and aggregates. |
|
| 20 |
+ |
|
| 21 |
+3. **Analyze in SQL, not RAM.** |
|
| 22 |
+ Diffs, counts, point-in-time reconstruction, export selection, and consolidation heuristics must use SQLite indexes, joins, CTEs, temporary tables, and paged/streaming results. |
|
| 23 |
+ |
|
| 24 |
+4. **Support legacy devices.** |
|
| 25 |
+ The target includes iOS 15-era devices such as iPhone 6s-class Health collection setups. Do not require SwiftData. |
|
| 26 |
+ |
|
| 27 |
+5. **Preserve recovery-compatible structure.** |
|
| 28 |
+ The app will not restore or re-publish data, but archives/exports must preserve enough identity, payload, provenance, relationships, hashes, and observation history for external recovery/salvage tooling. |
|
| 29 |
+ |
|
| 30 |
+6. **Never treat counts as sufficient truth.** |
|
| 31 |
+ Counts are cached for reports/UI, but record identity, payload versions, visibility history, and aggregate context are required for interpretation. |
|
| 32 |
+ |
|
| 33 |
+7. **No real personal data in repository artifacts.** |
|
| 34 |
+ Database fixtures, docs, tests, and examples must use synthetic values only. |
|
| 35 |
+ |
|
| 36 |
+## 3. Storage Layers |
|
| 37 |
+ |
|
| 38 |
+### 3.1 SQLite Archive / Analysis Database |
|
| 39 |
+ |
|
| 40 |
+`HealthProbeArchive.sqlite` |
|
| 41 |
+ |
|
| 42 |
+Responsibilities: |
|
| 43 |
+- canonical HealthKit observation history; |
|
| 44 |
+- sample identity and payload versioning; |
|
| 45 |
+- source/device/metadata/relationship preservation; |
|
| 46 |
+- point-in-time reconstruction; |
|
| 47 |
+- adjacent and selected-observation diffs; |
|
| 48 |
+- consolidation heuristics; |
|
| 49 |
+- materialized aggregates; |
|
| 50 |
+- streaming/paged exports; |
|
| 51 |
+- integrity manifests and future schema migrations. |
|
| 52 |
+ |
|
| 53 |
+This database must be queryable without loading high-frequency datasets into Swift arrays. |
|
| 54 |
+ |
|
| 55 |
+### 3.2 Core Data UI / Report Cache |
|
| 56 |
+ |
|
| 57 |
+Core Data cache store. |
|
| 58 |
+ |
|
| 59 |
+Responsibilities: |
|
| 60 |
+- expensive counts already computed from SQLite; |
|
| 61 |
+- observation list rows; |
|
| 62 |
+- dashboard/timeline summaries; |
|
| 63 |
+- per-type summary rows; |
|
| 64 |
+- report/export metadata; |
|
| 65 |
+- app settings and lightweight UI state. |
|
| 66 |
+ |
|
| 67 |
+Rules: |
|
| 68 |
+- cache rows are disposable; |
|
| 69 |
+- cache rows must be rebuildable from SQLite; |
|
| 70 |
+- if Core Data and SQLite disagree, SQLite wins; |
|
| 71 |
+- Core Data must not contain the only copy of any record-level evidence. |
|
| 72 |
+ |
|
| 73 |
+### 3.3 SwiftData Legacy Store |
|
| 74 |
+ |
|
| 75 |
+Current SwiftData models are a prototype implementation detail. New storage work should not expand them. |
|
| 76 |
+ |
|
| 77 |
+There are no real deployments, only test installs. During the archive v2 refactor, old SwiftData stores and prototype SQLite archives may be ignored, deleted, or reinitialized. Do not build backward compatibility or one-way import for the old prototype schema unless a later product decision explicitly changes this policy. |
|
| 78 |
+ |
|
| 79 |
+## 4. Conceptual Model |
|
| 80 |
+ |
|
| 81 |
+### Observation |
|
| 82 |
+ |
|
| 83 |
+An observation is one local capture attempt/result at a specific time on the current device chain. It is not a full copy of all visible records. |
|
| 84 |
+ |
|
| 85 |
+An observation records: |
|
| 86 |
+- when capture started/ended; |
|
| 87 |
+- app/schema/OS context; |
|
| 88 |
+- timezone context at observation time; |
|
| 89 |
+- selected type registry; |
|
| 90 |
+- per-type capture quality; |
|
| 91 |
+- HealthKit anchors; |
|
| 92 |
+- events and aggregate changes observed during the capture. |
|
| 93 |
+ |
|
| 94 |
+### Terminology |
|
| 95 |
+ |
|
| 96 |
+- **Capture**: the act of querying HealthKit and writing results to the archive. |
|
| 97 |
+- **Observation**: the durable archive record created by a capture attempt. |
|
| 98 |
+- **Snapshot**: a reconstructed view of records visible at a selected observation. Do not store snapshot copies for high-volume data. |
|
| 99 |
+- **Diff**: SQL-derived comparison between two observations on the same local device chain. |
|
| 100 |
+ |
|
| 101 |
+### Sample Identity |
|
| 102 |
+ |
|
| 103 |
+A sample identity is the stable record or semantic record HealthProbe tracks over time. |
|
| 104 |
+ |
|
| 105 |
+Identity inputs may include: |
|
| 106 |
+- HealthKit UUID hash when available; |
|
| 107 |
+- strict fingerprint; |
|
| 108 |
+- semantic fingerprint; |
|
| 109 |
+- sample type; |
|
| 110 |
+- date range; |
|
| 111 |
+- value/unit/category/workout fields; |
|
| 112 |
+- source revision where relevant. |
|
| 113 |
+ |
|
| 114 |
+HealthKit UUID hash is important but not enough for every future use case. Apple exports and backup database extracts may require semantic/fuzzy matching. |
|
| 115 |
+ |
|
| 116 |
+### Sample Version |
|
| 117 |
+ |
|
| 118 |
+A sample version is the payload representation observed for a sample identity. |
|
| 119 |
+ |
|
| 120 |
+A new version is created only when the representation changes: |
|
| 121 |
+- start/end dates; |
|
| 122 |
+- value/unit/category/workout fields; |
|
| 123 |
+- source revision; |
|
| 124 |
+- metadata hash; |
|
| 125 |
+- related sample/workout/event links. |
|
| 126 |
+ |
|
| 127 |
+### Visibility/Event History |
|
| 128 |
+ |
|
| 129 |
+HealthProbe stores visibility as events and/or compressed ranges: |
|
| 130 |
+- appeared; |
|
| 131 |
+- verified/seen; |
|
| 132 |
+- disappeared/no longer visible; |
|
| 133 |
+- representation changed; |
|
| 134 |
+- deleted-object evidence where HealthKit exposes `HKDeletedObject`. |
|
| 135 |
+ |
|
| 136 |
+This allows point-in-time reconstruction without duplicating every visible record into every observation. |
|
| 137 |
+ |
|
| 138 |
+### Aggregate Cache In SQLite |
|
| 139 |
+ |
|
| 140 |
+SQLite stores materialized aggregates because many reports and screens need expensive counts/sums repeatedly. |
|
| 141 |
+ |
|
| 142 |
+Aggregates are archive-derived evidence, not the source of truth. They must be rebuildable from sample/version/event tables. |
|
| 143 |
+ |
|
| 144 |
+## 5. Target SQLite Schema |
|
| 145 |
+ |
|
| 146 |
+Exact names may evolve, but the shape and constraints should remain. |
|
| 147 |
+ |
|
| 148 |
+### 5.1 Schema And Metadata |
|
| 149 |
+ |
|
| 150 |
+```sql |
|
| 151 |
+CREATE TABLE schema_migrations ( |
|
| 152 |
+ version INTEGER PRIMARY KEY, |
|
| 153 |
+ applied_at REAL NOT NULL, |
|
| 154 |
+ description TEXT NOT NULL |
|
| 155 |
+); |
|
| 156 |
+ |
|
| 157 |
+CREATE TABLE archive_metadata ( |
|
| 158 |
+ key TEXT PRIMARY KEY, |
|
| 159 |
+ value TEXT NOT NULL |
|
| 160 |
+); |
|
| 161 |
+``` |
|
| 162 |
+ |
|
| 163 |
+### 5.2 Device Chain And Observations |
|
| 164 |
+ |
|
| 165 |
+```sql |
|
| 166 |
+CREATE TABLE device_chains ( |
|
| 167 |
+ id INTEGER PRIMARY KEY, |
|
| 168 |
+ device_chain_hash TEXT NOT NULL UNIQUE, |
|
| 169 |
+ created_at REAL NOT NULL, |
|
| 170 |
+ recovered_from_keychain INTEGER NOT NULL DEFAULT 0 |
|
| 171 |
+); |
|
| 172 |
+ |
|
| 173 |
+CREATE TABLE observations ( |
|
| 174 |
+ id INTEGER PRIMARY KEY, |
|
| 175 |
+ device_chain_id INTEGER NOT NULL REFERENCES device_chains(id), |
|
| 176 |
+ observed_at REAL NOT NULL, |
|
| 177 |
+ started_at REAL, |
|
| 178 |
+ ended_at REAL, |
|
| 179 |
+ status TEXT NOT NULL, |
|
| 180 |
+ trigger_reason TEXT NOT NULL, |
|
| 181 |
+ app_version TEXT, |
|
| 182 |
+ os_version TEXT, |
|
| 183 |
+ time_zone_identifier TEXT, |
|
| 184 |
+ time_zone_seconds_from_gmt INTEGER, |
|
| 185 |
+ schema_version INTEGER NOT NULL, |
|
| 186 |
+ selected_type_set_hash TEXT, |
|
| 187 |
+ notes TEXT |
|
| 188 |
+); |
|
| 189 |
+ |
|
| 190 |
+CREATE INDEX idx_observations_device_time |
|
| 191 |
+ON observations(device_chain_id, observed_at); |
|
| 192 |
+``` |
|
| 193 |
+ |
|
| 194 |
+### 5.3 Per-Type Capture Runs And Anchors |
|
| 195 |
+ |
|
| 196 |
+```sql |
|
| 197 |
+CREATE TABLE sample_types ( |
|
| 198 |
+ id INTEGER PRIMARY KEY, |
|
| 199 |
+ type_identifier TEXT NOT NULL UNIQUE, |
|
| 200 |
+ display_name TEXT, |
|
| 201 |
+ category TEXT |
|
| 202 |
+); |
|
| 203 |
+ |
|
| 204 |
+CREATE TABLE observation_type_runs ( |
|
| 205 |
+ id INTEGER PRIMARY KEY, |
|
| 206 |
+ observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 207 |
+ sample_type_id INTEGER NOT NULL REFERENCES sample_types(id), |
|
| 208 |
+ status TEXT NOT NULL, |
|
| 209 |
+ started_at REAL, |
|
| 210 |
+ ended_at REAL, |
|
| 211 |
+ anchor_before BLOB, |
|
| 212 |
+ anchor_after BLOB, |
|
| 213 |
+ inserted_event_count INTEGER NOT NULL DEFAULT 0, |
|
| 214 |
+ deleted_event_count INTEGER NOT NULL DEFAULT 0, |
|
| 215 |
+ verified_visible_count INTEGER, |
|
| 216 |
+ error_kind TEXT, |
|
| 217 |
+ error_message_hash TEXT, |
|
| 218 |
+ UNIQUE(observation_id, sample_type_id) |
|
| 219 |
+); |
|
| 220 |
+ |
|
| 221 |
+CREATE INDEX idx_type_runs_type_observation |
|
| 222 |
+ON observation_type_runs(sample_type_id, observation_id); |
|
| 223 |
+``` |
|
| 224 |
+ |
|
| 225 |
+### 5.4 Sources, Devices, Metadata |
|
| 226 |
+ |
|
| 227 |
+```sql |
|
| 228 |
+CREATE TABLE sources ( |
|
| 229 |
+ id INTEGER PRIMARY KEY, |
|
| 230 |
+ source_name_hash TEXT, |
|
| 231 |
+ bundle_identifier TEXT |
|
| 232 |
+); |
|
| 233 |
+ |
|
| 234 |
+CREATE TABLE source_revisions ( |
|
| 235 |
+ id INTEGER PRIMARY KEY, |
|
| 236 |
+ source_id INTEGER NOT NULL REFERENCES sources(id), |
|
| 237 |
+ product_type TEXT, |
|
| 238 |
+ version TEXT, |
|
| 239 |
+ operating_system_version TEXT, |
|
| 240 |
+ UNIQUE(source_id, product_type, version, operating_system_version) |
|
| 241 |
+); |
|
| 242 |
+ |
|
| 243 |
+CREATE TABLE hk_devices ( |
|
| 244 |
+ id INTEGER PRIMARY KEY, |
|
| 245 |
+ device_hash TEXT, |
|
| 246 |
+ manufacturer_hash TEXT, |
|
| 247 |
+ model TEXT, |
|
| 248 |
+ hardware_version TEXT, |
|
| 249 |
+ firmware_version TEXT, |
|
| 250 |
+ software_version TEXT, |
|
| 251 |
+ local_identifier_hash TEXT, |
|
| 252 |
+ udi_hash TEXT |
|
| 253 |
+); |
|
| 254 |
+ |
|
| 255 |
+CREATE TABLE metadata_blobs ( |
|
| 256 |
+ id INTEGER PRIMARY KEY, |
|
| 257 |
+ metadata_hash TEXT NOT NULL UNIQUE, |
|
| 258 |
+ metadata_json TEXT NOT NULL |
|
| 259 |
+); |
|
| 260 |
+``` |
|
| 261 |
+ |
|
| 262 |
+Privacy note: raw personal/device identifiers should be hashed or omitted according to policy. Store enough provenance for local analysis and recovery-compatible exports without leaking identifiers into logs or repository fixtures. |
|
| 263 |
+ |
|
| 264 |
+### 5.5 Samples And Payload Versions |
|
| 265 |
+ |
|
| 266 |
+```sql |
|
| 267 |
+CREATE TABLE samples ( |
|
| 268 |
+ id INTEGER PRIMARY KEY, |
|
| 269 |
+ sample_type_id INTEGER NOT NULL REFERENCES sample_types(id), |
|
| 270 |
+ sample_uuid_hash TEXT, |
|
| 271 |
+ strict_fingerprint TEXT NOT NULL, |
|
| 272 |
+ semantic_fingerprint TEXT, |
|
| 273 |
+ fuzzy_key TEXT, |
|
| 274 |
+ first_seen_observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 275 |
+ first_seen_at REAL NOT NULL, |
|
| 276 |
+ UNIQUE(sample_type_id, strict_fingerprint) |
|
| 277 |
+); |
|
| 278 |
+ |
|
| 279 |
+CREATE INDEX idx_samples_uuid_hash |
|
| 280 |
+ON samples(sample_uuid_hash); |
|
| 281 |
+ |
|
| 282 |
+CREATE INDEX idx_samples_type_semantic |
|
| 283 |
+ON samples(sample_type_id, semantic_fingerprint); |
|
| 284 |
+ |
|
| 285 |
+CREATE TABLE sample_versions ( |
|
| 286 |
+ id INTEGER PRIMARY KEY, |
|
| 287 |
+ sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 288 |
+ payload_hash TEXT NOT NULL, |
|
| 289 |
+ start_date REAL NOT NULL, |
|
| 290 |
+ end_date REAL NOT NULL, |
|
| 291 |
+ value_kind TEXT, |
|
| 292 |
+ numeric_value REAL, |
|
| 293 |
+ unit TEXT, |
|
| 294 |
+ category_value INTEGER, |
|
| 295 |
+ workout_activity_type INTEGER, |
|
| 296 |
+ duration_seconds REAL, |
|
| 297 |
+ source_revision_id INTEGER REFERENCES source_revisions(id), |
|
| 298 |
+ hk_device_id INTEGER REFERENCES hk_devices(id), |
|
| 299 |
+ metadata_id INTEGER REFERENCES metadata_blobs(id), |
|
| 300 |
+ created_observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 301 |
+ UNIQUE(sample_id, payload_hash) |
|
| 302 |
+); |
|
| 303 |
+ |
|
| 304 |
+CREATE INDEX idx_sample_versions_sample |
|
| 305 |
+ON sample_versions(sample_id); |
|
| 306 |
+ |
|
| 307 |
+CREATE INDEX idx_sample_versions_time |
|
| 308 |
+ON sample_versions(start_date, end_date); |
|
| 309 |
+``` |
|
| 310 |
+ |
|
| 311 |
+### 5.6 Observation Events And Visibility Ranges |
|
| 312 |
+ |
|
| 313 |
+```sql |
|
| 314 |
+CREATE TABLE sample_observation_events ( |
|
| 315 |
+ id INTEGER PRIMARY KEY, |
|
| 316 |
+ observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 317 |
+ sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 318 |
+ version_id INTEGER REFERENCES sample_versions(id), |
|
| 319 |
+ event_kind TEXT NOT NULL, |
|
| 320 |
+ observed_at REAL NOT NULL, |
|
| 321 |
+ evidence_kind TEXT, |
|
| 322 |
+ UNIQUE(observation_id, sample_id, event_kind) |
|
| 323 |
+); |
|
| 324 |
+ |
|
| 325 |
+CREATE INDEX idx_events_observation_kind |
|
| 326 |
+ON sample_observation_events(observation_id, event_kind); |
|
| 327 |
+ |
|
| 328 |
+CREATE INDEX idx_events_sample |
|
| 329 |
+ON sample_observation_events(sample_id, observation_id); |
|
| 330 |
+ |
|
| 331 |
+CREATE TABLE sample_visibility_ranges ( |
|
| 332 |
+ sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 333 |
+ version_id INTEGER REFERENCES sample_versions(id), |
|
| 334 |
+ first_observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 335 |
+ last_observation_id INTEGER REFERENCES observations(id), |
|
| 336 |
+ first_seen_at REAL NOT NULL, |
|
| 337 |
+ last_seen_at REAL, |
|
| 338 |
+ PRIMARY KEY (sample_id, version_id, first_observation_id) |
|
| 339 |
+); |
|
| 340 |
+ |
|
| 341 |
+CREATE INDEX idx_visibility_open_ranges |
|
| 342 |
+ON sample_visibility_ranges(last_observation_id); |
|
| 343 |
+ |
|
| 344 |
+CREATE INDEX idx_visibility_point_lookup |
|
| 345 |
+ON sample_visibility_ranges(first_observation_id, last_observation_id); |
|
| 346 |
+``` |
|
| 347 |
+ |
|
| 348 |
+Range convention: |
|
| 349 |
+- `last_observation_id IS NULL` means still visible at the latest verified observation for that type; |
|
| 350 |
+- closed ranges represent observations where the sample/version was visible; |
|
| 351 |
+- deleted-object evidence should create an event even when full payload is not available. |
|
| 352 |
+ |
|
| 353 |
+### 5.7 Relationships |
|
| 354 |
+ |
|
| 355 |
+```sql |
|
| 356 |
+CREATE TABLE sample_relationships ( |
|
| 357 |
+ id INTEGER PRIMARY KEY, |
|
| 358 |
+ observation_id INTEGER REFERENCES observations(id), |
|
| 359 |
+ source_sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 360 |
+ target_sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 361 |
+ relationship_kind TEXT NOT NULL, |
|
| 362 |
+ metadata_id INTEGER REFERENCES metadata_blobs(id), |
|
| 363 |
+ UNIQUE(observation_id, source_sample_id, target_sample_id, relationship_kind) |
|
| 364 |
+); |
|
| 365 |
+ |
|
| 366 |
+CREATE INDEX idx_relationship_source |
|
| 367 |
+ON sample_relationships(source_sample_id, relationship_kind); |
|
| 368 |
+ |
|
| 369 |
+CREATE INDEX idx_relationship_target |
|
| 370 |
+ON sample_relationships(target_sample_id, relationship_kind); |
|
| 371 |
+``` |
|
| 372 |
+ |
|
| 373 |
+Relationships are required for recovery-compatible archives. Even if iOS HealthKit exposes limited relationships, the schema must not prevent future preservation. |
|
| 374 |
+ |
|
| 375 |
+### 5.8 Materialized Aggregates |
|
| 376 |
+ |
|
| 377 |
+```sql |
|
| 378 |
+CREATE TABLE observation_type_summaries ( |
|
| 379 |
+ observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 380 |
+ sample_type_id INTEGER NOT NULL REFERENCES sample_types(id), |
|
| 381 |
+ visible_record_count INTEGER NOT NULL, |
|
| 382 |
+ appeared_count INTEGER NOT NULL DEFAULT 0, |
|
| 383 |
+ disappeared_count INTEGER NOT NULL DEFAULT 0, |
|
| 384 |
+ representation_changed_count INTEGER NOT NULL DEFAULT 0, |
|
| 385 |
+ earliest_start_date REAL, |
|
| 386 |
+ latest_end_date REAL, |
|
| 387 |
+ value_sum REAL, |
|
| 388 |
+ value_max REAL, |
|
| 389 |
+ aggregate_hash TEXT, |
|
| 390 |
+ PRIMARY KEY (observation_id, sample_type_id) |
|
| 391 |
+); |
|
| 392 |
+ |
|
| 393 |
+CREATE TABLE daily_type_aggregates ( |
|
| 394 |
+ observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 395 |
+ sample_type_id INTEGER NOT NULL REFERENCES sample_types(id), |
|
| 396 |
+ bucket_start REAL NOT NULL, |
|
| 397 |
+ bucket_end REAL NOT NULL, |
|
| 398 |
+ visible_record_count INTEGER NOT NULL, |
|
| 399 |
+ value_sum REAL, |
|
| 400 |
+ value_max REAL, |
|
| 401 |
+ source_revision_id INTEGER, |
|
| 402 |
+ aggregate_hash TEXT, |
|
| 403 |
+ PRIMARY KEY (observation_id, sample_type_id, bucket_start, source_revision_id) |
|
| 404 |
+); |
|
| 405 |
+ |
|
| 406 |
+CREATE INDEX idx_daily_type_bucket |
|
| 407 |
+ON daily_type_aggregates(sample_type_id, bucket_start); |
|
| 408 |
+``` |
|
| 409 |
+ |
|
| 410 |
+Aggregates feed reports and the Core Data cache. They are also important for consolidation heuristics because a count drop with stable aggregate value may be representation change, not meaningful loss. |
|
| 411 |
+ |
|
| 412 |
+### 5.9 Exports And Manifests |
|
| 413 |
+ |
|
| 414 |
+```sql |
|
| 415 |
+CREATE TABLE export_manifests ( |
|
| 416 |
+ id INTEGER PRIMARY KEY, |
|
| 417 |
+ export_id TEXT NOT NULL UNIQUE, |
|
| 418 |
+ created_at REAL NOT NULL, |
|
| 419 |
+ export_kind TEXT NOT NULL, |
|
| 420 |
+ from_observation_id INTEGER REFERENCES observations(id), |
|
| 421 |
+ to_observation_id INTEGER REFERENCES observations(id), |
|
| 422 |
+ filter_json TEXT, |
|
| 423 |
+ manifest_hash TEXT NOT NULL, |
|
| 424 |
+ record_count INTEGER NOT NULL |
|
| 425 |
+); |
|
| 426 |
+ |
|
| 427 |
+CREATE TABLE export_items ( |
|
| 428 |
+ export_manifest_id INTEGER NOT NULL REFERENCES export_manifests(id), |
|
| 429 |
+ sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 430 |
+ version_id INTEGER REFERENCES sample_versions(id), |
|
| 431 |
+ item_hash TEXT NOT NULL, |
|
| 432 |
+ PRIMARY KEY (export_manifest_id, sample_id, version_id) |
|
| 433 |
+); |
|
| 434 |
+``` |
|
| 435 |
+ |
|
| 436 |
+Exports should be reproducible from the archive when possible. Manifest hashes let external tools verify that a recovery-compatible export matches the archived evidence. |
|
| 437 |
+ |
|
| 438 |
+## 6. Archive V2 Implementation Decisions |
|
| 439 |
+ |
|
| 440 |
+These decisions close Milestone 1 for archive v2 unless a later dated entry changes them. |
|
| 441 |
+ |
|
| 442 |
+### 6.1 Timestamps |
|
| 443 |
+ |
|
| 444 |
+Archive timestamps are stored as Unix seconds in UTC using SQLite `REAL`. |
|
| 445 |
+ |
|
| 446 |
+Rules: |
|
| 447 |
+- write dates with `Date.timeIntervalSince1970`; |
|
| 448 |
+- read dates with `Date(timeIntervalSince1970:)`; |
|
| 449 |
+- never store local-time interpreted timestamps in archive date columns; |
|
| 450 |
+- canonical hash/export text uses ISO 8601 UTC with fractional seconds; |
|
| 451 |
+- aggregate bucket rows store UTC bucket boundaries plus the observation timezone context that produced them. |
|
| 452 |
+ |
|
| 453 |
+### 6.2 Hashing And Privacy |
|
| 454 |
+ |
|
| 455 |
+Use two hash classes: |
|
| 456 |
+- **Integrity/content hashes:** plain `SHA-256`, lower-case hex. Use for payload hashes, metadata hashes, aggregate hashes, export item hashes, and manifest hashes. |
|
| 457 |
+- **Privacy-sensitive identifiers:** `HMAC-SHA256`, lower-case hex, using a locally stored archive secret. Use for HealthKit UUIDs, device identifiers, source names, local identifiers, UDI-like values, and device-chain identifiers. |
|
| 458 |
+ |
|
| 459 |
+All hash inputs must include a domain/version prefix such as `hp:v2:sample_uuid:` or `hp:v2:payload:`. Do not hash raw strings without a domain prefix. |
|
| 460 |
+ |
|
| 461 |
+The local archive secret is device-local application state. If the secret is lost, future captures may start a new device chain. Export files include already-computed hashes and manifest/item hashes, not the local archive secret. |
|
| 462 |
+ |
|
| 463 |
+### 6.3 Device Chain Identity |
|
| 464 |
+ |
|
| 465 |
+`device_chain_hash` identifies one local capture chain, not a globally unique device. |
|
| 466 |
+ |
|
| 467 |
+Initial implementation: |
|
| 468 |
+- create or recover a random chain seed from Keychain; |
|
| 469 |
+- compute `device_chain_hash = HMAC-SHA256(archiveSecret, "hp:v2:device_chain:" + chainSeed)`; |
|
| 470 |
+- set `recovered_from_keychain = 1` when the seed survived app reinstall and was reused; |
|
| 471 |
+- start a new chain if the seed is missing or explicitly reset. |
|
| 472 |
+ |
|
| 473 |
+HealthKit device metadata may be stored as hashed provenance in `hk_devices`, but it must not be used as a cross-device comparison key. |
|
| 474 |
+ |
|
| 475 |
+### 6.4 Fingerprints And Payload Versions |
|
| 476 |
+ |
|
| 477 |
+`sample_uuid_hash` is the preferred stable HealthKit identity when HealthKit exposes a UUID. |
|
| 478 |
+ |
|
| 479 |
+`strict_fingerprint` is a deterministic exact fallback/verification key built from canonical fields: |
|
| 480 |
+- type identifier; |
|
| 481 |
+- start/end timestamps in canonical UTC text; |
|
| 482 |
+- value kind and canonical value representation; |
|
| 483 |
+- unit/category/workout fields where applicable; |
|
| 484 |
+- source bundle identifier hash when available. |
|
| 485 |
+ |
|
| 486 |
+Do not include SQLite row ids in fingerprints. If HealthKit UUID is available, a payload change creates a new `sample_versions` row for the same sample identity. If HealthKit UUID is not available, an exact strict fingerprint change may create appeared/disappeared evidence, with `semantic_fingerprint` used only as weak consolidation context. |
|
| 487 |
+ |
|
| 488 |
+`payload_hash` is `SHA-256` over the canonical sample payload representation, including dates, value/unit/category/workout fields, source revision fields, device provenance hashes, metadata hash, and relationship payload when available. A new `sample_versions` row is created when `payload_hash` changes. |
|
| 489 |
+ |
|
| 490 |
+`semantic_fingerprint` is type-specific and optional. It supports consolidation heuristics and fuzzy backup/export reconciliation, but it is never sufficient by itself to prove record identity. |
|
| 491 |
+ |
|
| 492 |
+### 6.5 Timezone And Aggregate Buckets |
|
| 493 |
+ |
|
| 494 |
+Raw sample timestamps remain UTC. Daily/monthly aggregate buckets are computed using the device timezone active at observation time because user-facing health summaries are local-day concepts. |
|
| 495 |
+ |
|
| 496 |
+Rules: |
|
| 497 |
+- store `time_zone_identifier` and `time_zone_seconds_from_gmt` on `observations`; |
|
| 498 |
+- bucket boundaries are midnight-to-midnight in that observation timezone, stored as UTC seconds; |
|
| 499 |
+- old observations are not retroactively re-bucketed when the device timezone changes; |
|
| 500 |
+- exports include timezone metadata so external tools can reinterpret buckets if needed. |
|
| 501 |
+ |
|
| 502 |
+### 6.6 Visibility Range Maintenance |
|
| 503 |
+ |
|
| 504 |
+Maintain `sample_visibility_ranges` eagerly in the same transaction that writes observation events. Point-in-time queries should read ranges, not rebuild them from events on every request. |
|
| 505 |
+ |
|
| 506 |
+Integrity tools may rebuild ranges from `sample_observation_events` into temporary tables and compare the result with stored ranges. Validation rebuild is for tests/repair checks, not the normal query path. |
|
| 507 |
+ |
|
| 508 |
+### 6.7 Relationship Preservation |
|
| 509 |
+ |
|
| 510 |
+Store every relationship the capture/import surface can observe. Relationship rows are append-only evidence for what was known at an observation; do not rewrite `relationship_kind` to encode later disappearance. |
|
| 511 |
+ |
|
| 512 |
+If a related sample disappears, endpoint visibility ranges and events explain that disappearance. Relationship exports should include relationships when both endpoints are included, and may include unresolved endpoint hashes when allowed by the export scope. |
|
| 513 |
+ |
|
| 514 |
+### 6.8 Aggregates |
|
| 515 |
+ |
|
| 516 |
+Aggregates are computed in SQLite after each successful observation/type run and materialized in `observation_type_summaries` and `daily_type_aggregates`. |
|
| 517 |
+ |
|
| 518 |
+SQL-first does not mean recompute every count live for the UI. It means heavy computation is done by SQLite using indexes, joins, CTEs, temporary tables, and paged result sets. Repeated UI/report reads consume materialized SQLite aggregates and the Core Data cache. |
|
| 519 |
+ |
|
| 520 |
+### 6.9 Export Manifest Canonicalization |
|
| 521 |
+ |
|
| 522 |
+Structured exports use a versioned canonical envelope with deterministic ordering. |
|
| 523 |
+ |
|
| 524 |
+Rules: |
|
| 525 |
+- export format version starts at `1`; |
|
| 526 |
+- JSON object keys are sorted for canonical bytes; |
|
| 527 |
+- record/item order is deterministic: sample type, start date, end date, sample identity hash, version hash; |
|
| 528 |
+- each exported item has an `item_hash = SHA-256(canonical item JSON)`; |
|
| 529 |
+- `manifest_hash = SHA-256(canonical export metadata + ordered item_hash list + counts + filter description)`; |
|
| 530 |
+- large exports compute hashes incrementally while streaming/paging rows from SQLite. |
|
| 531 |
+ |
|
| 532 |
+Manifest hashes must cover exported content through item hashes, not only counts or first/last dates. |
|
| 533 |
+ |
|
| 534 |
+## 7. Write Path |
|
| 535 |
+ |
|
| 536 |
+### 7.1 Capture Transaction Shape |
|
| 537 |
+ |
|
| 538 |
+For each observation/type run: |
|
| 539 |
+ |
|
| 540 |
+1. Open SQLite transaction. |
|
| 541 |
+2. Insert/update `observations` and `observation_type_runs`. |
|
| 542 |
+3. For each added/visible sample page: |
|
| 543 |
+ - upsert source/source revision/device/metadata; |
|
| 544 |
+ - upsert `samples`; |
|
| 545 |
+ - upsert `sample_versions`; |
|
| 546 |
+ - insert observation event; |
|
| 547 |
+ - update visibility ranges. |
|
| 548 |
+4. For each `HKDeletedObject`: |
|
| 549 |
+ - find sample by UUID hash; |
|
| 550 |
+ - insert deleted/disappeared event; |
|
| 551 |
+ - close open visibility ranges. |
|
| 552 |
+5. Recompute affected materialized aggregates. |
|
| 553 |
+6. Commit SQLite. |
|
| 554 |
+7. Update/rebuild Core Data cache after SQLite commit. |
|
| 555 |
+ |
|
| 556 |
+SQLite commit must happen before Core Data cache update. Cache rebuild failures must not corrupt archive truth. |
|
| 557 |
+ |
|
| 558 |
+### 7.2 Idempotency |
|
| 559 |
+ |
|
| 560 |
+Capture pages may be retried. Writes must be idempotent using uniqueness constraints: |
|
| 561 |
+- `(sample_type_id, strict_fingerprint)` for samples; |
|
| 562 |
+- `(sample_id, payload_hash)` for versions; |
|
| 563 |
+- `(observation_id, sample_id, event_kind)` for events; |
|
| 564 |
+- range primary key for visibility. |
|
| 565 |
+ |
|
| 566 |
+### 7.3 Anchor Handling |
|
| 567 |
+ |
|
| 568 |
+HealthKit anchors are capture implementation state. Store them per type run, but do not treat anchors as forensic truth. If an anchor is unusable, the archive must still support rebuilding the current visible state from a full type scan. |
|
| 569 |
+ |
|
| 570 |
+## 8. Point-In-Time Reconstruction |
|
| 571 |
+ |
|
| 572 |
+Point-in-time reconstruction should use ranges, not full snapshot tables. |
|
| 573 |
+ |
|
| 574 |
+Conceptual query: |
|
| 575 |
+ |
|
| 576 |
+```sql |
|
| 577 |
+SELECT s.id AS sample_id, sv.id AS version_id, sv.start_date, sv.end_date, |
|
| 578 |
+ sv.value_kind, sv.numeric_value, sv.unit |
|
| 579 |
+FROM sample_visibility_ranges r |
|
| 580 |
+JOIN samples s ON s.id = r.sample_id |
|
| 581 |
+JOIN sample_versions sv ON sv.id = r.version_id |
|
| 582 |
+JOIN observations target ON target.id = :observation_id |
|
| 583 |
+WHERE s.sample_type_id = :sample_type_id |
|
| 584 |
+ AND r.first_observation_id <= target.id |
|
| 585 |
+ AND (r.last_observation_id IS NULL OR r.last_observation_id >= target.id) |
|
| 586 |
+ORDER BY sv.start_date, s.strict_fingerprint |
|
| 587 |
+LIMIT :limit OFFSET :offset; |
|
| 588 |
+``` |
|
| 589 |
+ |
|
| 590 |
+Implementation may optimize this with temporary tables or materialized visible sets for selected observations. |
|
| 591 |
+ |
|
| 592 |
+## 9. Diff Between Observations |
|
| 593 |
+ |
|
| 594 |
+Diffs must run in SQLite. |
|
| 595 |
+ |
|
| 596 |
+```sql |
|
| 597 |
+CREATE TEMP TABLE prev_visible AS |
|
| 598 |
+SELECT r.sample_id, r.version_id |
|
| 599 |
+FROM sample_visibility_ranges r |
|
| 600 |
+WHERE r.first_observation_id <= :previous |
|
| 601 |
+ AND (r.last_observation_id IS NULL OR r.last_observation_id >= :previous); |
|
| 602 |
+ |
|
| 603 |
+CREATE TEMP TABLE curr_visible AS |
|
| 604 |
+SELECT r.sample_id, r.version_id |
|
| 605 |
+FROM sample_visibility_ranges r |
|
| 606 |
+WHERE r.first_observation_id <= :current |
|
| 607 |
+ AND (r.last_observation_id IS NULL OR r.last_observation_id >= :current); |
|
| 608 |
+ |
|
| 609 |
+CREATE INDEX temp_prev_sample ON prev_visible(sample_id); |
|
| 610 |
+CREATE INDEX temp_curr_sample ON curr_visible(sample_id); |
|
| 611 |
+ |
|
| 612 |
+-- Disappeared. |
|
| 613 |
+SELECT p.sample_id |
|
| 614 |
+FROM prev_visible p |
|
| 615 |
+LEFT JOIN curr_visible c ON c.sample_id = p.sample_id |
|
| 616 |
+WHERE c.sample_id IS NULL; |
|
| 617 |
+ |
|
| 618 |
+-- Appeared. |
|
| 619 |
+SELECT c.sample_id |
|
| 620 |
+FROM curr_visible c |
|
| 621 |
+LEFT JOIN prev_visible p ON p.sample_id = c.sample_id |
|
| 622 |
+WHERE p.sample_id IS NULL; |
|
| 623 |
+ |
|
| 624 |
+-- Representation changed. |
|
| 625 |
+SELECT c.sample_id, p.version_id AS before_version_id, c.version_id AS after_version_id |
|
| 626 |
+FROM curr_visible c |
|
| 627 |
+JOIN prev_visible p ON p.sample_id = c.sample_id |
|
| 628 |
+WHERE p.version_id != c.version_id; |
|
| 629 |
+``` |
|
| 630 |
+ |
|
| 631 |
+Result sets must be paged. Counts can be materialized into `observation_type_summaries` and Core Data cache. |
|
| 632 |
+ |
|
| 633 |
+## 10. Consolidation Heuristics |
|
| 634 |
+ |
|
| 635 |
+Consolidation likely when: |
|
| 636 |
+- many old high-frequency records disappear; |
|
| 637 |
+- newer records cover similar date windows; |
|
| 638 |
+- aggregate sums remain within tolerance; |
|
| 639 |
+- sample density decreases while duration/interval length increases; |
|
| 640 |
+- source/provenance context is compatible. |
|
| 641 |
+ |
|
| 642 |
+Required evidence: |
|
| 643 |
+- record-count delta; |
|
| 644 |
+- value-sum delta; |
|
| 645 |
+- coverage-window overlap; |
|
| 646 |
+- interval-length/density comparison; |
|
| 647 |
+- source/source-revision breakdown; |
|
| 648 |
+- uncertainty label if evidence is incomplete. |
|
| 649 |
+ |
|
| 650 |
+Never classify count drops alone as loss. |
|
| 651 |
+ |
|
| 652 |
+## 11. Core Data Cache Contract |
|
| 653 |
+ |
|
| 654 |
+Core Data entities should mirror presentation needs, not archive internals. |
|
| 655 |
+ |
|
| 656 |
+Core Data may contain two categories: |
|
| 657 |
+- rebuildable UI/report cache derived from SQLite; |
|
| 658 |
+- non-forensic local app state/settings. |
|
| 659 |
+ |
|
| 660 |
+Deleting rebuildable cache rows must not delete the SQLite archive. User settings may be preserved across cache rebuilds. |
|
| 661 |
+ |
|
| 662 |
+Candidate cache entities: |
|
| 663 |
+- `CachedObservationRow`; |
|
| 664 |
+- `CachedTypeSummary`; |
|
| 665 |
+- `CachedDailyAggregate`; |
|
| 666 |
+- `CachedDiffSummary`; |
|
| 667 |
+- `CachedExportManifest`; |
|
| 668 |
+- `CachedArchiveHealth`; |
|
| 669 |
+- `AppSetting`. |
|
| 670 |
+ |
|
| 671 |
+Every cache row should include: |
|
| 672 |
+- archive schema version; |
|
| 673 |
+- cache schema version; |
|
| 674 |
+- source observation id(s); |
|
| 675 |
+- source aggregate/hash where applicable; |
|
| 676 |
+- computed_at timestamp. |
|
| 677 |
+ |
|
| 678 |
+Invalidation rules: |
|
| 679 |
+- archive reset or future archive migration invalidates cache; |
|
| 680 |
+- selected type registry change invalidates affected summaries; |
|
| 681 |
+- aggregate rebuild invalidates corresponding Core Data rows; |
|
| 682 |
+- app can delete all cache rows and rebuild from SQLite. |
|
| 683 |
+ |
|
| 684 |
+## 12. Export Requirements |
|
| 685 |
+ |
|
| 686 |
+Exports are scoped and recovery-compatible. |
|
| 687 |
+ |
|
| 688 |
+Every structured export should include: |
|
| 689 |
+- export id; |
|
| 690 |
+- archive schema version; |
|
| 691 |
+- app version; |
|
| 692 |
+- observation id(s); |
|
| 693 |
+- selected type filters; |
|
| 694 |
+- record count; |
|
| 695 |
+- manifest hash; |
|
| 696 |
+- per-record sample identity/fingerprint; |
|
| 697 |
+- payload version hash; |
|
| 698 |
+- dates, values, units, category/workout fields; |
|
| 699 |
+- source/provenance metadata where available and allowed; |
|
| 700 |
+- relationships where available; |
|
| 701 |
+- provenance-loss warnings for external re-publication workflows. |
|
| 702 |
+ |
|
| 703 |
+Exports must stream/page from SQLite. Do not build large JSON/CSV exports entirely in RAM. |
|
| 704 |
+ |
|
| 705 |
+## 13. Reset And Future Migration Policy |
|
| 706 |
+ |
|
| 707 |
+Current status: HealthProbe has no real deployments, only test installations. The archive v2 refactor does not need backward compatibility with the old SwiftData/prototype SQLite schema. |
|
| 708 |
+ |
|
| 709 |
+For the current refactor: |
|
| 710 |
+- old SwiftData stores and prototype SQLite archives may be deleted, ignored, or reinitialized; |
|
| 711 |
+- no one-way migration from old prototype stores is required; |
|
| 712 |
+- test users/developers must expect prototype data loss when moving to archive v2; |
|
| 713 |
+- reset behavior must be documented in test/release notes and must not be presented as data preservation; |
|
| 714 |
+- Core Data cache stores remain disposable and rebuildable. |
|
| 715 |
+ |
|
| 716 |
+For future real archives: |
|
| 717 |
+- migrations must be versioned in `schema_migrations`; |
|
| 718 |
+- migrations must be tested on synthetic large archives; |
|
| 719 |
+- migration failure must not silently delete the archive; |
|
| 720 |
+- cache stores may be deleted and rebuilt; |
|
| 721 |
+- archive reset must require explicit user confirmation. |
|
| 722 |
+ |
|
| 723 |
+## 14. Integrity And Verification |
|
| 724 |
+ |
|
| 725 |
+Archive integrity checks: |
|
| 726 |
+- SQLite `PRAGMA integrity_check`; |
|
| 727 |
+- schema version check; |
|
| 728 |
+- missing FK/reference checks; |
|
| 729 |
+- aggregate rebuild spot checks; |
|
| 730 |
+- manifest hash verification; |
|
| 731 |
+- open visibility range sanity checks; |
|
| 732 |
+- duplicate identity checks; |
|
| 733 |
+- export reproducibility checks. |
|
| 734 |
+ |
|
| 735 |
+Use WAL mode for normal operation. Consider periodic checkpoints when safe. |
|
| 736 |
+ |
|
| 737 |
+## 15. Performance Rules |
|
| 738 |
+ |
|
| 739 |
+- Use prepared statements for bulk writes. |
|
| 740 |
+- Batch within transactions. |
|
| 741 |
+- Use indexes listed in the schema and add query-specific indexes only with evidence. |
|
| 742 |
+- Prefer integer primary keys internally. |
|
| 743 |
+- Store archive dates as Unix seconds UTC. |
|
| 744 |
+- Page record tables and exports. |
|
| 745 |
+- Use temporary tables for large selected-observation diffs. |
|
| 746 |
+- Do not decode large archived payloads into Swift collections. |
|
| 747 |
+- Profile on low-memory/legacy-class devices. |
|
| 748 |
+ |
|
| 749 |
+## 16. Testing Requirements |
|
| 750 |
+ |
|
| 751 |
+Unit/integration tests must cover: |
|
| 752 |
+- idempotent repeated capture page writes; |
|
| 753 |
+- first observation for a type; |
|
| 754 |
+- appeared/disappeared/representation-changed events; |
|
| 755 |
+- visibility range open/close behavior; |
|
| 756 |
+- point-in-time reconstruction; |
|
| 757 |
+- diff queries on large synthetic datasets; |
|
| 758 |
+- aggregate rebuild; |
|
| 759 |
+- Core Data cache rebuild after deletion; |
|
| 760 |
+- export manifest reproducibility; |
|
| 761 |
+- recovery-compatible export fields; |
|
| 762 |
+- prototype-store reset/reinitialization behavior for current test installs; |
|
| 763 |
+- future archive migrations once real archive versions exist; |
|
| 764 |
+- memory ceiling during export and diff. |
|
| 765 |
+ |
|
| 766 |
+No real HealthKit data in fixtures. |
|
| 767 |
+ |
|
| 768 |
+## 17. Deferred Design Questions |
|
| 769 |
+ |
|
| 770 |
+These do not block the archive v2 foundation, but they should be revisited before advanced import/export tooling: |
|
| 771 |
+ |
|
| 772 |
+1. Exact semantic/fuzzy fingerprint fields per HealthKit sample family. |
|
| 773 |
+2. Relationship extraction surface available from HealthKit vs backup/XML exports. |
|
| 774 |
+3. Optional user-controlled export profiles for more/less provenance disclosure. |
|
| 775 |
+4. Repair tooling for rebuilding visibility ranges after future real archive migrations. |
|
| 776 |
+ |
|
| 777 |
+Record any changed decision here with a date before implementing schema changes. |
|
@@ -0,0 +1,151 @@ |
||
| 1 |
+# HealthProbe - Export Specification |
|
| 2 |
+ |
|
| 3 |
+**Last Updated:** 2026-05-23 |
|
| 4 |
+**Status:** Target design for recovery-compatible exports |
|
| 5 |
+ |
|
| 6 |
+## 1. Purpose |
|
| 7 |
+ |
|
| 8 |
+Exports let the user preserve selected point-in-time views, diffs, record tables, and evidence summaries from the local SQLite archive. |
|
| 9 |
+ |
|
| 10 |
+HealthProbe does not restore, patch backups, or re-publish HealthKit data. Exports should still preserve enough structure for external recovery/salvage tools to reason about what was observed. |
|
| 11 |
+ |
|
| 12 |
+## 2. Export Kinds |
|
| 13 |
+ |
|
| 14 |
+Supported target kinds: |
|
| 15 |
+- `observation_records_json`; |
|
| 16 |
+- `observation_records_csv`; |
|
| 17 |
+- `observation_diff_json`; |
|
| 18 |
+- `type_summary_json`; |
|
| 19 |
+- `archive_manifest_json`. |
|
| 20 |
+ |
|
| 21 |
+Large exports must stream/page from SQLite. Do not materialize all records into Swift arrays. |
|
| 22 |
+ |
|
| 23 |
+## 3. JSON Envelope |
|
| 24 |
+ |
|
| 25 |
+Every JSON export uses a versioned envelope: |
|
| 26 |
+ |
|
| 27 |
+```json |
|
| 28 |
+{
|
|
| 29 |
+ "export_format_version": 1, |
|
| 30 |
+ "export_id": "UUID", |
|
| 31 |
+ "export_kind": "observation_records_json", |
|
| 32 |
+ "created_at": "2026-05-23T12:00:00.000Z", |
|
| 33 |
+ "app": {
|
|
| 34 |
+ "name": "HealthProbe", |
|
| 35 |
+ "version": "local-build" |
|
| 36 |
+ }, |
|
| 37 |
+ "archive": {
|
|
| 38 |
+ "schema_version": 2, |
|
| 39 |
+ "device_chain_hash": "hex", |
|
| 40 |
+ "from_observation_id": 1, |
|
| 41 |
+ "to_observation_id": null |
|
| 42 |
+ }, |
|
| 43 |
+ "filters": {
|
|
| 44 |
+ "sample_type_identifiers": [], |
|
| 45 |
+ "date_range": null, |
|
| 46 |
+ "include_relationships": true |
|
| 47 |
+ }, |
|
| 48 |
+ "manifest": {
|
|
| 49 |
+ "record_count": 0, |
|
| 50 |
+ "item_hash_algorithm": "sha256", |
|
| 51 |
+ "manifest_hash_algorithm": "sha256", |
|
| 52 |
+ "manifest_hash": "hex" |
|
| 53 |
+ }, |
|
| 54 |
+ "items": [] |
|
| 55 |
+} |
|
| 56 |
+``` |
|
| 57 |
+ |
|
| 58 |
+JSON keys are emitted in deterministic sorted order for canonical hashing. |
|
| 59 |
+ |
|
| 60 |
+## 4. Export Item Contract |
|
| 61 |
+ |
|
| 62 |
+Record items should include: |
|
| 63 |
+- sample identity hash; |
|
| 64 |
+- HealthKit UUID hash when available; |
|
| 65 |
+- strict fingerprint; |
|
| 66 |
+- semantic fingerprint when available; |
|
| 67 |
+- payload version hash; |
|
| 68 |
+- sample type identifier; |
|
| 69 |
+- start/end timestamps as ISO 8601 UTC; |
|
| 70 |
+- value kind, value, unit, category, workout fields; |
|
| 71 |
+- source/provenance hashes or redacted fields allowed by the export scope; |
|
| 72 |
+- metadata hash and optional metadata object when allowed; |
|
| 73 |
+- relationships when both endpoints are in scope, or unresolved endpoint hashes when explicitly allowed; |
|
| 74 |
+- observation visibility fields: first seen, last verified, disappeared evidence when available. |
|
| 75 |
+ |
|
| 76 |
+Every item has: |
|
| 77 |
+- `item_hash = SHA-256(canonical item JSON)`. |
|
| 78 |
+ |
|
| 79 |
+## 5. Manifest Hash |
|
| 80 |
+ |
|
| 81 |
+`manifest_hash` is calculated incrementally: |
|
| 82 |
+ |
|
| 83 |
+```text |
|
| 84 |
+SHA-256( |
|
| 85 |
+ canonical_export_metadata_json |
|
| 86 |
+ + ordered_item_hashes |
|
| 87 |
+ + canonical_counts_json |
|
| 88 |
+ + canonical_filter_json |
|
| 89 |
+) |
|
| 90 |
+``` |
|
| 91 |
+ |
|
| 92 |
+The manifest hash must cover exported content through item hashes. Counts, first dates, or last dates alone are not sufficient. |
|
| 93 |
+ |
|
| 94 |
+Item order: |
|
| 95 |
+1. sample type identifier; |
|
| 96 |
+2. start date; |
|
| 97 |
+3. end date; |
|
| 98 |
+4. sample identity hash; |
|
| 99 |
+5. payload version hash. |
|
| 100 |
+ |
|
| 101 |
+## 6. CSV Contract |
|
| 102 |
+ |
|
| 103 |
+CSV exports are flat record tables for spreadsheet and external tooling. |
|
| 104 |
+ |
|
| 105 |
+Required column order: |
|
| 106 |
+1. `export_id` |
|
| 107 |
+2. `observation_id` |
|
| 108 |
+3. `sample_type_identifier` |
|
| 109 |
+4. `sample_identity_hash` |
|
| 110 |
+5. `sample_uuid_hash` |
|
| 111 |
+6. `strict_fingerprint` |
|
| 112 |
+7. `semantic_fingerprint` |
|
| 113 |
+8. `payload_hash` |
|
| 114 |
+9. `start_date_utc` |
|
| 115 |
+10. `end_date_utc` |
|
| 116 |
+11. `value_kind` |
|
| 117 |
+12. `numeric_value` |
|
| 118 |
+13. `unit` |
|
| 119 |
+14. `category_value` |
|
| 120 |
+15. `workout_activity_type` |
|
| 121 |
+16. `duration_seconds` |
|
| 122 |
+17. `source_hash` |
|
| 123 |
+18. `device_hash` |
|
| 124 |
+19. `metadata_hash` |
|
| 125 |
+20. `first_seen_observation_id` |
|
| 126 |
+21. `last_verified_observation_id` |
|
| 127 |
+22. `disappeared_observation_id` |
|
| 128 |
+23. `item_hash` |
|
| 129 |
+ |
|
| 130 |
+CSV uses RFC 4180 quoting rules and UTF-8. |
|
| 131 |
+ |
|
| 132 |
+Relationships are not flattened into the main CSV row. If needed, export a companion relationships CSV with source/target sample hashes. |
|
| 133 |
+ |
|
| 134 |
+## 7. Streaming And Cancellation |
|
| 135 |
+ |
|
| 136 |
+Implementation contract: |
|
| 137 |
+- page records from SQLite with deterministic cursors; |
|
| 138 |
+- write output incrementally; |
|
| 139 |
+- update item/manifest hash state as rows stream; |
|
| 140 |
+- if the user cancels, mark export status as `cancelled` and do not record a completed manifest; |
|
| 141 |
+- failed exports should leave no completed manifest row unless the output is verifiable. |
|
| 142 |
+ |
|
| 143 |
+Resume support is optional for v1 exports. |
|
| 144 |
+ |
|
| 145 |
+## 8. Provenance Warning |
|
| 146 |
+ |
|
| 147 |
+Every user-facing export flow must communicate that: |
|
| 148 |
+- exported data is observed evidence from HealthKit-accessible surfaces; |
|
| 149 |
+- external re-publication to HealthKit may lose original metadata/provenance; |
|
| 150 |
+- HealthProbe itself does not restore or modify HealthKit/iOS backups. |
|
| 151 |
+ |
|
@@ -0,0 +1,403 @@ |
||
| 1 |
+# HealthProbe - Technical Implementation Guide |
|
| 2 |
+ |
|
| 3 |
+**Version:** 1.6 |
|
| 4 |
+**Last Updated:** 2026-05-23 |
|
| 5 |
+**Purpose:** Implementation guide for the iOS local Health DB Time Machine |
|
| 6 |
+ |
|
| 7 |
+For database schema, archive invariants, SQL analysis patterns, Core Data cache |
|
| 8 |
+boundaries, reset policy, and future migration rules, read |
|
| 9 |
+[`Database-Design.md`](Database-Design.md) first. This guide describes |
|
| 10 |
+implementation workflow and cross-module behavior. |
|
| 11 |
+ |
|
| 12 |
+## Privacy Directives - Mandatory |
|
| 13 |
+ |
|
| 14 |
+The following rules apply to all code, logs, examples, tests, and documentation: |
|
| 15 |
+ |
|
| 16 |
+- No credentials, API keys, tokens, passwords, or signing certificates |
|
| 17 |
+- No personal data: names, emails, phone numbers, dates of birth |
|
| 18 |
+- No account identifiers: Apple IDs, iCloud account info, CloudKit record IDs |
|
| 19 |
+- No raw real health values in the repository, tests, fixtures, logs, examples, or documentation |
|
| 20 |
+- No location data or patterns that could identify a user |
|
| 21 |
+- Device/source identifiers must be redacted, hashed, or stored only as local provenance according to the privacy policy |
|
| 22 |
+ |
|
| 23 |
+The app may store a user's HealthKit samples locally on-device when the user grants HealthKit access. Those samples must never be committed to source control or written to diagnostic logs. |
|
| 24 |
+ |
|
| 25 |
+## 1. Product Objective |
|
| 26 |
+ |
|
| 27 |
+HealthProbe is a single-device local archive and time-machine app for HealthKit-accessible data. |
|
| 28 |
+ |
|
| 29 |
+The implementation must prioritize: |
|
| 30 |
+- point-in-time reconstruction of local HealthKit observations |
|
| 31 |
+- neutral change explanation between observations |
|
| 32 |
+- preservation of selected details before HealthKit aggregation/consolidation makes them unavailable |
|
| 33 |
+- scoped user exports |
|
| 34 |
+- no HealthProbe CloudKit/iCloud sync |
|
| 35 |
+- no cross-device record-by-record comparison |
|
| 36 |
+- iOS 15-era legacy device support; SwiftData is not a target dependency |
|
| 37 |
+ |
|
| 38 |
+Record-count drops are not inherently critical. They are evidence to explain with record-level and aggregate context. |
|
| 39 |
+ |
|
| 40 |
+## 2. Test Installation Reset Lifecycle |
|
| 41 |
+ |
|
| 42 |
+HealthProbe has no real deployments at this stage. Existing SwiftData stores and prototype SQLite archives are disposable. |
|
| 43 |
+ |
|
| 44 |
+Archive v2 startup behavior: |
|
| 45 |
+1. Open the archive path. |
|
| 46 |
+2. Read `archive_metadata.schema_version` when present. |
|
| 47 |
+3. If no archive exists, create archive v2. |
|
| 48 |
+4. If a prototype/unknown schema exists, close the database, move it to a timestamped `*.prototype-backup` file for developer inspection, and create a fresh archive v2. |
|
| 49 |
+5. Rebuild/delete Core Data cache rows after archive reset. |
|
| 50 |
+6. Log reset reason without raw health values. |
|
| 51 |
+ |
|
| 52 |
+Do not implement one-way migration from the old prototype schema unless a later dated product decision reverses this policy. |
|
| 53 |
+ |
|
| 54 |
+## 3. HealthKit Capture |
|
| 55 |
+ |
|
| 56 |
+Use: |
|
| 57 |
+- `HKAnchoredObjectQuery` for incremental capture |
|
| 58 |
+- `HKObserverQuery` as a wake-up hint |
|
| 59 |
+- manual capture from the app UI |
|
| 60 |
+ |
|
| 61 |
+Capture flow: |
|
| 62 |
+1. Resolve the current local device chain ID. |
|
| 63 |
+2. Start an observation record. |
|
| 64 |
+3. For each selected sample type, run anchored queries. |
|
| 65 |
+4. Write HealthKit samples and deleted-object evidence to the local archive first. |
|
| 66 |
+5. Update materialized aggregate tables in SQLite. |
|
| 67 |
+6. Save/rebuild derived Core Data cache rows only after archive writes succeed. |
|
| 68 |
+7. Compute summary/diff caches for UI and reports. |
|
| 69 |
+ |
|
| 70 |
+Anchors belong to the local device timeline and selected type registry. They are implementation state, not forensic truth. |
|
| 71 |
+ |
|
| 72 |
+### 3.1 Capture State Machine |
|
| 73 |
+ |
|
| 74 |
+Observation statuses: |
|
| 75 |
+- `started`; |
|
| 76 |
+- `partial`; |
|
| 77 |
+- `completed`; |
|
| 78 |
+- `failed`; |
|
| 79 |
+- `cancelled`. |
|
| 80 |
+ |
|
| 81 |
+Type-run statuses: |
|
| 82 |
+- `started`; |
|
| 83 |
+- `completed`; |
|
| 84 |
+- `failed`; |
|
| 85 |
+- `unauthorized`; |
|
| 86 |
+- `timed_out`. |
|
| 87 |
+ |
|
| 88 |
+Rules: |
|
| 89 |
+- one failed type does not invalidate successfully committed type runs; |
|
| 90 |
+- incomplete observations are visible as partial evidence, not as proof of disappearance; |
|
| 91 |
+- anchors are saved only after the corresponding SQLite transaction commits; |
|
| 92 |
+- UI change labels must include uncertainty when either side of a diff has partial/failed type evidence. |
|
| 93 |
+ |
|
| 94 |
+### 3.2 Anchor Recovery |
|
| 95 |
+ |
|
| 96 |
+If an anchor is missing, corrupt, or rejected by HealthKit: |
|
| 97 |
+- mark the type run with anchor failure context; |
|
| 98 |
+- run a full scan for the affected type when permissions allow; |
|
| 99 |
+- rebuild current visibility for that type from observed samples and deleted-object evidence; |
|
| 100 |
+- continue storing future anchors after the full scan succeeds. |
|
| 101 |
+ |
|
| 102 |
+## 4. Storage Layers |
|
| 103 |
+ |
|
| 104 |
+### 4.1 Local Archive Store |
|
| 105 |
+ |
|
| 106 |
+The archive store is the source of truth. It should be a robust local SQLite database designed for both storage and analysis. |
|
| 107 |
+ |
|
| 108 |
+The canonical database design is [`Database-Design.md`](Database-Design.md). The summary below is intentionally high-level; do not treat it as a competing schema source. |
|
| 109 |
+ |
|
| 110 |
+The archive should support: |
|
| 111 |
+- one schema for all selected sample types |
|
| 112 |
+- differential observation storage; do not store complete recurring snapshots |
|
| 113 |
+- HealthKit UUID hash and internal fingerprints |
|
| 114 |
+- sample payload versions deduplicated across observations |
|
| 115 |
+- type identifier, start/end date, value, unit, and category/workout fields |
|
| 116 |
+- source/source revision metadata |
|
| 117 |
+- HealthKit metadata dictionaries |
|
| 118 |
+- device provenance exposed by HealthKit, redacted or hashed as required |
|
| 119 |
+- first-seen, last-seen, last-verified, and disappeared-at observations |
|
| 120 |
+- visibility ranges/events for point-in-time reconstruction |
|
| 121 |
+- observation history sufficient for point-in-time reconstruction |
|
| 122 |
+- relationship records where HealthKit exposes links between workouts, samples, events, or related records |
|
| 123 |
+- materialized aggregate tables for expensive counts used by reports/UI |
|
| 124 |
+- schema versioning, current test-store reset policy, and future migrations |
|
| 125 |
+- integrity hashes/manifests for exports |
|
| 126 |
+- indexes, temporary tables, joins, CTEs, and paged result sets for large diffs |
|
| 127 |
+- recovery-compatible exports for external tooling, preserving record identity, payload versions, provenance metadata where available, relationships, observation history, and manifest hashes |
|
| 128 |
+ |
|
| 129 |
+The archive must be able to answer: |
|
| 130 |
+- records visible at observation T |
|
| 131 |
+- records that appeared/disappeared between T1 and T2 |
|
| 132 |
+- records whose representation changed while semantic/aggregate meaning may be preserved |
|
| 133 |
+- selected records for streaming export |
|
| 134 |
+ |
|
| 135 |
+Minimum target schema shape is defined in [`Database-Design.md`](Database-Design.md). The archive must at least preserve these concepts: |
|
| 136 |
+ |
|
| 137 |
+- observations; |
|
| 138 |
+- sample identities; |
|
| 139 |
+- sample payload versions; |
|
| 140 |
+- observation events; |
|
| 141 |
+- visibility ranges; |
|
| 142 |
+- sources, source revisions, devices, metadata, and relationships; |
|
| 143 |
+- materialized aggregates; |
|
| 144 |
+- export manifests. |
|
| 145 |
+ |
|
| 146 |
+Historical sketch retained for orientation: |
|
| 147 |
+ |
|
| 148 |
+```sql |
|
| 149 |
+-- One row per local capture attempt/result. |
|
| 150 |
+CREATE TABLE observations ( |
|
| 151 |
+ id INTEGER PRIMARY KEY, |
|
| 152 |
+ observed_at REAL NOT NULL, |
|
| 153 |
+ status TEXT NOT NULL, |
|
| 154 |
+ app_version TEXT, |
|
| 155 |
+ os_version TEXT, |
|
| 156 |
+ device_chain_id TEXT NOT NULL, |
|
| 157 |
+ schema_version INTEGER NOT NULL |
|
| 158 |
+); |
|
| 159 |
+ |
|
| 160 |
+-- Stable identity for a HealthKit-accessible record or semantic record. |
|
| 161 |
+CREATE TABLE samples ( |
|
| 162 |
+ id INTEGER PRIMARY KEY, |
|
| 163 |
+ type_identifier TEXT NOT NULL, |
|
| 164 |
+ sample_uuid_hash TEXT, |
|
| 165 |
+ strict_fingerprint TEXT NOT NULL, |
|
| 166 |
+ semantic_fingerprint TEXT, |
|
| 167 |
+ first_seen_observation_id INTEGER NOT NULL |
|
| 168 |
+); |
|
| 169 |
+ |
|
| 170 |
+-- Deduplicated payload representation. New row only when representation changes. |
|
| 171 |
+CREATE TABLE sample_versions ( |
|
| 172 |
+ id INTEGER PRIMARY KEY, |
|
| 173 |
+ sample_id INTEGER NOT NULL, |
|
| 174 |
+ payload_hash TEXT NOT NULL, |
|
| 175 |
+ start_date REAL NOT NULL, |
|
| 176 |
+ end_date REAL NOT NULL, |
|
| 177 |
+ value REAL, |
|
| 178 |
+ unit TEXT, |
|
| 179 |
+ source_id INTEGER, |
|
| 180 |
+ metadata_hash TEXT |
|
| 181 |
+); |
|
| 182 |
+ |
|
| 183 |
+-- Visibility/event history, not a full snapshot copy. |
|
| 184 |
+CREATE TABLE sample_observation_events ( |
|
| 185 |
+ id INTEGER PRIMARY KEY, |
|
| 186 |
+ observation_id INTEGER NOT NULL, |
|
| 187 |
+ sample_id INTEGER NOT NULL, |
|
| 188 |
+ version_id INTEGER, |
|
| 189 |
+ event_kind TEXT NOT NULL |
|
| 190 |
+); |
|
| 191 |
+ |
|
| 192 |
+-- Optional compressed visibility ranges for point-in-time reconstruction. |
|
| 193 |
+CREATE TABLE sample_visibility_ranges ( |
|
| 194 |
+ sample_id INTEGER NOT NULL, |
|
| 195 |
+ version_id INTEGER, |
|
| 196 |
+ first_observation_id INTEGER NOT NULL, |
|
| 197 |
+ last_observation_id INTEGER, |
|
| 198 |
+ PRIMARY KEY (sample_id, version_id, first_observation_id) |
|
| 199 |
+); |
|
| 200 |
+ |
|
| 201 |
+-- Materialized aggregates feeding reports and Core Data cache. |
|
| 202 |
+CREATE TABLE daily_type_aggregates ( |
|
| 203 |
+ observation_id INTEGER NOT NULL, |
|
| 204 |
+ type_identifier TEXT NOT NULL, |
|
| 205 |
+ bucket_start REAL NOT NULL, |
|
| 206 |
+ record_count INTEGER NOT NULL, |
|
| 207 |
+ value_sum REAL, |
|
| 208 |
+ value_max REAL, |
|
| 209 |
+ PRIMARY KEY (observation_id, type_identifier, bucket_start) |
|
| 210 |
+); |
|
| 211 |
+``` |
|
| 212 |
+ |
|
| 213 |
+Exact naming can evolve, but the constraints must hold: payloads are deduplicated, observations are differential, and aggregates are materialized. |
|
| 214 |
+ |
|
| 215 |
+### 4.2 Core Data UI/Report Cache |
|
| 216 |
+ |
|
| 217 |
+Detailed entity contracts live in [`Core-Data-Cache-Design.md`](Core-Data-Cache-Design.md). |
|
| 218 |
+ |
|
| 219 |
+Core Data is the target derived/cache layer because it supports older iOS versions than SwiftData and is suitable for UI/report state. It may store: |
|
| 220 |
+- selected data types and app settings |
|
| 221 |
+- observation list and capture status |
|
| 222 |
+- precomputed summaries, temporal bins, and diff previews |
|
| 223 |
+- operation logs and export indexes |
|
| 224 |
+- change labels and links into the archive |
|
| 225 |
+- expensive count results used by reports and presentation |
|
| 226 |
+ |
|
| 227 |
+Core Data must not be the only forensic copy. If Core Data and the archive disagree, the SQLite archive wins. The cache must be safe to delete and rebuild from SQLite. |
|
| 228 |
+ |
|
| 229 |
+Current SwiftData models are legacy/prototype implementation details. New storage work should target Core Data for cache and SQLite for archive/analysis. |
|
| 230 |
+ |
|
| 231 |
+## 5. Change Explanation |
|
| 232 |
+ |
|
| 233 |
+Change logic should be evidence-first and consolidation-aware. |
|
| 234 |
+ |
|
| 235 |
+Basic diff should execute in SQLite, not by loading full datasets into Swift arrays: |
|
| 236 |
+```swift |
|
| 237 |
+appeared = currentFingerprints.subtracting(previousFingerprints) |
|
| 238 |
+disappeared = previousFingerprints.subtracting(currentFingerprints) |
|
| 239 |
+retained = currentFingerprints.intersection(previousFingerprints) |
|
| 240 |
+``` |
|
| 241 |
+ |
|
| 242 |
+Conceptual SQL shape: |
|
| 243 |
+```sql |
|
| 244 |
+CREATE TEMP TABLE prev_visible AS |
|
| 245 |
+SELECT sample_id, version_id |
|
| 246 |
+FROM visible_samples |
|
| 247 |
+WHERE observation_id = :previous; |
|
| 248 |
+ |
|
| 249 |
+CREATE TEMP TABLE curr_visible AS |
|
| 250 |
+SELECT sample_id, version_id |
|
| 251 |
+FROM visible_samples |
|
| 252 |
+WHERE observation_id = :current; |
|
| 253 |
+ |
|
| 254 |
+SELECT p.sample_id |
|
| 255 |
+FROM prev_visible p |
|
| 256 |
+LEFT JOIN curr_visible c ON c.sample_id = p.sample_id |
|
| 257 |
+WHERE c.sample_id IS NULL; |
|
| 258 |
+``` |
|
| 259 |
+ |
|
| 260 |
+Semantic grouping should compare: |
|
| 261 |
+- type identifier |
|
| 262 |
+- start/end coverage |
|
| 263 |
+- value sum and value max where meaningful |
|
| 264 |
+- source/source revision |
|
| 265 |
+- metadata keys relevant to HealthKit interpretation |
|
| 266 |
+- interval length and sample density |
|
| 267 |
+ |
|
| 268 |
+Suggested labels: |
|
| 269 |
+- `appeared` |
|
| 270 |
+- `disappeared` |
|
| 271 |
+- `representationChanged` |
|
| 272 |
+- `consolidationLikely` |
|
| 273 |
+- `aggregateChanged` |
|
| 274 |
+- `uncertain` |
|
| 275 |
+ |
|
| 276 |
+Severity should be reserved for user-facing workflow urgency, not treated as proof of corruption. In particular, a high disappeared count with stable aggregate totals should usually be shown as `consolidationLikely` or `representationChanged`, not as critical loss. |
|
| 277 |
+ |
|
| 278 |
+## 6. Exports |
|
| 279 |
+ |
|
| 280 |
+Detailed export formats and manifest rules live in [`Export-Specification.md`](Export-Specification.md). |
|
| 281 |
+ |
|
| 282 |
+Exports are scoped to what the user is inspecting. |
|
| 283 |
+ |
|
| 284 |
+Supported MVP exports: |
|
| 285 |
+- point-in-time record table |
|
| 286 |
+- observation manifest with hashes |
|
| 287 |
+- diff report between two observations |
|
| 288 |
+- selected appeared/disappeared/changed record set |
|
| 289 |
+ |
|
| 290 |
+Export rules: |
|
| 291 |
+- Include observation timestamps and app/build/schema versions. |
|
| 292 |
+- Include hashes so exported evidence can be re-identified within HealthProbe. |
|
| 293 |
+- Do not automatically upload exports. |
|
| 294 |
+- Keep examples synthetic. |
|
| 295 |
+- Allow CSV for spreadsheet inspection and JSON for structured analysis. |
|
| 296 |
+- Stream/page from SQLite. Do not build a full large export in RAM. |
|
| 297 |
+- Preserve enough structure for external recovery/salvage tools to reason about records without making HealthProbe itself a restore tool. |
|
| 298 |
+ |
|
| 299 |
+## 7. Context Logging |
|
| 300 |
+ |
|
| 301 |
+Context logs help interpret changes but must not claim causality. |
|
| 302 |
+ |
|
| 303 |
+Log: |
|
| 304 |
+- capture start/end/failure |
|
| 305 |
+- HealthKit permission changes |
|
| 306 |
+- selected type registry changes |
|
| 307 |
+- app version and iOS version |
|
| 308 |
+- coarse iCloud sign-in state if available |
|
| 309 |
+- archive reset/schema-version changes and integrity-check results |
|
| 310 |
+ |
|
| 311 |
+Do not log raw health values or personal identifiers. |
|
| 312 |
+ |
|
| 313 |
+## 8. Archive Health And Integrity Failure |
|
| 314 |
+ |
|
| 315 |
+Archive health checks: |
|
| 316 |
+- open database; |
|
| 317 |
+- verify schema version; |
|
| 318 |
+- run `PRAGMA integrity_check`; |
|
| 319 |
+- verify required tables/indexes; |
|
| 320 |
+- spot-check aggregate rebuilds; |
|
| 321 |
+- verify manifest hashes for completed exports. |
|
| 322 |
+ |
|
| 323 |
+If integrity fails: |
|
| 324 |
+- stop write operations; |
|
| 325 |
+- show archive health as degraded; |
|
| 326 |
+- allow export only if the specific query path can be verified safe; |
|
| 327 |
+- offer developer/test reset for current prototype builds; |
|
| 328 |
+- do not silently delete a real archive in future production builds. |
|
| 329 |
+ |
|
| 330 |
+Core Data cache corruption is lower severity: delete and rebuild cache from SQLite. |
|
| 331 |
+ |
|
| 332 |
+## 9. UI Implementation Guidance |
|
| 333 |
+ |
|
| 334 |
+Primary surfaces: |
|
| 335 |
+- observation timeline |
|
| 336 |
+- point-in-time observation detail |
|
| 337 |
+- per-type record table |
|
| 338 |
+- diff detail between observations |
|
| 339 |
+- export preview and export history |
|
| 340 |
+- archive health/status |
|
| 341 |
+ |
|
| 342 |
+Legacy devices may disable or simplify heavy visualizations. They must still support capture, cached summaries, report generation, and export. |
|
| 343 |
+ |
|
| 344 |
+Avoid alarm-first wording. Prefer: |
|
| 345 |
+- "84 records no longer visible in current observation" |
|
| 346 |
+- "Daily aggregate changed by 0.1%" |
|
| 347 |
+- "Consolidation likely" |
|
| 348 |
+- "Cause not inferred" |
|
| 349 |
+ |
|
| 350 |
+Avoid: |
|
| 351 |
+- "Apple lost your data" |
|
| 352 |
+- "Critical loss" based only on count |
|
| 353 |
+- "iCloud broke sync" |
|
| 354 |
+ |
|
| 355 |
+## 10. Testing Strategy |
|
| 356 |
+ |
|
| 357 |
+Unit tests: |
|
| 358 |
+- point-in-time reconstruction |
|
| 359 |
+- appeared/disappeared diff sets |
|
| 360 |
+- consolidation heuristic with stable aggregates |
|
| 361 |
+- changed aggregate with uncertain label |
|
| 362 |
+- empty observations |
|
| 363 |
+- permission/type-registry changes |
|
| 364 |
+- clock skew/context timestamp handling |
|
| 365 |
+- Core Data cache deletion and rebuild from SQLite |
|
| 366 |
+- SQL diff queries on large synthetic datasets without high RAM use |
|
| 367 |
+ |
|
| 368 |
+Integration tests: |
|
| 369 |
+- archive persistence and recovery |
|
| 370 |
+- archive reset/reinitialization for current test installs |
|
| 371 |
+- future archive schema migrations once real archive versions exist |
|
| 372 |
+- Core Data cache rebuild from archive |
|
| 373 |
+- export generation with manifest hashes |
|
| 374 |
+- high-frequency capture memory/performance |
|
| 375 |
+- deletion evidence via `HKDeletedObject` |
|
| 376 |
+ |
|
| 377 |
+Synthetic fixtures only. No real health values or identifiable metadata. |
|
| 378 |
+ |
|
| 379 |
+## 11. Performance Considerations |
|
| 380 |
+ |
|
| 381 |
+| Operation | Target | Notes | |
|
| 382 |
+|-----------|--------|-------| |
|
| 383 |
+| Anchored capture | Background | Stream pages; avoid building huge arrays | |
|
| 384 |
+| Archive write | Background | Commit before Core Data cache update | |
|
| 385 |
+| UI cache update | Short main-thread work | Use precomputed summaries | |
|
| 386 |
+| Diff preview | SQL-first, bounded | Use temp tables/indexes; cap record previews and page full tables | |
|
| 387 |
+| Export | User-initiated | Stream/page from SQLite; support filters for large high-frequency types | |
|
| 388 |
+ |
|
| 389 |
+## 12. Deployment Checklist |
|
| 390 |
+ |
|
| 391 |
+- [ ] HealthKit read permissions declared in Info.plist |
|
| 392 |
+- [ ] Background Modes enabled if used |
|
| 393 |
+- [ ] Core Data cache schema/rebuild tested |
|
| 394 |
+- [ ] Archive reset/reinitialization and schema versioning tested |
|
| 395 |
+- [ ] Archive integrity/manifests tested |
|
| 396 |
+- [ ] Export files verified with synthetic data |
|
| 397 |
+- [ ] Privacy policy matches local archive behavior |
|
| 398 |
+- [ ] UI copy reviewed for neutral, consolidation-aware language |
|
| 399 |
+- [ ] Legacy-device mode reviewed for simplified UI/report/export behavior |
|
| 400 |
+ |
|
| 401 |
+--- |
|
| 402 |
+ |
|
| 403 |
+*HealthProbe Implementation Guide v1.6 - 2026-05-23* |
|
@@ -0,0 +1,12 @@ |
||
| 1 |
+# HealthProbe UI Chapter |
|
| 2 |
+ |
|
| 3 |
+Active UI agent guidance is currently centralized in: |
|
| 4 |
+ |
|
| 5 |
+- [`../00-agent-guides/CLAUDE.md`](../00-agent-guides/CLAUDE.md) |
|
| 6 |
+ |
|
| 7 |
+Historical UI redesign notes are archived in: |
|
| 8 |
+ |
|
| 9 |
+- [`../99-archive/DATA_TYPE_VIEWS_OPTIMIZATION.md`](../99-archive/DATA_TYPE_VIEWS_OPTIMIZATION.md) |
|
| 10 |
+- [`../99-archive/REFACTORING_DATA_TYPE_VIEWS.md`](../99-archive/REFACTORING_DATA_TYPE_VIEWS.md) |
|
| 11 |
+ |
|
| 12 |
+Do not treat archived UI notes as current product scope. Current UI work should follow the Time Machine, observation, diff, and export language in the Claude guide and MVP specification. |
|
@@ -0,0 +1,74 @@ |
||
| 1 |
+# HealthProbe - Implementation Status |
|
| 2 |
+ |
|
| 3 |
+**Last Updated:** 2026-05-23 |
|
| 4 |
+ |
|
| 5 |
+## Current Reality |
|
| 6 |
+ |
|
| 7 |
+The app currently contains a working SwiftUI + SwiftData prototype with HealthKit capture, snapshot/delta screens, and an initial SQLite archive store. |
|
| 8 |
+ |
|
| 9 |
+The product direction has changed. The target architecture is now: |
|
| 10 |
+- iOS 15-era compatible; |
|
| 11 |
+- direct SQLite archive/analysis database as source of truth; |
|
| 12 |
+- differential observation storage; |
|
| 13 |
+- Core Data UI/report cache; |
|
| 14 |
+- Time Machine UI and scoped exports; |
|
| 15 |
+- recovery-compatible archive/export format; |
|
| 16 |
+- no in-app restore, backup patching, or HealthKit re-publication. |
|
| 17 |
+ |
|
| 18 |
+Current SwiftData models and anomaly-oriented naming are legacy/prototype implementation details. |
|
| 19 |
+ |
|
| 20 |
+There are no real deployments, only test installations. Existing prototype databases are disposable: the archive v2 refactor should reset, ignore, or reinitialize old SwiftData/prototype SQLite stores instead of preserving backward compatibility with them. |
|
| 21 |
+ |
|
| 22 |
+## Status By Area |
|
| 23 |
+ |
|
| 24 |
+| Area | Current Status | Target / Next Work | |
|
| 25 |
+|------|----------------|--------------------| |
|
| 26 |
+| Product docs | Updated | Keep `HealthProbe/Doc/README.md` as canonical index | |
|
| 27 |
+| HealthKit capture | Prototype exists | Adapt capture to write differential SQLite observations first | |
|
| 28 |
+| SQLite archive | Archive v2 schema bootstrap exists; legacy write table still active | Move write path from `archive_samples` to observations/samples/versions/events/ranges | |
|
| 29 |
+| Core Data cache | Not implemented | Add rebuildable cache for expensive counts, summaries, report metadata, UI state | |
|
| 30 |
+| SwiftData cache | Exists | Treat as disposable prototype data; reset/ignore during v2 transition | |
|
| 31 |
+| UI | Prototype exists | Reframe screens around observations, diffs, export, archive status | |
|
| 32 |
+| Diff/change explanation | Prototype/legacy anomaly logic exists | Move heavy diffing into SQLite and use neutral change classifications | |
|
| 33 |
+| Export | Prototype scoped JSON export exists | Add recovery-compatible manifests and streaming/paged export | |
|
| 34 |
+| Legacy device support | Not implemented | Remove SwiftData dependency and simplify heavy views for low-memory devices | |
|
| 35 |
+| Recovery workflows | Not supported | Preserve export/archive structure for external recovery tools only | |
|
| 36 |
+ |
|
| 37 |
+## Refactoring Priorities |
|
| 38 |
+ |
|
| 39 |
+Detailed checkable milestones live in [`Refactoring-Plan.md`](Refactoring-Plan.md). |
|
| 40 |
+ |
|
| 41 |
+1. Implement differential write path: observations, samples, payload versions, events/ranges, aggregates. |
|
| 42 |
+2. Add SQLite integrity/open/schema-version tests. |
|
| 43 |
+3. Move large diffs/counts into SQL queries with indexes/temp tables/paged results. |
|
| 44 |
+4. Add Core Data UI/report cache and rebuild pipeline. |
|
| 45 |
+5. Replace SwiftData UI dependencies with Core Data/cache DTOs. |
|
| 46 |
+6. Update UI language from anomaly/status to observation/diff/export. |
|
| 47 |
+7. Add streaming exports with manifests. |
|
| 48 |
+8. Validate on low-memory/legacy-class devices. |
|
| 49 |
+ |
|
| 50 |
+## Known Prototype Mismatches |
|
| 51 |
+ |
|
| 52 |
+- SwiftData currently blocks iOS 15-era device support. |
|
| 53 |
+- Existing `Anomaly*` model/service names are legacy language. |
|
| 54 |
+- Some screens still imply snapshot-count monitoring rather than Time Machine inspection. |
|
| 55 |
+- Current archive schema is not sufficient as the long-term source of truth. |
|
| 56 |
+- Existing implementation may decode or cache too much data for low-end devices. |
|
| 57 |
+- Old prototype database compatibility is no longer required. |
|
| 58 |
+ |
|
| 59 |
+## Verification Checklist |
|
| 60 |
+ |
|
| 61 |
+- [ ] SQLite archive v2 can reconstruct records visible at observation T. |
|
| 62 |
+- [ ] No recurring complete snapshot copies are written for high-volume types. |
|
| 63 |
+- [ ] SQL diff between two observations runs without loading full datasets into Swift arrays. |
|
| 64 |
+- [ ] Expensive counts used by reports/UI are cached and rebuildable. |
|
| 65 |
+- [ ] Deleting Core Data cache and rebuilding from SQLite restores UI/report summaries. |
|
| 66 |
+- [ ] Export can stream large selected record sets. |
|
| 67 |
+- [ ] Export manifests include hashes and observation metadata. |
|
| 68 |
+- [ ] iOS app remains read-only with respect to HealthKit. |
|
| 69 |
+- [ ] Docs and UI do not claim in-app restore/re-publication support. |
|
| 70 |
+- [ ] Legacy/small-device UI mode preserves capture/report/export. |
|
| 71 |
+ |
|
| 72 |
+## Historical Notes |
|
| 73 |
+ |
|
| 74 |
+Older status docs described a completed snapshot/anomaly/SwiftData system. That was true for the prototype direction, but it is no longer the target architecture. |
|
@@ -0,0 +1,293 @@ |
||
| 1 |
+# HealthProbe - Database-Led Refactoring Plan |
|
| 2 |
+ |
|
| 3 |
+**Last Updated:** 2026-05-23 |
|
| 4 |
+**Status:** Active planning document |
|
| 5 |
+ |
|
| 6 |
+## Goal |
|
| 7 |
+ |
|
| 8 |
+Move HealthProbe from the current SwiftData/snapshot/anomaly prototype toward the target architecture: |
|
| 9 |
+ |
|
| 10 |
+- SQLite archive/analysis database as source of truth; |
|
| 11 |
+- differential observation storage; |
|
| 12 |
+- SQL-first analysis for large datasets; |
|
| 13 |
+- Core Data UI/report cache; |
|
| 14 |
+- recovery-compatible exports; |
|
| 15 |
+- iOS 15-era legacy-device support; |
|
| 16 |
+- Time Machine UI over local observations. |
|
| 17 |
+- destructive reset/reinitialization of prototype/test stores; old database |
|
| 18 |
+ compatibility is not required. |
|
| 19 |
+ |
|
| 20 |
+UI refactoring happens after the storage and query foundations exist. |
|
| 21 |
+ |
|
| 22 |
+## Milestone 0 - Freeze Legacy Direction |
|
| 23 |
+ |
|
| 24 |
+**Purpose:** Stop work from deepening the old architecture. |
|
| 25 |
+ |
|
| 26 |
+Checklist: |
|
| 27 |
+- [ ] Mark SwiftData as legacy/prototype in active implementation tickets. |
|
| 28 |
+- [ ] Stop adding new SwiftData entities. |
|
| 29 |
+- [ ] Stop adding features that require recurring complete snapshots. |
|
| 30 |
+- [ ] Mark existing prototype/test installation data as disposable for archive v2. |
|
| 31 |
+- [ ] Point all storage agents to [`../02-architecture/Database-Design.md`](../02-architecture/Database-Design.md). |
|
| 32 |
+- [ ] Confirm root docs only bootstrap into `HealthProbe/Doc/`. |
|
| 33 |
+ |
|
| 34 |
+Acceptance: |
|
| 35 |
+- [ ] No active task describes SwiftData as target persistence. |
|
| 36 |
+- [ ] No active task proposes full periodic snapshot storage. |
|
| 37 |
+- [ ] No active task requires old prototype-store compatibility. |
|
| 38 |
+- [ ] `HealthProbe/Doc/README.md` points DB work to `Database-Design.md`. |
|
| 39 |
+ |
|
| 40 |
+## Milestone 1 - Lock Database Decisions |
|
| 41 |
+ |
|
| 42 |
+**Purpose:** Resolve irreversible archive choices before coding schema v2. |
|
| 43 |
+ |
|
| 44 |
+Checklist: |
|
| 45 |
+- [x] Decide timestamp storage convention. |
|
| 46 |
+- [x] Decide hash/salt/key strategy for source/device identifiers. |
|
| 47 |
+- [x] Define strict fingerprint foundation. |
|
| 48 |
+- [x] Define semantic/fuzzy fingerprint policy. |
|
| 49 |
+- [x] Define timezone policy for daily/monthly aggregate buckets. |
|
| 50 |
+- [x] Decide whether visibility ranges are maintained eagerly or rebuilt from events. |
|
| 51 |
+- [x] Define relationship preservation policy for workouts/samples/events. |
|
| 52 |
+- [x] Record prototype data policy: discard/reset old SwiftData and prototype SQLite stores; no compatibility migration. |
|
| 53 |
+- [x] Define export manifest canonicalization and hash algorithm. |
|
| 54 |
+ |
|
| 55 |
+Acceptance: |
|
| 56 |
+- [x] `Database-Design.md` open questions are answered or explicitly deferred. |
|
| 57 |
+- [x] Schema v2 can be implemented without guessing. |
|
| 58 |
+- [x] Test-install reset/reinitialization policy is documented. |
|
| 59 |
+- [x] Privacy implications of identifiers/provenance are documented. |
|
| 60 |
+ |
|
| 61 |
+## Milestone 2 - Synthetic Large-Data Test Harness |
|
| 62 |
+ |
|
| 63 |
+**Purpose:** Prove the new design can be tested before real HealthKit data is involved. |
|
| 64 |
+ |
|
| 65 |
+Checklist: |
|
| 66 |
+- [ ] Create synthetic observation generator. |
|
| 67 |
+- [ ] Generate low, medium, and high-volume sample sets. |
|
| 68 |
+- [ ] Include appeared/disappeared/representationChanged scenarios. |
|
| 69 |
+- [ ] Include consolidation-like high-frequency thinning scenarios. |
|
| 70 |
+- [ ] Include source/device/metadata variation. |
|
| 71 |
+- [ ] Include relationship fixtures. |
|
| 72 |
+- [ ] Add memory/performance measurement for large diff/export operations. |
|
| 73 |
+ |
|
| 74 |
+Acceptance: |
|
| 75 |
+- [ ] Tests can create a large synthetic archive without real health data. |
|
| 76 |
+- [ ] Large diff test does not require loading all records into Swift arrays. |
|
| 77 |
+- [ ] Export test streams/pages output. |
|
| 78 |
+- [ ] Fixtures contain no personal, device, location, or real health data. |
|
| 79 |
+ |
|
| 80 |
+## Milestone 3 - SQLite Archive V2 Schema |
|
| 81 |
+ |
|
| 82 |
+**Purpose:** Create the new archive foundation. |
|
| 83 |
+ |
|
| 84 |
+Checklist: |
|
| 85 |
+- [x] Implement `schema_migrations`. |
|
| 86 |
+- [x] Implement `archive_metadata`. |
|
| 87 |
+- [x] Implement `device_chains`. |
|
| 88 |
+- [x] Implement `observations`. |
|
| 89 |
+- [x] Implement `sample_types`. |
|
| 90 |
+- [x] Implement `observation_type_runs`. |
|
| 91 |
+- [x] Implement `sources`. |
|
| 92 |
+- [x] Implement `source_revisions`. |
|
| 93 |
+- [x] Implement `hk_devices`. |
|
| 94 |
+- [x] Implement `metadata_blobs`. |
|
| 95 |
+- [x] Implement `samples`. |
|
| 96 |
+- [x] Implement `sample_versions`. |
|
| 97 |
+- [x] Implement `sample_observation_events`. |
|
| 98 |
+- [x] Implement `sample_visibility_ranges`. |
|
| 99 |
+- [x] Implement `sample_relationships`. |
|
| 100 |
+- [x] Implement `observation_type_summaries`. |
|
| 101 |
+- [x] Implement `daily_type_aggregates`. |
|
| 102 |
+- [x] Implement `export_manifests`. |
|
| 103 |
+- [x] Implement `export_items`. |
|
| 104 |
+- [x] Add required indexes. |
|
| 105 |
+- [ ] Add SQLite integrity/open/schema-version tests. |
|
| 106 |
+ |
|
| 107 |
+Acceptance: |
|
| 108 |
+- [ ] Fresh archive initializes successfully. |
|
| 109 |
+- [x] Schema version is recorded. |
|
| 110 |
+- [x] Archive v2 can initialize after old prototype stores are removed or ignored. |
|
| 111 |
+- [ ] `PRAGMA integrity_check` passes. |
|
| 112 |
+- [x] Required indexes exist. |
|
| 113 |
+- [ ] Empty archive queries return valid empty results. |
|
| 114 |
+ |
|
| 115 |
+## Milestone 4 - Differential Write Path |
|
| 116 |
+ |
|
| 117 |
+**Purpose:** Write observations without storing full recurring snapshots. |
|
| 118 |
+ |
|
| 119 |
+Checklist: |
|
| 120 |
+- [ ] Create observation transaction wrapper. |
|
| 121 |
+- [ ] Upsert sample types. |
|
| 122 |
+- [ ] Upsert source/source revision/device/metadata rows. |
|
| 123 |
+- [ ] Upsert sample identity. |
|
| 124 |
+- [ ] Upsert sample payload version only when payload changes. |
|
| 125 |
+- [ ] Insert appeared/verified/representationChanged events. |
|
| 126 |
+- [ ] Record `HKDeletedObject` evidence by UUID hash. |
|
| 127 |
+- [ ] Close visibility ranges for disappeared/deleted samples. |
|
| 128 |
+- [ ] Maintain open visibility ranges for visible samples. |
|
| 129 |
+- [ ] Rebuild/update affected aggregates after capture. |
|
| 130 |
+- [ ] Commit SQLite before Core Data/cache work. |
|
| 131 |
+- [ ] Make repeated capture page writes idempotent. |
|
| 132 |
+ |
|
| 133 |
+Acceptance: |
|
| 134 |
+- [ ] Initial import stores identities and versions once. |
|
| 135 |
+- [ ] Re-running same page does not duplicate records. |
|
| 136 |
+- [ ] Representation change creates a new version, not a new logical sample. |
|
| 137 |
+- [ ] Disappearance closes visibility range. |
|
| 138 |
+- [ ] No full observation copy table is written. |
|
| 139 |
+ |
|
| 140 |
+## Milestone 5 - SQL Analysis Layer |
|
| 141 |
+ |
|
| 142 |
+**Purpose:** Make the archive useful without RAM-heavy processing. |
|
| 143 |
+ |
|
| 144 |
+Checklist: |
|
| 145 |
+- [ ] Implement point-in-time visible-record query. |
|
| 146 |
+- [ ] Implement paged record table query. |
|
| 147 |
+- [ ] Implement appeared query between observations. |
|
| 148 |
+- [ ] Implement disappeared query between observations. |
|
| 149 |
+- [ ] Implement representationChanged query between observations. |
|
| 150 |
+- [ ] Implement diff counts using temp tables or equivalent SQL-first strategy. |
|
| 151 |
+- [ ] Implement aggregate comparison query. |
|
| 152 |
+- [ ] Implement consolidation-likely evidence query. |
|
| 153 |
+- [ ] Implement source/provenance breakdown query. |
|
| 154 |
+- [ ] Add query timing/memory tests on synthetic large datasets. |
|
| 155 |
+ |
|
| 156 |
+Acceptance: |
|
| 157 |
+- [ ] Observation T can be reconstructed from ranges/events. |
|
| 158 |
+- [ ] Large diff returns counts and first page without loading all rows. |
|
| 159 |
+- [ ] Query results are deterministic and ordered. |
|
| 160 |
+- [ ] Consolidation evidence includes count, aggregate, coverage, density, and uncertainty data. |
|
| 161 |
+ |
|
| 162 |
+## Milestone 6 - Core Data UI/Report Cache |
|
| 163 |
+ |
|
| 164 |
+**Purpose:** Cache expensive presentation/report values while keeping SQLite authoritative. |
|
| 165 |
+ |
|
| 166 |
+Checklist: |
|
| 167 |
+- [ ] Define Core Data model for observation rows. |
|
| 168 |
+- [ ] Define type summary cache entity. |
|
| 169 |
+- [ ] Define daily/monthly aggregate cache entity. |
|
| 170 |
+- [ ] Define diff summary cache entity. |
|
| 171 |
+- [ ] Define export manifest/status cache entity. |
|
| 172 |
+- [ ] Define archive health/status cache entity. |
|
| 173 |
+- [ ] Implement cache rebuild from SQLite. |
|
| 174 |
+- [ ] Implement cache invalidation by archive schema/cache schema/version/hash. |
|
| 175 |
+- [ ] Implement delete-cache-and-rebuild flow. |
|
| 176 |
+- [ ] Add cache schema/version and rebuild tests. |
|
| 177 |
+ |
|
| 178 |
+Acceptance: |
|
| 179 |
+- [ ] Deleting Core Data cache does not lose forensic data. |
|
| 180 |
+- [ ] Cache rebuild restores dashboard/timeline/report summaries. |
|
| 181 |
+- [ ] Cache rows include source observation ids and archive/cache schema versions. |
|
| 182 |
+- [ ] SQLite wins on disagreement. |
|
| 183 |
+ |
|
| 184 |
+## Milestone 7 - Export Layer |
|
| 185 |
+ |
|
| 186 |
+**Purpose:** Produce scoped, recovery-compatible exports. |
|
| 187 |
+ |
|
| 188 |
+Checklist: |
|
| 189 |
+- [ ] Define JSON export envelope. |
|
| 190 |
+- [ ] Define CSV record-table export. |
|
| 191 |
+- [ ] Define manifest hash algorithm. |
|
| 192 |
+- [ ] Include archive/app/schema/observation metadata. |
|
| 193 |
+- [ ] Include sample identity and payload version hashes. |
|
| 194 |
+- [ ] Include values/dates/units/type fields. |
|
| 195 |
+- [ ] Include source/provenance metadata where available and allowed. |
|
| 196 |
+- [ ] Include relationships where available. |
|
| 197 |
+- [ ] Include provenance-loss warning for external HealthKit re-publication. |
|
| 198 |
+- [ ] Stream/page export from SQLite. |
|
| 199 |
+- [ ] Store export manifest rows. |
|
| 200 |
+- [ ] Add reproducibility test for export manifests. |
|
| 201 |
+ |
|
| 202 |
+Acceptance: |
|
| 203 |
+- [ ] Large export does not materialize full record set in RAM. |
|
| 204 |
+- [ ] Export can be verified against archive hashes. |
|
| 205 |
+- [ ] Export contains enough structure for external recovery/salvage tooling. |
|
| 206 |
+- [ ] App still does not perform restore, backup patching, or HealthKit re-publication. |
|
| 207 |
+ |
|
| 208 |
+## Milestone 8 - UI/Data Flow Migration |
|
| 209 |
+ |
|
| 210 |
+**Purpose:** Move UI from prototype storage to target cache/query flow. |
|
| 211 |
+ |
|
| 212 |
+Checklist: |
|
| 213 |
+- [ ] Replace direct SwiftData `@Query` dependencies for target screens. |
|
| 214 |
+- [ ] Dashboard reads Core Data cache. |
|
| 215 |
+- [ ] Observation timeline reads Core Data cache. |
|
| 216 |
+- [ ] Observation detail uses cached summaries plus paged SQLite DTOs. |
|
| 217 |
+- [ ] Diff detail uses cached summary plus paged SQLite DTOs. |
|
| 218 |
+- [ ] Data type screens use target change labels. |
|
| 219 |
+- [ ] Export preview uses export query/manifest APIs. |
|
| 220 |
+- [ ] Archive status reflects SQLite/Core Data cache health. |
|
| 221 |
+- [ ] Legacy/small-device UI mode simplifies heavy visualizations. |
|
| 222 |
+ |
|
| 223 |
+Acceptance: |
|
| 224 |
+- [ ] Core Time Machine flows work without SwiftData as target persistence. |
|
| 225 |
+- [ ] UI copy uses observation/diff/export language. |
|
| 226 |
+- [ ] No count-only critical data loss messaging. |
|
| 227 |
+- [ ] Large record tables are paged. |
|
| 228 |
+- [ ] Legacy mode preserves capture/report/export. |
|
| 229 |
+ |
|
| 230 |
+## Milestone 9 - Legacy SwiftData Retirement |
|
| 231 |
+ |
|
| 232 |
+**Purpose:** Remove prototype persistence from the target architecture. |
|
| 233 |
+ |
|
| 234 |
+Checklist: |
|
| 235 |
+- [ ] Identify all remaining SwiftData imports. |
|
| 236 |
+- [ ] Replace SwiftData models used by active flows. |
|
| 237 |
+- [ ] Remove/disable `ModelContainer` as required for target builds. |
|
| 238 |
+- [ ] Add prototype-store ignore/delete/reset path for test installs. |
|
| 239 |
+- [ ] Verify no old-store compatibility layer remains in active flows. |
|
| 240 |
+- [ ] Lower deployment target as far as dependencies allow. |
|
| 241 |
+- [ ] Verify build for iOS 15-era target constraints. |
|
| 242 |
+ |
|
| 243 |
+Acceptance: |
|
| 244 |
+- [ ] SwiftData is not required for normal app launch. |
|
| 245 |
+- [ ] Active flows use SQLite + Core Data cache. |
|
| 246 |
+- [ ] Prototype data handling is explicit: old stores are ignored/deleted/reset for test installs. |
|
| 247 |
+ |
|
| 248 |
+## Milestone 10 - Acceptance Gate |
|
| 249 |
+ |
|
| 250 |
+**Purpose:** Decide whether the refactor is complete enough to build product features on top. |
|
| 251 |
+ |
|
| 252 |
+Checklist: |
|
| 253 |
+- [ ] Point-in-time reconstruction works. |
|
| 254 |
+- [ ] Large diff works SQL-first. |
|
| 255 |
+- [ ] Materialized aggregates can be rebuilt and verified. |
|
| 256 |
+- [ ] Core Data cache can be deleted and rebuilt. |
|
| 257 |
+- [ ] Large export streams/pages. |
|
| 258 |
+- [ ] Recovery-compatible manifest is present. |
|
| 259 |
+- [ ] SQLite integrity checks pass. |
|
| 260 |
+- [ ] Low-memory synthetic tests pass. |
|
| 261 |
+- [ ] UI no longer depends on SwiftData as foundation. |
|
| 262 |
+- [ ] Docs match implementation. |
|
| 263 |
+ |
|
| 264 |
+Acceptance: |
|
| 265 |
+- [ ] Product can safely proceed to UI polish and higher-level workflows. |
|
| 266 |
+- [ ] Database is no longer the main unresolved architectural risk. |
|
| 267 |
+ |
|
| 268 |
+## Parallelization Guide |
|
| 269 |
+ |
|
| 270 |
+Can run in parallel after Milestone 1: |
|
| 271 |
+- synthetic data harness; |
|
| 272 |
+- schema implementation; |
|
| 273 |
+- Core Data cache model drafting; |
|
| 274 |
+- export format drafting; |
|
| 275 |
+- UI DTO contract design. |
|
| 276 |
+ |
|
| 277 |
+Must not run before dependencies: |
|
| 278 |
+- UI migration before SQL query layer and Core Data cache exist; |
|
| 279 |
+- export implementation before manifest design is locked; |
|
| 280 |
+- legacy SwiftData removal before replacement flows exist; |
|
| 281 |
+- archive v2 initialization before reset/reinitialization policy is documented. |
|
| 282 |
+ |
|
| 283 |
+## Agent Assignment Hints |
|
| 284 |
+ |
|
| 285 |
+| Workstream | Primary Doc | |
|
| 286 |
+|------------|-------------| |
|
| 287 |
+| SQLite schema/write path/query layer | [`../02-architecture/Database-Design.md`](../02-architecture/Database-Design.md) | |
|
| 288 |
+| HealthKit capture integration | [`../02-architecture/Implementation-Guide.md`](../02-architecture/Implementation-Guide.md) | |
|
| 289 |
+| Core Data cache | [`../02-architecture/Core-Data-Cache-Design.md`](../02-architecture/Core-Data-Cache-Design.md) | |
|
| 290 |
+| Export formats/manifests | [`../02-architecture/Export-Specification.md`](../02-architecture/Export-Specification.md) | |
|
| 291 |
+| UI migration | [`../00-agent-guides/CLAUDE.md`](../00-agent-guides/CLAUDE.md) | |
|
| 292 |
+| Product language/non-goals | [`../01-product/MVP-Specification.md`](../01-product/MVP-Specification.md) | |
|
| 293 |
+| Status updates | [`IMPLEMENTATION_STATUS.md`](IMPLEMENTATION_STATUS.md) | |
|
@@ -1,4 +1,15 @@ |
||
| 1 |
-# Data Type Views Optimization – Visual Guide |
|
| 1 |
+# Archived Note - Data Type Views Optimization Visual Guide |
|
| 2 |
+ |
|
| 3 |
+**Status:** Historical implementation note. Do not treat this file as current product scope or UI requirements. |
|
| 4 |
+ |
|
| 5 |
+For active UI direction, read: |
|
| 6 |
+- [`../README.md`](../README.md) |
|
| 7 |
+- [`../00-agent-guides/CLAUDE.md`](../00-agent-guides/CLAUDE.md) |
|
| 8 |
+- [`../01-product/MVP-Specification.md`](../01-product/MVP-Specification.md) |
|
| 9 |
+ |
|
| 10 |
+--- |
|
| 11 |
+ |
|
| 12 |
+# Data Type Views Optimization - Visual Guide |
|
| 2 | 13 |
|
| 3 | 14 |
## Overview |
| 4 | 15 |
|
@@ -1,4 +1,15 @@ |
||
| 1 |
-# Data Type Views Optimization – Visual Redesign |
|
| 1 |
+# Archived Note - Data Type Views Optimization Visual Redesign |
|
| 2 |
+ |
|
| 3 |
+**Status:** Historical implementation note. Do not treat this file as current product scope or UI requirements. |
|
| 4 |
+ |
|
| 5 |
+For active UI direction, read: |
|
| 6 |
+- [`../README.md`](../README.md) |
|
| 7 |
+- [`../00-agent-guides/CLAUDE.md`](../00-agent-guides/CLAUDE.md) |
|
| 8 |
+- [`../01-product/MVP-Specification.md`](../01-product/MVP-Specification.md) |
|
| 9 |
+ |
|
| 10 |
+--- |
|
| 11 |
+ |
|
| 12 |
+# Data Type Views Optimization - Visual Redesign |
|
| 2 | 13 |
|
| 3 | 14 |
## Summary |
| 4 | 15 |
|
@@ -1,409 +0,0 @@ |
||
| 1 |
-# HealthProbe – Risks, Limitations & Forensic Capabilities |
|
| 2 |
- |
|
| 3 |
- |
|
| 4 |
-## 1. Known Limitations |
|
| 5 |
- |
|
| 6 |
-### 1.1 HealthKit Framework Constraints |
|
| 7 |
- |
|
| 8 |
-**What HealthProbe Cannot Detect:** |
|
| 9 |
- |
|
| 10 |
-| Gap | Why | Mitigation | |
|
| 11 |
-|-----|-----|-----------| |
|
| 12 |
-| **Modifications without deletion** | HealthKit has no "modified" event, only "added" and "deleted" | Use snapshot comparison to detect value changes | |
|
| 13 |
-| **Lost deletions** | If deletion notification arrives while app backgrounded, we may miss it | Monitor both anchored queries AND deleted objects | |
|
| 14 |
-| **Timing precision** | Anchored queries may batch multiple changes, lose granular timestamps | Store both sample timestamp AND observation timestamp | |
|
| 15 |
-| **Private HealthKit types** | Some data types not accessible to third-party apps | Accept data available only to Health.app | |
|
| 16 |
-| **Cross-device sync delays** | Watch-to-phone-to-cloud can take 24+ hours | Extend observation window, don't flag immediate after device sync | |
|
| 17 |
-| **Consolidation / downsampling** | Apple Health/iCloud can rewrite high-frequency historical samples in-place (count decreases; values/intervals change) | Store fingerprints + optionally archive raw samples for selected types (local-only forensic backup mode) | |
|
| 18 |
- |
|
| 19 |
-### 1.2 iOS Background Mode Limitations |
|
| 20 |
- |
|
| 21 |
-**Background Fetch Reality:** |
|
| 22 |
-- iOS may delay or skip background fetch requests (battery, network state, user activity) |
|
| 23 |
-- No guarantee that HealthProbe will run at specified interval |
|
| 24 |
-- User can disable background refresh in Settings → HealthProbe |
|
| 25 |
-- System may suspend app if device is low on storage |
|
| 26 |
- |
|
| 27 |
-**Impact:** Anomalies may not be detected for 24-72 hours after occurrence |
|
| 28 |
- |
|
| 29 |
-**Mitigation:** |
|
| 30 |
-- Encourage users to open app regularly (at least weekly) |
|
| 31 |
-- Provide manual "Check Now" button |
|
| 32 |
-- Use `HKObserverQuery` for real-time detection when app is running |
|
| 33 |
- |
|
| 34 |
-### 1.3 Data Retention Constraints |
|
| 35 |
- |
|
| 36 |
-**HealthKit Sample Retention:** |
|
| 37 |
-- HealthKit automatically deletes some transient data (e.g., minute-level HR after 90 days) |
|
| 38 |
-- User can manually delete samples (HealthProbe cannot prevent this) |
|
| 39 |
-- Backups may not restore all data (lossy compression, sync state) |
|
| 40 |
- |
|
| 41 |
-**HealthProbe Impact:** |
|
| 42 |
-- Cannot reconstruct data that was never observed |
|
| 43 |
-- Snapshot from 6 months ago may have samples no longer in HealthKit |
|
| 44 |
-- Gap detection assumes continuous observation (may be false positive if app uninstalled then reinstalled) |
|
| 45 |
-- If Apple consolidates history, **counts alone can be misleading** (a month can “lose” samples but keep the same totals via aggregation); value-level forensics are required for proof |
|
| 46 |
- |
|
| 47 |
- |
|
| 48 |
-## 2. Risk Assessment |
|
| 49 |
- |
|
| 50 |
-### 2.1 Privacy Risks (Mitigation: Excellent ✅) |
|
| 51 |
- |
|
| 52 |
-| Risk | Impact | Mitigation | |
|
| 53 |
-|------|--------|-----------| |
|
| 54 |
-| **Raw health data exfiltration** | CRITICAL: user's personal health history exposed | ✅ Local-only storage, never sends raw samples | |
|
| 55 |
-| **Device fingerprinting** | HIGH: tracking user across services | ✅ Salted hash of device ID, stored locally only | |
|
| 56 |
-| **Timing attacks** (inferring behavior) | MEDIUM: archive/check patterns reveal habits | ✅ No automatic cloud sync; reports are explicit local exports | |
|
| 57 |
-| **App crashes leaking data** | LOW: crash logs may contain HealthKit info | ✅ All logging is aggregated (counts, not values) | |
|
| 58 |
- |
|
| 59 |
-### 2.2 Data Integrity Risks (Mitigation: Good ✅) |
|
| 60 |
- |
|
| 61 |
-| Risk | Impact | Mitigation | |
|
| 62 |
-|------|--------|-----------| |
|
| 63 |
-| **Snapshot corruption** | HIGH: audit trail becomes unreliable | ✅ Use MD5 checksum of snapshots, detect corruption | |
|
| 64 |
-| **Lost audit trail on uninstall** | HIGH: forensic data disappears | ⚠️ PARTIAL: Encourage export before uninstall; future: iCloud backup option | |
|
| 65 |
-| **Silent rewrites (no deletion events)** | HIGH: history can change without HKDeletedObject evidence | ✅ Detect via fingerprint diffs; **Forensic Backup Mode** can preserve per-sample evidence locally for selected types | |
|
| 66 |
-| **Clock skew** (device time wrong) | MEDIUM: timestamps inaccurate, anomaly detection confused | ⚠️ Log both device time + time since boot, detect skew | |
|
| 67 |
-| **Concurrent modification** (app + Health.app) | LOW: race conditions during query | ✅ Anchored queries are atomic | |
|
| 68 |
- |
|
| 69 |
-### 2.3 Security Risks (Mitigation: Good ✅) |
|
| 70 |
- |
|
| 71 |
-| Risk | Impact | Mitigation | |
|
| 72 |
-|------|--------|-----------| |
|
| 73 |
-| **Malicious apps accessing HealthKit** | MEDIUM: third-party apps can read our data | ✅ iOS sandboxing; ask user for HealthKit permission per app | |
|
| 74 |
-| **Jailbroken device** | CRITICAL: all bets off | ⚠️ Not defendable; document assumption: standard iOS device | |
|
| 75 |
-| **iCloud account compromise** | MEDIUM: Apple Health/iCloud state may still affect source data | ✅ HealthProbe archive remains local; no HealthProbe CloudKit sync | |
|
| 76 |
-| **Local device theft** | MEDIUM: thief can see audit trail | ✅ Data encrypted by iOS, requires device unlock | |
|
| 77 |
- |
|
| 78 |
- |
|
| 79 |
-## 3. Forensic Capabilities |
|
| 80 |
- |
|
| 81 |
-### 3.1 Questions HealthProbe Can Answer |
|
| 82 |
- |
|
| 83 |
-**Q1: "Was my data lost?"** |
|
| 84 |
-``` |
|
| 85 |
-Answer method: |
|
| 86 |
- 1. Load all snapshots for "Steps" type |
|
| 87 |
- 2. Create timeline: date → sample count |
|
| 88 |
- 3. Detect gap where count drops significantly |
|
| 89 |
- 4. Report: when, how much, which source |
|
| 90 |
- |
|
| 91 |
-Example output: |
|
| 92 |
- ✅ Yes — On 2026-03-15, step data dropped from 8,234 to 2,100 |
|
| 93 |
- Loss: 6,134 samples (74.3%) |
|
| 94 |
- Source: "iPhone Health App" |
|
| 95 |
- Severity: CRITICAL |
|
| 96 |
-``` |
|
| 97 |
- |
|
| 98 |
-**Q2: "Why did my data diverge?"** |
|
| 99 |
-``` |
|
| 100 |
-Answer method: |
|
| 101 |
- 1. Load historical aggregates (daily sums) |
|
| 102 |
- 2. Fit trend line across 30/60/90 day periods |
|
| 103 |
- 3. Calculate deviation from baseline |
|
| 104 |
- 4. Correlate with sync state changes & OS updates |
|
| 105 |
- |
|
| 106 |
-Example output: |
|
| 107 |
- 📊 Step count trending down 15% per month |
|
| 108 |
- Baseline (2025): avg 9,200 steps/day (σ = 1,200) |
|
| 109 |
- Recent (2026): avg 7,800 steps/day (σ = 1,800) |
|
| 110 |
- Correlation: Matches iPhone → Apple Watch priority shift |
|
| 111 |
-``` |
|
| 112 |
- |
|
| 113 |
-**Q3: "When did this data first appear?"** |
|
| 114 |
-``` |
|
| 115 |
-Answer method: |
|
| 116 |
- 1. Search anomaly trail for "historical_insertion" |
|
| 117 |
- 2. Find sample in audit trail with matching ID |
|
| 118 |
- 3. Report exact timestamp (±1 sync cycle) |
|
| 119 |
- |
|
| 120 |
-Example output: |
|
| 121 |
- 🔍 Workout "Morning Run" (2025-01-15) |
|
| 122 |
- First observed: 2026-05-01 at 14:35:22 UTC |
|
| 123 |
- Age: 471 days |
|
| 124 |
- Context: iCloud sync completed 2 minutes prior |
|
| 125 |
-``` |
|
| 126 |
- |
|
| 127 |
-**Q4: "Is my device syncing correctly?"** |
|
| 128 |
-``` |
|
| 129 |
-Answer method: |
|
| 130 |
- 1. Load sync state changes |
|
| 131 |
- 2. Check for state = "icloud_sync_active" frequency |
|
| 132 |
- 3. Measure time between sync completions |
|
| 133 |
- 4. Compare to baseline (typical: every 2-4 hours) |
|
| 134 |
- |
|
| 135 |
-Example output: |
|
| 136 |
- 📡 Sync frequency: ABNORMAL |
|
| 137 |
- Expected: sync every 2-4 hours |
|
| 138 |
- Observed: Last sync 6 days ago |
|
| 139 |
- Status: ⚠️ iCloud sync may be stuck |
|
| 140 |
-``` |
|
| 141 |
- |
|
| 142 |
-**Q5: "Which devices are contributing data?"** |
|
| 143 |
-``` |
|
| 144 |
-Answer method: |
|
| 145 |
- 1. Analyze source distribution in snapshots |
|
| 146 |
- 2. Track which source each sample came from |
|
| 147 |
- 3. Report composition over time |
|
| 148 |
- |
|
| 149 |
-Example output: |
|
| 150 |
- 📱 Data source composition: |
|
| 151 |
- iPhone Health: 45% (4,532 samples) |
|
| 152 |
- Apple Watch: 50% (5,041 samples) |
|
| 153 |
- Manual entry: 5% (504 samples) |
|
| 154 |
- |
|
| 155 |
- Trend: Watch contribution increased from 30% (Jan) to 50% (Now) |
|
| 156 |
-``` |
|
| 157 |
- |
|
| 158 |
-### 3.2 Export Formats for Analysis |
|
| 159 |
- |
|
| 160 |
-**Format 1: JSON Forensic Report** |
|
| 161 |
-```json |
|
| 162 |
-{
|
|
| 163 |
- "export_date": "2026-05-01T14:35:22Z", |
|
| 164 |
- "device": "iPhone 15 Pro", |
|
| 165 |
- "ios_version": "18.4.1", |
|
| 166 |
- "app_version": "1.0.0", |
|
| 167 |
- "observation_period": {
|
|
| 168 |
- "start": "2026-01-01T00:00:00Z", |
|
| 169 |
- "end": "2026-05-01T14:35:22Z", |
|
| 170 |
- "days": 120 |
|
| 171 |
- }, |
|
| 172 |
- "anomalies_summary": {
|
|
| 173 |
- "total": 12, |
|
| 174 |
- "critical": 2, |
|
| 175 |
- "warning": 3, |
|
| 176 |
- "info": 7, |
|
| 177 |
- "by_type": {
|
|
| 178 |
- "historical_insertion": 5, |
|
| 179 |
- "silent_deletion": 2, |
|
| 180 |
- "duplicate": 3, |
|
| 181 |
- "divergence": 2 |
|
| 182 |
- } |
|
| 183 |
- }, |
|
| 184 |
- "anomalies": [ |
|
| 185 |
- {
|
|
| 186 |
- "id": "ANML_20260501_001", |
|
| 187 |
- "type": "silent_deletion", |
|
| 188 |
- "severity": "critical", |
|
| 189 |
- "timestamp": "2026-04-15T08:30:00Z", |
|
| 190 |
- "description": "72 step samples lost without deletion notification", |
|
| 191 |
- "evidence": {
|
|
| 192 |
- "sample_type": "Steps", |
|
| 193 |
- "loss_count": 72, |
|
| 194 |
- "loss_percent": 23.4, |
|
| 195 |
- "affected_dates": ["2026-04-13", "2026-04-14", "2026-04-15"] |
|
| 196 |
- } |
|
| 197 |
- } |
|
| 198 |
- ], |
|
| 199 |
- "snapshots": [ |
|
| 200 |
- {
|
|
| 201 |
- "timestamp": "2026-05-01T14:35:00Z", |
|
| 202 |
- "type": "Steps", |
|
| 203 |
- "count": 305_432, |
|
| 204 |
- "sources": {
|
|
| 205 |
- "iPhone Health": 152_716, |
|
| 206 |
- "Apple Watch": 152_716 |
|
| 207 |
- } |
|
| 208 |
- } |
|
| 209 |
- ] |
|
| 210 |
-} |
|
| 211 |
-``` |
|
| 212 |
- |
|
| 213 |
-**Format 2: CSV Timeline (for Excel/Sheets)** |
|
| 214 |
-```csv |
|
| 215 |
-Date,Time,Event,Type,Severity,Description,Details |
|
| 216 |
-2026-01-01,08:15,Initial Snapshot,snapshot,info,Baseline established,305432 steps |
|
| 217 |
-2026-02-15,14:20,Historical Insertion,anomaly,medium,Data appeared retroactively,Workout from 2025-01-15 |
|
| 218 |
-2026-03-15,09:00,Silent Deletion,anomaly,critical,Data gap detected,72 steps lost |
|
| 219 |
-2026-04-20,16:45,Sync State Change,sync,info,iCloud sync completed,Samples added: 87 |
|
| 220 |
-``` |
|
| 221 |
- |
|
| 222 |
-**Format 3: Markdown Report (for Bug Submissions)** |
|
| 223 |
-```markdown |
|
| 224 |
-# Apple Health Data Integrity Issue Report |
|
| 225 |
- |
|
| 226 |
-## Timeline |
|
| 227 |
-- **Start observation:** 2026-01-01 |
|
| 228 |
-- **Last check:** 2026-05-01 |
|
| 229 |
-- **Total observations:** 120 days |
|
| 230 |
- |
|
| 231 |
-## Issue Summary |
|
| 232 |
-3 critical anomalies detected involving 280+ data samples and 15 days of missing data. |
|
| 233 |
- |
|
| 234 |
-### Critical Finding #1: Silent Deletion (2026-03-15) |
|
| 235 |
-- **Type:** 72 step samples disappeared without notification |
|
| 236 |
-- **Date affected:** 2026-04-13 to 2026-04-15 |
|
| 237 |
-- **Detection:** Snapshot comparison (305,432 → 305,360) |
|
| 238 |
-- **Severity:** Critical |
|
| 239 |
-- **Context:** Occurred 3 days after iOS 18.4 update |
|
| 240 |
- |
|
| 241 |
-### Critical Finding #2: Historical Insertion (2026-02-15) |
|
| 242 |
-- **Type:** Workout appears 471 days after original date |
|
| 243 |
-- **Sample:** "Morning Run" from 2025-01-15 |
|
| 244 |
-- **First observed:** 2026-02-15 14:20 UTC |
|
| 245 |
-- **Context:** 2 minutes after iCloud sync completed |
|
| 246 |
-- **Severity:** Medium (likely restore from backup) |
|
| 247 |
- |
|
| 248 |
-## Recommendations |
|
| 249 |
-1. Verify data integrity on other devices |
|
| 250 |
-2. Compare with iCloud.com Health export |
|
| 251 |
-3. Review iOS 18.4 release notes for HealthKit changes |
|
| 252 |
-4. Check if backup restore was interrupted |
|
| 253 |
-``` |
|
| 254 |
- |
|
| 255 |
-### 3.3 Forensic Techniques Enabled by HealthProbe |
|
| 256 |
- |
|
| 257 |
-**Technique 1: Timeline Reconstruction** |
|
| 258 |
-``` |
|
| 259 |
-Given: Snapshots at [T0, T1, T2, ...] |
|
| 260 |
-Compute: Δ_count = snapshot[T_i] - snapshot[T_i-1] |
|
| 261 |
-Result: Visual timeline of when data appeared/disappeared |
|
| 262 |
-Use: Correlate with sync events, OS updates, app launches |
|
| 263 |
-``` |
|
| 264 |
- |
|
| 265 |
-**Technique 2: Source Attribution** |
|
| 266 |
-``` |
|
| 267 |
-Given: Source field in each snapshot |
|
| 268 |
-Track: iPhone vs. Watch vs. Manual contributions |
|
| 269 |
-Result: Identify which device is unreliable |
|
| 270 |
-Use: Isolate whether issue is device, OS, or iCloud |
|
| 271 |
-``` |
|
| 272 |
- |
|
| 273 |
-**Technique 3: Anomaly Clustering** |
|
| 274 |
-``` |
|
| 275 |
-Given: All anomalies with timestamps |
|
| 276 |
-Cluster: Group nearby anomalies (e.g., within 24 hours) |
|
| 277 |
-Result: Pattern detection — is this systemic or isolated? |
|
| 278 |
-Use: Determine if it's device-specific or iOS version issue |
|
| 279 |
-``` |
|
| 280 |
- |
|
| 281 |
-**Technique 4: Cross-Device Correlation** (Future: macOS) |
|
| 282 |
-``` |
|
| 283 |
-Given: Multiple device's HealthProbe exports |
|
| 284 |
-Compare: Are anomalies synchronized across devices? |
|
| 285 |
-Result: Distinguish local bug from iCloud sync issue |
|
| 286 |
-Use: Report "this affects all devices" vs. "only this device" |
|
| 287 |
-``` |
|
| 288 |
- |
|
| 289 |
- |
|
| 290 |
-## 4. Comparison: HealthProbe vs. Alternatives |
|
| 291 |
- |
|
| 292 |
-| Feature | HealthProbe | Health.app | Third-party apps | |
|
| 293 |
-|---------|------------|-----------|-----------------| |
|
| 294 |
-| **Real-time monitoring** | ✅ Yes | ❌ No | ⚠️ Partial | |
|
| 295 |
-| **Audit trail** | ✅ Yes | ❌ No | ❌ No | |
|
| 296 |
-| **Detects data loss** | ✅ Yes | ❌ No (silent) | ❌ No | |
|
| 297 |
-| **Privacy (no exfiltration)** | ✅ Yes | ✅ Yes | ❌ Often sells data | |
|
| 298 |
-| **Local-only** | ✅ Yes | ✅ Yes | ❌ Often cloud-based | |
|
| 299 |
-| **Open source** | 🔄 Future | ❌ No | ⚠️ Some are | |
|
| 300 |
-| **Forensic export** | ✅ Yes | ❌ No | ⚠️ Limited | |
|
| 301 |
- |
|
| 302 |
- |
|
| 303 |
-## 5. Recommended Usage Patterns |
|
| 304 |
- |
|
| 305 |
-### 5.1 For Individual Users (Personal Monitoring) |
|
| 306 |
- |
|
| 307 |
-``` |
|
| 308 |
-Baseline: |
|
| 309 |
- 1. Install HealthProbe |
|
| 310 |
- 2. Let it run for 30 days to establish baseline |
|
| 311 |
- 3. Export first "clean" snapshot |
|
| 312 |
- |
|
| 313 |
-Ongoing: |
|
| 314 |
- 1. Check app weekly (or enable background notifications) |
|
| 315 |
- 2. If anomaly alert → Screenshot it |
|
| 316 |
- 3. If critical → Export full report immediately |
|
| 317 |
- 4. Keep exported reports in Notes/Files app as backup |
|
| 318 |
- |
|
| 319 |
-Post-incident: |
|
| 320 |
- 1. Export complete forensic report |
|
| 321 |
- 2. Attach to Apple Feedback Assistant ticket |
|
| 322 |
- 3. Include link to DearApple issue #001 |
|
| 323 |
-``` |
|
| 324 |
- |
|
| 325 |
-### 5.2 For Researchers (Data Collection) |
|
| 326 |
- |
|
| 327 |
-``` |
|
| 328 |
-Setup: |
|
| 329 |
- 1. Export anonymized anomaly summaries manually |
|
| 330 |
- 2. Keep raw archive local |
|
| 331 |
- |
|
| 332 |
-Analysis: |
|
| 333 |
- 1. Correlate own data loss with iOS release dates |
|
| 334 |
- 2. Compare patterns with other HealthProbe users |
|
| 335 |
- 3. Contribute findings to DearApple repository |
|
| 336 |
-``` |
|
| 337 |
- |
|
| 338 |
-### 5.3 For Apple Support / Developers |
|
| 339 |
- |
|
| 340 |
-``` |
|
| 341 |
-When submitting feedback: |
|
| 342 |
- 1. Include HealthProbe forensic export |
|
| 343 |
- 2. Specify device model, iOS version, exact reproduction steps |
|
| 344 |
- 3. Include timeline showing when anomalies appeared |
|
| 345 |
- 4. Mention if pattern repeats across multiple devices |
|
| 346 |
-``` |
|
| 347 |
- |
|
| 348 |
- |
|
| 349 |
-## 6. Future Enhancements (Post-MVP) |
|
| 350 |
- |
|
| 351 |
-### 6.1 Machine Learning Anomaly Scoring |
|
| 352 |
-``` |
|
| 353 |
-Current: Binary detection (anomaly or not) |
|
| 354 |
-Future: Confidence scoring (0-100%) |
|
| 355 |
- - Low risk: Temporary duplicates, minor drifts |
|
| 356 |
- - High risk: Permanent loss, systematic divergence |
|
| 357 |
- - Enable severity-based alerting |
|
| 358 |
-``` |
|
| 359 |
- |
|
| 360 |
-### 6.2 Community Pattern Database |
|
| 361 |
-``` |
|
| 362 |
-Current: Single-device observation |
|
| 363 |
-Future: Anonymized multi-device dataset |
|
| 364 |
- - iOS 18.4 affected 23% of users |
|
| 365 |
- - "Morning Run" workout loses 15% post-sync (systematic) |
|
| 366 |
- - Identify if issue is iOS, device model, or iCloud region specific |
|
| 367 |
-``` |
|
| 368 |
- |
|
| 369 |
-### 6.3 Predictive Detection |
|
| 370 |
-``` |
|
| 371 |
-Current: Detect after anomaly occurs |
|
| 372 |
-Future: Alert before data is lost |
|
| 373 |
- - Watch for sync stall patterns |
|
| 374 |
- - Pre-loss indicators (e.g., rapid duplicates → deletion) |
|
| 375 |
-``` |
|
| 376 |
- |
|
| 377 |
- |
|
| 378 |
-## 7. Troubleshooting HealthProbe Itself |
|
| 379 |
- |
|
| 380 |
-### Common Issues |
|
| 381 |
- |
|
| 382 |
-| Issue | Cause | Fix | |
|
| 383 |
-|-------|-------|-----| |
|
| 384 |
-| **No anomalies detected for weeks** | Background fetch disabled | Settings → HealthProbe → Background Refresh | |
|
| 385 |
-| **Snapshots not being saved** | Insufficient storage or archive write failure | Free up space; verify local archive health and SwiftData cache rebuild | |
|
| 386 |
-| **Sync state not updating** | iCloud token check failing | Sign out/in to iCloud; restart device | |
|
| 387 |
-| **Old audit trail entries missing** | SwiftData cache/log retention policy or migration issue | Rebuild derived views from archive where possible; export reports before uninstall | |
|
| 388 |
- |
|
| 389 |
- |
|
| 390 |
-## 8. References |
|
| 391 |
- |
|
| 392 |
-- **iOS HealthKit Framework:** https://developer.apple.com/documentation/healthkit/ |
|
| 393 |
-- **HKAnchoredObjectQuery:** https://developer.apple.com/documentation/healthkit/hkanchoredrobjectquery |
|
| 394 |
-- **SwiftData Persistence:** https://developer.apple.com/documentation/swiftdata |
|
| 395 |
-- **DearApple Issue #001:** Apple Health mass data loss investigation |
|
| 396 |
-- **Apple Privacy:** https://www.apple.com/privacy/ |
|
| 397 |
- |
|
| 398 |
- |
|
| 399 |
-*HealthProbe — Forensics for Your Health Data* |
|
| 400 |
-*Document v1.0 — 2026-05-01* |
|
@@ -1,75 +0,0 @@ |
||
| 1 |
-# HealthProbe iOS – Specification (MVP) |
|
| 2 |
- |
|
| 3 |
-## Overview |
|
| 4 |
-HealthProbe is an iOS application designed to **audit and monitor the integrity of HealthKit data**. |
|
| 5 |
-It detects anomalies such as: |
|
| 6 |
-- unexpected historical insertions |
|
| 7 |
-- silent deletions of past data |
|
| 8 |
-- duplicate records |
|
| 9 |
-- divergence trends over time |
|
| 10 |
- |
|
| 11 |
-The application operates as a **local audit and capture agent**. It does not sync HealthProbe data via CloudKit/iCloud; HealthKit databases can evolve differently per device, so the MVP keeps each device archive local and explicit. |
|
| 12 |
- |
|
| 13 |
-⚠️ This document describes ONLY the iOS application (MVP phase). |
|
| 14 |
-A future macOS application will act as a visualization/analysis layer. |
|
| 15 |
- |
|
| 16 |
- |
|
| 17 |
-## Core Principles |
|
| 18 |
- |
|
| 19 |
-1. **Read-only with respect to HealthKit** |
|
| 20 |
- - Never modify or delete HealthKit data |
|
| 21 |
- - Only observe and audit |
|
| 22 |
- |
|
| 23 |
-2. **Local-first architecture** |
|
| 24 |
- - All detection must work without network access |
|
| 25 |
- |
|
| 26 |
-3. **Incremental observation** |
|
| 27 |
- - Use anchored queries to track changes |
|
| 28 |
- |
|
| 29 |
-4. **No app cloud sync** |
|
| 30 |
- - HealthProbe does not sync raw samples, digests, or reports through CloudKit/iCloud |
|
| 31 |
- |
|
| 32 |
-5. **Robust local archive** |
|
| 33 |
- - Store captured HealthKit data in one local archive store, not per-data-type silos |
|
| 34 |
- - SwiftData is used for derived UI data, settings, logs, and history only |
|
| 35 |
- |
|
| 36 |
- |
|
| 37 |
-## Features (MVP) |
|
| 38 |
- |
|
| 39 |
-### 1. HealthKit Monitoring |
|
| 40 |
-Use: |
|
| 41 |
-- `HKAnchoredObjectQuery` |
|
| 42 |
-- `HKObserverQuery` |
|
| 43 |
- |
|
| 44 |
-Track: |
|
| 45 |
-- Workouts (`HKWorkoutType`) |
|
| 46 |
-- Heart Rate (`HKQuantityTypeIdentifierHeartRate`) |
|
| 47 |
-- High Heart Rate Events |
|
| 48 |
-- Other relevant samples (extensible) |
|
| 49 |
- |
|
| 50 |
-Persist: |
|
| 51 |
-- sample values and units |
|
| 52 |
-- source and source revision metadata |
|
| 53 |
-- device metadata exposed by HealthKit |
|
| 54 |
-- HealthKit metadata dictionaries |
|
| 55 |
-- first-seen / last-seen / last-verified timestamps |
|
| 56 |
-- fingerprints for matching against Apple Health XML exports and backup database extracts |
|
| 57 |
- |
|
| 58 |
- |
|
| 59 |
-### 2. Anomaly Detection |
|
| 60 |
- |
|
| 61 |
-#### A. Historical Insertions |
|
| 62 |
-Detect samples where: |
|
| 63 |
-- `startDate << now` |
|
| 64 |
-- AND `firstSeenAt ≈ now` |
|
| 65 |
- |
|
| 66 |
-#### B. Deletions |
|
| 67 |
-Detect via: |
|
| 68 |
-- `HKDeletedObject` |
|
| 69 |
-- Compare with previously stored snapshot |
|
| 70 |
- |
|
| 71 |
-#### C. Duplicate Detection |
|
| 72 |
-Fingerprint: |
|
@@ -1,558 +0,0 @@ |
||
| 1 |
-# HealthProbe – Complete Specification & Motivations |
|
| 2 |
- |
|
| 3 |
-**Version:** 1.2 |
|
| 4 |
-**Status:** MVP (iOS monitoring agent) |
|
| 5 |
-**Last Updated:** 2026-05-18 |
|
| 6 |
- |
|
| 7 |
- |
|
| 8 |
-## 1. Executive Summary |
|
| 9 |
- |
|
| 10 |
-HealthProbe is an **audit and integrity monitoring tool for Apple HealthKit**, designed to detect and document anomalies in health data that would otherwise go unnoticed. It serves as a **local sentinel** — read-only observation with forensic-grade logging for later analysis. |
|
| 11 |
- |
|
| 12 |
-**Core Problem:** Apple Health data loss events (confirmed Sept 2025 incident, ongoing sporadic reports) lack detective mechanisms. Users do not know when data has been lost, corrupted, or silently modified. |
|
| 13 |
- |
|
| 14 |
-**Solution:** HealthProbe incrementally captures HealthKit data into a robust local archive and maintains an audit trail for post-incident forensic analysis. SwiftData is used for UI assistance, settings, logs, history, and precomputed values; it is not the source of truth. The local archive exists because Apple Health/iCloud can **rewrite / downsample / consolidate** historical samples in-place (not only add/delete), which cannot be proven by counts alone. |
|
| 15 |
- |
|
| 16 |
- |
|
| 17 |
-## 2. Motivations & Concrete Observed Cases |
|
| 18 |
- |
|
| 19 |
-### 2.1 The September 2025 Mass Data Loss Event |
|
| 20 |
- |
|
| 21 |
-**What happened:** |
|
| 22 |
-- Large-scale loss of Apple Health records reported across multiple devices and iOS versions |
|
| 23 |
-- Timeframe: September 2025 (correlated with iOS 26 release) |
|
| 24 |
-- Suspected triggers: |
|
| 25 |
- - Device migration (iCloud sync state transitions) |
|
| 26 |
- - OS upgrade/downgrade cycles |
|
| 27 |
- - Backup restore operations |
|
| 28 |
- - HealthKit database re-indexing |
|
| 29 |
- - iCloud sync divergence |
|
| 30 |
- |
|
| 31 |
-**Why undetected:** |
|
| 32 |
-- No notification from Apple Health |
|
| 33 |
-- Users discover loss retrospectively (weeks/months later) |
|
| 34 |
-- No audit trail to identify exactly *when* or *what* was lost |
|
| 35 |
-- No differentiation between: user deletion, sync loss, corruption, or app bug |
|
| 36 |
- |
|
| 37 |
-**HealthProbe's answer:** |
|
| 38 |
-- Continuous monitoring detects *first occurrence* of loss |
|
| 39 |
-- Timestamped snapshots enable forensic reconstruction |
|
| 40 |
-- Pattern detection identifies trends (gradual loss vs. sudden wipe) |
|
| 41 |
-- Allows users to file reproducible bug reports with evidence |
|
| 42 |
- |
|
| 43 |
-### 2.2 Observed Anomalies |
|
| 44 |
- |
|
| 45 |
-#### A. Historical Insertions (Backdated Data) |
|
| 46 |
-**Pattern:** HealthKit receives samples with `startDate` far in the past, but `firstSeen` ≈ now |
|
| 47 |
-- **Examples:** |
|
| 48 |
- - Workout from Jan 2023 suddenly appears in Feb 2026 |
|
| 49 |
- - Step count "corrected" retroactively without user action |
|
| 50 |
- - Heart rate baseline recalibration affecting past months |
|
| 51 |
- |
|
| 52 |
-**Root cause theories:** |
|
| 53 |
- - iCloud sync restoring from outdated backup |
|
| 54 |
- - Third-party fitness app injecting historical reconstructions |
|
| 55 |
- - HealthKit recovery logic applying retroactive corrections |
|
| 56 |
- - Cross-device sync desynchronization |
|
| 57 |
- |
|
| 58 |
-**Detection method:** Anchored queries + timestamp comparison |
|
| 59 |
- |
|
| 60 |
- |
|
| 61 |
-#### B. Silent Deletions |
|
| 62 |
-**Pattern:** Samples present in previous snapshot, absent in current, no `HKDeletedObject` notification |
|
| 63 |
-- **Examples:** |
|
| 64 |
- - 2-week gap of step data (no deletion events logged) |
|
| 65 |
- - Entire workout history from old iPhone missing post-restore |
|
| 66 |
- - Selective loss (e.g., only workouts, heart rate preserved) |
|
| 67 |
- |
|
| 68 |
-**Root cause theories:** |
|
| 69 |
- - Incomplete restore from backup |
|
| 70 |
- - Selective iCloud sync pruning based on storage limits |
|
| 71 |
- - Corrupted local database during indexing |
|
| 72 |
- - Race condition during multi-device sync |
|
| 73 |
- |
|
| 74 |
-**Detection method:** Snapshot comparison + gap detection |
|
| 75 |
- |
|
| 76 |
- |
|
| 77 |
-#### C. Duplicate Records |
|
| 78 |
-**Pattern:** Identical samples (same type, time, value) appearing multiple times |
|
| 79 |
-- **Examples:** |
|
| 80 |
- - Duplicate step counts from watch syncing (15 min apart) |
|
| 81 |
- - Duplicate workouts after iCloud re-sync |
|
| 82 |
- - Conflicting HR readings within 30-second window |
|
| 83 |
- |
|
| 84 |
-**Root cause theories:** |
|
| 85 |
- - Multi-device sync collision (watch + phone + iPad) |
|
| 86 |
- - Retry logic without deduplication |
|
| 87 |
- - Backup restore merging with live data |
|
| 88 |
- |
|
| 89 |
-**Detection method:** Fingerprinting (type + date + value + source) |
|
| 90 |
- |
|
| 91 |
- |
|
| 92 |
-#### D. Divergence Trends |
|
| 93 |
-**Pattern:** Measurable drift in aggregated metrics over time |
|
| 94 |
-- **Examples:** |
|
| 95 |
- - Active energy expenditure trending down 30% without behavior change |
|
| 96 |
- - Sleep records shifting systematically earlier/later |
|
| 97 |
- - Heart rate variability calculation method changing unexpectedly |
|
| 98 |
- |
|
| 99 |
-**Root cause theories:** |
|
| 100 |
- - Algorithm updates not backfilled uniformly |
|
| 101 |
- - Device calibration drift |
|
| 102 |
- - Source priority shifting (watch → phone) |
|
| 103 |
- - Health app recalculation without user visibility |
|
| 104 |
- |
|
| 105 |
-**Detection method:** Time-series aggregation + statistical outlier detection |
|
| 106 |
- |
|
| 107 |
- |
|
| 108 |
-#### E. Consolidation / Downsampling (Historical Rewrites) |
|
| 109 |
-**Pattern:** For high-frequency data types, Apple Health/iCloud may rewrite historical samples such that: |
|
| 110 |
-- total sample `count` decreases for older months |
|
| 111 |
-- timestamps in a later snapshot/export become a **subset** of an earlier snapshot/export (thinning) |
|
| 112 |
-- some samples become **interval-based** (`endDate > startDate`) and/or values become fractional |
|
| 113 |
-- for cumulative quantities, `value_sum` can remain stable while per-sample `value_max` increases (consolidation) |
|
| 114 |
- |
|
| 115 |
-**Why it matters:** A “record counter” cannot distinguish discard vs consolidation. HealthProbe must support value-level forensics and (optionally) preserve complete evidence locally. |
|
| 116 |
- |
|
| 117 |
-**Detection method:** Snapshot comparison of fingerprints *plus* optional per-sample archives for selected data types. |
|
| 118 |
- |
|
| 119 |
- |
|
| 120 |
-### 2.3 Why This Matters |
|
| 121 |
- |
|
| 122 |
-| Concern | Impact | HealthProbe Role | |
|
| 123 |
-|---------|--------|-----------------| |
|
| 124 |
-| **Data loss undetected** | Users lose personal health history with no notification | Immediate detection & alert | |
|
| 125 |
-| **No forensic trail** | Impossible to reproduce for bug reports | Audit trail enables Apple debugging | |
|
| 126 |
-| **Blame uncertainty** | "Is it sync? Backup? A bug?" | Precise classification of anomaly type | |
|
| 127 |
-| **Third-party apps** | Apps assume data is trustworthy, may make wrong decisions | Detect corruption before downstream use | |
|
| 128 |
-| **Privacy of monitoring** | Users fear data exfiltration by health apps | Local-only observation, no cloud upload | |
|
| 129 |
- |
|
| 130 |
- |
|
| 131 |
-## 3. Core Architecture |
|
| 132 |
- |
|
| 133 |
-### 3.1 Design Principles |
|
| 134 |
- |
|
| 135 |
-1. **Read-only operations** (never modify HealthKit data) |
|
| 136 |
-2. **Local-first** (full functionality without network) |
|
| 137 |
-3. **Incremental queries** (efficient, avoid repeating work) |
|
| 138 |
-4. **Single archive store** (do not split the forensic store per data type; cross-type relationships and shared metadata matter) |
|
| 139 |
-5. **Auditability** (every observation logged, timestamped, reproducible) |
|
| 140 |
-6. **Privacy by default** (no HealthProbe cloud sync; local storage remains under user control) |
|
| 141 |
-7. **Forensic capture** (selected data types are archived locally as complete per-sample records with metadata to preserve evidence against silent rewrites) |
|
| 142 |
- |
|
| 143 |
-### 3.2 Threading Model |
|
| 144 |
- |
|
| 145 |
-``` |
|
| 146 |
-┌─────────────────────────────────────────┐ |
|
| 147 |
-│ Main Thread (UI) │ |
|
| 148 |
-│ - Display current health status │ |
|
| 149 |
-│ - Show alerts & anomalies │ |
|
| 150 |
-│ - User interaction │ |
|
| 151 |
-└──────────────┬──────────────────────────┘ |
|
| 152 |
- │ |
|
| 153 |
- ├─ Delegate query results |
|
| 154 |
- │ |
|
| 155 |
-┌──────────────▼──────────────────────────┐ |
|
| 156 |
-│ Background Queue (HealthKit Queries) │ |
|
| 157 |
-│ - HKAnchoredObjectQuery (efficient) │ |
|
| 158 |
-│ - HKObserverQuery (reactive) │ |
|
| 159 |
-│ - Snapshot comparisons │ |
|
| 160 |
-│ - Anomaly detection logic │ |
|
| 161 |
-└──────────────┬──────────────────────────┘ |
|
| 162 |
- │ |
|
| 163 |
- ├─ Write detected anomalies |
|
| 164 |
- │ |
|
| 165 |
-┌──────────────▼──────────────────────────┐ |
|
| 166 |
-│ Local Archive Store │ |
|
| 167 |
-│ - Canonical HealthKit samples │ |
|
| 168 |
-│ - Sources, devices, metadata │ |
|
| 169 |
-│ - Cross-type relationships │ |
|
| 170 |
-│ - Fingerprints and verification hashes │ |
|
| 171 |
-└──────────────┬──────────────────────────┘ |
|
| 172 |
- │ |
|
| 173 |
-┌──────────────▼──────────────────────────┐ |
|
| 174 |
-│ SwiftData UI Store │ |
|
| 175 |
-│ - Precomputed counts/statistics │ |
|
| 176 |
-│ - Visualization state and settings │ |
|
| 177 |
-│ - Logs, history, report indexes │ |
|
| 178 |
-└─────────────────────────────────────────┘ |
|
| 179 |
-``` |
|
| 180 |
- |
|
| 181 |
-### 3.3 Storage Model |
|
| 182 |
- |
|
| 183 |
-**Local Archive Store (source of truth):** |
|
| 184 |
-- one robust local database for all archived samples, not one archive per data type |
|
| 185 |
-- normalized entities for samples, workouts, sources, source revisions, devices, metadata, relationships, and observations |
|
| 186 |
-- multiple fingerprints per sample: HealthKit UUID, strict fingerprint, semantic fingerprint, and fuzzy matching keys for export/backup reconciliation |
|
| 187 |
-- append-only observation history (`firstSeen`, `lastSeen`, `lastVerified`, disappearance evidence) |
|
| 188 |
-- snapshot-level and table-level hashes for integrity checks |
|
| 189 |
- |
|
| 190 |
-**SwiftData UI Store (derived/cache layer):** |
|
| 191 |
-- settings and selected data types |
|
| 192 |
-- import job state and progress |
|
| 193 |
-- precomputed counts, temporal bins, display ranges, and summary statistics |
|
| 194 |
-- audit log entries and report indexes |
|
| 195 |
-- anomaly summaries and links into the archive store |
|
| 196 |
- |
|
| 197 |
-SwiftData rows must be rebuildable from the local archive store. If the two disagree, the archive store wins. |
|
| 198 |
- |
|
| 199 |
- |
|
| 200 |
-## 4. Monitoring Features (MVP) |
|
| 201 |
- |
|
| 202 |
-### 4.1 Incremental Change Detection |
|
| 203 |
- |
|
| 204 |
-**Using `HKAnchoredObjectQuery`:** |
|
| 205 |
-``` |
|
| 206 |
-Query pattern: |
|
| 207 |
-├─ Initial query: anchor = 0 → captures all existing data |
|
| 208 |
-├─ Store anchor locally |
|
| 209 |
-├─ Periodic queries: anchor = stored → captures only new/modified samples |
|
| 210 |
-└─ Update anchor → efficient incremental updates |
|
| 211 |
-``` |
|
| 212 |
- |
|
| 213 |
-**What triggers a query:** |
|
| 214 |
-- App launch |
|
| 215 |
-- Background refresh (iOS allows periodic background queries) |
|
| 216 |
-- User manually triggers "Check Now" |
|
| 217 |
-- Every 12-24 hours (configurable) |
|
| 218 |
- |
|
| 219 |
-### 4.2 Tracked Sample Types (Extensible) |
|
| 220 |
- |
|
| 221 |
-| Type | Why Monitored | Anomaly Signal | |
|
| 222 |
-|------|---------------|----------------| |
|
| 223 |
-| **Workouts** | High-value data, often synced from watch | Historical insertions, duplicates | |
|
| 224 |
-| **Heart Rate** | Continuous stream, high modification risk | Gaps, divergence | |
|
| 225 |
-| **Activity Summary** | Auto-computed, depends on other types | Recalculation without notice | |
|
| 226 |
-| **Steps** | Cross-device (watch/phone), sync-heavy | Duplicate from retries | |
|
| 227 |
-| **Sleep** | Frequently "corrected" post-recording | Backdated entries, loss | |
|
| 228 |
-| **Blood Pressure** | Manual entry, sync state-dependent | Divergence trends | |
|
| 229 |
-| **Audio Exposure** | Often device-specific | Selective loss | |
|
| 230 |
- |
|
| 231 |
-### 4.3 Anomaly Detection Logic |
|
| 232 |
- |
|
| 233 |
-#### A. Historical Insertion Detection |
|
| 234 |
-``` |
|
| 235 |
-For each sample: |
|
| 236 |
- Δt = now - startDate (age of sample) |
|
| 237 |
- Δt_observed = now - firstSeen (how long we've known about it) |
|
| 238 |
- |
|
| 239 |
- IF Δt >> Δt_observed (e.g., Δt ≈ 6 months, Δt_observed ≈ 5 minutes): |
|
| 240 |
- → Flag as "historical insertion" |
|
| 241 |
- → Severity: MEDIUM (might be legitimate correction) |
|
| 242 |
-``` |
|
| 243 |
- |
|
| 244 |
-#### B. Deletion Detection |
|
| 245 |
-``` |
|
| 246 |
-Current snapshot: S_now |
|
| 247 |
-Previous snapshot: S_prev |
|
| 248 |
- |
|
| 249 |
-Missing = S_prev - S_now |
|
| 250 |
- (samples present before, absent now) |
|
| 251 |
- |
|
| 252 |
-IF |Missing| > 0 AND no HKDeletedObject notification: |
|
| 253 |
- → Flag as "silent deletion" |
|
| 254 |
- → Severity: CRITICAL |
|
| 255 |
- → Record gap_duration = time between last observation and absence |
|
| 256 |
-``` |
|
| 257 |
- |
|
| 258 |
-#### C. Duplicate Detection |
|
| 259 |
-``` |
|
| 260 |
-Fingerprint = (sampleType, startDate, value, unit, source) |
|
| 261 |
- |
|
| 262 |
-IF count(fingerprint) > 1: |
|
| 263 |
- → Flag as "duplicate record" |
|
| 264 |
- → Severity: LOW (data integrity risk) |
|
| 265 |
- → Calculate time between duplicates |
|
| 266 |
-``` |
|
| 267 |
- |
|
| 268 |
-#### D. Divergence Detection |
|
| 269 |
-``` |
|
| 270 |
-Track aggregated metrics: |
|
| 271 |
- total_steps_per_day[date] |
|
| 272 |
- active_energy_per_day[date] |
|
| 273 |
- hr_average_per_day[date] |
|
| 274 |
- |
|
| 275 |
-For each metric over time: |
|
| 276 |
- σ_expected = standard deviation (normal range) |
|
| 277 |
- σ_observed = recent variance |
|
| 278 |
- |
|
| 279 |
- IF σ_observed > 2 * σ_expected: |
|
| 280 |
- → Flag as "divergence trend" |
|
| 281 |
- → Severity: MEDIUM |
|
| 282 |
- → Record trend direction & magnitude |
|
| 283 |
-``` |
|
| 284 |
- |
|
| 285 |
- |
|
| 286 |
-## 5. Sync Context Logging |
|
| 287 |
- |
|
| 288 |
-HealthProbe does **not** sync its own archive through iCloud or CloudKit. Observed HealthKit databases can diverge between devices, so cross-device HealthProbe sync would increase complexity without providing a reliable forensic source of truth. |
|
| 289 |
- |
|
| 290 |
-Health/iCloud state is still useful as **context** for anomalies. |
|
| 291 |
- |
|
| 292 |
-### 5.1 Context Tracking |
|
| 293 |
- |
|
| 294 |
-**Observe HealthKit permission & sync state:** |
|
| 295 |
-```swift |
|
| 296 |
-HKHealthStore().requestAuthorization(...) |
|
| 297 |
-// → Detect when user grants/revokes permissions |
|
| 298 |
- |
|
| 299 |
-// Monitor iCloud state |
|
| 300 |
-FileManager.default.ubiquityIdentityToken |
|
| 301 |
-// → Detects iCloud sign-in/sign-out |
|
| 302 |
-// → Logs context for later correlation |
|
| 303 |
-``` |
|
| 304 |
- |
|
| 305 |
-**Capture lifecycle events:** |
|
| 306 |
-- iCloud sign-in detected → log context and schedule a local archive verification pass |
|
| 307 |
-- iCloud sign-out detected → note local-only mode |
|
| 308 |
-- Device backup initiated → pre-backup snapshot |
|
| 309 |
-- App backgrounded/foregrounded → check for sync activity |
|
| 310 |
- |
|
| 311 |
-### 5.2 Context Documentation |
|
| 312 |
- |
|
| 313 |
-**Audit trail entries:** |
|
| 314 |
-``` |
|
| 315 |
-[2026-05-01 14:23:15] SYNC_STATE_CHANGE: iCloud enabled |
|
| 316 |
- - Previous: local-only |
|
| 317 |
- - Action: archive verification scheduled |
|
| 318 |
- - Result: no HealthProbe cloud sync performed |
|
| 319 |
- |
|
| 320 |
-[2026-05-01 14:24:02] SYNC_COMPLETE: iCloud data merged |
|
| 321 |
- - Samples added: 87 |
|
| 322 |
- - Samples deleted: 3 |
|
| 323 |
- - Duplicates found: 2 |
|
| 324 |
- - Divergence detected: NO |
|
| 325 |
- |
|
| 326 |
-[2026-05-01 16:15:00] ANOMALY_DETECTED: Historical insertion |
|
| 327 |
- - Sample: Workout "Running" |
|
| 328 |
- - Original date: 2024-03-15 |
|
| 329 |
- - First observed: 2026-05-01 |
|
| 330 |
- - Age: 778 days |
|
| 331 |
- - Severity: MEDIUM |
|
| 332 |
-``` |
|
| 333 |
- |
|
| 334 |
-### 5.3 Background Monitoring |
|
| 335 |
- |
|
| 336 |
-**iOS Background Modes enabled:** |
|
| 337 |
-- `background-fetch` — periodic archive and context checks |
|
| 338 |
-- `remote-notification` → not required for HealthProbe archive sync |
|
| 339 |
- |
|
| 340 |
-**Check frequency:** |
|
| 341 |
-- Min: 2 hours |
|
| 342 |
-- Max: 24 hours |
|
| 343 |
-- Adapts based on anomaly detection frequency |
|
| 344 |
- |
|
| 345 |
- |
|
| 346 |
-## 6. Local Archive, Reports & Forensics |
|
| 347 |
- |
|
| 348 |
-### 6.1 Local Archive Store |
|
| 349 |
- |
|
| 350 |
-The main backup artifact is the on-device archive store. It is populated incrementally from HealthKit and is not dependent on Apple Health ZIP exports or full encrypted iPhone backups. |
|
| 351 |
- |
|
| 352 |
-The archive must preserve as much HealthKit information as the API exposes: |
|
| 353 |
-- sample UUID, type, start/end date, value, unit, and metadata |
|
| 354 |
-- source, source revision, bundle identifier, product type, version/build if available |
|
| 355 |
-- device fields exposed by `HKDevice` |
|
| 356 |
-- relationships between workouts, samples, events, and other linked records where available |
|
| 357 |
-- first-seen / last-seen / last-verified observations |
|
| 358 |
-- fingerprints suitable for matching against Apple Health XML exports and extracted backup databases |
|
| 359 |
- |
|
| 360 |
-The archive is selected by data type for performance and privacy, but it is stored in **one schema** so later analysis can follow relationships between types. |
|
| 361 |
- |
|
| 362 |
-### 6.2 Reports and Point Exports |
|
| 363 |
- |
|
| 364 |
-HealthProbe does not need to optimize for routine complete exports. The local archive is the backup. |
|
| 365 |
- |
|
| 366 |
-Export is scoped to what the user is inspecting: |
|
| 367 |
-- anomaly reports |
|
| 368 |
-- tables of records shown in UI (e.g., “these 1,000 HK records disappeared”) |
|
| 369 |
-- point-in-time manifests and hashes |
|
| 370 |
-- selected record sets needed for external analysis |
|
| 371 |
- |
|
| 372 |
-### 6.3 Forensic Query Examples |
|
| 373 |
- |
|
| 374 |
-**"Has my step data been compromised?"** |
|
| 375 |
-``` |
|
| 376 |
-1. Load all snapshots for "Steps" type |
|
| 377 |
-2. Plot sample count over time |
|
| 378 |
-3. Identify gaps > 6 hours |
|
| 379 |
-4. Report: when, how many missing, context |
|
| 380 |
-``` |
|
| 381 |
- |
|
| 382 |
-**"Did iCloud sync break my data?"** |
|
| 383 |
-``` |
|
| 384 |
-1. Correlate anomalies with observed Health/iCloud state changes |
|
| 385 |
-2. Show timeline: before state change, during reconciliation, after |
|
| 386 |
-3. Calculate: samples lost, duplicates introduced |
|
| 387 |
-``` |
|
| 388 |
- |
|
| 389 |
-**"Is my health data drifting?"** |
|
| 390 |
-``` |
|
| 391 |
-1. Compute daily aggregates (steps, energy, HR) |
|
| 392 |
-2. Fit trend line over 30-90 days |
|
| 393 |
-3. Report: slope (drift direction), R² (confidence) |
|
| 394 |
-4. Compare to device baseline |
|
| 395 |
-``` |
|
| 396 |
- |
|
| 397 |
- |
|
| 398 |
-## 7. User-Facing Features |
|
| 399 |
- |
|
| 400 |
-### 7.1 Dashboard (iOS App) |
|
| 401 |
- |
|
| 402 |
-**Home Screen:** |
|
| 403 |
-- **Health Status** — "✅ Healthy" / "⚠️ Check" / "🚨 Critical" |
|
| 404 |
-- **Last Check** — timestamp of last monitoring run |
|
| 405 |
-- **Quick Stats** — samples tracked, anomalies found (all-time) |
|
| 406 |
-- **Active Alerts** — up to 3 most recent anomalies |
|
| 407 |
- |
|
| 408 |
-**Detail Views:** |
|
| 409 |
-- **Anomalies** — sortable list by date/severity |
|
| 410 |
-- **Snapshots** — historical timeline of known-good snapshots |
|
| 411 |
-- **Audit Trail** — complete immutable log |
|
| 412 |
-- **Archive Status** — current local archive health, last verification, selected data types |
|
| 413 |
- |
|
| 414 |
-**Settings:** |
|
| 415 |
-- Check frequency |
|
| 416 |
-- Sample types to track |
|
| 417 |
-- Alert thresholds |
|
| 418 |
-- Local archive retention and report export options |
|
| 419 |
- |
|
| 420 |
-### 7.2 Alerts |
|
| 421 |
- |
|
| 422 |
-**Push Notifications (opt-in):** |
|
| 423 |
-- 🚨 "Critical data loss detected" (> 10% samples missing) |
|
| 424 |
-- ⚠️ "Unexpected historical data inserted" (> 100 samples) |
|
| 425 |
-- ℹ️ "Archive check completed, 2 duplicates found" |
|
| 426 |
- |
|
| 427 |
- |
|
| 428 |
-## 8. Future Enhancements (Beyond MVP) |
|
| 429 |
- |
|
| 430 |
-### 8.1 macOS Companion (Visualization Layer) |
|
| 431 |
-- Open and analyze exported HealthProbe reports or archive copies |
|
| 432 |
-- Long-term trend visualization (6-12 month history) |
|
| 433 |
-- Cross-device anomaly correlation |
|
| 434 |
-- Export to reproducible bug reports |
|
| 435 |
- |
|
| 436 |
-### 8.2 Machine Learning |
|
| 437 |
-- Personalized baseline generation |
|
| 438 |
-- Anomaly confidence scoring |
|
| 439 |
-- Predictive detection (flag drift before threshold hit) |
|
| 440 |
- |
|
| 441 |
-### 8.3 Community Patterns |
|
| 442 |
-- Anonymized digest sharing → identify systemic issues |
|
| 443 |
-- Detect if data loss correlates with: iOS version, device model, iCloud region, etc. |
|
| 444 |
-- Contribute to DearApple bug reports with statistical evidence |
|
| 445 |
- |
|
| 446 |
- |
|
| 447 |
-## 9. Technical Specifications |
|
| 448 |
- |
|
| 449 |
-### 9.1 Platform |
|
| 450 |
-- **iOS 15.0+** (HealthKit framework support) |
|
| 451 |
-- **watchOS 8.0+** (future sync awareness) |
|
| 452 |
-- **macOS 12.0+** (visualization, analysis) |
|
| 453 |
- |
|
| 454 |
-### 9.2 Permissions Required |
|
| 455 |
-- `HealthKit` — read-only access to specified types |
|
| 456 |
-- `Background Modes` — "Background Fetch" |
|
| 457 |
- |
|
| 458 |
-### 9.3 Data Storage |
|
| 459 |
-- **Local Archive Store:** canonical HealthKit sample archive (source of truth) |
|
| 460 |
-- **SwiftData:** derived UI/cache/settings/log/history store |
|
| 461 |
-- **No CloudKit sync:** HealthProbe data remains local unless the user exports a report or selected record table |
|
| 462 |
- |
|
| 463 |
-### 9.4 Performance |
|
| 464 |
-- Query time: < 5 seconds (anchored queries) |
|
| 465 |
-- Snapshot/index size: ≈ 5-10 KB per type per snapshot in SwiftData |
|
| 466 |
-- Archive storage: depends on selected high-frequency data types; report per-type storage costs in settings |
|
| 467 |
- |
|
| 468 |
- |
|
| 469 |
-## 10. Privacy & Security |
|
| 470 |
- |
|
| 471 |
-### 10.1 What HealthProbe Never Does |
|
| 472 |
-- ❌ Exports raw health samples to cloud |
|
| 473 |
-- ❌ Identifies users by name/account |
|
| 474 |
-- ❌ Shares device location or personal context |
|
| 475 |
-- ❌ Modifies any HealthKit data |
|
| 476 |
-- ❌ Sells or shares data with third parties |
|
| 477 |
- |
|
| 478 |
-### 10.2 What HealthProbe Collects (Local Only) |
|
| 479 |
-- ✅ Aggregated counts (not samples) |
|
| 480 |
-- ✅ Timestamps of anomalies |
|
| 481 |
-- ✅ Device model & iOS version (for context) |
|
| 482 |
-- ✅ Anomaly types & severity |
|
| 483 |
- |
|
| 484 |
-**Local archive:** |
|
| 485 |
-- ✅ Per-sample archive for user-selected types, stored on-device and exportable by user |
|
| 486 |
-- ✅ Metadata needed for recognition in Apple Health XML exports, backup database extracts, and future datasets |
|
| 487 |
- |
|
| 488 |
-### 10.3 Cloud Policy |
|
| 489 |
-- No HealthProbe CloudKit/iCloud sync |
|
| 490 |
-- No automatic upload of raw samples, digests, reports, or device fingerprints |
|
| 491 |
-- User-triggered exports are explicit, scoped, and local-file based |
|
| 492 |
- |
|
| 493 |
- |
|
| 494 |
-## 11. Success Criteria |
|
| 495 |
- |
|
| 496 |
-| Objective | Metric | Target | |
|
| 497 |
-|-----------|--------|--------| |
|
| 498 |
-| **Detect loss** | Time to detection after loss occurs | < 24 hours | |
|
| 499 |
-| **Forensic completeness** | % of anomalies with sufficient evidence | > 95% | |
|
| 500 |
-| **False positives** | Alerts user shouldn't worry about | < 5% of total | |
|
| 501 |
-| **Privacy** | % of users comfortable with data practices | > 90% | |
|
| 502 |
-| **Performance** | Background capture battery impact | < 2% drain/day | |
|
| 503 |
-| **Adoption** | Users can reproduce bugs with HealthProbe data | High relevance in Apple feedback | |
|
| 504 |
- |
|
| 505 |
- |
|
| 506 |
-## 12. References & Related Work |
|
| 507 |
- |
|
| 508 |
-- [DearApple Issue #001](https://github.com/overbog/dear-apple/issues/0001-apple-health-mass-data-loss.md) — Sept 2025 mass data loss |
|
| 509 |
-- [Apple HealthKit Documentation](https://developer.apple.com/documentation/healthkit/) |
|
| 510 |
-- [HKAnchoredObjectQuery](https://developer.apple.com/documentation/healthkit/hkanchoredrobjectquery) — Efficient incremental queries |
|
| 511 |
- |
|
| 512 |
- |
|
| 513 |
-## Appendix A: Example Anomaly Report |
|
| 514 |
- |
|
| 515 |
-```json |
|
| 516 |
-{
|
|
| 517 |
- "anomaly_id": "ANML_20260501_001", |
|
| 518 |
- "type": "historical_insertion", |
|
| 519 |
- "timestamp_detected": "2026-05-01T14:35:22Z", |
|
| 520 |
- "severity": "MEDIUM", |
|
| 521 |
- "evidence": {
|
|
| 522 |
- "sample_type": "HKWorkout", |
|
| 523 |
- "workout_type": "Running", |
|
| 524 |
- "start_date": "2025-01-15T07:30:00Z", |
|
| 525 |
- "end_date": "2025-01-15T08:15:00Z", |
|
| 526 |
- "duration_minutes": 45, |
|
| 527 |
- "calories": 420, |
|
| 528 |
- "first_observed": "2026-05-01T14:35:00Z", |
|
| 529 |
- "age_days": 106, |
|
| 530 |
- "source": "Health.app", |
|
| 531 |
- "context": "iCloud sync completed 2 hours prior" |
|
| 532 |
- }, |
|
| 533 |
- "classification": "Likely data recovery from cloud", |
|
| 534 |
- "recommended_action": "Monitor for similar patterns" |
|
| 535 |
-} |
|
| 536 |
-``` |
|
| 537 |
- |
|
| 538 |
- |
|
| 539 |
-*HealthProbe — Guarding the integrity of your health data.* |
|
@@ -1,639 +0,0 @@ |
||
| 1 |
-# HealthProbe – Technical Implementation Guide |
|
| 2 |
- |
|
| 3 |
-**Document Purpose:** Step-by-step guide for iOS app implementation |
|
| 4 |
-**Target Audience:** iOS developers |
|
| 5 |
-**Prerequisite Reading:** "Complete Specification & Motivations" |
|
| 6 |
- |
|
| 7 |
- |
|
| 8 |
-## ⚠️ Privacy Directives — Mandatory |
|
| 9 |
- |
|
| 10 |
-The following rules apply to **all code, logs, examples, tests, and documentation** in this project: |
|
| 11 |
- |
|
| 12 |
-- **No credentials** — no API keys, tokens, passwords, or signing certificates |
|
| 13 |
-- **No personal data** — no names, email addresses, phone numbers, or dates of birth |
|
| 14 |
-- **No device identifiers** — no UDIDs, serial numbers, advertising IDs, or device names |
|
| 15 |
-- **No account identifiers** — no Apple IDs, iCloud account info, or CloudKit record IDs |
|
| 16 |
-- **No raw health values in the repository** — do not include real health records, measurements, or workouts in code, tests, logs, examples, or documentation. The app may optionally store a user's raw samples **locally on-device** for forensic backup, but nothing real belongs in this repo. |
|
| 17 |
-- **No location data** — no GPS coordinates or location history |
|
| 18 |
-- **No recognizable patterns** — no logs or exports where combining fields could identify a person or device |
|
| 19 |
- |
|
| 20 |
-If adding examples, use clearly synthetic data: `"Device: iPhone-TESTDEVICE"`, `"User: Test User"`, `"2000-01-01"`. |
|
| 21 |
- |
|
| 22 |
- |
|
| 23 |
-## 1. HealthKit Integration |
|
| 24 |
- |
|
| 25 |
-### 1.1 Permission Model |
|
| 26 |
- |
|
| 27 |
-```swift |
|
| 28 |
-import HealthKit |
|
| 29 |
- |
|
| 30 |
-class HealthKitManager {
|
|
| 31 |
- static let shared = HealthKitManager() |
|
| 32 |
- let healthStore = HKHealthStore() |
|
| 33 |
- |
|
| 34 |
- let typesToRead: Set<HKSampleType> = [ |
|
| 35 |
- HKWorkoutType.workoutType(), |
|
| 36 |
- HKQuantityType.quantityType(forIdentifier: .heartRate)!, |
|
| 37 |
- HKQuantityType.quantityType(forIdentifier: .stepCount)!, |
|
| 38 |
- HKQuantityType.quantityType(forIdentifier: .activeEnergyBurned)!, |
|
| 39 |
- HKCategoryType.categoryType(forIdentifier: .sleepAnalysis)!, |
|
| 40 |
- HKActivitySummaryType.activitySummaryType(), |
|
| 41 |
- ] |
|
| 42 |
- |
|
| 43 |
- func requestAuthorization(completion: @escaping (Bool, Error?) -> Void) {
|
|
| 44 |
- healthStore.requestAuthorization(toShare: [], read: typesToRead) { success, error in
|
|
| 45 |
- completion(success, error) |
|
| 46 |
- } |
|
| 47 |
- } |
|
| 48 |
-} |
|
| 49 |
-``` |
|
| 50 |
- |
|
| 51 |
-### 1.2 Anchored Query Pattern |
|
| 52 |
- |
|
| 53 |
-**Purpose:** Efficient incremental queries that only fetch changes since last check |
|
| 54 |
- |
|
| 55 |
-```swift |
|
| 56 |
-class AnchoredQueryManager {
|
|
| 57 |
- let defaults = UserDefaults(suiteName: "group.com.healthprobe.data") |
|
| 58 |
- |
|
| 59 |
- func loadAnchor(for sampleType: HKSampleType) -> HKQueryAnchor? {
|
|
| 60 |
- guard let data = defaults?.data(forKey: "anchor_\(sampleType.identifier)") else {
|
|
| 61 |
- return nil |
|
| 62 |
- } |
|
| 63 |
- return try? NSKeyedUnarchiver.unarchivedObject(ofClass: HKQueryAnchor.self, from: data) |
|
| 64 |
- } |
|
| 65 |
- |
|
| 66 |
- func saveAnchor(_ anchor: HKQueryAnchor, for sampleType: HKSampleType) {
|
|
| 67 |
- let data = try? NSKeyedArchiver.archivedData(withRootObject: anchor, requiringSecureCoding: true) |
|
| 68 |
- defaults?.set(data, forKey: "anchor_\(sampleType.identifier)") |
|
| 69 |
- } |
|
| 70 |
- |
|
| 71 |
- func executeAnchoredQuery( |
|
| 72 |
- sampleType: HKSampleType, |
|
| 73 |
- completion: @escaping ([HKSample], [HKDeletedObject], HKQueryAnchor) -> Void |
|
| 74 |
- ) {
|
|
| 75 |
- let anchor = loadAnchor(for: sampleType) ?? HKQueryAnchor(byAdding: 0) |
|
| 76 |
- let query = HKAnchoredObjectQuery( |
|
| 77 |
- type: sampleType, |
|
| 78 |
- predicate: nil, |
|
| 79 |
- anchor: anchor, |
|
| 80 |
- limit: HKObjectQueryNoLimit |
|
| 81 |
- ) { _, samples, deletedObjects, newAnchor, error in
|
|
| 82 |
- guard let newAnchor = newAnchor else { return }
|
|
| 83 |
- self.saveAnchor(newAnchor, for: sampleType) |
|
| 84 |
- completion(samples ?? [], deletedObjects ?? [], newAnchor) |
|
| 85 |
- } |
|
| 86 |
- |
|
| 87 |
- healthStore.execute(query) |
|
| 88 |
- } |
|
| 89 |
-} |
|
| 90 |
-``` |
|
| 91 |
- |
|
| 92 |
-### 1.3 Observer Query (Real-time Changes) |
|
| 93 |
- |
|
| 94 |
-```swift |
|
| 95 |
-class HealthKitObserver {
|
|
| 96 |
- func setupObserverQueries(for types: [HKSampleType], handler: @escaping (HKSampleType) -> Void) {
|
|
| 97 |
- for sampleType in types {
|
|
| 98 |
- let query = HKObserverQuery(sampleType: sampleType, predicate: nil) { _, completionHandler, error in
|
|
| 99 |
- if error == nil {
|
|
| 100 |
- handler(sampleType) |
|
| 101 |
- } |
|
| 102 |
- completionHandler() |
|
| 103 |
- } |
|
| 104 |
- |
|
| 105 |
- healthStore.execute(query) |
|
| 106 |
- |
|
| 107 |
- // Important: Keep strong reference to prevent query from being deallocated |
|
| 108 |
- activeQueries.append(query) |
|
| 109 |
- } |
|
| 110 |
- } |
|
| 111 |
- |
|
| 112 |
- // Call this when background notification arrives |
|
| 113 |
- func backgroundFetch(completionHandler: @escaping (UIBackgroundFetchResult) -> Void) {
|
|
| 114 |
- // Re-run anchored queries to detect changes |
|
| 115 |
- // Update snapshots and detect anomalies |
|
| 116 |
- // Persist any findings |
|
| 117 |
- completionHandler(.newData) |
|
| 118 |
- } |
|
| 119 |
-} |
|
| 120 |
-``` |
|
| 121 |
- |
|
| 122 |
- |
|
| 123 |
-## 2. Storage Implementation |
|
| 124 |
- |
|
| 125 |
-HealthProbe uses two storage layers: |
|
| 126 |
- |
|
| 127 |
-1. **Local Archive Store (source of truth)** |
|
| 128 |
- - Stores canonical HealthKit samples and all metadata exposed by the API |
|
| 129 |
- - Uses one schema for all selected data types, so workouts, samples, sources, devices, and metadata can be related later |
|
| 130 |
- - Maintains `firstSeen`, `lastSeen`, `lastVerified`, strict/semantic/fuzzy fingerprints, and integrity hashes |
|
| 131 |
- - Should be implemented with an explicit local database/archive format (not SwiftData model graphs for millions of samples) |
|
| 132 |
- |
|
| 133 |
-2. **SwiftData UI Store (derived/cache layer)** |
|
| 134 |
- - Stores settings, logs, import/check history, anomaly summaries, and precomputed values used by charts |
|
| 135 |
- - Can be rebuilt from the archive store |
|
| 136 |
- - Must not be treated as the only forensic copy |
|
| 137 |
- |
|
| 138 |
-### 2.1 SwiftData UI Models |
|
| 139 |
- |
|
| 140 |
-```swift |
|
| 141 |
-import SwiftData |
|
| 142 |
-import Foundation |
|
| 143 |
- |
|
| 144 |
-// MARK: - Core Models |
|
| 145 |
- |
|
| 146 |
-@Model |
|
| 147 |
-final class HealthSnapshot {
|
|
| 148 |
- /// Unique identifier |
|
| 149 |
- @Attribute(.unique) var id: String = UUID().uuidString |
|
| 150 |
- |
|
| 151 |
- /// When this snapshot was captured |
|
| 152 |
- var capturedAt: Date |
|
| 153 |
- |
|
| 154 |
- /// Sample type (e.g., "HKWorkout", "HKQuantity:HeartRate") |
|
| 155 |
- var sampleType: String |
|
| 156 |
- |
|
| 157 |
- /// Source device (e.g., "iPhone 15 Pro", "Apple Watch") |
|
| 158 |
- var sourceDevice: String |
|
| 159 |
- |
|
| 160 |
- /// Total samples of this type at capture time |
|
| 161 |
- var recordCount: Int |
|
| 162 |
- |
|
| 163 |
- /// MD5 of aggregated sample IDs (for integrity checking) |
|
| 164 |
- var integrityChecksum: String |
|
| 165 |
- |
|
| 166 |
- /// Aggregated counts by source: { "iPhone Health": 1200, "Apple Watch": 450 }
|
|
| 167 |
- var sourceDistribution: [String: Int] |
|
| 168 |
- |
|
| 169 |
- /// Metadata |
|
| 170 |
- var iosVersion: String |
|
| 171 |
- var appVersion: String |
|
| 172 |
- |
|
| 173 |
- init( |
|
| 174 |
- capturedAt: Date, |
|
| 175 |
- sampleType: String, |
|
| 176 |
- sourceDevice: String, |
|
| 177 |
- recordCount: Int, |
|
| 178 |
- integrityChecksum: String, |
|
| 179 |
- sourceDistribution: [String: Int], |
|
| 180 |
- iosVersion: String, |
|
| 181 |
- appVersion: String |
|
| 182 |
- ) {
|
|
| 183 |
- self.capturedAt = capturedAt |
|
| 184 |
- self.sampleType = sampleType |
|
| 185 |
- self.sourceDevice = sourceDevice |
|
| 186 |
- self.recordCount = recordCount |
|
| 187 |
- self.integrityChecksum = integrityChecksum |
|
| 188 |
- self.sourceDistribution = sourceDistribution |
|
| 189 |
- self.iosVersion = iosVersion |
|
| 190 |
- self.appVersion = appVersion |
|
| 191 |
- } |
|
| 192 |
-} |
|
| 193 |
- |
|
| 194 |
-@Model |
|
| 195 |
-final class AuditTrailEntry {
|
|
| 196 |
- @Attribute(.unique) var id: String = UUID().uuidString |
|
| 197 |
- var timestamp: Date |
|
| 198 |
- var eventType: String // "snapshot", "sync_event", "anomaly_detected", etc. |
|
| 199 |
- var message: String |
|
| 200 |
- var context: [String: String] // JSON-serializable context |
|
| 201 |
- |
|
| 202 |
- init(timestamp: Date, eventType: String, message: String, context: [String: String] = [:]) {
|
|
| 203 |
- self.timestamp = timestamp |
|
| 204 |
- self.eventType = eventType |
|
| 205 |
- self.message = message |
|
| 206 |
- self.context = context |
|
| 207 |
- } |
|
| 208 |
-} |
|
| 209 |
- |
|
| 210 |
-@Model |
|
| 211 |
-final class DetectedAnomaly {
|
|
| 212 |
- @Attribute(.unique) var id: String = UUID().uuidString |
|
| 213 |
- var detectedAt: Date |
|
| 214 |
- var type: String // "historical_insertion", "silent_deletion", "duplicate", "divergence" |
|
| 215 |
- var severity: String // "info", "warning", "critical" |
|
| 216 |
- var sampleType: String |
|
| 217 |
- var summary: String |
|
| 218 |
- var evidence: [String: String] // Forensic data |
|
| 219 |
- var resolved: Bool = false |
|
| 220 |
- var resolvedAt: Date? |
|
| 221 |
- |
|
| 222 |
- init( |
|
| 223 |
- detectedAt: Date, |
|
| 224 |
- type: String, |
|
| 225 |
- severity: String, |
|
| 226 |
- sampleType: String, |
|
| 227 |
- summary: String, |
|
| 228 |
- evidence: [String: String] = [:] |
|
| 229 |
- ) {
|
|
| 230 |
- self.detectedAt = detectedAt |
|
| 231 |
- self.type = type |
|
| 232 |
- self.severity = severity |
|
| 233 |
- self.sampleType = sampleType |
|
| 234 |
- self.summary = summary |
|
| 235 |
- self.evidence = evidence |
|
| 236 |
- } |
|
| 237 |
-} |
|
| 238 |
- |
|
| 239 |
-@Model |
|
| 240 |
-final class ContextStateChange {
|
|
| 241 |
- @Attribute(.unique) var id: String = UUID().uuidString |
|
| 242 |
- var timestamp: Date |
|
| 243 |
- var previousState: String // "local_only", "icloud_enabled", "icloud_sync_active" |
|
| 244 |
- var newState: String |
|
| 245 |
- var details: String |
|
| 246 |
- |
|
| 247 |
- init(timestamp: Date, previousState: String, newState: String, details: String = "") {
|
|
| 248 |
- self.timestamp = timestamp |
|
| 249 |
- self.previousState = previousState |
|
| 250 |
- self.newState = newState |
|
| 251 |
- self.details = details |
|
| 252 |
- } |
|
| 253 |
-} |
|
| 254 |
- |
|
| 255 |
-// MARK: - Model Container Setup |
|
| 256 |
- |
|
| 257 |
-func createModelContainer() throws -> ModelContainer {
|
|
| 258 |
- let schema = Schema([ |
|
| 259 |
- HealthSnapshot.self, |
|
| 260 |
- AuditTrailEntry.self, |
|
| 261 |
- DetectedAnomaly.self, |
|
| 262 |
- ContextStateChange.self, |
|
| 263 |
- ]) |
|
| 264 |
- |
|
| 265 |
- let modelConfiguration = ModelConfiguration( |
|
| 266 |
- schema: schema, |
|
| 267 |
- isStoredInMemoryOnly: false, |
|
| 268 |
- cloudKitDatabase: .none // Local only in MVP |
|
| 269 |
- ) |
|
| 270 |
- |
|
| 271 |
- return try ModelContainer(for: schema, configurations: [modelConfiguration]) |
|
| 272 |
-} |
|
| 273 |
-``` |
|
| 274 |
- |
|
| 275 |
-### 2.2 Local Archive Store Contract |
|
| 276 |
- |
|
| 277 |
-The archive store should expose a small service interface rather than leaking SQL/archive details into UI code: |
|
| 278 |
- |
|
| 279 |
-```swift |
|
| 280 |
-protocol HealthArchiveStore {
|
|
| 281 |
- func upsertSamples(_ samples: [HKSample], observedAt: Date) async throws -> HealthArchiveWriteSummary |
|
| 282 |
- func markVerification(sampleType: HKSampleType, verifiedAt: Date) async throws |
|
| 283 |
- func recordDisappearance(sampleUUIDHash: String, sampleTypeIdentifier: String, observedMissingAt: Date) async throws |
|
| 284 |
- func records(for request: HealthArchiveRecordRequest) async throws -> [ArchivedHealthRecord] |
|
| 285 |
- func exportReport(_ request: HealthArchiveReportRequest) async throws -> URL |
|
| 286 |
-} |
|
| 287 |
-``` |
|
| 288 |
- |
|
| 289 |
-Archive rows should preserve: |
|
| 290 |
-- HealthKit UUID where exposed |
|
| 291 |
-- type identifier, start/end date, value, unit |
|
| 292 |
-- source, source revision, bundle identifier, version/build/product type where available |
|
| 293 |
-- `HKDevice` fields exposed by HealthKit |
|
| 294 |
-- full metadata dictionary as structured data |
|
| 295 |
-- relationship keys for workouts, events, and related samples where available |
|
| 296 |
-- fingerprints for matching records across HealthProbe, Apple Health XML exports, and backup database extracts |
|
| 297 |
- |
|
| 298 |
-The MVP implementation is `SQLiteHealthArchiveStore`, an actor-isolated SQLite archive in Application Support. It is populated from HealthKit anchored-query pages before SwiftData receives derived snapshot/index rows. |
|
| 299 |
- |
|
| 300 |
- |
|
| 301 |
-## 3. Anomaly Detection Implementation |
|
| 302 |
- |
|
| 303 |
-```swift |
|
| 304 |
-class AnomalyDetector {
|
|
| 305 |
- private let modelContext: ModelContext |
|
| 306 |
- private let healthKitManager: HealthKitManager |
|
| 307 |
- |
|
| 308 |
- // MARK: - Historical Insertion Detection |
|
| 309 |
- |
|
| 310 |
- func detectHistoricalInsertions( |
|
| 311 |
- newSamples: [HKSample], |
|
| 312 |
- completion: @escaping ([DetectedAnomaly]) -> Void |
|
| 313 |
- ) {
|
|
| 314 |
- var anomalies: [DetectedAnomaly] = [] |
|
| 315 |
- let now = Date() |
|
| 316 |
- |
|
| 317 |
- for sample in newSamples {
|
|
| 318 |
- let ageInDays = Calendar.current.dateComponents([.day], from: sample.startDate, to: now).day ?? 0 |
|
| 319 |
- |
|
| 320 |
- // Check if sample is older than 7 days but was just added |
|
| 321 |
- if ageInDays > 7 {
|
|
| 322 |
- let anomaly = DetectedAnomaly( |
|
| 323 |
- detectedAt: now, |
|
| 324 |
- type: "historical_insertion", |
|
| 325 |
- severity: "medium", |
|
| 326 |
- sampleType: sample.sampleType.identifier, |
|
| 327 |
- summary: "Sample from \(ageInDays) days ago appeared in HealthKit", |
|
| 328 |
- evidence: [ |
|
| 329 |
- "original_date": ISO8601DateFormatter().string(from: sample.startDate), |
|
| 330 |
- "age_days": String(ageInDays), |
|
| 331 |
- "sample_id": sample.uuid.uuidString, |
|
| 332 |
- ] |
|
| 333 |
- ) |
|
| 334 |
- anomalies.append(anomaly) |
|
| 335 |
- } |
|
| 336 |
- } |
|
| 337 |
- |
|
| 338 |
- completion(anomalies) |
|
| 339 |
- } |
|
| 340 |
- |
|
| 341 |
- // MARK: - Silent Deletion Detection |
|
| 342 |
- |
|
| 343 |
- func detectSilentDeletions( |
|
| 344 |
- previousSnapshot: HealthSnapshot, |
|
| 345 |
- currentSnapshot: HealthSnapshot, |
|
| 346 |
- completion: @escaping ([DetectedAnomaly]) -> Void |
|
| 347 |
- ) {
|
|
| 348 |
- var anomalies: [DetectedAnomaly] = [] |
|
| 349 |
- |
|
| 350 |
- let previousCount = previousSnapshot.recordCount |
|
| 351 |
- let currentCount = currentSnapshot.recordCount |
|
| 352 |
- let loss = previousCount - currentCount |
|
| 353 |
- |
|
| 354 |
- if loss > 0 {
|
|
| 355 |
- let lossPercent = Double(loss) / Double(previousCount) * 100 |
|
| 356 |
- let severity = lossPercent > 10 ? "critical" : lossPercent > 5 ? "warning" : "info" |
|
| 357 |
- |
|
| 358 |
- let anomaly = DetectedAnomaly( |
|
| 359 |
- detectedAt: Date(), |
|
| 360 |
- type: "silent_deletion", |
|
| 361 |
- severity: severity, |
|
| 362 |
- sampleType: previousSnapshot.sampleType, |
|
| 363 |
- summary: "\(loss) samples missing (\(String(format: "%.1f", lossPercent))%)", |
|
| 364 |
- evidence: [ |
|
| 365 |
- "previous_count": String(previousCount), |
|
| 366 |
- "current_count": String(currentCount), |
|
| 367 |
- "loss_count": String(loss), |
|
| 368 |
- "loss_percent": String(format: "%.1f", lossPercent), |
|
| 369 |
- "time_gap": String(describing: Date().timeIntervalSince(previousSnapshot.capturedAt)), |
|
| 370 |
- ] |
|
| 371 |
- ) |
|
| 372 |
- anomalies.append(anomaly) |
|
| 373 |
- } |
|
| 374 |
- |
|
| 375 |
- completion(anomalies) |
|
| 376 |
- } |
|
| 377 |
- |
|
| 378 |
- // MARK: - Duplicate Detection |
|
| 379 |
- |
|
| 380 |
- func detectDuplicates( |
|
| 381 |
- samples: [HKSample], |
|
| 382 |
- completion: @escaping ([DetectedAnomaly]) -> Void |
|
| 383 |
- ) {
|
|
| 384 |
- var anomalies: [DetectedAnomaly] = [] |
|
| 385 |
- var fingerprints: [String: [HKSample]] = [:] |
|
| 386 |
- |
|
| 387 |
- // Group by fingerprint |
|
| 388 |
- for sample in samples {
|
|
| 389 |
- let fingerprint = createFingerprint(for: sample) |
|
| 390 |
- fingerprints[fingerprint, default: []].append(sample) |
|
| 391 |
- } |
|
| 392 |
- |
|
| 393 |
- // Find duplicates |
|
| 394 |
- for (fingerprint, dupes) in fingerprints where dupes.count > 1 {
|
|
| 395 |
- let anomaly = DetectedAnomaly( |
|
| 396 |
- detectedAt: Date(), |
|
| 397 |
- type: "duplicate", |
|
| 398 |
- severity: "low", |
|
| 399 |
- sampleType: dupes[0].sampleType.identifier, |
|
| 400 |
- summary: "\(dupes.count) duplicate records found", |
|
| 401 |
- evidence: [ |
|
| 402 |
- "fingerprint": fingerprint, |
|
| 403 |
- "count": String(dupes.count), |
|
| 404 |
- ] |
|
| 405 |
- ) |
|
| 406 |
- anomalies.append(anomaly) |
|
| 407 |
- } |
|
| 408 |
- |
|
| 409 |
- completion(anomalies) |
|
| 410 |
- } |
|
| 411 |
- |
|
| 412 |
- // MARK: - Divergence Detection |
|
| 413 |
- |
|
| 414 |
- func detectDivergence( |
|
| 415 |
- currentTrend: [Date: Double], |
|
| 416 |
- historicalBaseline: [Date: Double], |
|
| 417 |
- completion: @escaping ([DetectedAnomaly]) -> Void |
|
| 418 |
- ) {
|
|
| 419 |
- // Calculate standard deviations |
|
| 420 |
- let baselineStdDev = standardDeviation(values: Array(historicalBaseline.values)) |
|
| 421 |
- let currentStdDev = standardDeviation(values: Array(currentTrend.values)) |
|
| 422 |
- |
|
| 423 |
- if currentStdDev > baselineStdDev * 2.0 {
|
|
| 424 |
- let anomaly = DetectedAnomaly( |
|
| 425 |
- detectedAt: Date(), |
|
| 426 |
- type: "divergence", |
|
| 427 |
- severity: "medium", |
|
| 428 |
- sampleType: "aggregated_metric", |
|
| 429 |
- summary: "Unusual trend detected (σ increased \(currentStdDev / baselineStdDev)x)", |
|
| 430 |
- evidence: [ |
|
| 431 |
- "baseline_stddev": String(format: "%.2f", baselineStdDev), |
|
| 432 |
- "current_stddev": String(format: "%.2f", currentStdDev), |
|
| 433 |
- "ratio": String(format: "%.2f", currentStdDev / baselineStdDev), |
|
| 434 |
- ] |
|
| 435 |
- ) |
|
| 436 |
- completion([anomaly]) |
|
| 437 |
- } else {
|
|
| 438 |
- completion([]) |
|
| 439 |
- } |
|
| 440 |
- } |
|
| 441 |
- |
|
| 442 |
- // MARK: - Helpers |
|
| 443 |
- |
|
| 444 |
- private func createFingerprint(for sample: HKSample) -> String {
|
|
| 445 |
- let formatter = ISO8601DateFormatter() |
|
| 446 |
- let startStr = formatter.string(from: sample.startDate) |
|
| 447 |
- let endStr = formatter.string(from: sample.endDate) |
|
| 448 |
- let type = sample.sampleType.identifier |
|
| 449 |
- let source = sample.sourceRevision.source.name |
|
| 450 |
- |
|
| 451 |
- return "\(type)|\(startStr)|\(endStr)|\(source)".addingPercentEncoding(withAllowedCharacters: .alphanumerics) ?? "" |
|
| 452 |
- } |
|
| 453 |
- |
|
| 454 |
- private func standardDeviation(values: [Double]) -> Double {
|
|
| 455 |
- let mean = values.reduce(0, +) / Double(values.count) |
|
| 456 |
- let squaredDiffs = values.map { pow($0 - mean, 2) }
|
|
| 457 |
- let variance = squaredDiffs.reduce(0, +) / Double(values.count) |
|
| 458 |
- return sqrt(variance) |
|
| 459 |
- } |
|
| 460 |
-} |
|
| 461 |
-``` |
|
| 462 |
- |
|
| 463 |
- |
|
| 464 |
-## 4. Context Monitoring (Background Thread) |
|
| 465 |
- |
|
| 466 |
-HealthProbe does not sync its own database through iCloud/CloudKit. This service only logs Health/iCloud state as context for later forensic correlation. |
|
| 467 |
- |
|
| 468 |
-```swift |
|
| 469 |
-class ContextMonitor {
|
|
| 470 |
- private let modelContext: ModelContext |
|
| 471 |
- private let queue = DispatchQueue(label: "com.healthprobe.sync-monitor", qos: .background) |
|
| 472 |
- |
|
| 473 |
- private var previousHealthCloudState: String = "unknown" |
|
| 474 |
- |
|
| 475 |
- func startMonitoring() {
|
|
| 476 |
- queue.async {
|
|
| 477 |
- self.monitorContext() |
|
| 478 |
- } |
|
| 479 |
- } |
|
| 480 |
- |
|
| 481 |
- private func monitorContext() {
|
|
| 482 |
- // Check iCloud state |
|
| 483 |
- let iCloudToken = FileManager.default.ubiquityIdentityToken |
|
| 484 |
- let currentState = iCloudToken != nil ? "icloud_enabled" : "local_only" |
|
| 485 |
- |
|
| 486 |
- if currentState != previousHealthCloudState {
|
|
| 487 |
- logContextChange(from: previousHealthCloudState, to: currentState) |
|
| 488 |
- previousHealthCloudState = currentState |
|
| 489 |
- |
|
| 490 |
- // Schedule archive verification on state change |
|
| 491 |
- DispatchQueue.main.async {
|
|
| 492 |
- NotificationCenter.default.post(name: NSNotification.Name("HealthContextChanged"), object: nil)
|
|
| 493 |
- } |
|
| 494 |
- } |
|
| 495 |
- } |
|
| 496 |
- |
|
| 497 |
- private func logContextChange(from: String, to: String) {
|
|
| 498 |
- let change = ContextStateChange( |
|
| 499 |
- timestamp: Date(), |
|
| 500 |
- previousState: from, |
|
| 501 |
- newState: to, |
|
| 502 |
- details: "iCloud state changed" |
|
| 503 |
- ) |
|
| 504 |
- |
|
| 505 |
- do {
|
|
| 506 |
- modelContext.insert(change) |
|
| 507 |
- try modelContext.save() |
|
| 508 |
- |
|
| 509 |
- let auditEntry = AuditTrailEntry( |
|
| 510 |
- timestamp: Date(), |
|
| 511 |
- eventType: "health_context_change", |
|
| 512 |
- message: "Health cloud context: \(from) → \(to)", |
|
| 513 |
- context: ["previous": from, "current": to] |
|
| 514 |
- ) |
|
| 515 |
- modelContext.insert(auditEntry) |
|
| 516 |
- try modelContext.save() |
|
| 517 |
- } catch {
|
|
| 518 |
- print("Error logging context change: \(error)")
|
|
| 519 |
- } |
|
| 520 |
- } |
|
| 521 |
-} |
|
| 522 |
-``` |
|
| 523 |
- |
|
| 524 |
- |
|
| 525 |
-## 5. Integration into App Lifecycle |
|
| 526 |
- |
|
| 527 |
-```swift |
|
| 528 |
-@main |
|
| 529 |
-struct HealthProbeApp: App {
|
|
| 530 |
- @StateObject private var healthKitManager = HealthKitManager.shared |
|
| 531 |
- @StateObject private var contextMonitor: ContextMonitor |
|
| 532 |
- let modelContainer: ModelContainer |
|
| 533 |
- |
|
| 534 |
- init() {
|
|
| 535 |
- do {
|
|
| 536 |
- modelContainer = try createModelContainer() |
|
| 537 |
- let context = ModelContext(modelContainer) |
|
| 538 |
- _contextMonitor = StateObject(wrappedValue: ContextMonitor(modelContext: context)) |
|
| 539 |
- } catch {
|
|
| 540 |
- fatalError("Could not initialize model container: \(error)")
|
|
| 541 |
- } |
|
| 542 |
- } |
|
| 543 |
- |
|
| 544 |
- var body: some Scene {
|
|
| 545 |
- WindowGroup {
|
|
| 546 |
- ContentView() |
|
| 547 |
- .modelContainer(modelContainer) |
|
| 548 |
- .onAppear {
|
|
| 549 |
- // Request HealthKit permissions |
|
| 550 |
- healthKitManager.requestAuthorization { success, error in
|
|
| 551 |
- if success {
|
|
| 552 |
- // Start context monitoring and archive capture |
|
| 553 |
- contextMonitor.startMonitoring() |
|
| 554 |
- captureInitialSnapshot() |
|
| 555 |
- } |
|
| 556 |
- } |
|
| 557 |
- } |
|
| 558 |
- .onReceive(Timer.publish(every: 3600).autoconnect()) { _ in
|
|
| 559 |
- // Periodic check every hour |
|
| 560 |
- refreshHealthData() |
|
| 561 |
- } |
|
| 562 |
- } |
|
| 563 |
- } |
|
| 564 |
- |
|
| 565 |
- private func captureInitialSnapshot() {
|
|
| 566 |
- // Implement snapshot capture |
|
| 567 |
- } |
|
| 568 |
- |
|
| 569 |
- private func refreshHealthData() {
|
|
| 570 |
- // Implement periodic refresh |
|
| 571 |
- } |
|
| 572 |
-} |
|
| 573 |
-``` |
|
| 574 |
- |
|
| 575 |
- |
|
| 576 |
-## 6. Testing Strategy |
|
| 577 |
- |
|
| 578 |
-### Unit Tests |
|
| 579 |
-```swift |
|
| 580 |
-class AnomalyDetectorTests: XCTestCase {
|
|
| 581 |
- var detector: AnomalyDetector! |
|
| 582 |
- |
|
| 583 |
- override func setUp() {
|
|
| 584 |
- super.setUp() |
|
| 585 |
- detector = AnomalyDetector(...) |
|
| 586 |
- } |
|
| 587 |
- |
|
| 588 |
- func testDetectsHistoricalInsertion() {
|
|
| 589 |
- // Create sample from 30 days ago |
|
| 590 |
- // Assert: anomaly detected |
|
| 591 |
- } |
|
| 592 |
- |
|
| 593 |
- func testDetectsSilentDeletion() {
|
|
| 594 |
- // Create two snapshots, second has fewer records |
|
| 595 |
- // Assert: anomaly detected with correct loss percentage |
|
| 596 |
- } |
|
| 597 |
-} |
|
| 598 |
-``` |
|
| 599 |
- |
|
| 600 |
-### Integration Tests |
|
| 601 |
-- ✅ HealthKit query performance (anchor efficiency) |
|
| 602 |
-- ✅ Local archive persistence and recovery |
|
| 603 |
-- ✅ SwiftData cache rebuild from archive |
|
| 604 |
-- ✅ Background context monitoring accuracy |
|
| 605 |
-- ✅ Anomaly detection on real HealthKit data |
|
| 606 |
- |
|
| 607 |
- |
|
| 608 |
-## 7. Performance Considerations |
|
| 609 |
- |
|
| 610 |
-| Operation | Target | Notes | |
|
| 611 |
-|-----------|--------|-------| |
|
| 612 |
-| Anchored query | < 5 sec | Background, user perceives delay > 2s | |
|
| 613 |
-| Anomaly detection | < 2 sec | Should not block UI | |
|
| 614 |
-| SwiftData cache update | < 1 sec | Can run on main thread only after archive work completes | |
|
| 615 |
-| Archive write | Background | Stream large imports; never build full high-frequency datasets in memory | |
|
| 616 |
-| Background check | < 30 sec | iOS allows 30 min for background fetch | |
|
| 617 |
- |
|
| 618 |
- |
|
| 619 |
-## 8. Deployment Checklist |
|
| 620 |
- |
|
| 621 |
-- [ ] HealthKit read permissions declared in Info.plist |
|
| 622 |
-- [ ] Background Modes enabled ("Background Fetch")
|
|
| 623 |
-- [ ] SwiftData model migrations tested |
|
| 624 |
-- [ ] Local archive schema migrations tested |
|
| 625 |
-- [ ] Privacy Policy updated (what data is collected) |
|
| 626 |
-- [ ] Accessibility review (VoiceOver, Dynamic Type) |
|
| 627 |
- |
|
| 628 |
- |
|
| 629 |
-*HealthProbe Implementation Guide v1.0 — 2026-05-01* |
|
@@ -1,438 +0,0 @@ |
||
| 1 |
-# HealthProbe – Open Source Publication Guidelines |
|
| 2 |
- |
|
| 3 |
-**Purpose:** Ensure documentation is accurate, responsible, and suitable for public release |
|
| 4 |
-**Date:** 2026-05-01 |
|
| 5 |
-**Status:** Pre-publication review |
|
| 6 |
- |
|
| 7 |
- |
|
| 8 |
-## 1. Key Principles for Open Source |
|
| 9 |
- |
|
| 10 |
-1. **Neutrality:** Describe *observed behavior*, not conspiracy |
|
| 11 |
-2. **Precision:** Distinguish between *facts*, *patterns*, and *theories* |
|
| 12 |
-3. **Humility:** Acknowledge unknowns and limitations |
|
| 13 |
-4. **Responsibility:** Don't speculate about Apple's intentions |
|
| 14 |
-5. **Reproducibility:** All claims must be testable |
|
| 15 |
- |
|
| 16 |
- |
|
| 17 |
-## 2. Content Review – Flagged Items |
|
| 18 |
- |
|
| 19 |
-### 🔴 HIGH PRIORITY: Reframe Tone |
|
| 20 |
- |
|
| 21 |
-**Issue 1: Section 2.1 "The September 2025 Mass Data Loss Event"** |
|
| 22 |
- |
|
| 23 |
-**Current language:** |
|
| 24 |
-``` |
|
| 25 |
-Suspected triggers: |
|
| 26 |
- - Device migration (iCloud sync state transitions) |
|
| 27 |
- - OS upgrade/downgrade cycles |
|
| 28 |
- - Backup restore operations |
|
| 29 |
- - HealthKit database re-indexing |
|
| 30 |
- - iCloud sync divergence |
|
| 31 |
-``` |
|
| 32 |
- |
|
| 33 |
-**Problem:** Lists "suspected triggers" without evidence; reads like accusations. |
|
| 34 |
- |
|
| 35 |
-**Revision for open source:** |
|
| 36 |
-``` |
|
| 37 |
-**Preliminary observations from user reports suggest correlation with:** |
|
| 38 |
- - Device migration or iCloud sync state changes |
|
| 39 |
- - OS updates (particularly iOS 26.x) |
|
| 40 |
- - Backup restore operations |
|
| 41 |
- - Data re-indexing |
|
| 42 |
- |
|
| 43 |
-**NOTE:** These are patterns observed in reports, not confirmed causal links. |
|
| 44 |
-Actual root causes require access to Apple system logs. |
|
| 45 |
-``` |
|
| 46 |
- |
|
| 47 |
- |
|
| 48 |
-**Issue 2: "Root cause theories" sections** |
|
| 49 |
- |
|
| 50 |
-**Current language:** |
|
| 51 |
-``` |
|
| 52 |
-**Root cause theories:** |
|
| 53 |
- - iCloud sync restoring from outdated backup |
|
| 54 |
- - Third-party fitness app injecting historical reconstructions |
|
| 55 |
- - HealthKit recovery logic applying retroactive corrections |
|
| 56 |
- - Cross-device sync desynchronization |
|
| 57 |
-``` |
|
| 58 |
- |
|
| 59 |
-**Problem:** "Theories" is vague. Some are highly speculative; "third-party apps injecting" sounds accusatory. |
|
| 60 |
- |
|
| 61 |
-**Revision for open source:** |
|
| 62 |
-``` |
|
| 63 |
-**Possible mechanisms** (listed for documentation, not as conclusions): |
|
| 64 |
- - iCloud sync merging data from outdated backup |
|
| 65 |
- - Legitimate algorithmic recalculation (e.g., HR baseline updates) |
|
| 66 |
- - Data misalignment across multiple devices during sync |
|
| 67 |
- - Timestamp reconciliation during restore operations |
|
| 68 |
- |
|
| 69 |
-**These possibilities are inferred from observed patterns, not system internals.** |
|
| 70 |
-Apple has not confirmed mechanisms. |
|
| 71 |
-``` |
|
| 72 |
- |
|
| 73 |
- |
|
| 74 |
-### 🟡 MEDIUM PRIORITY: Add Disclaimers |
|
| 75 |
- |
|
| 76 |
-**Issue 3: "Concrete observed cases" section 2.2** |
|
| 77 |
- |
|
| 78 |
-**Current:** Lists examples without caveats. |
|
| 79 |
- |
|
| 80 |
-**Add disclaimer:** |
|
| 81 |
-``` |
|
| 82 |
-## 2.2 Observed Anomalies – Data Note |
|
| 83 |
- |
|
| 84 |
-⚠️ **IMPORTANT:** These patterns have been observed in user reports and |
|
| 85 |
-HealthProbe testing, but represent a limited dataset. They are NOT |
|
| 86 |
-confirmed bugs, and may have benign explanations: |
|
| 87 |
- |
|
| 88 |
-- Historical insertions could be legitimate corrections/backfills |
|
| 89 |
-- Silent deletions could be user actions or incomplete HealthKit queries |
|
| 90 |
-- Duplicates could be transient sync artifacts (self-healing within 24h) |
|
| 91 |
-- Divergence could reflect algorithm updates or device recalibration |
|
| 92 |
- |
|
| 93 |
-HealthProbe documents *observations*, not diagnoses. |
|
| 94 |
-``` |
|
| 95 |
- |
|
| 96 |
- |
|
| 97 |
-**Issue 4: "Why undetected" section** |
|
| 98 |
- |
|
| 99 |
-**Current language:** |
|
| 100 |
-``` |
|
| 101 |
-**Why undetected:** |
|
| 102 |
-- No notification from Apple Health |
|
| 103 |
-- Users discover loss retrospectively (weeks/months later) |
|
| 104 |
-- No audit trail to identify exactly *when* or *what* was lost |
|
| 105 |
-``` |
|
| 106 |
- |
|
| 107 |
-**Problem:** Reads like Apple is hiding data loss intentionally. |
|
| 108 |
- |
|
| 109 |
-**Revision:** |
|
| 110 |
-``` |
|
| 111 |
-**Why current mechanisms may not catch this:** |
|
| 112 |
-- Health.app provides no built-in audit trail for historical changes |
|
| 113 |
-- Data loss is often not immediately obvious (daily view may not change much) |
|
| 114 |
-- Users cannot easily compare snapshots over time |
|
| 115 |
-- Some anomalies resolve automatically within 24-72 hours (self-healing sync) |
|
| 116 |
-``` |
|
| 117 |
- |
|
| 118 |
- |
|
| 119 |
-### 🟡 MEDIUM PRIORITY: Soften Certainty Language |
|
| 120 |
- |
|
| 121 |
-**Issue 5: Executive summary opening** |
|
| 122 |
- |
|
| 123 |
-**Current:** |
|
| 124 |
-``` |
|
| 125 |
-Apple Health data loss events (confirmed Sept 2025 incident, ongoing sporadic reports) |
|
| 126 |
-``` |
|
| 127 |
- |
|
| 128 |
-**Problem:** "Confirmed incident" is too strong without official Apple acknowledgment. |
|
| 129 |
- |
|
| 130 |
-**Revision:** |
|
| 131 |
-``` |
|
| 132 |
-Reports of Apple Health data loss (September 2025 timeframe, ongoing user reports) |
|
| 133 |
-``` |
|
| 134 |
- |
|
| 135 |
- |
|
| 136 |
-**Issue 6: Throughout documentation** |
|
| 137 |
- |
|
| 138 |
-**Replace** these phrases: |
|
| 139 |
-| Current | Replace with | |
|
| 140 |
-|---------|--------------| |
|
| 141 |
-| "Apple Health data loss" | "Reported Apple Health data anomalies" or "User-observed data gaps" | |
|
| 142 |
-| "confirmed bug" | "potential issue" or "reported anomaly" | |
|
| 143 |
-| "undetected" | "not immediately visible to users" | |
|
| 144 |
-| "corrupted" | "inconsistent" or "unexpected state" | |
|
| 145 |
- |
|
| 146 |
- |
|
| 147 |
-### 🟡 MEDIUM PRIORITY: Privacy/Security Section Expansion |
|
| 148 |
- |
|
| 149 |
-**Current Limitation:** Section 10 exists but is brief. |
|
| 150 |
- |
|
| 151 |
-**Add to "Risks & Limitations" document:** |
|
| 152 |
- |
|
| 153 |
-```markdown |
|
| 154 |
-## Important Caveats for Open Source Users |
|
| 155 |
- |
|
| 156 |
-### What HealthProbe Cannot Know |
|
| 157 |
-- Whether data loss is a bug, user action, or legitimate system operation |
|
| 158 |
-- Exact root cause (only observations, not system internals) |
|
| 159 |
-- Cross-device behavior (requires manual export from multiple devices) |
|
| 160 |
-- iCloud backend state (only observes local HealthKit) |
|
| 161 |
- |
|
| 162 |
-### What Users Should Understand |
|
| 163 |
-- **False positives expected:** Some "anomalies" may resolve automatically |
|
| 164 |
-- **Incomplete record:** Uninstalling HealthProbe loses all audit history |
|
| 165 |
-- **No guarantees:** HealthProbe itself could have bugs; don't rely solely on it |
|
| 166 |
-- **Comparison not validation:** Snapshot comparison detects differences, not errors |
|
| 167 |
- |
|
| 168 |
-### Recommended Usage |
|
| 169 |
-- Use as **documentation tool**, not as truth source |
|
| 170 |
-- Export data regularly as backup |
|
| 171 |
-- Compare findings with iCloud.com Health export when possible |
|
| 172 |
-- Report patterns, not individual anomalies, to Apple |
|
| 173 |
-``` |
|
| 174 |
- |
|
| 175 |
- |
|
| 176 |
-## 3. Content Audit Checklist |
|
| 177 |
- |
|
| 178 |
-Before release, verify: |
|
| 179 |
- |
|
| 180 |
-### Documentation Quality |
|
| 181 |
-- [ ] Every claim is either observable fact OR clearly labeled as theory/speculation |
|
| 182 |
-- [ ] "Suspected," "possible," "may" used where causality unclear |
|
| 183 |
-- [ ] Root causes described as inferences, not conclusions |
|
| 184 |
-- [ ] No language implying Apple intentionally hides issues |
|
| 185 |
-- [ ] Disclaimers present before speculative sections |
|
| 186 |
- |
|
| 187 |
-### Technical Accuracy |
|
| 188 |
-- [ ] HealthKit API descriptions verified against Apple docs |
|
| 189 |
-- [ ] Code examples tested/executable |
|
| 190 |
-- [ ] Performance claims have measurement basis |
|
| 191 |
-- [ ] Known limitations documented explicitly |
|
| 192 |
- |
|
| 193 |
-### Privacy Compliance |
|
| 194 |
-- [ ] No raw health sample data in examples |
|
| 195 |
-- [ ] HealthProbe CloudKit/iCloud sync is not described as a product goal |
|
| 196 |
-- [ ] User consent documented |
|
| 197 |
-- [ ] Data retention policy clear |
|
| 198 |
-- [ ] No tracking/analytics hidden in code |
|
| 199 |
- |
|
| 200 |
-### Responsible Disclosure |
|
| 201 |
-- [ ] References to Apple issues are neutral, not accusatory |
|
| 202 |
-- [ ] Links to DearApple properly contextualized |
|
| 203 |
-- [ ] No suggestion of intentional misconduct by Apple |
|
| 204 |
-- [ ] Recommendations for bug reporting included |
|
| 205 |
- |
|
| 206 |
- |
|
| 207 |
-## 4. Specific Revisions Needed |
|
| 208 |
- |
|
| 209 |
-### File: "Complete Specification & Motivations.md" |
|
| 210 |
- |
|
| 211 |
-**Location:** Section 2.1 (3 major edits) |
|
| 212 |
-``` |
|
| 213 |
-CHANGE: "Large-scale loss of Apple Health records reported" |
|
| 214 |
-TO: "Reports of large-scale Apple Health data anomalies" |
|
| 215 |
- |
|
| 216 |
-CHANGE: Entire "Why undetected" subsection |
|
| 217 |
-TO: [See Issue #4 above] |
|
| 218 |
- |
|
| 219 |
-CHANGE: All "suspected triggers" with confidence qualifier |
|
| 220 |
-TO: [See Issue #1 above] |
|
| 221 |
-``` |
|
| 222 |
- |
|
| 223 |
-**Location:** Section 2.2 (add disclaimer at top) |
|
| 224 |
-``` |
|
| 225 |
-ADD: [See Issue #4 above - the full disclaimer block] |
|
| 226 |
-``` |
|
| 227 |
- |
|
| 228 |
- |
|
| 229 |
-### File: "Forensics & Limitations.md" |
|
| 230 |
- |
|
| 231 |
-**Location:** Section 1 "Known Limitations" (add) |
|
| 232 |
-``` |
|
| 233 |
-ADD: Section 1.4 "Data Interpretation" |
|
| 234 |
- |
|
| 235 |
-1.4 Data Interpretation Risks |
|
| 236 |
- |
|
| 237 |
-HealthProbe documents observations, not diagnoses: |
|
| 238 |
- |
|
| 239 |
-| Finding | What it means | What it does NOT mean | |
|
| 240 |
-|---------|---------------|----------------------| |
|
| 241 |
-| **Silent deletion detected** | Samples in snapshot A absent in B | Data is corrupted or lost forever | |
|
| 242 |
-| **Historical insertion** | Sample has old date, recent first-seen | Apple maliciously backdated data | |
|
| 243 |
-| **Duplicates found** | Multiple identical samples present | System is broken; may auto-deduplicate | |
|
| 244 |
-| **Divergence trend** | Metric value changing over time | Algorithm bug; could be calibration or update | |
|
| 245 |
- |
|
| 246 |
-Always validate findings before drawing conclusions. |
|
| 247 |
-``` |
|
| 248 |
- |
|
| 249 |
-**Location:** Section 2.1 "Privacy Risks" |
|
| 250 |
-``` |
|
| 251 |
-CHANGE: "CRITICAL: user's personal health history exposed" |
|
| 252 |
-TO: "CRITICAL RISK IF: raw health data were exfiltrated" |
|
| 253 |
-(Emphasis: HealthProbe doesn't do this) |
|
| 254 |
-``` |
|
| 255 |
- |
|
| 256 |
- |
|
| 257 |
-### File: "Implementation Guide.md" |
|
| 258 |
- |
|
| 259 |
-**Add Section:** 0.5 "Ethical Implementation Notes" |
|
| 260 |
- |
|
| 261 |
-```markdown |
|
| 262 |
-## 0.5 Ethical Implementation Notes |
|
| 263 |
- |
|
| 264 |
-As an open-source health monitoring tool, HealthProbe should: |
|
| 265 |
- |
|
| 266 |
-1. **Never store/transmit raw health data** |
|
| 267 |
- - Code review required before adding any health sample export |
|
| 268 |
- |
|
| 269 |
-2. **Always ask before background operations** |
|
| 270 |
- - Background fetch enabled only with user consent |
|
| 271 |
- - Notify user of sync frequency in settings |
|
| 272 |
- |
|
| 273 |
-3. **Respect user autonomy** |
|
| 274 |
- - Easy to disable all monitoring |
|
| 275 |
- - Easy to export/delete all data |
|
| 276 |
- - Audit trail visible to user (not hidden) |
|
| 277 |
- |
|
| 278 |
-4. **Accept limitations gracefully** |
|
| 279 |
- - Don't claim certainty you don't have |
|
| 280 |
- - Document where you're guessing |
|
| 281 |
- - Encourage validation via Apple's tools |
|
| 282 |
-``` |
|
| 283 |
- |
|
| 284 |
- |
|
| 285 |
-## 5. External References & Linking |
|
| 286 |
- |
|
| 287 |
-### How to Reference DearApple Issues |
|
| 288 |
- |
|
| 289 |
-**Current:** Direct references to issues as "bugs" |
|
| 290 |
- |
|
| 291 |
-**Revision:** Neutral framing |
|
| 292 |
- |
|
| 293 |
-```markdown |
|
| 294 |
-**BAD:** |
|
| 295 |
-"See DearApple Issue #001 – mass data loss bug" |
|
| 296 |
- |
|
| 297 |
-**GOOD:** |
|
| 298 |
-"See DearApple Issue #001 – documented reports of Apple Health data loss" |
|
| 299 |
- |
|
| 300 |
-**BETTER:** |
|
| 301 |
-"For additional context on the observed anomalies, see [DearApple Issue #001] |
|
| 302 |
-(https://github.com/overbog/dear-apple/issues/...) which collects user reports |
|
| 303 |
-of similar patterns." |
|
| 304 |
-``` |
|
| 305 |
- |
|
| 306 |
- |
|
| 307 |
-## 6. README & Contributing Guidelines |
|
| 308 |
- |
|
| 309 |
-**Add file:** `CONTRIBUTING.md` (for open source) |
|
| 310 |
- |
|
| 311 |
-```markdown |
|
| 312 |
-# Contributing to HealthProbe |
|
| 313 |
- |
|
| 314 |
-## Data Integrity First |
|
| 315 |
- |
|
| 316 |
-When reporting anomalies or contributing code: |
|
| 317 |
- |
|
| 318 |
-1. **Distinguish facts from theories** |
|
| 319 |
- - Observed: "On 2026-03-15, step count dropped from 5000 to 2500" |
|
| 320 |
- - Theory: "This might be due to iCloud sync" |
|
| 321 |
- - Avoid: "iCloud sync corrupted my data" |
|
| 322 |
- |
|
| 323 |
-2. **Include evidence** |
|
| 324 |
- - Screenshots of HealthProbe audit trail |
|
| 325 |
- - Export from Health.app for comparison |
|
| 326 |
- - Device model, iOS version, app version |
|
| 327 |
- |
|
| 328 |
-3. **Respect privacy** |
|
| 329 |
- - Redact dates if identifying |
|
| 330 |
- - Remove specific health values if sensitive |
|
| 331 |
- - Mention: (e.g., "10 days of step data" not exact values) |
|
| 332 |
- |
|
| 333 |
-4. **Acknowledge unknowns** |
|
| 334 |
- - "I observed X, but I don't know if it's a bug or expected behavior" |
|
| 335 |
- |
|
| 336 |
-## Code Standards |
|
| 337 |
- |
|
| 338 |
-- Read-only HealthKit operations only |
|
| 339 |
-- No exfiltration of raw health data |
|
| 340 |
-- User consent required before new data collection |
|
| 341 |
-- Audit trail for all operations |
|
| 342 |
-``` |
|
| 343 |
- |
|
| 344 |
- |
|
| 345 |
-## 7. Release Checklist |
|
| 346 |
- |
|
| 347 |
-Before tagging v1.0.0: |
|
| 348 |
- |
|
| 349 |
-- [ ] All flagged content revised (Issues #1-6) |
|
| 350 |
-- [ ] Added disclaimers in 3 places (Issues #3, #4, #7) |
|
| 351 |
-- [ ] Softened certainty language throughout (Issue #5) |
|
| 352 |
-- [ ] Privacy/Security section expanded (Issue #4) |
|
| 353 |
-- [ ] Added "Ethical Implementation" section to code guide |
|
| 354 |
-- [ ] New CONTRIBUTING.md with data integrity guidelines |
|
| 355 |
-- [ ] License file present (recommend: MIT or Apache 2.0) |
|
| 356 |
-- [ ] README includes clear link to DearApple context |
|
| 357 |
-- [ ] Code examples tested and run-verified |
|
| 358 |
-- [ ] No hardcoded debugging/logging left in |
|
| 359 |
-- [ ] Legal review of liability disclaimers |
|
| 360 |
- |
|
| 361 |
- |
|
| 362 |
-## 8. Statement of Purpose (For README) |
|
| 363 |
- |
|
| 364 |
-```markdown |
|
| 365 |
-## Purpose |
|
| 366 |
- |
|
| 367 |
-HealthProbe is a **documentation and monitoring tool** designed to help users |
|
| 368 |
-understand their Apple HealthKit data state over time. |
|
| 369 |
- |
|
| 370 |
-**It is NOT:** |
|
| 371 |
-- A diagnostic tool (cannot confirm bugs) |
|
| 372 |
-- A data recovery tool |
|
| 373 |
-- A security auditing tool |
|
| 374 |
-- A replacement for Apple's Health app |
|
| 375 |
- |
|
| 376 |
-**It IS:** |
|
| 377 |
-- A local audit trail (what changed, when) |
|
| 378 |
-- An anomaly detector (unusual patterns documented) |
|
| 379 |
-- A forensic aid (exportable evidence for bug reports) |
|
| 380 |
-- Privacy-respecting (all local, no exfiltration) |
|
| 381 |
- |
|
| 382 |
-**Appropriate uses:** |
|
| 383 |
-- Personal monitoring of your own health data |
|
| 384 |
-- Documenting anomalies to report to Apple |
|
| 385 |
-- Researching HealthKit behavior (with proper ethics) |
|
| 386 |
-- Contributing data to DearApple investigation (with consent) |
|
| 387 |
- |
|
| 388 |
-**Inappropriate uses:** |
|
| 389 |
-- Claiming definitive proof of bugs without Apple confirmation |
|
| 390 |
-- Identifying or tracking other users |
|
| 391 |
-- Replacing professional medical advice |
|
| 392 |
-- Distributing unvalidated health data claims |
|
| 393 |
-``` |
|
| 394 |
- |
|
| 395 |
- |
|
| 396 |
-## 9. Review Before Publish |
|
| 397 |
- |
|
| 398 |
-Suggested external reviewers: |
|
| 399 |
-1. **Apple developer relations** — verify no confidential info disclosed |
|
| 400 |
-2. **Privacy researcher** — check data handling assumptions |
|
| 401 |
-3. **Legal counsel** — health data liability disclaimers |
|
| 402 |
-4. **DearApple maintainers** — coordinate messaging |
|
| 403 |
- |
|
| 404 |
- |
|
| 405 |
-## Summary: Key Changes for v1.0.0 Public Release |
|
| 406 |
- |
|
| 407 |
-| Issue | Severity | Action | Effort | |
|
| 408 |
-|-------|----------|--------|--------| |
|
| 409 |
-| Tone: "confirmed bug" → "reported anomaly" | 🔴 HIGH | S&R in 3 docs | 30 min | |
|
| 410 |
-| Add data interpretation disclaimers | 🟡 MED | New section in Forensics | 45 min | |
|
| 411 |
-| Soften causality language | 🟡 MED | S&R throughout | 20 min | |
|
| 412 |
-| Add ethics section to Implementation | 🟡 MED | New section | 30 min | |
|
| 413 |
-| Create CONTRIBUTING.md | 🟡 MED | New file | 30 min | |
|
| 414 |
-| Final legal/privacy review | 🟡 MED | External | 2-4 hours | |
|
| 415 |
- |
|
| 416 |
-**Total estimated effort:** 3-5 hours to make publication-ready |
|
| 417 |
- |
|
| 418 |
- |
|
| 419 |
-*HealthProbe – Open Source Governance v1.0* |
|
@@ -1,106 +1,118 @@ |
||
| 1 | 1 |
# HealthProbe Documentation Index |
| 2 | 2 |
|
| 3 |
-## Quick Navigation |
|
| 3 |
+**Canonical documentation root:** `HealthProbe/Doc/` |
|
| 4 | 4 |
|
| 5 |
-### 📋 Core Documentation |
|
| 5 |
+This directory is the only place for substantive HealthProbe documentation. Root-level `AGENTS.md` and `CLAUDE.md` are bootstrap pointers only, kept so agent tools can find this index. |
|
| 6 | 6 |
|
| 7 |
-1. **[Complete Specification & Motivations](HealthProbe%20–%20Complete%20Specification%20&%20Motivations.md)** |
|
| 8 |
- - Complete system design |
|
| 9 |
- - Concrete observed cases (Sept 2025 data loss + ongoing issues) |
|
| 10 |
- - Motivations for each feature |
|
| 11 |
- - Technical architecture & threading model |
|
| 12 |
- - Privacy & security guarantees |
|
| 7 |
+## Current Product Direction |
|
| 13 | 8 |
|
| 14 |
-2. **[MVP Specification](HealthProbe%20iOS%20–%20Specification%20(MVP).md)** *(original)* |
|
| 15 |
- - Feature scope for iOS 1.0 |
|
| 16 |
- - Core HealthKit monitoring approach |
|
| 9 |
+HealthProbe is a single-device, local Health DB Time Machine: |
|
| 10 |
+- capture selected HealthKit-accessible observations over time; |
|
| 11 |
+- reconstruct how the local Health database looked at a chosen observation date; |
|
| 12 |
+- explain local changes with consolidation-aware labels; |
|
| 13 |
+- preserve recovery-compatible archives and exports; |
|
| 14 |
+- keep the iOS app read-only with respect to HealthKit and iOS backups. |
|
| 17 | 15 |
|
| 16 |
+Target storage architecture: |
|
| 17 |
+- SQLite archive/analysis database is the source of truth; |
|
| 18 |
+- observations are stored differentially, not as recurring complete snapshots; |
|
| 19 |
+- Core Data is the rebuildable UI/report cache for expensive counts and summaries; |
|
| 20 |
+- SwiftData is legacy/prototype only and should not be expanded; |
|
| 21 |
+- existing prototype/test databases are disposable and may be reset for archive v2. |
|
| 18 | 22 |
|
| 19 |
-## Project Status |
|
| 23 |
+## How To Point Agents |
|
| 20 | 24 |
|
| 21 |
-| Component | Status | Notes | |
|
| 22 |
-|-----------|--------|-------| |
|
| 23 |
-| **iOS App Foundation** | ✅ Started | SwiftUI + SwiftData scaffolding in place | |
|
| 24 |
-| **Core Architecture** | 📋 Designed | See "Complete Specification" | |
|
| 25 |
-| **HealthKit Integration** | ⏳ Pending | Implement anchored queries, observer queries | |
|
| 26 |
-| **Anomaly Detection** | 📋 Designed | Logic documented, pending implementation | |
|
| 27 |
-| **Sync Context Logging** | 📋 Designed | Log Health/iCloud state as forensic context; do not sync HealthProbe data via iCloud | |
|
| 28 |
-| **UI Dashboard** | ⏳ Pending | Wireframes in Complete Specification | |
|
| 29 |
-| **Local Archive Store** | 📋 Designed | Robust on-device archive is the source of truth | |
|
| 30 |
-| **Reports & Point Exports** | 📋 Designed | Export only selected reports/record tables, not a complete routine dump | |
|
| 31 |
-| **macOS Companion** | 🔄 Future | Post-MVP enhancement | |
|
| 25 |
+Use the chapter map below. Send agents to the narrowest document that matches their task. |
|
| 32 | 26 |
|
| 27 |
+| If the task is about... | Send the agent to... | |
|
| 28 |
+|-------------------------|----------------------| |
|
| 29 |
+| Overall product scope, non-goals, future parking lot | [`01-product/Product-Specification.md`](01-product/Product-Specification.md) | |
|
| 30 |
+| MVP behavior and out-of-scope boundaries | [`01-product/MVP-Specification.md`](01-product/MVP-Specification.md) | |
|
| 31 |
+| Database design, archive schema, differential storage, SQL analysis | [`02-architecture/Database-Design.md`](02-architecture/Database-Design.md) | |
|
| 32 |
+| Core Data cache schema and invalidation | [`02-architecture/Core-Data-Cache-Design.md`](02-architecture/Core-Data-Cache-Design.md) | |
|
| 33 |
+| Export formats, manifests, streaming contract | [`02-architecture/Export-Specification.md`](02-architecture/Export-Specification.md) | |
|
| 34 |
+| Implementation workflow, HealthKit capture, exports, tests | [`02-architecture/Implementation-Guide.md`](02-architecture/Implementation-Guide.md) | |
|
| 35 |
+| Forensic limits, export meaning, recovery compatibility | [`01-product/Forensics-Limitations.md`](01-product/Forensics-Limitations.md) | |
|
| 36 |
+| General agent ownership and handoff rules | [`00-agent-guides/AGENTS.md`](00-agent-guides/AGENTS.md) | |
|
| 37 |
+| SwiftUI/UI work | [`00-agent-guides/CLAUDE.md`](00-agent-guides/CLAUDE.md) | |
|
| 38 |
+| Refactoring milestones and sequencing | [`04-project/Refactoring-Plan.md`](04-project/Refactoring-Plan.md) | |
|
| 39 |
+| Project status and refactoring priorities | [`04-project/IMPLEMENTATION_STATUS.md`](04-project/IMPLEMENTATION_STATUS.md) | |
|
| 40 |
+| Historical UI notes only | [`99-archive/`](99-archive/) | |
|
| 33 | 41 |
|
| 34 |
-## Motivation: Why HealthProbe Exists |
|
| 42 |
+## Chapters |
|
| 35 | 43 |
|
| 36 |
-**The Problem:** Apple Health data loss events (confirmed September 2025, ongoing sporadic reports) lack any detection mechanism. Users don't know their data has been lost, corrupted, or silently modified. |
|
| 44 |
+### 00 Agent Guides |
|
| 37 | 45 |
|
| 38 |
-**Concrete Examples:** |
|
| 39 |
-- **Historical insertions:** Workouts from 6+ months ago suddenly appearing |
|
| 40 |
-- **Silent deletions:** Multi-week gaps with no deletion notification |
|
| 41 |
-- **Duplicates:** Same workout syncing multiple times across devices |
|
| 42 |
-- **Divergence:** Metrics (steps, energy, HR) drifting without user action |
|
| 46 |
+- [`00-agent-guides/AGENTS.md`](00-agent-guides/AGENTS.md) |
|
| 47 |
+ Multi-agent development guide, ownership boundaries, protocol contracts, and current architecture decisions. |
|
| 43 | 48 |
|
| 44 |
-See **Complete Specification § 2** for detailed observed cases and forensic implications. |
|
| 49 |
+- [`00-agent-guides/CLAUDE.md`](00-agent-guides/CLAUDE.md) |
|
| 50 |
+ UI/SwiftUI-specific instructions aligned with the current Time Machine objective. |
|
| 45 | 51 |
|
| 52 |
+### 01 Product |
|
| 46 | 53 |
|
| 47 |
-## Next Steps |
|
| 54 |
+- [`01-product/Product-Specification.md`](01-product/Product-Specification.md) |
|
| 55 |
+ Full current product specification, motivations, architecture decision record, and non-goals. |
|
| 48 | 56 |
|
| 49 |
-1. **Implement HealthKit Integration** (`Sources/HealthKitMonitor.swift`) |
|
| 50 |
- - `HKAnchoredObjectQuery` for efficient incremental queries |
|
| 51 |
- - `HKObserverQuery` for real-time change notifications |
|
| 52 |
- - Track: Workouts, Heart Rate, Steps, Sleep, Activity Summary |
|
| 57 |
+- [`01-product/MVP-Specification.md`](01-product/MVP-Specification.md) |
|
| 58 |
+ MVP scope for the iOS app, including read-only behavior, differential storage expectations, exports, and out-of-scope restore/republication. |
|
| 53 | 59 |
|
| 54 |
-2. **Build Anomaly Detection** (`Sources/AnomalyDetector.swift`) |
|
| 55 |
- - Historical insertion detection |
|
| 56 |
- - Silent deletion detection |
|
| 57 |
- - Duplicate fingerprinting |
|
| 58 |
- - Divergence trend analysis |
|
| 60 |
+- [`01-product/Forensics-Limitations.md`](01-product/Forensics-Limitations.md) |
|
| 61 |
+ What the app can and cannot prove, how to interpret changes, and what recovery-compatible exports should preserve. |
|
| 59 | 62 |
|
| 60 |
-3. **Implement Local Archive Store** (`Sources/ArchiveStore.swift`) |
|
| 61 |
- - Single robust local database for all archived samples |
|
| 62 |
- - Preserve cross-type relationships, sources, devices, metadata, and fingerprints |
|
| 63 |
- - Keep SwiftData as UI/cache/settings/history only |
|
| 63 |
+### 02 Architecture |
|
| 64 | 64 |
|
| 65 |
-4. **Create UI Dashboard** (`Views/HealthStatusView.swift`) |
|
| 66 |
- - Show current health status |
|
| 67 |
- - Display active alerts |
|
| 68 |
- - Timeline of anomalies |
|
| 69 |
- - Audit trail viewer |
|
| 65 |
+- [`02-architecture/Database-Design.md`](02-architecture/Database-Design.md) |
|
| 66 |
+ Canonical database design. Start here for SQLite archive v2, Core Data cache boundaries, differential storage, point-in-time reconstruction, SQL diffs, recovery-compatible exports, reset policy, future migrations, and DB tests. |
|
| 70 | 67 |
|
| 68 |
+- [`02-architecture/Core-Data-Cache-Design.md`](02-architecture/Core-Data-Cache-Design.md) |
|
| 69 |
+ Target Core Data cache schema, invalidation rules, rebuild order, and legacy-device cache behavior. |
|
| 71 | 70 |
|
| 72 |
-## Key Design Decisions |
|
| 71 |
+- [`02-architecture/Export-Specification.md`](02-architecture/Export-Specification.md) |
|
| 72 |
+ JSON/CSV export envelope, canonical manifest hashing, item hashing, streaming/cancellation behavior, and provenance warnings. |
|
| 73 | 73 |
|
| 74 |
-| Decision | Rationale | |
|
| 75 |
-|----------|-----------| |
|
| 76 |
-| **Read-only + HealthKit** | Never modify health data; pure observation only | |
|
| 77 |
-| **Local-first storage** | Full functionality without internet; privacy-first | |
|
| 78 |
-| **Archive DB as truth** | Store HealthKit samples and metadata in a robust local database, not split per data type | |
|
| 79 |
-| **SwiftData as UI cache** | Keep precomputed values, settings, logs, and history for visualization only | |
|
| 80 |
-| **Anchored queries** | Minimize HealthKit load, reduce battery impact | |
|
| 81 |
-| **No HealthProbe iCloud sync** | Device HealthKit databases evolve independently; CloudKit sync adds complexity without proven forensic benefit | |
|
| 82 |
-| **Selective forensic capture** | When Health/iCloud rewrites or downsamples historical data, counts alone are insufficient; HealthProbe archives complete samples for selected types in one local store | |
|
| 74 |
+- [`02-architecture/Implementation-Guide.md`](02-architecture/Implementation-Guide.md) |
|
| 75 |
+ Technical implementation guide for HealthKit capture flow, change explanation, exports, context logging, UI integration, and testing. It references the canonical database design instead of redefining it. |
|
| 83 | 76 |
|
| 77 |
+### 03 UI |
|
| 78 |
+ |
|
| 79 |
+- [`03-ui/README.md`](03-ui/README.md) |
|
| 80 |
+ Entry point for active UI guidance and links to archived UI notes. |
|
| 81 |
+ |
|
| 82 |
+### 04 Project |
|
| 83 |
+ |
|
| 84 |
+- [`04-project/Refactoring-Plan.md`](04-project/Refactoring-Plan.md) |
|
| 85 |
+ Checkable milestone plan for the database-led refactor from prototype architecture to SQLite archive v2 + Core Data cache. |
|
| 86 |
+ |
|
| 87 |
+- [`04-project/IMPLEMENTATION_STATUS.md`](04-project/IMPLEMENTATION_STATUS.md) |
|
| 88 |
+ Current implementation status and refactoring priorities. |
|
| 89 |
+ |
|
| 90 |
+### 99 Archive |
|
| 91 |
+ |
|
| 92 |
+Files in [`99-archive/`](99-archive/) are historical implementation notes. They are kept for context only and are not product requirements. |
|
| 93 |
+ |
|
| 94 |
+## Removed / Replaced Objectives |
|
| 95 |
+ |
|
| 96 |
+These are not current objectives: |
|
| 97 |
+- HealthProbe CloudKit/iCloud sync; |
|
| 98 |
+- cross-device record-by-record comparison; |
|
| 99 |
+- count-drop-as-data-loss alerting; |
|
| 100 |
+- notification-led monitoring; |
|
| 101 |
+- community reporting or open-source publication commitments; |
|
| 102 |
+- macOS companion as committed product scope; |
|
| 103 |
+- in-app backup transplant, restore, or HealthKit re-publication; |
|
| 104 |
+- SwiftData as target persistence foundation; |
|
| 105 |
+- backward compatibility with prototype/test databases; |
|
| 106 |
+- recurring complete snapshots for large HealthKit datasets. |
|
| 84 | 107 |
|
| 85 |
-## Document Versions |
|
| 108 |
+## Document Maintenance Rules |
|
| 86 | 109 |
|
| 87 |
-- **v1.0** — 2026-05-01 — Initial comprehensive specification |
|
| 88 |
- - Concrete cases from DearApple |
|
| 89 |
- - Full technical architecture |
|
| 90 |
- - MVP feature scope + future roadmap |
|
| 91 |
-- **v1.1** — 2026-05-17 — Objective extension (post-findings) |
|
| 92 |
- - New observed behavior: Apple-side consolidation/downsampling can rewrite historical samples (not just add/delete) |
|
| 93 |
- - HealthProbe scope extended: from “counter of records” → “forensic backup agent” (local-only archives for selected types) |
|
| 94 |
-- **v1.2** — 2026-05-18 — Storage direction update |
|
| 95 |
- - Robust single local archive store becomes the source of truth |
|
| 96 |
- - SwiftData is limited to precomputed UI data, settings, logs, and history |
|
| 97 |
- - CloudKit/iCloud sync removed from product goals; reports and point exports replace routine complete export ambitions |
|
| 110 |
+1. Add new substantive docs only under `HealthProbe/Doc/`. |
|
| 111 |
+2. Update this index whenever a document is added, renamed, archived, or removed. |
|
| 112 |
+3. If a document becomes stale but may still be useful, move it to `99-archive/` and add a warning header. |
|
| 113 |
+4. Do not leave old copies in the repository root. |
|
| 114 |
+5. Product scope changes must update product docs first, then implementation docs, then code. |
|
| 98 | 115 |
|
| 99 | 116 |
--- |
| 100 | 117 |
|
| 101 |
-*HealthProbe: Guarding the integrity of your health data.* |
|
| 118 |
+*HealthProbe: local HealthKit observation history, recovery-compatible archives, no in-app restore.* |
|
@@ -19,13 +19,62 @@ struct HealthArchiveWriteSummary: Equatable, Sendable {
|
||
| 19 | 19 |
struct HealthArchiveRecordRequest: Equatable, Sendable {
|
| 20 | 20 |
let sampleTypeIdentifier: String? |
| 21 | 21 |
let fingerprints: Set<String> |
| 22 |
+ let disappearedOnly: Bool |
|
| 23 |
+ let firstSeenAfter: Date? |
|
| 24 |
+ let firstSeenBefore: Date? |
|
| 25 |
+ let afterCursor: RecordCursor? |
|
| 22 | 26 |
let limit: Int? |
| 27 |
+ |
|
| 28 |
+ init( |
|
| 29 |
+ sampleTypeIdentifier: String? = nil, |
|
| 30 |
+ fingerprints: Set<String> = [], |
|
| 31 |
+ disappearedOnly: Bool = false, |
|
| 32 |
+ firstSeenAfter: Date? = nil, |
|
| 33 |
+ firstSeenBefore: Date? = nil, |
|
| 34 |
+ afterCursor: RecordCursor? = nil, |
|
| 35 |
+ limit: Int? = nil |
|
| 36 |
+ ) {
|
|
| 37 |
+ self.sampleTypeIdentifier = sampleTypeIdentifier |
|
| 38 |
+ self.fingerprints = fingerprints |
|
| 39 |
+ self.disappearedOnly = disappearedOnly |
|
| 40 |
+ self.firstSeenAfter = firstSeenAfter |
|
| 41 |
+ self.firstSeenBefore = firstSeenBefore |
|
| 42 |
+ self.afterCursor = afterCursor |
|
| 43 |
+ self.limit = limit |
|
| 44 |
+ } |
|
| 45 |
+} |
|
| 46 |
+ |
|
| 47 |
+struct RecordCursor: Equatable, Sendable {
|
|
| 48 |
+ let startDate: Date |
|
| 49 |
+ let strictFingerprint: String |
|
| 23 | 50 |
} |
| 24 | 51 |
|
| 25 | 52 |
struct HealthArchiveReportRequest: Equatable, Sendable {
|
| 26 | 53 |
let reportID: UUID |
| 27 | 54 |
let title: String |
| 28 | 55 |
let includedFingerprints: Set<String> |
| 56 |
+ let typeIdentifierFilter: String? |
|
| 57 |
+ let disappearedOnly: Bool |
|
| 58 |
+ let firstSeenAfter: Date? |
|
| 59 |
+ let firstSeenBefore: Date? |
|
| 60 |
+ |
|
| 61 |
+ init( |
|
| 62 |
+ reportID: UUID, |
|
| 63 |
+ title: String, |
|
| 64 |
+ includedFingerprints: Set<String> = [], |
|
| 65 |
+ typeIdentifierFilter: String? = nil, |
|
| 66 |
+ disappearedOnly: Bool = false, |
|
| 67 |
+ firstSeenAfter: Date? = nil, |
|
| 68 |
+ firstSeenBefore: Date? = nil |
|
| 69 |
+ ) {
|
|
| 70 |
+ self.reportID = reportID |
|
| 71 |
+ self.title = title |
|
| 72 |
+ self.includedFingerprints = includedFingerprints |
|
| 73 |
+ self.typeIdentifierFilter = typeIdentifierFilter |
|
| 74 |
+ self.disappearedOnly = disappearedOnly |
|
| 75 |
+ self.firstSeenAfter = firstSeenAfter |
|
| 76 |
+ self.firstSeenBefore = firstSeenBefore |
|
| 77 |
+ } |
|
| 29 | 78 |
} |
| 30 | 79 |
|
| 31 | 80 |
struct ArchivedHealthRecord: Identifiable, Equatable, Sendable, Encodable {
|
@@ -40,4 +89,31 @@ struct ArchivedHealthRecord: Identifiable, Equatable, Sendable, Encodable {
|
||
| 40 | 89 |
let lastSeenAt: Date? |
| 41 | 90 |
let lastVerifiedAt: Date? |
| 42 | 91 |
let disappearedAt: Date? |
| 92 |
+ |
|
| 93 |
+ // Value fields |
|
| 94 |
+ let valueKind: String? // "quantity", "category", "workout", nil |
|
| 95 |
+ let value: Double? |
|
| 96 |
+ let unit: String? |
|
| 97 |
+ let categoryValue: Int? |
|
| 98 |
+ let workoutActivityType: Int? |
|
| 99 |
+ let durationSeconds: Double? |
|
| 100 |
+ |
|
| 101 |
+ // Source/device metadata |
|
| 102 |
+ let sourceName: String? |
|
| 103 |
+ let sourceBundleIdentifier: String? |
|
| 104 |
+ let deviceName: String? |
|
| 105 |
+ |
|
| 106 |
+ // Display helper |
|
| 107 |
+ var displayValue: String? {
|
|
| 108 |
+ if let value, let unit {
|
|
| 109 |
+ return "\(String(format: "%.1f", value)) \(unit)" |
|
| 110 |
+ } |
|
| 111 |
+ if let categoryValue {
|
|
| 112 |
+ return "Category: \(categoryValue)" |
|
| 113 |
+ } |
|
| 114 |
+ if let durationSeconds {
|
|
| 115 |
+ return String(format: "%.1f min", durationSeconds / 60.0) |
|
| 116 |
+ } |
|
| 117 |
+ return nil |
|
| 118 |
+ } |
|
| 43 | 119 |
} |
@@ -6,12 +6,14 @@ private enum SQLiteHealthArchiveStoreError: Error {
|
||
| 6 | 6 |
case openFailed(String) |
| 7 | 7 |
case prepareFailed(String) |
| 8 | 8 |
case stepFailed(String) |
| 9 |
+ case incompatibleSchema(Int) |
|
| 9 | 10 |
case exportEncodingFailed |
| 10 | 11 |
} |
| 11 | 12 |
|
| 12 | 13 |
// Interface updated 2026-05-18 — see AGENTS.md |
| 13 | 14 |
actor SQLiteHealthArchiveStore: HealthArchiveStore {
|
| 14 | 15 |
static let shared = SQLiteHealthArchiveStore() |
| 16 |
+ nonisolated private static let archiveSchemaVersion = 2 |
|
| 15 | 17 |
|
| 16 | 18 |
private let databaseURL: URL |
| 17 | 19 |
private var didPrepareSchema = false |
@@ -93,11 +95,25 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
|
||
| 93 | 95 |
if !request.fingerprints.isEmpty {
|
| 94 | 96 |
clauses.append("strict_fingerprint IN (\(Array(repeating: "?", count: request.fingerprints.count).joined(separator: ",")))")
|
| 95 | 97 |
} |
| 98 |
+ if request.disappearedOnly {
|
|
| 99 |
+ clauses.append("disappeared_at IS NOT NULL")
|
|
| 100 |
+ } |
|
| 101 |
+ if request.firstSeenAfter != nil {
|
|
| 102 |
+ clauses.append("first_seen_at >= ?")
|
|
| 103 |
+ } |
|
| 104 |
+ if request.firstSeenBefore != nil {
|
|
| 105 |
+ clauses.append("first_seen_at <= ?")
|
|
| 106 |
+ } |
|
| 107 |
+ if request.afterCursor != nil {
|
|
| 108 |
+ clauses.append("(start_date > ? OR (start_date = ? AND strict_fingerprint > ?))")
|
|
| 109 |
+ } |
|
| 96 | 110 |
let whereClause = clauses.isEmpty ? "" : "WHERE \(clauses.joined(separator: " AND "))" |
| 97 | 111 |
let limitClause = request.limit.map { "LIMIT \(max($0, 0))" } ?? ""
|
| 98 | 112 |
let sql = """ |
| 99 | 113 |
SELECT sample_uuid_hash, type_identifier, strict_fingerprint, semantic_fingerprint, |
| 100 |
- start_date, end_date, first_seen_at, last_seen_at, last_verified_at, disappeared_at |
|
| 114 |
+ start_date, end_date, first_seen_at, last_seen_at, last_verified_at, disappeared_at, |
|
| 115 |
+ value_kind, value, unit, category_value, workout_activity_type, duration_seconds, |
|
| 116 |
+ source_name, source_bundle_identifier, device_name |
|
| 101 | 117 |
FROM archive_samples |
| 102 | 118 |
\(whereClause) |
| 103 | 119 |
ORDER BY start_date ASC, strict_fingerprint ASC |
@@ -114,6 +130,22 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
|
||
| 114 | 130 |
bindText(fingerprint, to: index, in: statement) |
| 115 | 131 |
index += 1 |
| 116 | 132 |
} |
| 133 |
+ if let firstSeenAfter = request.firstSeenAfter {
|
|
| 134 |
+ sqlite3_bind_double(statement, index, firstSeenAfter.timeIntervalSinceReferenceDate) |
|
| 135 |
+ index += 1 |
|
| 136 |
+ } |
|
| 137 |
+ if let firstSeenBefore = request.firstSeenBefore {
|
|
| 138 |
+ sqlite3_bind_double(statement, index, firstSeenBefore.timeIntervalSinceReferenceDate) |
|
| 139 |
+ index += 1 |
|
| 140 |
+ } |
|
| 141 |
+ if let cursor = request.afterCursor {
|
|
| 142 |
+ sqlite3_bind_double(statement, index, cursor.startDate.timeIntervalSinceReferenceDate) |
|
| 143 |
+ index += 1 |
|
| 144 |
+ sqlite3_bind_double(statement, index, cursor.startDate.timeIntervalSinceReferenceDate) |
|
| 145 |
+ index += 1 |
|
| 146 |
+ bindText(cursor.strictFingerprint, to: index, in: statement) |
|
| 147 |
+ index += 1 |
|
| 148 |
+ } |
|
| 117 | 149 |
|
| 118 | 150 |
var records: [ArchivedHealthRecord] = [] |
| 119 | 151 |
while sqlite3_step(statement) == SQLITE_ROW {
|
@@ -128,7 +160,16 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
|
||
| 128 | 160 |
firstSeenAt: columnDate(statement, 6) ?? Date(timeIntervalSinceReferenceDate: 0), |
| 129 | 161 |
lastSeenAt: columnDate(statement, 7), |
| 130 | 162 |
lastVerifiedAt: columnDate(statement, 8), |
| 131 |
- disappearedAt: columnDate(statement, 9) |
|
| 163 |
+ disappearedAt: columnDate(statement, 9), |
|
| 164 |
+ valueKind: columnText(statement, 10), |
|
| 165 |
+ value: columnDouble(statement, 11), |
|
| 166 |
+ unit: columnText(statement, 12), |
|
| 167 |
+ categoryValue: columnInt(statement, 13), |
|
| 168 |
+ workoutActivityType: columnInt(statement, 14), |
|
| 169 |
+ durationSeconds: columnDouble(statement, 15), |
|
| 170 |
+ sourceName: columnText(statement, 16), |
|
| 171 |
+ sourceBundleIdentifier: columnText(statement, 17), |
|
| 172 |
+ deviceName: columnText(statement, 18) |
|
| 132 | 173 |
)) |
| 133 | 174 |
} |
| 134 | 175 |
return records |
@@ -136,11 +177,15 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
|
||
| 136 | 177 |
} |
| 137 | 178 |
|
| 138 | 179 |
func exportReport(_ request: HealthArchiveReportRequest) async throws -> URL {
|
| 139 |
- let records = try await records(for: HealthArchiveRecordRequest( |
|
| 140 |
- sampleTypeIdentifier: nil, |
|
| 180 |
+ let recordRequest = HealthArchiveRecordRequest( |
|
| 181 |
+ sampleTypeIdentifier: request.typeIdentifierFilter, |
|
| 141 | 182 |
fingerprints: request.includedFingerprints, |
| 183 |
+ disappearedOnly: request.disappearedOnly, |
|
| 184 |
+ firstSeenAfter: request.firstSeenAfter, |
|
| 185 |
+ firstSeenBefore: request.firstSeenBefore, |
|
| 142 | 186 |
limit: nil |
| 143 |
- )) |
|
| 187 |
+ ) |
|
| 188 |
+ let records = try await records(for: recordRequest) |
|
| 144 | 189 |
let payload = HealthArchiveReportPayload( |
| 145 | 190 |
reportID: request.reportID, |
| 146 | 191 |
title: request.title, |
@@ -173,6 +218,333 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
|
||
| 173 | 218 |
guard !didPrepareSchema else { return }
|
| 174 | 219 |
try execute("PRAGMA journal_mode = WAL", db: db)
|
| 175 | 220 |
try execute("PRAGMA foreign_keys = ON", db: db)
|
| 221 |
+ |
|
| 222 |
+ let existingVersion = try archiveSchemaVersionIfPresent(db) |
|
| 223 |
+ if let existingVersion, existingVersion > Self.archiveSchemaVersion {
|
|
| 224 |
+ throw SQLiteHealthArchiveStoreError.incompatibleSchema(existingVersion) |
|
| 225 |
+ } |
|
| 226 |
+ if existingVersion != Self.archiveSchemaVersion {
|
|
| 227 |
+ let needsReset = existingVersion != nil ? true : try hasUserTables(db) |
|
| 228 |
+ if needsReset {
|
|
| 229 |
+ try resetPrototypeSchema(db) |
|
| 230 |
+ } |
|
| 231 |
+ } |
|
| 232 |
+ |
|
| 233 |
+ try createArchiveV2Schema(db) |
|
| 234 |
+ try seedArchiveMetadata(db) |
|
| 235 |
+ didPrepareSchema = true |
|
| 236 |
+ } |
|
| 237 |
+ |
|
| 238 |
+ private func archiveSchemaVersionIfPresent(_ db: OpaquePointer?) throws -> Int? {
|
|
| 239 |
+ guard try tableExists("archive_metadata", db: db) else { return nil }
|
|
| 240 |
+ let sql = "SELECT value FROM archive_metadata WHERE key = 'schema_version' LIMIT 1" |
|
| 241 |
+ return try withStatement(sql, db: db) { statement in
|
|
| 242 |
+ guard sqlite3_step(statement) == SQLITE_ROW, |
|
| 243 |
+ let value = columnText(statement, 0) else {
|
|
| 244 |
+ return nil |
|
| 245 |
+ } |
|
| 246 |
+ return Int(value) |
|
| 247 |
+ } |
|
| 248 |
+ } |
|
| 249 |
+ |
|
| 250 |
+ private func hasUserTables(_ db: OpaquePointer?) throws -> Bool {
|
|
| 251 |
+ let sql = """ |
|
| 252 |
+ SELECT name |
|
| 253 |
+ FROM sqlite_master |
|
| 254 |
+ WHERE type = 'table' AND name NOT LIKE 'sqlite_%' |
|
| 255 |
+ LIMIT 1 |
|
| 256 |
+ """ |
|
| 257 |
+ return try withStatement(sql, db: db) { statement in
|
|
| 258 |
+ sqlite3_step(statement) == SQLITE_ROW |
|
| 259 |
+ } |
|
| 260 |
+ } |
|
| 261 |
+ |
|
| 262 |
+ private func tableExists(_ tableName: String, db: OpaquePointer?) throws -> Bool {
|
|
| 263 |
+ let sql = """ |
|
| 264 |
+ SELECT 1 |
|
| 265 |
+ FROM sqlite_master |
|
| 266 |
+ WHERE type = 'table' AND name = ? |
|
| 267 |
+ LIMIT 1 |
|
| 268 |
+ """ |
|
| 269 |
+ return try withStatement(sql, db: db) { statement in
|
|
| 270 |
+ bindText(tableName, to: 1, in: statement) |
|
| 271 |
+ return sqlite3_step(statement) == SQLITE_ROW |
|
| 272 |
+ } |
|
| 273 |
+ } |
|
| 274 |
+ |
|
| 275 |
+ private func resetPrototypeSchema(_ db: OpaquePointer?) throws {
|
|
| 276 |
+ // Prototype/test installs are disposable for archive v2. Future real archives |
|
| 277 |
+ // must use explicit migrations instead of destructive reset. |
|
| 278 |
+ try execute("PRAGMA foreign_keys = OFF", db: db)
|
|
| 279 |
+ for objectName in try schemaObjectNames(types: ["view", "trigger"], db: db) {
|
|
| 280 |
+ try execute("DROP \(objectName.kind.uppercased()) IF EXISTS \(quotedIdentifier(objectName.name))", db: db)
|
|
| 281 |
+ } |
|
| 282 |
+ for tableName in try schemaObjectNames(types: ["table"], db: db) {
|
|
| 283 |
+ try execute("DROP TABLE IF EXISTS \(quotedIdentifier(tableName.name))", db: db)
|
|
| 284 |
+ } |
|
| 285 |
+ try execute("PRAGMA foreign_keys = ON", db: db)
|
|
| 286 |
+ } |
|
| 287 |
+ |
|
| 288 |
+ private func schemaObjectNames(types: [String], db: OpaquePointer?) throws -> [(kind: String, name: String)] {
|
|
| 289 |
+ let typeList = types.map { "'\($0)'" }.joined(separator: ",")
|
|
| 290 |
+ let sql = """ |
|
| 291 |
+ SELECT type, name |
|
| 292 |
+ FROM sqlite_master |
|
| 293 |
+ WHERE type IN (\(typeList)) AND name NOT LIKE 'sqlite_%' |
|
| 294 |
+ ORDER BY type, name |
|
| 295 |
+ """ |
|
| 296 |
+ return try withStatement(sql, db: db) { statement in
|
|
| 297 |
+ var names: [(kind: String, name: String)] = [] |
|
| 298 |
+ while sqlite3_step(statement) == SQLITE_ROW {
|
|
| 299 |
+ guard let kind = columnText(statement, 0), |
|
| 300 |
+ let name = columnText(statement, 1) else {
|
|
| 301 |
+ continue |
|
| 302 |
+ } |
|
| 303 |
+ names.append((kind, name)) |
|
| 304 |
+ } |
|
| 305 |
+ return names |
|
| 306 |
+ } |
|
| 307 |
+ } |
|
| 308 |
+ |
|
| 309 |
+ private func createArchiveV2Schema(_ db: OpaquePointer?) throws {
|
|
| 310 |
+ try execute("""
|
|
| 311 |
+ CREATE TABLE IF NOT EXISTS schema_migrations ( |
|
| 312 |
+ version INTEGER PRIMARY KEY, |
|
| 313 |
+ applied_at REAL NOT NULL, |
|
| 314 |
+ description TEXT NOT NULL |
|
| 315 |
+ ) |
|
| 316 |
+ """, db: db) |
|
| 317 |
+ try execute("""
|
|
| 318 |
+ CREATE TABLE IF NOT EXISTS archive_metadata ( |
|
| 319 |
+ key TEXT PRIMARY KEY, |
|
| 320 |
+ value TEXT NOT NULL |
|
| 321 |
+ ) |
|
| 322 |
+ """, db: db) |
|
| 323 |
+ try execute("""
|
|
| 324 |
+ CREATE TABLE IF NOT EXISTS device_chains ( |
|
| 325 |
+ id INTEGER PRIMARY KEY, |
|
| 326 |
+ device_chain_hash TEXT NOT NULL UNIQUE, |
|
| 327 |
+ created_at REAL NOT NULL, |
|
| 328 |
+ recovered_from_keychain INTEGER NOT NULL DEFAULT 0 |
|
| 329 |
+ ) |
|
| 330 |
+ """, db: db) |
|
| 331 |
+ try execute("""
|
|
| 332 |
+ CREATE TABLE IF NOT EXISTS observations ( |
|
| 333 |
+ id INTEGER PRIMARY KEY, |
|
| 334 |
+ device_chain_id INTEGER NOT NULL REFERENCES device_chains(id), |
|
| 335 |
+ observed_at REAL NOT NULL, |
|
| 336 |
+ started_at REAL, |
|
| 337 |
+ ended_at REAL, |
|
| 338 |
+ status TEXT NOT NULL, |
|
| 339 |
+ trigger_reason TEXT NOT NULL, |
|
| 340 |
+ app_version TEXT, |
|
| 341 |
+ os_version TEXT, |
|
| 342 |
+ time_zone_identifier TEXT, |
|
| 343 |
+ time_zone_seconds_from_gmt INTEGER, |
|
| 344 |
+ schema_version INTEGER NOT NULL, |
|
| 345 |
+ selected_type_set_hash TEXT, |
|
| 346 |
+ notes TEXT |
|
| 347 |
+ ) |
|
| 348 |
+ """, db: db) |
|
| 349 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_observations_device_time ON observations(device_chain_id, observed_at)", db: db)
|
|
| 350 |
+ try execute("""
|
|
| 351 |
+ CREATE TABLE IF NOT EXISTS sample_types ( |
|
| 352 |
+ id INTEGER PRIMARY KEY, |
|
| 353 |
+ type_identifier TEXT NOT NULL UNIQUE, |
|
| 354 |
+ display_name TEXT, |
|
| 355 |
+ category TEXT |
|
| 356 |
+ ) |
|
| 357 |
+ """, db: db) |
|
| 358 |
+ try execute("""
|
|
| 359 |
+ CREATE TABLE IF NOT EXISTS observation_type_runs ( |
|
| 360 |
+ id INTEGER PRIMARY KEY, |
|
| 361 |
+ observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 362 |
+ sample_type_id INTEGER NOT NULL REFERENCES sample_types(id), |
|
| 363 |
+ status TEXT NOT NULL, |
|
| 364 |
+ started_at REAL, |
|
| 365 |
+ ended_at REAL, |
|
| 366 |
+ anchor_before BLOB, |
|
| 367 |
+ anchor_after BLOB, |
|
| 368 |
+ inserted_event_count INTEGER NOT NULL DEFAULT 0, |
|
| 369 |
+ deleted_event_count INTEGER NOT NULL DEFAULT 0, |
|
| 370 |
+ verified_visible_count INTEGER, |
|
| 371 |
+ error_kind TEXT, |
|
| 372 |
+ error_message_hash TEXT, |
|
| 373 |
+ UNIQUE(observation_id, sample_type_id) |
|
| 374 |
+ ) |
|
| 375 |
+ """, db: db) |
|
| 376 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_type_runs_type_observation ON observation_type_runs(sample_type_id, observation_id)", db: db)
|
|
| 377 |
+ try execute("""
|
|
| 378 |
+ CREATE TABLE IF NOT EXISTS sources ( |
|
| 379 |
+ id INTEGER PRIMARY KEY, |
|
| 380 |
+ source_name_hash TEXT, |
|
| 381 |
+ bundle_identifier TEXT |
|
| 382 |
+ ) |
|
| 383 |
+ """, db: db) |
|
| 384 |
+ try execute("""
|
|
| 385 |
+ CREATE TABLE IF NOT EXISTS source_revisions ( |
|
| 386 |
+ id INTEGER PRIMARY KEY, |
|
| 387 |
+ source_id INTEGER NOT NULL REFERENCES sources(id), |
|
| 388 |
+ product_type TEXT, |
|
| 389 |
+ version TEXT, |
|
| 390 |
+ operating_system_version TEXT, |
|
| 391 |
+ UNIQUE(source_id, product_type, version, operating_system_version) |
|
| 392 |
+ ) |
|
| 393 |
+ """, db: db) |
|
| 394 |
+ try execute("""
|
|
| 395 |
+ CREATE TABLE IF NOT EXISTS hk_devices ( |
|
| 396 |
+ id INTEGER PRIMARY KEY, |
|
| 397 |
+ device_hash TEXT, |
|
| 398 |
+ manufacturer_hash TEXT, |
|
| 399 |
+ model TEXT, |
|
| 400 |
+ hardware_version TEXT, |
|
| 401 |
+ firmware_version TEXT, |
|
| 402 |
+ software_version TEXT, |
|
| 403 |
+ local_identifier_hash TEXT, |
|
| 404 |
+ udi_hash TEXT |
|
| 405 |
+ ) |
|
| 406 |
+ """, db: db) |
|
| 407 |
+ try execute("""
|
|
| 408 |
+ CREATE TABLE IF NOT EXISTS metadata_blobs ( |
|
| 409 |
+ id INTEGER PRIMARY KEY, |
|
| 410 |
+ metadata_hash TEXT NOT NULL UNIQUE, |
|
| 411 |
+ metadata_json TEXT NOT NULL |
|
| 412 |
+ ) |
|
| 413 |
+ """, db: db) |
|
| 414 |
+ try execute("""
|
|
| 415 |
+ CREATE TABLE IF NOT EXISTS samples ( |
|
| 416 |
+ id INTEGER PRIMARY KEY, |
|
| 417 |
+ sample_type_id INTEGER NOT NULL REFERENCES sample_types(id), |
|
| 418 |
+ sample_uuid_hash TEXT, |
|
| 419 |
+ strict_fingerprint TEXT NOT NULL, |
|
| 420 |
+ semantic_fingerprint TEXT, |
|
| 421 |
+ fuzzy_key TEXT, |
|
| 422 |
+ first_seen_observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 423 |
+ first_seen_at REAL NOT NULL, |
|
| 424 |
+ UNIQUE(sample_type_id, strict_fingerprint) |
|
| 425 |
+ ) |
|
| 426 |
+ """, db: db) |
|
| 427 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_samples_uuid_hash ON samples(sample_uuid_hash)", db: db)
|
|
| 428 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_samples_type_semantic ON samples(sample_type_id, semantic_fingerprint)", db: db)
|
|
| 429 |
+ try execute("""
|
|
| 430 |
+ CREATE TABLE IF NOT EXISTS sample_versions ( |
|
| 431 |
+ id INTEGER PRIMARY KEY, |
|
| 432 |
+ sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 433 |
+ payload_hash TEXT NOT NULL, |
|
| 434 |
+ start_date REAL NOT NULL, |
|
| 435 |
+ end_date REAL NOT NULL, |
|
| 436 |
+ value_kind TEXT, |
|
| 437 |
+ numeric_value REAL, |
|
| 438 |
+ unit TEXT, |
|
| 439 |
+ category_value INTEGER, |
|
| 440 |
+ workout_activity_type INTEGER, |
|
| 441 |
+ duration_seconds REAL, |
|
| 442 |
+ source_revision_id INTEGER REFERENCES source_revisions(id), |
|
| 443 |
+ hk_device_id INTEGER REFERENCES hk_devices(id), |
|
| 444 |
+ metadata_id INTEGER REFERENCES metadata_blobs(id), |
|
| 445 |
+ created_observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 446 |
+ UNIQUE(sample_id, payload_hash) |
|
| 447 |
+ ) |
|
| 448 |
+ """, db: db) |
|
| 449 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_sample_versions_sample ON sample_versions(sample_id)", db: db)
|
|
| 450 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_sample_versions_time ON sample_versions(start_date, end_date)", db: db)
|
|
| 451 |
+ try execute("""
|
|
| 452 |
+ CREATE TABLE IF NOT EXISTS sample_observation_events ( |
|
| 453 |
+ id INTEGER PRIMARY KEY, |
|
| 454 |
+ observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 455 |
+ sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 456 |
+ version_id INTEGER REFERENCES sample_versions(id), |
|
| 457 |
+ event_kind TEXT NOT NULL, |
|
| 458 |
+ observed_at REAL NOT NULL, |
|
| 459 |
+ evidence_kind TEXT, |
|
| 460 |
+ UNIQUE(observation_id, sample_id, event_kind) |
|
| 461 |
+ ) |
|
| 462 |
+ """, db: db) |
|
| 463 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_events_observation_kind ON sample_observation_events(observation_id, event_kind)", db: db)
|
|
| 464 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_events_sample ON sample_observation_events(sample_id, observation_id)", db: db)
|
|
| 465 |
+ try execute("""
|
|
| 466 |
+ CREATE TABLE IF NOT EXISTS sample_visibility_ranges ( |
|
| 467 |
+ sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 468 |
+ version_id INTEGER REFERENCES sample_versions(id), |
|
| 469 |
+ first_observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 470 |
+ last_observation_id INTEGER REFERENCES observations(id), |
|
| 471 |
+ first_seen_at REAL NOT NULL, |
|
| 472 |
+ last_seen_at REAL, |
|
| 473 |
+ PRIMARY KEY (sample_id, version_id, first_observation_id) |
|
| 474 |
+ ) |
|
| 475 |
+ """, db: db) |
|
| 476 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_visibility_open_ranges ON sample_visibility_ranges(last_observation_id)", db: db)
|
|
| 477 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_visibility_point_lookup ON sample_visibility_ranges(first_observation_id, last_observation_id)", db: db)
|
|
| 478 |
+ try execute("""
|
|
| 479 |
+ CREATE TABLE IF NOT EXISTS sample_relationships ( |
|
| 480 |
+ id INTEGER PRIMARY KEY, |
|
| 481 |
+ observation_id INTEGER REFERENCES observations(id), |
|
| 482 |
+ source_sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 483 |
+ target_sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 484 |
+ relationship_kind TEXT NOT NULL, |
|
| 485 |
+ metadata_id INTEGER REFERENCES metadata_blobs(id), |
|
| 486 |
+ UNIQUE(observation_id, source_sample_id, target_sample_id, relationship_kind) |
|
| 487 |
+ ) |
|
| 488 |
+ """, db: db) |
|
| 489 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_relationship_source ON sample_relationships(source_sample_id, relationship_kind)", db: db)
|
|
| 490 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_relationship_target ON sample_relationships(target_sample_id, relationship_kind)", db: db)
|
|
| 491 |
+ try execute("""
|
|
| 492 |
+ CREATE TABLE IF NOT EXISTS observation_type_summaries ( |
|
| 493 |
+ observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 494 |
+ sample_type_id INTEGER NOT NULL REFERENCES sample_types(id), |
|
| 495 |
+ visible_record_count INTEGER NOT NULL, |
|
| 496 |
+ appeared_count INTEGER NOT NULL DEFAULT 0, |
|
| 497 |
+ disappeared_count INTEGER NOT NULL DEFAULT 0, |
|
| 498 |
+ representation_changed_count INTEGER NOT NULL DEFAULT 0, |
|
| 499 |
+ earliest_start_date REAL, |
|
| 500 |
+ latest_end_date REAL, |
|
| 501 |
+ value_sum REAL, |
|
| 502 |
+ value_max REAL, |
|
| 503 |
+ aggregate_hash TEXT, |
|
| 504 |
+ PRIMARY KEY (observation_id, sample_type_id) |
|
| 505 |
+ ) |
|
| 506 |
+ """, db: db) |
|
| 507 |
+ try execute("""
|
|
| 508 |
+ CREATE TABLE IF NOT EXISTS daily_type_aggregates ( |
|
| 509 |
+ observation_id INTEGER NOT NULL REFERENCES observations(id), |
|
| 510 |
+ sample_type_id INTEGER NOT NULL REFERENCES sample_types(id), |
|
| 511 |
+ bucket_start REAL NOT NULL, |
|
| 512 |
+ bucket_end REAL NOT NULL, |
|
| 513 |
+ visible_record_count INTEGER NOT NULL, |
|
| 514 |
+ value_sum REAL, |
|
| 515 |
+ value_max REAL, |
|
| 516 |
+ source_revision_id INTEGER, |
|
| 517 |
+ aggregate_hash TEXT, |
|
| 518 |
+ PRIMARY KEY (observation_id, sample_type_id, bucket_start, source_revision_id) |
|
| 519 |
+ ) |
|
| 520 |
+ """, db: db) |
|
| 521 |
+ try execute("CREATE INDEX IF NOT EXISTS idx_daily_type_bucket ON daily_type_aggregates(sample_type_id, bucket_start)", db: db)
|
|
| 522 |
+ try execute("""
|
|
| 523 |
+ CREATE TABLE IF NOT EXISTS export_manifests ( |
|
| 524 |
+ id INTEGER PRIMARY KEY, |
|
| 525 |
+ export_id TEXT NOT NULL UNIQUE, |
|
| 526 |
+ created_at REAL NOT NULL, |
|
| 527 |
+ export_kind TEXT NOT NULL, |
|
| 528 |
+ from_observation_id INTEGER REFERENCES observations(id), |
|
| 529 |
+ to_observation_id INTEGER REFERENCES observations(id), |
|
| 530 |
+ filter_json TEXT, |
|
| 531 |
+ manifest_hash TEXT NOT NULL, |
|
| 532 |
+ record_count INTEGER NOT NULL |
|
| 533 |
+ ) |
|
| 534 |
+ """, db: db) |
|
| 535 |
+ try execute("""
|
|
| 536 |
+ CREATE TABLE IF NOT EXISTS export_items ( |
|
| 537 |
+ export_manifest_id INTEGER NOT NULL REFERENCES export_manifests(id), |
|
| 538 |
+ sample_id INTEGER NOT NULL REFERENCES samples(id), |
|
| 539 |
+ version_id INTEGER REFERENCES sample_versions(id), |
|
| 540 |
+ item_hash TEXT NOT NULL, |
|
| 541 |
+ PRIMARY KEY (export_manifest_id, sample_id, version_id) |
|
| 542 |
+ ) |
|
| 543 |
+ """, db: db) |
|
| 544 |
+ try createLegacyArchiveSamplesTable(db) |
|
| 545 |
+ } |
|
| 546 |
+ |
|
| 547 |
+ private func createLegacyArchiveSamplesTable(_ db: OpaquePointer?) throws {
|
|
| 176 | 548 |
try execute("""
|
| 177 | 549 |
CREATE TABLE IF NOT EXISTS archive_samples ( |
| 178 | 550 |
sample_uuid_hash TEXT PRIMARY KEY NOT NULL, |
@@ -211,7 +583,46 @@ actor SQLiteHealthArchiveStore: HealthArchiveStore {
|
||
| 211 | 583 |
""", db: db) |
| 212 | 584 |
try execute("CREATE INDEX IF NOT EXISTS idx_archive_samples_type_date ON archive_samples(type_identifier, start_date)", db: db)
|
| 213 | 585 |
try execute("CREATE INDEX IF NOT EXISTS idx_archive_samples_strict_fingerprint ON archive_samples(strict_fingerprint)", db: db)
|
| 214 |
- didPrepareSchema = true |
|
| 586 |
+ } |
|
| 587 |
+ |
|
| 588 |
+ private func seedArchiveMetadata(_ db: OpaquePointer?) throws {
|
|
| 589 |
+ try upsertMetadata(key: "schema_version", value: "\(Self.archiveSchemaVersion)", db: db) |
|
| 590 |
+ try insertMetadataIfMissing(key: "created_at_unix", value: "\(Date().timeIntervalSince1970)", db: db) |
|
| 591 |
+ try upsertMetadata(key: "timestamp_convention", value: "unix_seconds_utc_real", db: db) |
|
| 592 |
+ try upsertMetadata(key: "identifier_hash_algorithm", value: "hmac-sha256-local-secret", db: db) |
|
| 593 |
+ try upsertMetadata(key: "content_hash_algorithm", value: "sha256", db: db) |
|
| 594 |
+ try upsertMetadata(key: "prototype_reset_policy", value: "reset_or_reinitialize_test_installs", db: db) |
|
| 595 |
+ try withStatement( |
|
| 596 |
+ "INSERT OR IGNORE INTO schema_migrations (version, applied_at, description) VALUES (?, ?, ?)", |
|
| 597 |
+ db: db |
|
| 598 |
+ ) { statement in
|
|
| 599 |
+ sqlite3_bind_int64(statement, 1, sqlite3_int64(Self.archiveSchemaVersion)) |
|
| 600 |
+ sqlite3_bind_double(statement, 2, Date().timeIntervalSince1970) |
|
| 601 |
+ bindText("Initialize archive v2 schema", to: 3, in: statement)
|
|
| 602 |
+ guard sqlite3_step(statement) == SQLITE_DONE else {
|
|
| 603 |
+ throw SQLiteHealthArchiveStoreError.stepFailed(lastErrorMessage(db)) |
|
| 604 |
+ } |
|
| 605 |
+ } |
|
| 606 |
+ } |
|
| 607 |
+ |
|
| 608 |
+ private func upsertMetadata(key: String, value: String, db: OpaquePointer?) throws {
|
|
| 609 |
+ try withStatement("INSERT OR REPLACE INTO archive_metadata (key, value) VALUES (?, ?)", db: db) { statement in
|
|
| 610 |
+ bindText(key, to: 1, in: statement) |
|
| 611 |
+ bindText(value, to: 2, in: statement) |
|
| 612 |
+ guard sqlite3_step(statement) == SQLITE_DONE else {
|
|
| 613 |
+ throw SQLiteHealthArchiveStoreError.stepFailed(lastErrorMessage(db)) |
|
| 614 |
+ } |
|
| 615 |
+ } |
|
| 616 |
+ } |
|
| 617 |
+ |
|
| 618 |
+ private func insertMetadataIfMissing(key: String, value: String, db: OpaquePointer?) throws {
|
|
| 619 |
+ try withStatement("INSERT OR IGNORE INTO archive_metadata (key, value) VALUES (?, ?)", db: db) { statement in
|
|
| 620 |
+ bindText(key, to: 1, in: statement) |
|
| 621 |
+ bindText(value, to: 2, in: statement) |
|
| 622 |
+ guard sqlite3_step(statement) == SQLITE_DONE else {
|
|
| 623 |
+ throw SQLiteHealthArchiveStoreError.stepFailed(lastErrorMessage(db)) |
|
| 624 |
+ } |
|
| 625 |
+ } |
|
| 215 | 626 |
} |
| 216 | 627 |
|
| 217 | 628 |
private func upsertSamples(_ samples: [HKSample], observedAt: Date, db: OpaquePointer?) throws -> HealthArchiveWriteSummary {
|
@@ -532,6 +943,20 @@ nonisolated private func columnDate(_ statement: OpaquePointer?, _ index: Int32) |
||
| 532 | 943 |
return Date(timeIntervalSinceReferenceDate: sqlite3_column_double(statement, index)) |
| 533 | 944 |
} |
| 534 | 945 |
|
| 946 |
+nonisolated private func columnDouble(_ statement: OpaquePointer?, _ index: Int32) -> Double? {
|
|
| 947 |
+ guard sqlite3_column_type(statement, index) != SQLITE_NULL else { return nil }
|
|
| 948 |
+ return sqlite3_column_double(statement, index) |
|
| 949 |
+} |
|
| 950 |
+ |
|
| 951 |
+nonisolated private func columnInt(_ statement: OpaquePointer?, _ index: Int32) -> Int? {
|
|
| 952 |
+ guard sqlite3_column_type(statement, index) != SQLITE_NULL else { return nil }
|
|
| 953 |
+ return Int(sqlite3_column_int(statement, index)) |
|
| 954 |
+} |
|
| 955 |
+ |
|
| 956 |
+nonisolated private func quotedIdentifier(_ value: String) -> String {
|
|
| 957 |
+ "\"\(value.replacingOccurrences(of: "\"", with: "\"\""))\"" |
|
| 958 |
+} |
|
| 959 |
+ |
|
| 535 | 960 |
nonisolated private func lastErrorMessage(_ db: OpaquePointer?) -> String {
|
| 536 | 961 |
guard let message = sqlite3_errmsg(db) else { return "unknown SQLite error" }
|
| 537 | 962 |
return String(cString: message) |
@@ -1,239 +0,0 @@ |
||
| 1 |
-# HealthProbe Implementation Status |
|
| 2 |
- |
|
| 3 |
-## Overview |
|
| 4 |
- |
|
| 5 |
-HealthProbe's comprehensive snapshot + delta system has been implemented according to the detailed plan. The project builds successfully with no compilation errors. |
|
| 6 |
- |
|
| 7 |
-## Completed Components (100%) |
|
| 8 |
- |
|
| 9 |
-### Models (Step 1-3) |
|
| 10 |
-✅ **SnapshotQuality.swift** — All quality states (complete, partial, unauthorized, loading, failed) |
|
| 11 |
-✅ **AnomalyType.swift** — All anomaly types + Severity + TypeTransition + TypeDeltaReason enums |
|
| 12 |
-✅ **HealthSnapshot.swift** — Chain metadata, quality, trigger context, registry fingerprinting, timezone context |
|
| 13 |
-✅ **TypeCount.swift** — Count with hash, date range, quality, yearly counts with cascade relationship |
|
| 14 |
-✅ **SnapshotDelta.swift** — Local delta with checksums and cascade relationship to TypeDeltas |
|
| 15 |
-✅ **TypeDelta.swift** — Per-type local delta with transition, reason, quality before/after, yearly count note |
|
| 16 |
-✅ **AnomalyRecord.swift** — Anomaly record with deltaID set structurally by detector, never by caller |
|
| 17 |
-✅ **OperationLog.swift** — Audit trail for destructive operations with JSON-encoded affected snapshot IDs |
|
| 18 |
-✅ **YearlyCount.swift** — Per-year sample counts with approximation flag |
|
| 19 |
- |
|
| 20 |
-### Services (Step 4-12) |
|
| 21 |
- |
|
| 22 |
-#### Step 5: HashService ✅ |
|
| 23 |
-- `typeHash()` — SHA256 of typeIdentifier|count|earliest|latest (ISO8601 with fractional seconds) |
|
| 24 |
-- `snapshotChecksum()` — Filters on quality==.complete (not hash!=""), concatenates type hashes |
|
| 25 |
-- `typeSetHash()` — SHA256 of sorted active typeIdentifiers (covers full intended registry) |
|
| 26 |
- |
|
| 27 |
-#### Step 11 & 11b: HealthKitService & ObserverService ✅ |
|
| 28 |
-- Per-type fetch with **15-second combined timeout** (distribution + earliestDate + latestDate) |
|
| 29 |
-- Concurrency capped at 6 simultaneous type fetches (prevents HealthKit resource exhaustion) |
|
| 30 |
-- Per-type quality detection (unauthorized, failed, complete) |
|
| 31 |
-- Real earliestDate/latestDate from separate HKSampleQuery (NOT from bin boundaries) |
|
| 32 |
-- YearlyCount population from distribution bins with isApproximate flag |
|
| 33 |
-- Snapshot quality aggregation (loading > unauthorized > partial > complete) |
|
| 34 |
-- Chain metadata set before save (previousSnapshotID, isChainStart, monitoredTypeSetHash) |
|
| 35 |
-- Auto-detect post-restore (full deny → complete transition, or chain start > 1000 records) |
|
| 36 |
-- **Post-save pipeline**: DeltaService → AnomalyDetector → OperationLog |
|
| 37 |
-- **ObserverService**: debounce (10 min), manual overlap suppression, all-monitored-types snapshot |
|
| 38 |
-- **Background delivery**: .immediate for heart rate/steps, .daily for others |
|
| 39 |
- |
|
| 40 |
-#### Step 7: DeltaService ✅ |
|
| 41 |
-- Computes and saves SnapshotDelta with TypeDeltas |
|
| 42 |
-- **Reason assignment with priority**: authorizationChanged > unsupported > registryChanged > unknown > normal |
|
| 43 |
-- **Unavailable count guard**: if either quality != .complete, countDelta = 0 (never from -1) |
|
| 44 |
-- **YearlyCount timezone guard**: if timezone changes, set countDelta = 0 and yearlyCountNote |
|
| 45 |
-- **Delta merge** (for intermediate deletion): |
|
| 46 |
- - Recomputes checksums from surrounding snapshots (never carries old checksums) |
|
| 47 |
- - Handles disappeared→appeared transition (remove from merged delta if type existed only in deleted snapshot) |
|
| 48 |
- - Applies unavailable count guard and reason priority to merged result |
|
| 49 |
- - Sets timezone note if either source had it |
|
| 50 |
- |
|
| 51 |
-#### Step 8: AnomalyDetector ✅ |
|
| 52 |
-- **Pure function**: no context mutation, receives TypeCount maps, returns DetectionResult |
|
| 53 |
-- **Quality gate**: both snapshots must be .complete (suppresses ALL detection including first auth after full deny) |
|
| 54 |
-- **Registry gate**: skips appeared/disappeared anomalies if reason != .normal |
|
| 55 |
-- **count = -1 guard**: skips any TypeDelta with qualityBefore or qualityAfter != .complete |
|
| 56 |
-- **Anomaly detection rules**: |
|
| 57 |
- - `historicalInsertion` — countDelta > 0 AND (earlier earliest date OR recent latest with increased count) |
|
| 58 |
- - `deletion` — countDelta < 0 (severity based on % loss) |
|
| 59 |
- - `duplication` — countDelta > 50% AND date ranges within 1 day |
|
| 60 |
- - `silentReplacement` — countDelta == 0 AND hash differs (best-effort, MVP limitation) |
|
| 61 |
- - `syncAnomaly` — ≥4 types with |delta| > 10% (critical severity) |
|
| 62 |
-- **isPostRestore suppression**: |
|
| 63 |
- - Suppresses syncAnomaly if previous.isPostRestore && previous.isPostRestoreSuppressedDeltaID == nil |
|
| 64 |
- - Suppression token consumed via DetectionResult, persisted by HealthKitService |
|
| 65 |
- - Forwarded past low-quality successors (quality gate prevents consumption on incomplete snapshots) |
|
| 66 |
-- **AnomalyRecord.deltaID**: set internally, structural guarantee (impossible to return record without deltaID) |
|
| 67 |
- |
|
| 68 |
-#### Step 4: KeychainService ✅ |
|
| 69 |
-- Stable device ID persisted in Keychain (service: "ro.xdev.healthprobe.deviceid", account: "stable_device_id") |
|
| 70 |
-- Detects DB reset: swiftDataStoreIsEmpty + existing keychain ID → recoveredDeviceID = true |
|
| 71 |
-- In-process cache for repeated lookups |
|
| 72 |
- |
|
| 73 |
-#### Step 6 & 9: IntegrityService & Quality Aggregation ✅ |
|
| 74 |
-- `validate()` — strict mode: |
|
| 75 |
- - Recomputes checksum from TypeCounts |
|
| 76 |
- - Compares with delta.checksumAfter |
|
| 77 |
- - Returns .valid or .checksumMismatch / .missingDelta / .corrupted |
|
| 78 |
-- `validateChain()` — walk backwards from latest via previousSnapshotID: |
|
| 79 |
- - **Fork detection**: asserts no duplicate previousSnapshotID (returns .corrupted immediately) |
|
| 80 |
- - Stops at first mismatch (no auto-repair, no skips) |
|
| 81 |
-- **Quality aggregation**: loading > unauthorized (only if ALL) > partial (any failed/unauthorized) > complete |
|
| 82 |
- |
|
| 83 |
-#### Step 10: SnapshotLifecycleService ✅ |
|
| 84 |
-- `previewDeletion()` — advisory integrity check, surfaces willBreakChain warning to UI |
|
| 85 |
-- `delete()` — handles all position cases (oldest, latest, intermediate): |
|
| 86 |
- - **Oldest**: set next as chain start |
|
| 87 |
- - **Latest**: just delete |
|
| 88 |
- - **Intermediate**: merge deltas, recompute checksums, update nextSnapshot.previousSnapshotID |
|
| 89 |
-- **OperationLog**: always written atomically with deletive changes |
|
| 90 |
-- **Post-save verification**: re-fetches log by ID, recovery re-insert if missing, logs critical error |
|
| 91 |
- |
|
| 92 |
-#### Step 12: Local-only storage refactor ✅ |
|
| 93 |
-- Removed CloudKitSyncService and CloudKit-pending chain states |
|
| 94 |
-- **ModelContainer split**: |
|
| 95 |
- - uiCacheConfig: HealthSnapshot, TypeCount, YearlyCount, SnapshotDelta, TypeDelta, AnomalyRecord (derived local UI/index data) |
|
| 96 |
- - localConfig: OperationLog, DeviceProfile, MetricTimeoutProfile (local-only settings and operation metadata) |
|
| 97 |
-- Added `HealthArchiveStore` protocol for the single local archive store source of truth |
|
| 98 |
-- Added `SQLiteHealthArchiveStore`: actor-isolated SQLite archive with WAL, per-sample upsert, disappearance marking, verification timestamps, semantic fingerprints, metadata JSON, and scoped JSON report export |
|
| 99 |
-- HealthKit anchored-query pages now archive samples/deletions before SwiftData snapshot/index rows are built |
|
| 100 |
-- Schema migration recovery: removes legacy SwiftData stores and retries once on failure |
|
| 101 |
- |
|
| 102 |
-### UI (Step 13) |
|
| 103 |
- |
|
| 104 |
-✅ **SnapshotRow** — Shows: |
|
| 105 |
- - Chain indicators: "Chain start" / "DB reset / recovered device ID" / "Post-restore baseline" / "Observer-triggered snapshot" |
|
| 106 |
- - Anomaly warning badge (exclamationmark.triangle) if anomalyFlags non-empty |
|
| 107 |
- - Incomplete snapshot warning if quality != .complete |
|
| 108 |
- |
|
| 109 |
-✅ **SnapshotTypeCountRow** — Shows: |
|
| 110 |
- - "Unsupported" for isUnsupported = true (read directly, no delta needed) |
|
| 111 |
- - "Unavailable" for count = -1 |
|
| 112 |
- - Numeric count with warning color if quality != .complete |
|
| 113 |
- - Delta badge vs. baseline (green/amber) |
|
| 114 |
- |
|
| 115 |
-✅ **DashboardView** — Anomaly summary section: |
|
| 116 |
- - Counts unresolved anomalies by severity (critical/warning) |
|
| 117 |
- - Shows only if unresolved anomalies exist |
|
| 118 |
- |
|
| 119 |
-✅ **Full feature coverage**: |
|
| 120 |
- - Snapshot creation with observer triggers |
|
| 121 |
- - Chain visualization and deletion with integrity warnings |
|
| 122 |
- - Quality badges and anomaly indicators |
|
| 123 |
- - Timezone/registry change awareness |
|
| 124 |
- - Baseline comparison across multiple devices |
|
| 125 |
- |
|
| 126 |
-## Build Status |
|
| 127 |
- |
|
| 128 |
-``` |
|
| 129 |
-✅ BUILD SUCCEEDED |
|
| 130 |
- Target: HealthProbe (iOS 26.4) |
|
| 131 |
- No compilation errors or warnings |
|
| 132 |
- App signs successfully |
|
| 133 |
-``` |
|
| 134 |
- |
|
| 135 |
-## Verification Checklist (32 items from plan) |
|
| 136 |
- |
|
| 137 |
-These tests should be run to ensure all backend functionality is correct: |
|
| 138 |
- |
|
| 139 |
-### Basic Snapshot & Chain (1-3) |
|
| 140 |
-- [ ] 1. Build succeeds with no errors |
|
| 141 |
-- [ ] 2. First snapshot: isChainStart=true, previousSnapshotID=nil, no delta created |
|
| 142 |
-- [ ] 3. Second snapshot: SnapshotDelta created with correct checksumBefore/After |
|
| 143 |
- |
|
| 144 |
-### Quality & Anomalies (4-7) |
|
| 145 |
-- [ ] 4. Revoke permission → type quality=.unauthorized, snapshot=.partial, no anomalies |
|
| 146 |
-- [ ] 5. All permissions revoked → snapshot=.unauthorized, no anomalies |
|
| 147 |
-- [ ] 6. Timeout simulation (1ms) → count=-1, quality=.failed, "Unavailable" in UI |
|
| 148 |
-- [ ] 7. Post-authorize after full deny → first delta suppressed, snapshot marked post-restore |
|
| 149 |
- |
|
| 150 |
-### Chain Operations (8-10) |
|
| 151 |
-- [ ] 8. 3 snapshots A→B→C, delete B → single merged delta A→C, C.previousSnapshotID==A.id |
|
| 152 |
-- [ ] 9. Hash stability → no changes between snapshots = identical hashes/checksums |
|
| 153 |
-- [ ] 10. Integrity strict mode → corrupted checksum = validation stops, no auto-repair |
|
| 154 |
- |
|
| 155 |
-### Advanced Features (11-20) |
|
| 156 |
-- [ ] 11. DB reset with Keychain survival → same deviceID, isChainStart=true, recoveredDeviceID=true |
|
| 157 |
-- [ ] 12. Local-only launch → app functions without iCloud/CloudKit entitlements |
|
| 158 |
-- [ ] 13. Observer debounce → 10 rapid callbacks = exactly 1 snapshot (triggerReason=observerCallback) |
|
| 159 |
-- [ ] 14. Unsupported type → TypeCount(count=-1, quality=.failed, isUnsupported=true), "Unsupported" UI |
|
| 160 |
-- [ ] 15. YearlyCount timezone → Calendar.current used, isApproximate=true if bucket > day |
|
| 161 |
-- [ ] 16. Delta merge with unavailable counts → merged countDelta=0, impaired reason preserved |
|
| 162 |
-- [ ] 17. Missing local delta/typeDeltas → integrity validation surfaces the fault, never hides it as sync latency |
|
| 163 |
-- [ ] 18. First auth after full deny (quality gate) → no anomalies, current.isPostRestore=true, isPostRestoreInferred=true |
|
| 164 |
-- [ ] 19. Chain fork → validateChain() returns .corrupted(reason: "chain fork detected"), stops |
|
| 165 |
-- [ ] 20. disappeared→appeared merge with -1 source → merged countDelta=0, reason != .normal |
|
| 166 |
- |
|
| 167 |
-### Reason Priority & Suppression (21-26) |
|
| 168 |
-- [ ] 21. TypeDelta reason priority → .unauthorized wins over .registryChanged simultaneously |
|
| 169 |
-- [ ] 22. Debounce + manual overlap → no observer snapshot if manual created during debounce |
|
| 170 |
-- [ ] 23. completionHandler unconditional → called via defer, never gated on scheduling success |
|
| 171 |
-- [ ] 24. isPostRestore forwarding → suppression forwarded past low-quality, consumed on next .complete |
|
| 172 |
-- [ ] 25. Missing delta → validateChain() returns .missingDelta and stops |
|
| 173 |
-- [ ] 26. OperationLog verification → recovery re-insert if missing after save, log critical error |
|
| 174 |
- |
|
| 175 |
-### Coherence & Edge Cases (27-32) |
|
| 176 |
-- [ ] 27. Per-type query concurrency → max 6 simultaneous HK queries (not 3N at N=20) |
|
| 177 |
-- [ ] 28. YearlyCount timezone drift → countDelta=0, yearlyCountNote set, no anomalies |
|
| 178 |
-- [ ] 29. isUnsupported on TypeCount → UI shows "Unsupported" without delta context |
|
| 179 |
-- [ ] 30. count/quality coherence assert → debug assert fires, release corrects to -1 |
|
| 180 |
-- [ ] 31. snapshotChecksum filter → uses quality==.complete, not hash!="" (determinism) |
|
| 181 |
-- [ ] 32. AnomalyRecord.deltaID structural → every record has deltaID==delta.id (no external setter) |
|
| 182 |
- |
|
| 183 |
-## Architectural Highlights |
|
| 184 |
- |
|
| 185 |
-### Purity & Immutability |
|
| 186 |
-- **AnomalyDetector** is pure: no SwiftData mutations, explicit TypeCount maps, DetectionResult metadata |
|
| 187 |
-- **DeltaService** never carries old checksums during merge (recomputes from surrounding snapshots) |
|
| 188 |
-- **OperationLog** atomicity: log + destructive changes in single context.save() |
|
| 189 |
- |
|
| 190 |
-### Quality Gates |
|
| 191 |
-- **Snapshot quality** aggregation prevents false positives: |
|
| 192 |
- - All detection requires both snapshots .complete |
|
| 193 |
- - Covers first authorization after full deny (quality gate alone is complete suppression) |
|
| 194 |
- - isPostRestore suppression forwarded past low-quality successors |
|
| 195 |
- |
|
| 196 |
-### Chain Integrity |
|
| 197 |
-- **previousSnapshotID** is the sole source of chain truth (not localSequenceNumber) |
|
| 198 |
-- **Fork detection** prevents chain divergence (asserts no duplicate previousSnapshotID) |
|
| 199 |
-- **Checksum validation** ensures data wasn't corrupted between snapshots |
|
| 200 |
- |
|
| 201 |
-### Local Archive Direction |
|
| 202 |
-- CloudKit/iCloud sync is not a product goal |
|
| 203 |
-- SwiftData rows are derived UI/index data and must be rebuildable from the local archive store |
|
| 204 |
-- Missing deltas or type deltas are treated as local integrity faults, not remote sync latency |
|
| 205 |
- |
|
| 206 |
-### Observability |
|
| 207 |
-- **Reason priority** makes anomaly suppression deterministic |
|
| 208 |
- - authorizationChanged > unsupported > registryChanged > unknown > normal |
|
| 209 |
- - Prevents .registryChanged from masking .authorizationChanged |
|
| 210 |
-- **YearlyCount timezone guard** prevents false loss attribution across DST |
|
| 211 |
-- **TypeDelta.yearlyCountNote** signals unreliable year-level attribution |
|
| 212 |
- |
|
| 213 |
-## Known Limitations (MVP) |
|
| 214 |
- |
|
| 215 |
-1. **Hash** covers only count + date range, not distribution (silentReplacement is best-effort) |
|
| 216 |
-2. **YearlyCount** precision requires daily bucket granularity (noted if isApproximate) |
|
| 217 |
-3. **Archive query/report UI is still pending** (store exists, UI still mostly reads SwiftData cache) |
|
| 218 |
-4. **No automatic cross-device reconstruction**; cross-device analysis is future macOS/report work |
|
| 219 |
- |
|
| 220 |
-## Next Steps |
|
| 221 |
- |
|
| 222 |
-### Immediate (Testing) |
|
| 223 |
-1. Run all 32 verification checks against real HealthKit data |
|
| 224 |
-2. Create unit tests for delta merge, reason priority, anomaly detection |
|
| 225 |
-3. Test observer callback debounce with real HKObserverQuery |
|
| 226 |
-4. Add archive status/report UI backed by `HealthArchiveStore` |
|
| 227 |
- |
|
| 228 |
-### Post-MVP |
|
| 229 |
-1. Integrate actual BGTask expiration guard for observer snapshots (capture partial results) |
|
| 230 |
-2. Add delta comparison view showing TypeDelta reason and suppression explanations |
|
| 231 |
-3. Implement OperationLog viewer in UI (audit trail dashboard) |
|
| 232 |
-4. Add historical trend analysis (divergence detection, anomaly patterns) |
|
| 233 |
- |
|
| 234 |
- |
|
| 235 |
-**Built with:** SwiftUI, SwiftData, HealthKit, CryptoKit |
|
| 236 |
-**Minimum iOS:** 17.0 |
|
| 237 |
-**Target iOS:** 26.4 |
|
| 238 |
-**Swift Version:** 5.9+ |
|