|
Bogdan Timofte
authored
2 weeks ago
|
1
|
# HealthProbe - Export Specification
|
|
|
2
|
|
|
|
3
|
**Last Updated:** 2026-05-23
|
|
|
4
|
**Status:** Target design for recovery-compatible exports
|
|
|
5
|
|
|
|
6
|
## 1. Purpose
|
|
|
7
|
|
|
|
8
|
Exports let the user preserve selected point-in-time views, diffs, record tables, and evidence summaries from the local SQLite archive.
|
|
|
9
|
|
|
|
10
|
HealthProbe does not restore, patch backups, or re-publish HealthKit data. Exports should still preserve enough structure for external recovery/salvage tools to reason about what was observed.
|
|
|
11
|
|
|
|
12
|
## 2. Export Kinds
|
|
|
13
|
|
|
|
14
|
Supported target kinds:
|
|
|
15
|
- `observation_records_json`;
|
|
|
16
|
- `observation_records_csv`;
|
|
|
17
|
- `observation_diff_json`;
|
|
|
18
|
- `type_summary_json`;
|
|
|
19
|
- `archive_manifest_json`.
|
|
|
20
|
|
|
|
21
|
Large exports must stream/page from SQLite. Do not materialize all records into Swift arrays.
|
|
|
22
|
|
|
|
23
|
## 3. JSON Envelope
|
|
|
24
|
|
|
|
25
|
Every JSON export uses a versioned envelope:
|
|
|
26
|
|
|
|
27
|
```json
|
|
|
28
|
{
|
|
|
29
|
"export_format_version": 1,
|
|
|
30
|
"export_id": "UUID",
|
|
|
31
|
"export_kind": "observation_records_json",
|
|
|
32
|
"created_at": "2026-05-23T12:00:00.000Z",
|
|
|
33
|
"app": {
|
|
|
34
|
"name": "HealthProbe",
|
|
|
35
|
"version": "local-build"
|
|
|
36
|
},
|
|
|
37
|
"archive": {
|
|
|
38
|
"schema_version": 2,
|
|
|
39
|
"device_chain_hash": "hex",
|
|
|
40
|
"from_observation_id": 1,
|
|
|
41
|
"to_observation_id": null
|
|
|
42
|
},
|
|
|
43
|
"filters": {
|
|
|
44
|
"sample_type_identifiers": [],
|
|
|
45
|
"date_range": null,
|
|
|
46
|
"include_relationships": true
|
|
|
47
|
},
|
|
|
48
|
"manifest": {
|
|
|
49
|
"record_count": 0,
|
|
|
50
|
"item_hash_algorithm": "sha256",
|
|
|
51
|
"manifest_hash_algorithm": "sha256",
|
|
|
52
|
"manifest_hash": "hex"
|
|
|
53
|
},
|
|
|
54
|
"items": []
|
|
|
55
|
}
|
|
|
56
|
```
|
|
|
57
|
|
|
|
58
|
JSON keys are emitted in deterministic sorted order for canonical hashing.
|
|
|
59
|
|
|
|
60
|
## 4. Export Item Contract
|
|
|
61
|
|
|
|
62
|
Record items should include:
|
|
|
63
|
- sample identity hash;
|
|
|
64
|
- HealthKit UUID hash when available;
|
|
|
65
|
- strict fingerprint;
|
|
|
66
|
- semantic fingerprint when available;
|
|
|
67
|
- payload version hash;
|
|
|
68
|
- sample type identifier;
|
|
|
69
|
- start/end timestamps as ISO 8601 UTC;
|
|
|
70
|
- value kind, value, unit, category, workout fields;
|
|
|
71
|
- source/provenance hashes or redacted fields allowed by the export scope;
|
|
|
72
|
- metadata hash and optional metadata object when allowed;
|
|
|
73
|
- relationships when both endpoints are in scope, or unresolved endpoint hashes when explicitly allowed;
|
|
|
74
|
- observation visibility fields: first seen, last verified, disappeared evidence when available.
|
|
|
75
|
|
|
|
76
|
Every item has:
|
|
|
77
|
- `item_hash = SHA-256(canonical item JSON)`.
|
|
|
78
|
|
|
|
79
|
## 5. Manifest Hash
|
|
|
80
|
|
|
|
81
|
`manifest_hash` is calculated incrementally:
|
|
|
82
|
|
|
|
83
|
```text
|
|
|
84
|
SHA-256(
|
|
|
85
|
canonical_export_metadata_json
|
|
|
86
|
+ ordered_item_hashes
|
|
|
87
|
+ canonical_counts_json
|
|
|
88
|
+ canonical_filter_json
|
|
|
89
|
)
|
|
|
90
|
```
|
|
|
91
|
|
|
|
92
|
The manifest hash must cover exported content through item hashes. Counts, first dates, or last dates alone are not sufficient.
|
|
|
93
|
|
|
|
94
|
Item order:
|
|
|
95
|
1. sample type identifier;
|
|
|
96
|
2. start date;
|
|
|
97
|
3. end date;
|
|
|
98
|
4. sample identity hash;
|
|
|
99
|
5. payload version hash.
|
|
|
100
|
|
|
|
101
|
## 6. CSV Contract
|
|
|
102
|
|
|
|
103
|
CSV exports are flat record tables for spreadsheet and external tooling.
|
|
|
104
|
|
|
|
105
|
Required column order:
|
|
|
106
|
1. `export_id`
|
|
|
107
|
2. `observation_id`
|
|
|
108
|
3. `sample_type_identifier`
|
|
|
109
|
4. `sample_identity_hash`
|
|
|
110
|
5. `sample_uuid_hash`
|
|
|
111
|
6. `strict_fingerprint`
|
|
|
112
|
7. `semantic_fingerprint`
|
|
|
113
|
8. `payload_hash`
|
|
|
114
|
9. `start_date_utc`
|
|
|
115
|
10. `end_date_utc`
|
|
|
116
|
11. `value_kind`
|
|
|
117
|
12. `numeric_value`
|
|
|
118
|
13. `unit`
|
|
|
119
|
14. `category_value`
|
|
|
120
|
15. `workout_activity_type`
|
|
|
121
|
16. `duration_seconds`
|
|
|
122
|
17. `source_hash`
|
|
|
123
|
18. `device_hash`
|
|
|
124
|
19. `metadata_hash`
|
|
|
125
|
20. `first_seen_observation_id`
|
|
|
126
|
21. `last_verified_observation_id`
|
|
|
127
|
22. `disappeared_observation_id`
|
|
|
128
|
23. `item_hash`
|
|
|
129
|
|
|
|
130
|
CSV uses RFC 4180 quoting rules and UTF-8.
|
|
|
131
|
|
|
|
132
|
Relationships are not flattened into the main CSV row. If needed, export a companion relationships CSV with source/target sample hashes.
|
|
|
133
|
|
|
|
134
|
## 7. Streaming And Cancellation
|
|
|
135
|
|
|
|
136
|
Implementation contract:
|
|
|
137
|
- page records from SQLite with deterministic cursors;
|
|
|
138
|
- write output incrementally;
|
|
|
139
|
- update item/manifest hash state as rows stream;
|
|
|
140
|
- if the user cancels, mark export status as `cancelled` and do not record a completed manifest;
|
|
|
141
|
- failed exports should leave no completed manifest row unless the output is verifiable.
|
|
|
142
|
|
|
|
143
|
Resume support is optional for v1 exports.
|
|
|
144
|
|
|
|
145
|
## 8. Provenance Warning
|
|
|
146
|
|
|
|
147
|
Every user-facing export flow must communicate that:
|
|
|
148
|
- exported data is observed evidence from HealthKit-accessible surfaces;
|
|
|
149
|
- external re-publication to HealthKit may lose original metadata/provenance;
|
|
|
150
|
- HealthProbe itself does not restore or modify HealthKit/iOS backups.
|
|
|
151
|
|