# autoSMART Database Documentation

## Overview

autoSMART uses PostgreSQL as its primary database for storing SMART data, HDD tracking information, predictions, and system configuration. The database is designed for multi-node cluster deployments with comprehensive HDD mobility tracking.

## Database Schema

### Core Tables

#### `hdd_inventory`
The central inventory table that tracks all HDDs across the cluster.

```sql
CREATE TABLE hdd_inventory (
    id                  SERIAL PRIMARY KEY,
    serial_number       VARCHAR(100) NOT NULL,
    model_name          VARCHAR(200) NOT NULL,
    firmware            VARCHAR(50),
    size_gb             INTEGER,
    manufacturer        VARCHAR(100),
    current_device_path VARCHAR(50),
    current_node_id     VARCHAR(50),
    current_slot        VARCHAR(20),
    madagascar_id       VARCHAR(100),
    first_seen          TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    last_seen           TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    status              VARCHAR(20) DEFAULT 'active',
    status_changed_at   TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    notes               TEXT,
    created_at          TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at          TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    
    CONSTRAINT unique_hardware_id UNIQUE (serial_number, model_name)
);
```

**Key Features:**
- **Hardware-based identification**: Uses `serial_number` + `model_name` as unique constraint
- **Current location tracking**: `current_device_path`, `current_node_id` show where HDD is now
- **Lifecycle management**: `first_seen`, `last_seen`, `status` track HDD lifecycle
- **Madagascar integration**: `madagascar_id` field for cluster-specific identification

#### `hdd_presence`
Tracks HDD mobility across cluster nodes - records when HDDs are present on different nodes.

```sql
CREATE TABLE hdd_presence (
    id SERIAL PRIMARY KEY,
    serial_number VARCHAR(64) NOT NULL,
    node VARCHAR(64) NOT NULL,
    data_start TIMESTAMP NOT NULL,
    data_end TIMESTAMP NOT NULL,
    is_current BOOLEAN NOT NULL DEFAULT TRUE
);
```

**Key Features:**
- **Mobility tracking**: Records when HDDs move between nodes
- **Time-based records**: `data_start`/`data_end` define presence periods
- **Current vs Historic**: `is_current` flag marks active presence
- **Independent of inventory**: Works independently of `hdd_inventory` for pure mobility data

**Example Data:**
```sql
 id | serial_number  |   node    |         data_start         |          data_end          | is_current 
----+----------------+-----------+----------------------------+----------------------------+------------
  4 | ZW60K01R       | ebony     | 2025-08-16 22:05:15.863971 | 2025-08-16 22:05:15.863971 | t
  3 | S2HSNXRH402205 | ebony     | 2025-08-16 22:05:15.109956 | 2025-08-16 22:05:15.109956 | t
  2 | ZW60K01R       | baobab    | 2025-08-16 21:47:13.873642 | 2025-08-16 22:03:31.052316 | f
  1 | S2HSNXRH402205 | tapia     | 2025-08-16 21:47:13.078524 | 2025-08-16 22:03:30.268985 | f
```

#### `smart_readings`
Stores SMART data readings with differential storage optimization.

```sql
CREATE TABLE smart_readings (
    id                   BIGSERIAL PRIMARY KEY,
    hdd_id               INTEGER REFERENCES hdd_inventory(id),
    serial_number        VARCHAR(100) NOT NULL,
    device_path          VARCHAR(50),
    node_id              VARCHAR(50),
    timestamp            TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    collection_ok        BOOLEAN DEFAULT true,
    temperature          INTEGER,
    parameters_json      JSONB,
    reading_type         VARCHAR(20) DEFAULT 'full',
    changes_detected     BOOLEAN DEFAULT true,
    changed_parameters   JSONB,
    previous_reading_id  INTEGER REFERENCES smart_readings(id),
    checksum             VARCHAR(64)
);
```

**Reading Types:**
- `baseline`: First reading for an HDD
- `full`: Complete parameter set (forced by time interval)
- `differential`: Only changed parameters (optimization)
- `skipped`: No changes detected

**Key Features:**
- **Differential storage**: Only stores changes to reduce data volume
- **Full context**: Links to `hdd_inventory` and includes node information
- **Change tracking**: `previous_reading_id` creates reading chains
- **JSONB parameters**: Flexible storage for SMART attributes

#### `predictions`
AI-generated failure predictions and analysis.

```sql
CREATE TABLE predictions (
    id                    SERIAL PRIMARY KEY,
    hdd_id                INTEGER REFERENCES hdd_inventory(id),
    serial_number         VARCHAR(100) NOT NULL,
    device_path           VARCHAR(50),
    timestamp             TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    risk_level            VARCHAR(20),
    failure_probability   DECIMAL(5,4),
    predicted_failure_date DATE,
    confidence_score      DECIMAL(5,4),
    analysis_summary      TEXT,
    recommendations       JSONB,
    openai_response       JSONB,
    created_at            TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
```

#### `alert_history`
Tracks all alerts sent about HDD issues.

```sql
CREATE TABLE alert_history (
    id              SERIAL PRIMARY KEY,
    hdd_id          INTEGER REFERENCES hdd_inventory(id),
    serial_number   VARCHAR(100) NOT NULL,
    alert_type      VARCHAR(50),
    severity        VARCHAR(20),
    message         TEXT,
    sent_at         TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    sent_to         TEXT,
    delivery_status VARCHAR(20) DEFAULT 'pending',
    related_reading_id BIGINT REFERENCES smart_readings(id),
    related_prediction_id INTEGER REFERENCES predictions(id)
);
```

### Configuration Tables

#### `system_config`
Global system configuration parameters.

```sql
CREATE TABLE system_config (
    id          SERIAL PRIMARY KEY,
    config_key  VARCHAR(100) UNIQUE NOT NULL,
    value       TEXT,
    description TEXT,
    created_at  TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at  TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
```

**Default Configuration:**
- `collection_interval_seconds`: SMART data collection frequency
- `differential_storage_enabled`: Enable/disable storage optimization
- `forced_storage_interval_hours`: Force full readings periodically
- `critical_parameter_force_store`: Always store critical changes
- `temperature_change_threshold`: Temperature delta for storage

#### `smart_thresholds`
SMART parameter warning and critical thresholds.

```sql
CREATE TABLE smart_thresholds (
    id                SERIAL PRIMARY KEY,
    parameter_name    VARCHAR(100) NOT NULL,
    warning_threshold NUMERIC,
    critical_threshold NUMERIC,
    weight            NUMERIC DEFAULT 1.0,
    enabled           BOOLEAN DEFAULT true,
    description       TEXT,
    created_at        TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at        TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
```

## Views

### `smart_readings_reconstructed`
Reconstructs complete SMART data from differential storage.

```sql
CREATE VIEW smart_readings_reconstructed AS
WITH RECURSIVE reading_chain AS (
    -- Base case: get baseline readings
    SELECT id, hdd_id, serial_number, timestamp, 
           parameters_json, temperature, reading_type,
           previous_reading_id, 1 as chain_level
    FROM smart_readings 
    WHERE reading_type IN ('baseline', 'full')
    
    UNION ALL
    
    -- Recursive case: follow the chain of differential readings
    SELECT sr.id, sr.hdd_id, sr.serial_number, sr.timestamp,
           COALESCE(rc.parameters_json, '{}'::jsonb) || sr.parameters_json as parameters_json,
           COALESCE(sr.temperature, rc.temperature) as temperature,
           sr.reading_type, sr.previous_reading_id,
           rc.chain_level + 1
    FROM smart_readings sr
    JOIN reading_chain rc ON sr.previous_reading_id = rc.id
    WHERE sr.reading_type = 'differential'
)
SELECT id, hdd_id, serial_number, timestamp,
       parameters_json, temperature, reading_type, chain_level
FROM reading_chain;
```

### `latest_smart_readings`
Current SMART status for all active drives.

```sql
CREATE VIEW latest_smart_readings AS
SELECT DISTINCT ON (sr.hdd_id)
    sr.id, sr.hdd_id, sr.serial_number, sr.timestamp,
    sr.parameters_json, sr.temperature,
    hi.model_name, hi.manufacturer, hi.size_gb,
    hi.current_device_path, hi.current_node_id
FROM smart_readings_reconstructed sr
JOIN hdd_inventory hi ON sr.hdd_id = hi.id
ORDER BY sr.hdd_id, sr.timestamp DESC;
```

### `drive_health_summary`
Comprehensive health overview for all drives.

```sql
CREATE VIEW drive_health_summary AS
SELECT 
    hi.id as hdd_id, hi.serial_number, hi.model_name,
    hi.manufacturer, hi.current_device_path, hi.current_node_id,
    hi.status, lsr.timestamp as last_reading, lsr.temperature,
    p.risk_level, p.failure_probability, p.predicted_failure_date,
    EXTRACT(EPOCH FROM (NOW() - lsr.timestamp))/3600 as hours_since_last_reading
FROM hdd_inventory hi
LEFT JOIN latest_smart_readings lsr ON hi.id = lsr.hdd_id
LEFT JOIN LATERAL (
    SELECT risk_level, failure_probability, predicted_failure_date
    FROM predictions 
    WHERE hdd_id = hi.id 
    ORDER BY timestamp DESC 
    LIMIT 1
) p ON true
WHERE hi.status = 'active';
```

## Functions

### `update_hdd_presence()`
Manages HDD presence tracking when a drive is detected on a node.

```sql
CREATE OR REPLACE FUNCTION update_hdd_presence(
    p_serial_number VARCHAR(64),
    p_node VARCHAR(64)
) RETURNS VOID AS $$
BEGIN
    -- Mark all previous presence records for this serial as historic
    UPDATE hdd_presence 
    SET is_current = FALSE 
    WHERE serial_number = p_serial_number AND is_current = TRUE AND node <> p_node;
    
    -- Check if there's already a current presence for this serial/node
    IF EXISTS (SELECT 1 FROM hdd_presence WHERE serial_number = p_serial_number AND node = p_node AND is_current = TRUE) THEN
        -- Update data_end for existing current presence
        UPDATE hdd_presence 
        SET data_end = NOW() 
        WHERE serial_number = p_serial_number AND node = p_node AND is_current = TRUE;
    ELSE
        -- Create new presence record
        INSERT INTO hdd_presence (serial_number, node, data_start, data_end, is_current)
        VALUES (p_serial_number, p_node, NOW(), NOW(), TRUE);
    END IF;
END;
$$ LANGUAGE plpgsql;
```

### `should_store_smart_reading()`
Determines if a SMART reading should be stored based on differential storage logic.

```sql
CREATE OR REPLACE FUNCTION should_store_smart_reading(
    p_hdd_id INTEGER,
    p_parameters_json JSONB,
    p_checksum VARCHAR(64),
    p_timestamp TIMESTAMP WITH TIME ZONE DEFAULT NOW()
) RETURNS TABLE(
    should_store BOOLEAN,
    reading_type VARCHAR(20),
    changes_detected BOOLEAN,
    changed_parameters JSONB,
    previous_reading_id INTEGER
) AS $$
-- Function implementation handles:
-- - Differential storage enabled/disabled
-- - Checksum-based change detection
-- - Force intervals for full readings
-- - Reading type determination
$$;
```

## Indexes

### Performance Indexes
```sql
-- hdd_inventory indexes
CREATE INDEX idx_hdd_inventory_device_path ON hdd_inventory(current_device_path);
CREATE INDEX idx_hdd_inventory_node ON hdd_inventory(current_node_id);
CREATE INDEX idx_hdd_inventory_status ON hdd_inventory(status);
CREATE INDEX idx_hdd_inventory_last_seen ON hdd_inventory(last_seen);

-- hdd_presence indexes
CREATE INDEX idx_hdd_presence_serial_current ON hdd_presence(serial_number, is_current);
CREATE INDEX idx_hdd_presence_node ON hdd_presence(node);
CREATE INDEX idx_hdd_presence_data_end ON hdd_presence(data_end DESC);

-- smart_readings indexes
CREATE INDEX idx_smart_readings_hdd_id ON smart_readings(hdd_id);
CREATE INDEX idx_smart_readings_timestamp ON smart_readings(timestamp DESC);
CREATE INDEX idx_smart_readings_serial ON smart_readings(serial_number);
CREATE INDEX idx_smart_readings_device_path ON smart_readings(device_path);
CREATE INDEX idx_smart_readings_type ON smart_readings(reading_type);
CREATE INDEX idx_smart_readings_checksum ON smart_readings(checksum);
CREATE INDEX idx_smart_readings_previous ON smart_readings(previous_reading_id);

-- JSONB indexes for flexible queries
CREATE INDEX idx_smart_readings_parameters ON smart_readings USING GIN (parameters_json);
CREATE INDEX idx_smart_readings_changed_params ON smart_readings USING GIN (changed_parameters);
```

## Data Flow

### Collection Process
1. **Device Discovery**: Collector scans `/dev/sd*` and `/dev/nvme*` devices
2. **SMART Reading**: Uses `smartctl` to extract device information and parameters
3. **HDD Registration**: `get_or_create_hdd()` adds new devices to `hdd_inventory`
4. **Presence Tracking**: `update_hdd_presence()` records current node location
5. **Data Storage**: Stores SMART readings with differential optimization
6. **Change Detection**: Uses checksums to detect parameter changes

### Mobility Tracking
1. **HDD Detected**: When HDD is found on a new node
2. **Historic Records**: Previous presence records marked `is_current = FALSE`
3. **New Presence**: New record created with `is_current = TRUE`
4. **Timeline**: Complete history maintained with `data_start`/`data_end` timestamps

### Query Examples

#### Find HDD History
```sql
SELECT serial_number, node, data_start, data_end, is_current
FROM hdd_presence 
WHERE serial_number = 'ZW60K01R' 
ORDER BY data_start DESC;
```

#### Current HDD Locations
```sql
SELECT h.serial_number, h.model_name, p.node, h.current_device_path
FROM hdd_inventory h
JOIN hdd_presence p ON h.serial_number = p.serial_number
WHERE p.is_current = TRUE;
```

#### SMART Parameter Trends
```sql
SELECT timestamp, 
       parameters_json->>'Power_On_Hours' as power_hours,
       parameters_json->>'Temperature_Celsius' as temp,
       temperature
FROM smart_readings_reconstructed 
WHERE serial_number = 'ZW60K01R' 
ORDER BY timestamp DESC 
LIMIT 10;
```

#### Health Summary
```sql
SELECT * FROM drive_health_summary 
WHERE current_node_id = 'ebony';
```

## Troubleshooting

### Common Issues

#### 1. Node ID Mismatch
**Problem**: HDD presence shows wrong node name
**Cause**: Deploy script used local hostname instead of target node name
**Solution**: Deploy script now correctly determines target node name from `cluster.json`

#### 2. Empty hdd_presence Table
**Problem**: No mobility tracking data
**Causes**:
- SMART parameter parsing regex incompatible with new smartctl format
- Missing database sequence permissions
- Incomplete smart_readings INSERT statements

**Solutions**:
- Updated regex to support both old and new smartctl formats
- Added sequence permissions: `GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO autosmart;`
- Fixed INSERT to include all required fields

#### 3. Differential Storage Issues
**Problem**: Too much or too little data stored
**Configuration**: Adjust in `system_config` table:
```sql
UPDATE system_config SET value = 'false' WHERE config_key = 'differential_storage_enabled';
UPDATE system_config SET value = '12' WHERE config_key = 'forced_storage_interval_hours';
```

## Permissions

### Database User Setup
```sql
-- Create autosmart user
CREATE USER autosmart WITH PASSWORD 'autoSMART2025!';

-- Grant permissions
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO autosmart;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO autosmart;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA public TO autosmart;
```

### Sequence Permissions
```sql
-- Required for INSERT operations with SERIAL columns
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO autosmart;
```

## Maintenance

### Regular Tasks
1. **Monitor disk usage**: SMART readings table grows over time
2. **Archive old data**: Consider archiving readings older than 1 year
3. **Index maintenance**: REINDEX periodically for performance
4. **Backup**: Regular PostgreSQL backups recommended

### Performance Monitoring
```sql
-- Table sizes
SELECT schemaname, tablename, 
       pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables 
WHERE schemaname = 'public' 
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

-- Recent activity
SELECT COUNT(*) as readings_today 
FROM smart_readings 
WHERE timestamp > CURRENT_DATE;

SELECT COUNT(*) as active_drives 
FROM hdd_inventory 
WHERE status = 'active';
```
