The autoSMART v1.0 system now implements differential storage optimization to significantly reduce database storage requirements while maintaining full data integrity and analysis capabilities.
Instead of storing complete SMART readings for every collection cycle, the system intelligently stores only:
The system uses multiple methods to detect changes:
ALTER TABLE smart_readings ADD COLUMN reading_type VARCHAR(20) DEFAULT 'full';
ALTER TABLE smart_readings ADD COLUMN changes_detected BOOLEAN DEFAULT true;
ALTER TABLE smart_readings ADD COLUMN changed_parameters JSONB;
ALTER TABLE smart_readings ADD COLUMN previous_reading_id INTEGER REFERENCES smart_readings(id);
ALTER TABLE smart_readings ADD COLUMN checksum VARCHAR(64);
The should_store_smart_reading() function provides intelligent storage decisions:
SELECT should_store_smart_reading(hdd_id, parameters_json, checksum, current_timestamp);
Returns:
- should_store - Boolean indicating if reading should be stored
- reading_type - 'baseline', 'full', or 'differential'
- changes_detected - Boolean indicating if changes were found
- changed_parameters - JSON array of changed parameter names
- previous_reading_id - Reference to previous reading for chaining
The smart_readings_reconstructed view uses recursive SQL to rebuild complete SMART data from differential readings:
SELECT * FROM smart_readings_reconstructed WHERE hdd_id = 123;
Add to system_config table:
INSERT INTO system_config (key, value, description) VALUES
('differential_storage_enabled', 'true', 'Enable differential storage optimization'),
('forced_storage_interval_hours', '24', 'Hours between forced full readings'),
('critical_parameter_force_store', 'true', 'Force storage for critical parameter changes'),
('temperature_change_threshold', '5', 'Temperature change threshold for storage (Celsius)');
New methods:
_should_store_reading() - Check storage requirements_insert_smart_reading_differential() - Store with differential info_get_recent_storage_stats() - Monitor storage efficiencyEnhanced collection:
Storage optimization:
Expected storage reduction of 60-80% for typical HDD environments:
use SmartCollector;
my $collector = SmartCollector->new($config);
my $result = $collector->collect_all();
print "Storage efficiency: " . $result->{storage_stats}->{efficiency_percent} . "%\n";
print "Differential readings: " . $result->{storage_stats}->{differential} . "\n";
Run the comprehensive test suite:
cd /etc/pve/autoSMART
./scripts/test-differential-storage.pl
This will: 1. Create test HDD entries 2. Test storage decisions for various change scenarios 3. Validate data reconstruction 4. Show storage efficiency statistics
Existing installations can migrate seamlessly:
SELECT
reading_type,
COUNT(*) as count,
COUNT(*) * 100.0 / SUM(COUNT(*)) OVER() as percentage
FROM smart_readings
WHERE timestamp > NOW() - INTERVAL '7 days'
GROUP BY reading_type;
EXPLAIN ANALYZE
SELECT * FROM smart_readings_reconstructed
WHERE hdd_id = 123 AND timestamp > NOW() - INTERVAL '30 days';
SELECT
COUNT(*) as total_possible_readings,
COUNT(*) FILTER (WHERE reading_type != 'skipped') as stored_readings,
(COUNT(*) FILTER (WHERE reading_type != 'skipped') * 100.0 / COUNT(*)) as storage_percentage,
(100 - (COUNT(*) FILTER (WHERE reading_type != 'skipped') * 100.0 / COUNT(*))) as savings_percentage
FROM smart_readings
WHERE timestamp > NOW() - INTERVAL '30 days';
Default parameters that trigger immediate full storage:
- Reallocated_Sector_Ct
- Current_Pending_Sector
- Offline_Uncorrectable
- Reallocated_Event_Count
- Spin_Retry_Count
Configure in smart_thresholds table with weight >= 8.0.
The differential storage system provides significant storage optimization while maintaining complete data integrity and analytical capabilities. The system automatically adapts to HDD behavior patterns, storing more data when drives show issues and reducing storage when drives are stable.
This optimization is particularly beneficial for large-scale deployments like the Madagascar cluster, where hundreds of HDDs generate continuous SMART data over years of operation.