This document serves as the complete guide for developers working on autoSMART. It includes development environment setup, architecture documentation, testing procedures, and developer-specific changelog.
autoSMART follows a modular architecture with clear separation of concerns. Below is the complete directory structure and file descriptions:
autoSMART/
โโโ README.md # Symlink to docs/README.md (end-user documentation)
โโโ .deployignore # Files excluded from production deployment
โโโ config/ # Configuration files and templates
โโโ docs/ # Documentation (mixed deployment)
โโโ lib/ # Perl modules and core libraries
โโโ scripts/ # Executable scripts and utilities
โโโ sql/ # Database schema and SQL files
/config/ - Configuration ManagementConfiguration files are organized by scope and environment:
config/
โโโ cluster.conf # Cluster-wide settings (shared across nodes)
โโโ cluster-ebony.conf # Node-specific configuration for ebony
โโโ database.conf # PostgreSQL connection settings
โโโ openai.conf # OpenAI API configuration and prompts
โโโ smart.conf # SMART parameter thresholds and monitoring rules
โโโ default # Default/template configuration
โโโ debug-ebony.sh # Development debugging script for ebony node
cluster.conf (88 lines):
database.conf (30 lines):
openai.conf (50 lines):
smart.conf (57 lines):
default (107 lines):
cluster-ebony.conf (13 lines):
debug-ebony.sh (29 lines):
/lib/ - Core Perl ModulesCore business logic implemented as reusable Perl modules:
lib/
โโโ SmartCollector.pm # SMART data collection and hardware tracking
โโโ PredictionEngine.pm # AI-powered failure prediction engine
SmartCollector.pm (802 lines):
PredictionEngine.pm (607 lines):
/scripts/ - Executable ComponentsProduction scripts and development utilities:
scripts/
โโโ autosmart-collector.pl # Main data collection daemon
โโโ autosmart-predictor.pl # AI prediction processing
โโโ autosmart-report.pl # Report generation engine
โโโ autosmart-migration-report.pl # Hardware migration analysis
โโโ smart-collector-daemon.pl # Background collection service
โโโ deploy.sh # Unified deployment script
โโโ deploy-production.sh # Production cluster deployment
โโโ install.sh # Symlink to deploy.sh for compatibility
โโโ uninstall.sh # Complete system removal
โโโ monitor-cluster.sh # Cluster health monitoring
โโโ test-smart-collection.pl # SMART collection testing
โโโ test-differential-storage.pl # Storage optimization testing
โโโ test-db-connection.pl # Database connectivity testing
โโโ simple-smart-test.pl # Basic SMART functionality test
autosmart-collector.pl (348 lines):
autosmart-predictor.pl (483 lines):
autosmart-report.pl (662 lines):
smart-collector-daemon.pl (252 lines):
deploy.sh (697 lines):
deploy-production.sh (116 lines):
uninstall.sh (187 lines):
monitor-cluster.sh (515 lines):
test-smart-collection.pl (132 lines):
test-differential-storage.pl (270 lines):
test-db-connection.pl (55 lines):
simple-smart-test.pl (144 lines):
autosmart-migration-report.pl (615 lines):
/sql/ - Database SchemaPostgreSQL database definitions and utilities:
sql/
โโโ schema.sql # Complete production database schema
โโโ schema-fixed.sql # Schema with specific fixes/patches
hdd_inventory: Hardware identification and location trackingsmart_readings: SMART parameter data with differential storagehdd_migrations: Drive movement logging between nodes/pathspredictions: AI-generated failure predictions with confidence scoresalert_history: Alert notification tracking and escalationsmart_thresholds: Configurable parameter thresholds and alert rulessystem_config: System-wide configuration parametersshould_store_smart_reading())smart_readings_reconstructed)schema.sql (726 lines):
schema-fixed.sql (423 lines):
/docs/ - DocumentationDocumentation organized by audience and deployment status:
docs/
โโโ README.md # End-user guide (DEPLOYED)
โโโ INSTALLATION.md # Setup and configuration (DEPLOYED)
โโโ CHANGELOG.md # Release notes for end-users (DEPLOYED)
โโโ API.md # OpenAI API configuration (DEPLOYED)
โโโ DEVELOPMENT.md # Developer guide (NOT DEPLOYED)
โโโ DIFFERENTIAL_STORAGE.md # Technical storage details (NOT DEPLOYED)
smartmontools โ SmartCollector.pm โ PostgreSQL โ PredictionEngine.pm โ OpenAI API
โ โ โ โ
autosmart-collector.pl โ Database โ autosmart-predictor.pl โ Reports
cluster.conf (global) โ node-specific.conf โ smart.conf โ openai.conf
โ
Individual script configurations
autosmart-collector.pl
โโโ SmartCollector.pm
โโโ database.conf
โโโ smart.conf
โโโ cluster.conf
autosmart-predictor.pl
โโโ PredictionEngine.pm
โโโ SmartCollector.pm (for data access)
โโโ openai.conf
โโโ database.conf
# Current test database configuration
Host: 192.168.2.102
Database: autosmart
User: postgres
Password: (no password)
Port: 5432
# Core database modules
cpan install DBI DBD::Pg
# JSON processing
cpan install JSON::XS
# System utilities
cpan install Config::Simple File::Slurp Time::HiRes
# Security and hashing
cpan install Digest::SHA
# HTTP/API clients (for OpenAI integration)
cpan install LWP::UserAgent HTTP::Request::Common
# Optional: Development and testing
cpan install Data::Dumper Test::More Test::Exception
# Clone the project
cd /Users/bogdan/Documents/workspace/
git clone <autoSMART-repo>
cd autoSMART
# Set environment variables
export AUTOSMART_DB_HOST=192.168.2.102
export AUTOSMART_DB_NAME=autosmart
export AUTOSMART_DB_USER=postgres
export AUTOSMART_DB_PASS=
export AUTOSMART_DB_PORT=5432
# Optional: OpenAI API key for AI features
export OPENAI_API_KEY=your-api-key-here
# Initialize the database schema
psql -h 192.168.2.102 -U postgres -d autosmart -f sql/schema.sql
# Verify installation
psql -h 192.168.2.102 -U postgres -d autosmart -c "\\dt"
# Run the differential storage test suite
cd scripts/
perl test-differential-storage.pl
# Test database connectivity
perl -e "
use DBI;
my \$dsn = 'DBI:Pg:dbname=autosmart;host=192.168.2.102;port=5432';
my \$dbh = DBI->connect(\$dsn, 'postgres', '', {RaiseError => 1});
print \"Database connection successful!\\n\";
\$dbh->disconnect();
"
autoSMART Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Proxmox Cluster โ
โโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโค
โ Node 1 โ Node 2 โ Node 3 โ
โ โ โ โ
โ โโโโ SmartCollector โค โโโโ SmartCollector โค โโโโ SmartCollector
โ โ - HDD Scanning โ โ - HDD Scanning โ โ - HDD Scanning
โ โ - SMART Reading โ โ - SMART Reading โ โ - SMART Reading
โ โ - Migration Det โ โ - Migration Det โ โ - Migration Det
โ โโโโ Data Storage โ โโโโ Data Storage โ โโโโ Data Storage
โโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโผโโโโโโโโโโ
โ PostgreSQL DB โ
โ โ
โ โข HDD Inventory โ
โ โข SMART Readings โ
โ โข Migrations โ
โ โข AI Predictions โ
โโโโโโโโโโฌโโโโโโโโโโ
โ
โโโโโโโโโโโโผโโโโโโโโโโโโ
โ SmartAnalyzer โ
โ โ
โ โข OpenAI API โ
โ โข Failure Prediction โ
โ โข Pattern Analysis โ
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โโโโโโโโโโโโผโโโโโโโโโโโโ
โ SmartReporter โ
โ โ
โ โข Alert Generation โ
โ โข Report Creation โ
โ โข Dashboard Data โ
โโโโโโโโโโโโโโโโโโโโโโโโ
Collection Phase:
Analysis Phase:
Reporting Phase:
# Hardware identification and migration detection
sub _detect_or_create_hdd($drive_info, $smart_data)
# Differential storage decision making
sub _should_store_reading($hdd_id, $smart_data)
# Optimized data storage
sub _insert_smart_reading_differential($hdd_id, $drive_info, $smart_data, $storage_info)
New SMART Parameters: ```perl
if ($line =~ /New_Parameter.*\s+(\d+)/) { $smart_data->{parameters}{'New_Parameter'} = $1; } ```
Custom Manufacturer Detection: ```perl
sub _detect_manufacturer { my ($self, $model) = @_; return 'Custom_Manufacturer' if $model =~ /CUSTOM_PATTERN/; # ... existing logic } ```
# OpenAI API call structure
sub _call_openai_api {
my ($self, $prompt, $smart_data) = @_;
my $request = HTTP::Request->new(POST => 'https://api.openai.com/v1/chat/completions');
$request->header('Authorization' => "Bearer $self->{openai_api_key}");
$request->header('Content-Type' => 'application/json');
my $payload = {
model => "gpt-4",
messages => [
{
role => "system",
content => "You are an expert in HDD failure prediction..."
},
{
role => "user",
content => $prompt
}
]
};
# ... handle response
}
-- Always include migration scripts
CREATE TABLE new_feature (
id SERIAL PRIMARY KEY,
hdd_id INTEGER REFERENCES hdd_inventory(id),
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Add indexes for performance
CREATE INDEX idx_new_feature_hdd_id ON new_feature(hdd_id);
-- Use ALTER statements for compatibility
ALTER TABLE smart_readings ADD COLUMN new_field VARCHAR(100);
CREATE INDEX CONCURRENTLY idx_smart_readings_new_field ON smart_readings(new_field);
-- Use the reconstructed view for complete data
SELECT * FROM smart_readings_reconstructed
WHERE hdd_id = $1
AND timestamp > NOW() - INTERVAL '30 days'
ORDER BY timestamp DESC;
-- Use raw table for storage statistics
SELECT reading_type, COUNT(*)
FROM smart_readings
WHERE timestamp > NOW() - INTERVAL '7 days'
GROUP BY reading_type;
# Example test structure
use Test::More tests => 5;
use lib '../lib';
use SmartCollector;
my $collector = SmartCollector->new({
db_host => '192.168.2.102',
db_name => 'autosmart_test',
# ... test config
});
# Test hardware identification
my $hdd_id = $collector->_detect_or_create_hdd($drive_info, $smart_data);
ok($hdd_id > 0, "HDD identification successful");
# Test differential storage
my $storage_decision = $collector->_should_store_reading($hdd_id, $smart_data);
ok($storage_decision->{store}, "Storage decision made");
# Run the comprehensive test suite
cd scripts/
perl test-differential-storage.pl
# Test with real hardware (if available)
perl collect-smart-data.pl --test-mode --device /dev/sdb
-- Test query performance
EXPLAIN ANALYZE
SELECT * FROM smart_readings_reconstructed
WHERE hdd_id IN (1,2,3,4,5)
AND timestamp > NOW() - INTERVAL '90 days';
-- Monitor storage efficiency
SELECT
reading_type,
COUNT(*) as readings,
AVG(length(parameters_json::text)) as avg_size_bytes
FROM smart_readings
WHERE timestamp > NOW() - INTERVAL '24 hours'
GROUP BY reading_type;
# Enable debug logging
$ENV{AUTOSMART_DEBUG} = 3; # Maximum verbosity
# Log levels:
# 1 = Errors only
# 2 = Warnings and errors
# 3 = Info, warnings, errors
# 4 = Debug everything
# Test database connectivity
psql -h 192.168.2.102 -U postgres -d autosmart -c "SELECT version();"
# Check permissions
psql -h 192.168.2.102 -U postgres -d autosmart -c "\\dp smart_readings"
# Test smartctl access
sudo smartctl -a /dev/sda
# Check permissions
ls -la /dev/sd*
-- Check migration logs
SELECT * FROM hdd_migrations
ORDER BY detected_at DESC
LIMIT 10;
-- Verify HDD inventory
SELECT serial_number, model_name, current_device_path, current_node_id
FROM hdd_inventory
WHERE status = 'active';
-- Monitor table sizes
SELECT schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
-- Monitor query performance
SELECT query, mean_time, calls
FROM pg_stat_statements
WHERE query LIKE '%smart_readings%'
ORDER BY mean_time DESC;
# Add timing to critical operations
use Time::HiRes qw(time);
my $start_time = time();
my $result = $self->collect_smart_data($device_path);
my $duration = time() - $start_time;
$self->_log("SMART collection took ${duration}s for $device_path", 3);
Database Setup:
Security Configuration:
Performance Tuning:
# Install on cluster nodes
for node in pve01 pve02 pve03; do
scp -r autoSMART/ root@$node:/etc/pve/
done
# Configure systemd services
systemctl enable autosmart-collector
systemctl start autosmart-collector
# Monitor system in real-time
watch -n 30 'psql -h 192.168.2.102 -U postgres -d autosmart -c "SELECT COUNT(*) FROM smart_readings WHERE timestamp > NOW() - INTERVAL '\''1 hour'\''"'
# Generate performance report
psql -h 192.168.2.102 -U postgres -d autosmart -f sql/performance-report.sql
This section contains detailed technical changes, internal API modifications, and development-specific information that is not relevant for end-users.
hdd_inventory and hdd_migrations tables for hardware-based identificationshould_store_smart_reading() PostgreSQL function with configurable change detectionidentify_hardware(), detect_migration())should_store_reading())should_store_smart_reading(jsonb, text, text, interval, text[]) functionsmart_readings_reconstructed view for seamless data accesscluster.conf) and node-specific (autosmart.conf)force_storage_interval, critical_parameters)test-differential-storage.pl192.168.2.102 for development and testinguuid-ossp and btree_gin extensions../sql/schema.sql../sql/sample-queries.sql../sql/migrations/../config/cluster.conf (shared across all nodes)../config/defaults/autosmart (node-specific settings)../config/openai.conf (API configuration)../scripts/collect-smart-data.pl../scripts/analyze-smart-data.pl../scripts/generate-reports.pl../scripts/test-differential-storage.pl../scripts/deploy.sh, ../scripts/deploy-production.shFiles to modify:
1. lib/SmartCollector.pm - Add parameter collection logic
2. sql/schema.sql - Update parameter definitions if needed
3. scripts/test-differential-storage.pl - Add parameter tests
4. docs/DIFFERENTIAL_STORAGE.md - Document parameter behavior
Files to modify:
1. lib/SmartAnalyzer.pm - Add new prediction algorithms
2. docs/API.md - Update API integration patterns
3. scripts/analyze-smart-data.pl - Add model selection logic
4. sql/schema.sql - Add prediction result tables if needed
Areas to investigate:
1. docs/DIFFERENTIAL_STORAGE.md - Storage optimization techniques
2. sql/schema.sql - Index optimization
3. lib/SmartCollector.pm - Collection efficiency
4. PostgreSQL query performance using EXPLAIN ANALYZE
Files to modify:
1. lib/SmartCollector.pm - Hardware detection logic
2. docs/MIGRATION_DETECTION.md - Hardware tracking specifications
3. scripts/test-differential-storage.pl - Hardware-specific tests
4. Configuration templates for new hardware types
# Use strict and warnings
use strict;
use warnings;
# Consistent indentation (4 spaces)
sub example_function {
my ($self, $param) = @_;
# Clear variable names
my $smart_data = $self->collect_smart_data($param);
# Error handling
return unless defined $smart_data;
return $smart_data;
}
-- Use transactions for data consistency
BEGIN;
-- Multiple related operations
INSERT INTO hdd_inventory (...) VALUES (...);
INSERT INTO smart_readings (...) VALUES (...);
COMMIT;
-- Use proper indexing
CREATE INDEX CONCURRENTLY idx_smart_readings_timestamp
ON smart_readings(timestamp DESC, serial_number);
-- Use parameterized queries to prevent SQL injection
my $sth = $dbh->prepare("SELECT * FROM smart_readings WHERE serial_number = ?");
$sth->execute($serial_number);
This development guide provides the foundation for extending and maintaining the autoSMART system. Follow these guidelines to ensure code quality, performance, and reliability.