Development Guide
1. Objectives
The Media Importer project aims to provide a robust, efficient solution for organizing media files by date with proper timezone handling and conflict - Naming Convention: {s/f}_{YYYYMMDD_HHMMSS}_{TestName}.md format where:
- s = success, f = failure
- Followed by timestamp and test name (spaces converted to underscores)esolution.
Key objectives:
- Reliable EXIF/metadata extraction and date parsing
- Proper UTC time conversion for QuickTime/Apple media files
- Flexible organization patterns (year/month/day/hour)
- Safe file operations with dry-run capabilities
- Cross-platform compatibility (macOS/Linux)
2. Guide
Development Workflow
When making changes to the project, follow this structured approach:
Changelog Entries Format
All changelog entries should follow this format:
- Date time
- Bug/Feature description
- Changes made
Example:
- 2025-09-07 10:30
- Fixed data loss issue when processing already-sorted folders
- Added exclusion patterns for sorted/organized/processed folders
- Added --include-sorted flag to override exclusions when needed
File Move Confirmation
Every file move operation should be confirmed:
- After moving a file, the script must check that the file exists at the destination.
- If the file is not present at the destination after the move operation, the script should immediately stop and report an error.
- This ensures data integrity and prevents silent data loss.
Destination Inside Source Handling
- Given the script's purpose, the destination folder may be inside the source folder.
- In this case, all files within the destination folder must be excluded from scanning and processing.
- For extra safety: before processing any file, if its path matches (or is inside) the destination path, the script must report an error and stop immediately.
- This prevents accidental re-processing or moving of files that have already been sorted, ensuring data integrity.
Testing
Testing is essential to ensure the script's reliability and data safety. The following methodology should be used:
Test Environment Setup
- The
samples directory contains a variety of media files for testing.
- Create a dedicated working directory named
test for each test run.
- Copy selected files from
samples into the test directory to simulate real-world scenarios.
- Perform import operations using the script, targeting the
test directory as the source and a subdirectory (e.g., test/sorted) as the destination.
Test Execution and Documentation
- Before and after each import operation, run
find on both the source and destination directories to capture the file structure:
- Example:
find ./test > test/source_before.txt
- Example:
find ./test/sorted > test/dest_before.txt
- Log all results, including script output and directory listings, into a dedicated log file for each test.
Test Report Format
Each test must generate a comprehensive Markdown report in test/test_report.md with the following structure:
# Test Report: [Test Name/Scenario]
## Test Information
- **Date**: $(date)
- **Scenario**: [Brief description of what is being tested]
- **Objective**: [What specific functionality/behavior is being verified]
- **Files Used**: [List of test files and their characteristics]
## Pre-Test State
### Source Directory Structure
\`\`\`
[Contents of source_before.txt]
\`\`\`
### Destination Directory Structure
\`\`\`
[Contents of dest_before.txt]
\`\`\`
## Test Execution
### Command Used
\`\`\`bash
[Exact command executed]
\`\`\`
### Script Output
\`\`\`
[Full script output from import_log.txt]
\`\`\`
## Post-Test State
### Source Directory Structure
\`\`\`
[Contents of source_after.txt]
\`\`\`
### Destination Directory Structure
\`\`\`
[Contents of dest_after.txt]
\`\`\`
## Analysis and Verification
### Expected Results
- [List what should happen]
### Actual Results
- [List what actually happened]
### Issues Found
- [Any problems, errors, or unexpected behavior]
- [Include error messages, incorrect file placements, etc.]
### Protections Verified
- [ ] Destination exclusion working
- [ ] Move confirmation functional
- [ ] No data loss detected
- [ ] UTC conversion correct (for QuickTime files)
- [ ] Unimportable files handling (if applicable)
## Corrective Actions
### Issues Identified
- [Detailed description of problems found]
### Fixes Applied
- [Code changes made]
- [Configuration adjustments]
- [Process improvements]
### Re-test Results
- [Results after applying fixes]
## Conclusion
### Test Result
- [ ] PASSED
- [ ] FAILED
- [ ] PARTIAL (with notes)
### Notes
[Any additional observations, recommendations, or follow-up actions needed]
### Files Generated
- `test/source_before.txt` - Pre-test source structure
- `test/dest_before.txt` - Pre-test destination structure
- `test/source_after.txt` - Post-test source structure
- `test/dest_after.txt` - Post-test destination structure
- `test/import_log.txt` - Full script execution log
- `test/test_report.md` - This report
Automated Test Runner
A comprehensive test runner script (test_runner.sh) is available to automate the testing process:
./test_runner.sh
The script provides:
- Pre-configured test scenarios for common use cases
- Automatic report generation in Markdown format
- State capture before and after test execution
- Protection verification with checkboxes
- Custom test support for specific scenarios
Test Categories
The test runner provides the following pre-configured test scenarios:
- Basic Functionality Test: Tests processing of files with valid EXIF data to verify correct sorting and organization
- Unimportable Files Test: Tests handling of files without EXIF data in both root and subfolders, without --collect-unimportable flag
- Mixed Content Test: Tests processing of sortable and unimportable files in separate folders to verify cleanup behavior
- Safety Protections Test: Tests destination exclusion and move confirmation mechanisms to prevent data loss
- UTC Conversion Test: Tests UTC timestamp conversion for QuickTime/Apple EXIF data
- Subdirectory Processing Test: Tests processing of files in nested subdirectories to ensure recursive file discovery
- Custom Test: Allows user-defined test scenarios with custom file sets and commands
Test Result Persistence
The test runner includes automatic result persistence:
- Archival Location: Test results are saved as individual Markdown files in
test_reports/ directory
- Naming Convention:
{TestName}_{YYYYMMDD_HHMMSS}.md format for easy identification
- Contents Preserved: Single self-contained Markdown file with:
- Complete test information and directory structures
- Full script execution output embedded inline (ANSI codes stripped for readability)
- Import log content included directly in the report
- Excluded Files: No separate files - everything is consolidated in the Markdown report
- Historical Tracking: Maintains complete test history for debugging and regression testing
Cleanup
- Review the test report and verify all aspects are documented
- Clean up the
test directory after each test run to ensure a fresh environment for subsequent tests
- Archive important test reports in a
test_reports/ directory for future reference
3. Changelog
2025-09-07 21:15 - Test 2 and 3 Enhancements
- Updated Test 2 (Unimportable Files Test) to include files in both root and subfolder
- Removed --collect-unimportable flag from Test 2 to test default behavior
- Updated Test 3 (Mixed Content Test) to use separate folders for sortable vs unimportable files
- Test 3 now verifies that folders with only sortable files are cleaned up while folders with unimportable files are preserved
- Updated menu descriptions to reflect the changes
- Tests now verify proper handling of unimportable files without collection flag
2025-09-07 21:25 - Documentation Enhancement
- Added comprehensive documentation for --collect-unimportable flag in README.md
- Added Example 4 showing how to use --collect-unimportable flag
- Updated Features section to mention unimportable files handling
- Updated Configuration section to explain default behavior for unimportable files
- Added usage example for --collect-unimportable in Basic Usage section
2025-09-07 21:30 - Git Ignore Enhancement
- Added test_reports/ to .gitignore to exclude generated test reports from version control
- Test reports are generated files that don't need to be tracked in Git
- Prevents large numbers of timestamped report files from cluttering the repository
- Added sample/ to .gitignore to exclude test media files from version control
2025-09-07 20:40 - Source Only Test Addition
- Added Test 8: Source Only Test to test runner
- Tests processing with only source parameter (creates sorted subdirectory automatically)
- Verifies that when no destination is specified, files are sorted into source/sorted/
- Updated menu and command line options for new test
2025-09-07 20:45 - Test 7 Refinement
- Updated Test 7 to test --keep-empty-dirs functionality instead of cleanup
- Since cleanup is now default behavior, Test 7 now verifies empty directory preservation
- Renamed from "Cleanup Empty Directories Test" to "Keep Empty Directories Test"
- Updated test scenario to validate --keep-empty-dirs flag behavior
- Added command line option "keep-empty-dirs" for test 7
2025-09-07 19:30 - Test Runner Directory Separation
- Adapted test runner to use separate source and destination directories
- Changed from test/ as source to test/source/ and test/destination/
- Updated all test functions to use proper directory separation
- Improved test isolation and clarity
2025-09-07 19:00 - Default Cleanup Behavior
- Made --cleanup-empty-dirs the default behavior (implicit option)
- Added --keep-empty-dirs flag to disable cleanup if needed
- Updated help text and configuration display to reflect new default
- Cleanup now runs automatically unless explicitly disabled
2025-09-07 18:56 - Cleanup Empty Directories Feature
- Added --cleanup-empty-dirs option to remove empty directories from source after processing
- Added cleanup_empty_directories() function with safe empty directory detection
- Updated final report to show cleanup status
- Maintains safety by not removing source root directories
- Works correctly with dry-run mode
4. Todo
Key areas for future development:
- GPS metadata integration for timezone detection
- Enhanced duplicate detection
- Performance optimizations for large file sets
- Additional organization patterns