ML Model Registry Backup
Versioned, deduplicated backup of model weights with FastCDC.
What Gets Backed Up
BackupEngine automatically detects and backs up model files in all common formats:
- •.safetensors — HuggingFace-standard format for transformer models (Llama, Mistral, Phi, etc.)
- •.gguf — Quantized models optimized for edge/local inference (llama.cpp, Ollama)
- •.onnx — ONNX Runtime model format for cross-platform deployment
- •.pt / .pth — PyTorch native format, including checkpoints mid-training
- •.bin — Generic binary model weights (legacy TensorFlow, JAX)
- •TensorFlow SavedModel directories — Full TensorFlow model bundles with assets and variables
Auto-Detected Paths
BackupEngine automatically scans these standard model directories:
| Platform | Detected Paths |
|---|---|
| HuggingFace | ~/.cache/huggingface/hub |
| Ollama | ~/.ollama/models (macOS/Linux) or %USERPROFILE%\.ollama (Windows) |
| LM Studio | ~/.lm-studio/models |
| PyTorch Local | ~/projects/*/models (custom folder scans) |
Custom Paths
Add custom model directories via the desktop agent UI or CLI:
Add custom model path via CLI
# Add a custom model directory backupengine ai-assets add-path \ --type ai_model \ --path ~/custom-models/finetunes # List all model backup paths backupengine ai-assets list-paths --type ai_model
ℹ Note
Custom paths are backed up in real-time with continuous data protection (CDP). If a new model is added to the custom path, it is automatically included in the next backup run.
CLI Examples
AI model backup and restore commands
# List all backed-up models backupengine ai-assets list --type ai_model # Back up a specific model directory backupengine ai-assets backup --type ai_model --path ~/projects/my-finetune # View all snapshots of a model backupengine ai-assets snapshots --asset llama-3-finetune # Restore a specific version of a model backupengine ai-assets restore --asset llama-3-finetune --version v12 # Restore the latest version (default) backupengine ai-assets restore --asset llama-3-finetune
FastCDC Deduplication
Model development involves many checkpoint versions. Without deduplication, storing 50 checkpoints would cost 50× the storage of one full model. BackupEngine uses content-defined chunking (FastCDC) to detect byte-level similarity across checkpoints, reducing redundant storage dramatically.
- •Typical dedup ratio for model checkpoints: 80-95% of weights are unchanged between checkpoints.
- •Storage footprint: 50 checkpoints cost approximately 3× the size of one full model (not 50×).
- •No re-uploading or re-processing: Dedup is transparent and automatic.
- •Per-user keying: Each user's chunks are encrypted with their own key for zero-knowledge security.
💡 Tip
Fine-tuned models save intermediate checkpoints throughout training. BackupEngine automatically deduplicates these, so you get version control and disaster recovery without storage bloat.
Retention & Lifecycle
Model versions are retained based on your AI Infra plan:
- •Default retention: 1 year on AI Infra Backup tier.
- •Per-asset extension: Set longer retention for critical models (e.g., production models retain 5 years).
- •Automatic cleanup: Older versions are deleted automatically at the end of the retention window.
- •Manual deletion: Delete specific versions anytime from the portal or CLI.