Docs/AI Backup/Models

ML Model Registry Backup

Versioned, deduplicated backup of model weights with FastCDC.

What Gets Backed Up

BackupEngine automatically detects and backs up model files in all common formats:

•.safetensors — HuggingFace-standard format for transformer models (Llama, Mistral, Phi, etc.)
•.gguf — Quantized models optimized for edge/local inference (llama.cpp, Ollama)
•.onnx — ONNX Runtime model format for cross-platform deployment
•.pt / .pth — PyTorch native format, including checkpoints mid-training
•.bin — Generic binary model weights (legacy TensorFlow, JAX)
•TensorFlow SavedModel directories — Full TensorFlow model bundles with assets and variables

Auto-Detected Paths

BackupEngine automatically scans these standard model directories:

Platform	Detected Paths
HuggingFace	`~/.cache/huggingface/hub`
Ollama	`~/.ollama/models` (macOS/Linux) or `%USERPROFILE%\.ollama` (Windows)
LM Studio	`~/.lm-studio/models`
PyTorch Local	`~/projects/*/models` (custom folder scans)

Custom Paths

Add custom model directories via the desktop agent UI or CLI:

Add custom model path via CLI

# Add a custom model directory
backupengine ai-assets add-path \
  --type ai_model \
  --path ~/custom-models/finetunes

# List all model backup paths
backupengine ai-assets list-paths --type ai_model

ℹ Note

Custom paths are backed up in real-time with continuous data protection (CDP). If a new model is added to the custom path, it is automatically included in the next backup run.

CLI Examples

AI model backup and restore commands

# List all backed-up models
backupengine ai-assets list --type ai_model

# Back up a specific model directory
backupengine ai-assets backup --type ai_model --path ~/projects/my-finetune

# View all snapshots of a model
backupengine ai-assets snapshots --asset llama-3-finetune

# Restore a specific version of a model
backupengine ai-assets restore --asset llama-3-finetune --version v12

# Restore the latest version (default)
backupengine ai-assets restore --asset llama-3-finetune

FastCDC Deduplication

Model development involves many checkpoint versions. Without deduplication, storing 50 checkpoints would cost 50× the storage of one full model. BackupEngine uses content-defined chunking (FastCDC) to detect byte-level similarity across checkpoints, reducing redundant storage dramatically.

•Typical dedup ratio for model checkpoints: 80-95% of weights are unchanged between checkpoints.
•Storage footprint: 50 checkpoints cost approximately 3× the size of one full model (not 50×).
•No re-uploading or re-processing: Dedup is transparent and automatic.
•Per-user keying: Each user's chunks are encrypted with their own key for zero-knowledge security.

💡 Tip

Fine-tuned models save intermediate checkpoints throughout training. BackupEngine automatically deduplicates these, so you get version control and disaster recovery without storage bloat.

Retention & Lifecycle

Model versions are retained based on your AI Infra plan:

•Default retention: 1 year on AI Infra Backup tier.
•Per-asset extension: Set longer retention for critical models (e.g., production models retain 5 years).
•Automatic cleanup: Older versions are deleted automatically at the end of the retention window.
•Manual deletion: Delete specific versions anytime from the portal or CLI.

AI Infrastructure Backup Vector Database Backup