Backup EnginebackupEngine
Docs/AI Backup/Models

ML Model Registry Backup

Versioned, deduplicated backup of model weights with FastCDC.

What Gets Backed Up

BackupEngine automatically detects and backs up model files in all common formats:

  • .safetensors — HuggingFace-standard format for transformer models (Llama, Mistral, Phi, etc.)
  • .gguf — Quantized models optimized for edge/local inference (llama.cpp, Ollama)
  • .onnx — ONNX Runtime model format for cross-platform deployment
  • .pt / .pth — PyTorch native format, including checkpoints mid-training
  • .bin — Generic binary model weights (legacy TensorFlow, JAX)
  • TensorFlow SavedModel directories — Full TensorFlow model bundles with assets and variables

Auto-Detected Paths

BackupEngine automatically scans these standard model directories:

PlatformDetected Paths
HuggingFace~/.cache/huggingface/hub
Ollama~/.ollama/models (macOS/Linux) or %USERPROFILE%\.ollama (Windows)
LM Studio~/.lm-studio/models
PyTorch Local~/projects/*/models (custom folder scans)

Custom Paths

Add custom model directories via the desktop agent UI or CLI:

Add custom model path via CLI
# Add a custom model directory
backupengine ai-assets add-path \
  --type ai_model \
  --path ~/custom-models/finetunes

# List all model backup paths
backupengine ai-assets list-paths --type ai_model

ℹ Note

Custom paths are backed up in real-time with continuous data protection (CDP). If a new model is added to the custom path, it is automatically included in the next backup run.

CLI Examples

AI model backup and restore commands
# List all backed-up models
backupengine ai-assets list --type ai_model

# Back up a specific model directory
backupengine ai-assets backup --type ai_model --path ~/projects/my-finetune

# View all snapshots of a model
backupengine ai-assets snapshots --asset llama-3-finetune

# Restore a specific version of a model
backupengine ai-assets restore --asset llama-3-finetune --version v12

# Restore the latest version (default)
backupengine ai-assets restore --asset llama-3-finetune

FastCDC Deduplication

Model development involves many checkpoint versions. Without deduplication, storing 50 checkpoints would cost 50× the storage of one full model. BackupEngine uses content-defined chunking (FastCDC) to detect byte-level similarity across checkpoints, reducing redundant storage dramatically.

  • Typical dedup ratio for model checkpoints: 80-95% of weights are unchanged between checkpoints.
  • Storage footprint: 50 checkpoints cost approximately 3× the size of one full model (not 50×).
  • No re-uploading or re-processing: Dedup is transparent and automatic.
  • Per-user keying: Each user's chunks are encrypted with their own key for zero-knowledge security.

💡 Tip

Fine-tuned models save intermediate checkpoints throughout training. BackupEngine automatically deduplicates these, so you get version control and disaster recovery without storage bloat.

Retention & Lifecycle

Model versions are retained based on your AI Infra plan:

  • Default retention: 1 year on AI Infra Backup tier.
  • Per-asset extension: Set longer retention for critical models (e.g., production models retain 5 years).
  • Automatic cleanup: Older versions are deleted automatically at the end of the retention window.
  • Manual deletion: Delete specific versions anytime from the portal or CLI.