feat: add Freeleaps PVC backup job with ArgoCD deployment
- Add Python backup script with PST timezone support - Create Helm Chart for flexible configuration - Add ArgoCD Application for GitOps deployment - Include comprehensive documentation and build scripts - Support incremental snapshots for cost efficiency - Process PVCs independently with error handling - Add .gitignore to exclude Python cache files
This commit is contained in:
parent
2bf8244370
commit
a470476c71
36
jobs/freeleaps-data-backup/.gitignore
vendored
Normal file
36
jobs/freeleaps-data-backup/.gitignore
vendored
Normal file
@ -0,0 +1,36 @@
|
|||||||
|
# Python
|
||||||
|
__pycache__/
|
||||||
|
*.py[cod]
|
||||||
|
*$py.class
|
||||||
|
*.so
|
||||||
|
.Python
|
||||||
|
build/
|
||||||
|
develop-eggs/
|
||||||
|
dist/
|
||||||
|
downloads/
|
||||||
|
eggs/
|
||||||
|
.eggs/
|
||||||
|
lib/
|
||||||
|
lib64/
|
||||||
|
parts/
|
||||||
|
sdist/
|
||||||
|
var/
|
||||||
|
wheels/
|
||||||
|
*.egg-info/
|
||||||
|
.installed.cfg
|
||||||
|
*.egg
|
||||||
|
|
||||||
|
# Virtual environments
|
||||||
|
venv/
|
||||||
|
env/
|
||||||
|
ENV/
|
||||||
|
|
||||||
|
# IDE
|
||||||
|
.vscode/
|
||||||
|
.idea/
|
||||||
|
*.swp
|
||||||
|
*.swo
|
||||||
|
|
||||||
|
# OS
|
||||||
|
.DS_Store
|
||||||
|
Thumbs.db
|
||||||
25
jobs/freeleaps-data-backup/Dockerfile
Normal file
25
jobs/freeleaps-data-backup/Dockerfile
Normal file
@ -0,0 +1,25 @@
|
|||||||
|
FROM python:3.11-slim
|
||||||
|
|
||||||
|
# Set working directory
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
# Install system dependencies
|
||||||
|
RUN apt-get update && apt-get install -y \
|
||||||
|
curl \
|
||||||
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
# Copy requirements and install Python dependencies
|
||||||
|
COPY requirements.txt .
|
||||||
|
RUN pip install --no-cache-dir -r requirements.txt
|
||||||
|
|
||||||
|
# Copy backup script
|
||||||
|
COPY backup_script.py .
|
||||||
|
|
||||||
|
# Make script executable
|
||||||
|
RUN chmod +x backup_script.py
|
||||||
|
|
||||||
|
# Set environment variables
|
||||||
|
ENV PYTHONUNBUFFERED=1
|
||||||
|
|
||||||
|
# Run the backup script
|
||||||
|
CMD ["python", "backup_script.py"]
|
||||||
277
jobs/freeleaps-data-backup/README.md
Normal file
277
jobs/freeleaps-data-backup/README.md
Normal file
@ -0,0 +1,277 @@
|
|||||||
|
# Freeleaps PVC Backup Job
|
||||||
|
|
||||||
|
This job creates daily snapshots of critical PVCs in the Freeleaps production environment using Azure Disk CSI Snapshot feature.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The backup job runs daily at 00:00 PST (Pacific Standard Time) and creates snapshots for the following PVCs:
|
||||||
|
- `gitea-shared-storage` in namespace `freeleaps-prod`
|
||||||
|
- `data-freeleaps-prod-gitea-postgresql-ha-postgresql-0` in namespace `freeleaps-prod`
|
||||||
|
|
||||||
|
## Components
|
||||||
|
|
||||||
|
- **backup_script.py**: Python script that creates snapshots and monitors their status
|
||||||
|
- **Dockerfile**: Container image definition
|
||||||
|
- **build.sh**: Script to build the Docker image
|
||||||
|
- **deploy-argocd.sh**: Script to deploy via ArgoCD
|
||||||
|
- **helm-pkg/**: Helm Chart for Kubernetes deployment
|
||||||
|
- **argo-app/**: ArgoCD Application configuration
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- ✅ Creates snapshots with timestamp-based naming (YYYYMMDD format)
|
||||||
|
- ✅ Uses PST timezone for snapshot naming
|
||||||
|
- ✅ Monitors snapshot status until ready
|
||||||
|
- ✅ Comprehensive logging to console
|
||||||
|
- ✅ Error handling and retry logic
|
||||||
|
- ✅ RBAC permissions for secure operation
|
||||||
|
- ✅ Resource limits and security context
|
||||||
|
- ✅ Concurrency control (prevents overlapping jobs)
|
||||||
|
- ✅ Helm Chart for flexible configuration
|
||||||
|
- ✅ ArgoCD integration for GitOps deployment
|
||||||
|
- ✅ Incremental snapshots for cost efficiency
|
||||||
|
|
||||||
|
## Building and Deployment
|
||||||
|
|
||||||
|
### Option 1: ArgoCD Deployment (Recommended)
|
||||||
|
|
||||||
|
#### 1. Build and Push Docker Image
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Make build script executable
|
||||||
|
chmod +x build.sh
|
||||||
|
|
||||||
|
# Build the image
|
||||||
|
./build.sh
|
||||||
|
|
||||||
|
# Push to registry
|
||||||
|
docker push freeleaps-registry.azurecr.io/freeleaps-pvc-backup:latest
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 2. Deploy via ArgoCD
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Deploy ArgoCD Application
|
||||||
|
./deploy-argocd.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 3. Monitor in ArgoCD
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check ArgoCD application status
|
||||||
|
kubectl get applications -n freeleaps-devops-system
|
||||||
|
|
||||||
|
# Access ArgoCD UI
|
||||||
|
kubectl port-forward svc/argocd-server -n freeleaps-devops-system 8080:443
|
||||||
|
```
|
||||||
|
|
||||||
|
Then visit `https://localhost:8080` in your browser.
|
||||||
|
|
||||||
|
### Option 2: Direct Helm Deployment
|
||||||
|
|
||||||
|
#### 1. Build and Push Docker Image
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Build the image
|
||||||
|
./build.sh
|
||||||
|
|
||||||
|
# Push to registry
|
||||||
|
docker push freeleaps-registry.azurecr.io/freeleaps-pvc-backup:latest
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 2. Deploy with Helm
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Deploy using Helm Chart
|
||||||
|
helm install freeleaps-data-backup ./helm-pkg/freeleaps-data-backup \
|
||||||
|
--values helm-pkg/freeleaps-data-backup/values.prod.yaml \
|
||||||
|
--namespace freeleaps-prod \
|
||||||
|
--create-namespace
|
||||||
|
```
|
||||||
|
|
||||||
|
## Monitoring
|
||||||
|
|
||||||
|
### Check CronJob Status
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl get cronjobs -n freeleaps-prod
|
||||||
|
```
|
||||||
|
|
||||||
|
### Check Job History
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl get jobs -n freeleaps-prod
|
||||||
|
```
|
||||||
|
|
||||||
|
### View Job Logs
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Get the latest job name
|
||||||
|
kubectl get jobs -n freeleaps-prod --sort-by=.metadata.creationTimestamp
|
||||||
|
|
||||||
|
# View logs
|
||||||
|
kubectl logs -n freeleaps-prod job/freeleaps-data-backup-<timestamp>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Check Snapshots
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl get volumesnapshots -n freeleaps-prod
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
### Schedule
|
||||||
|
|
||||||
|
The job runs daily at 00:00 PST. To modify the schedule, edit the `cronjob.schedule` field in `helm-pkg/freeleaps-data-backup/values.prod.yaml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
cronjob:
|
||||||
|
schedule: "0 8 * * *" # UTC 08:00 = PST 00:00
|
||||||
|
```
|
||||||
|
|
||||||
|
### PVCs to Backup
|
||||||
|
|
||||||
|
To add or remove PVCs, modify the `backup.pvcs` list in `helm-pkg/freeleaps-data-backup/values.prod.yaml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
backup:
|
||||||
|
pvcs:
|
||||||
|
- "gitea-shared-storage"
|
||||||
|
- "data-freeleaps-prod-gitea-postgresql-ha-postgresql-0"
|
||||||
|
# Add more PVCs here
|
||||||
|
```
|
||||||
|
|
||||||
|
### Snapshot Class
|
||||||
|
|
||||||
|
The job uses the `csi-azuredisk-vsc` snapshot class with incremental snapshots enabled. This can be modified in `helm-pkg/freeleaps-data-backup/values.prod.yaml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
backup:
|
||||||
|
snapshotClass: "csi-azuredisk-vsc"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Resource Limits
|
||||||
|
|
||||||
|
Resource limits can be configured in `helm-pkg/freeleaps-data-backup/values.prod.yaml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
memory: "256Mi"
|
||||||
|
cpu: "200m"
|
||||||
|
limits:
|
||||||
|
memory: "512Mi"
|
||||||
|
cpu: "500m"
|
||||||
|
```
|
||||||
|
|
||||||
|
## How It Works
|
||||||
|
|
||||||
|
### Snapshot Naming
|
||||||
|
|
||||||
|
Snapshots are named using the format: `{PVC_NAME}-snapshot-{YYYYMMDD}`
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
- `gitea-shared-storage-snapshot-20250805`
|
||||||
|
- `data-freeleaps-prod-gitea-postgresql-ha-postgresql-0-snapshot-20250805`
|
||||||
|
|
||||||
|
### Processing Flow
|
||||||
|
|
||||||
|
1. **PVC Verification**: Each PVC is verified to exist before processing
|
||||||
|
2. **Snapshot Creation**: Individual snapshots are created for each PVC
|
||||||
|
3. **Status Monitoring**: Each snapshot is monitored until ready
|
||||||
|
4. **Independent Processing**: PVCs are processed independently (one failure doesn't affect others)
|
||||||
|
|
||||||
|
### Incremental Snapshots
|
||||||
|
|
||||||
|
The job uses Azure Disk CSI incremental snapshots, which:
|
||||||
|
- Save storage costs by only storing changed data blocks
|
||||||
|
- Create faster than full snapshots
|
||||||
|
- Maintain full recovery capability
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Common Issues
|
||||||
|
|
||||||
|
1. **Permission Denied**: Ensure RBAC is properly configured
|
||||||
|
2. **PVC Not Found**: Verify PVC names and namespace
|
||||||
|
3. **Snapshot Creation Failed**: Check Azure Disk CSI driver status
|
||||||
|
4. **Job Timeout**: Increase timeout in the values file if needed
|
||||||
|
|
||||||
|
### Debug Mode
|
||||||
|
|
||||||
|
To run the script locally for testing:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install dependencies
|
||||||
|
pip install -r requirements.txt
|
||||||
|
|
||||||
|
# Run with local kubeconfig
|
||||||
|
python3 backup_script.py
|
||||||
|
```
|
||||||
|
|
||||||
|
## Security
|
||||||
|
|
||||||
|
- The job runs with minimal required permissions
|
||||||
|
- Non-root user execution
|
||||||
|
- Dropped capabilities
|
||||||
|
- Resource limits enforced
|
||||||
|
- No privileged access
|
||||||
|
|
||||||
|
## Maintenance
|
||||||
|
|
||||||
|
### Cleanup Old Snapshots
|
||||||
|
|
||||||
|
Old snapshots can be cleaned up manually:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# List all snapshots
|
||||||
|
kubectl get volumesnapshots -n freeleaps-prod
|
||||||
|
|
||||||
|
# Delete specific snapshot
|
||||||
|
kubectl delete volumesnapshot <snapshot-name> -n freeleaps-prod
|
||||||
|
|
||||||
|
# Delete snapshots older than 30 days (example)
|
||||||
|
kubectl get volumesnapshots -n freeleaps-prod -o jsonpath='{.items[?(@.metadata.creationTimestamp<"2024-07-05T00:00:00Z")].metadata.name}' | xargs kubectl delete volumesnapshot -n freeleaps-prod
|
||||||
|
```
|
||||||
|
|
||||||
|
### Updating Configuration
|
||||||
|
|
||||||
|
To update the backup configuration:
|
||||||
|
|
||||||
|
1. Modify the appropriate values file in `helm-pkg/freeleaps-data-backup/`
|
||||||
|
2. Commit and push changes to the repository
|
||||||
|
3. ArgoCD will automatically sync the changes
|
||||||
|
4. Or manually upgrade with Helm: `helm upgrade freeleaps-data-backup ./helm-pkg/freeleaps-data-backup --values values.prod.yaml`
|
||||||
|
|
||||||
|
## Backup Data
|
||||||
|
|
||||||
|
### What Gets Backed Up
|
||||||
|
|
||||||
|
- **gitea-shared-storage**: Gitea repository data, attachments, and configuration
|
||||||
|
- **data-freeleaps-prod-gitea-postgresql-ha-postgresql-0**: PostgreSQL database data
|
||||||
|
|
||||||
|
### Recovery
|
||||||
|
|
||||||
|
To restore from a snapshot:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Create a PVC from snapshot
|
||||||
|
kubectl apply -f - <<EOF
|
||||||
|
apiVersion: v1
|
||||||
|
kind: PersistentVolumeClaim
|
||||||
|
metadata:
|
||||||
|
name: restored-pvc
|
||||||
|
namespace: freeleaps-prod
|
||||||
|
spec:
|
||||||
|
dataSource:
|
||||||
|
name: <snapshot-name>
|
||||||
|
kind: VolumeSnapshot
|
||||||
|
apiGroup: snapshot.storage.k8s.io
|
||||||
|
accessModes:
|
||||||
|
- ReadWriteOnce
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
storage: 10Gi
|
||||||
|
EOF
|
||||||
|
```
|
||||||
32
jobs/freeleaps-data-backup/argo-app/application.yaml
Normal file
32
jobs/freeleaps-data-backup/argo-app/application.yaml
Normal file
@ -0,0 +1,32 @@
|
|||||||
|
apiVersion: argoproj.io/v1alpha1
|
||||||
|
kind: Application
|
||||||
|
metadata:
|
||||||
|
name: freeleaps-data-backup
|
||||||
|
namespace: freeleaps-devops-system
|
||||||
|
labels:
|
||||||
|
app: freeleaps-data-backup
|
||||||
|
component: backup
|
||||||
|
environment: production
|
||||||
|
spec:
|
||||||
|
destination:
|
||||||
|
name: ""
|
||||||
|
namespace: freeleaps-prod
|
||||||
|
server: https://kubernetes.default.svc
|
||||||
|
source:
|
||||||
|
path: jobs/freeleaps-data-backup/helm-pkg/freeleaps-data-backup
|
||||||
|
repoURL: https://freeleaps@dev.azure.com/freeleaps/freeleaps-ops/_git/freeleaps-ops
|
||||||
|
targetRevision: HEAD
|
||||||
|
helm:
|
||||||
|
parameters: []
|
||||||
|
valueFiles:
|
||||||
|
- values.prod.yaml
|
||||||
|
sources: []
|
||||||
|
project: freeleaps-data-backup
|
||||||
|
syncPolicy:
|
||||||
|
automated:
|
||||||
|
prune: true
|
||||||
|
selfHeal: true
|
||||||
|
syncOptions:
|
||||||
|
- CreateNamespace=true
|
||||||
|
- PrunePropagationPolicy=foreground
|
||||||
|
- PruneLast=true
|
||||||
227
jobs/freeleaps-data-backup/backup_script.py
Normal file
227
jobs/freeleaps-data-backup/backup_script.py
Normal file
@ -0,0 +1,227 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
PVC Backup Script for Freeleaps Production Environment
|
||||||
|
Creates snapshots for specified PVCs and monitors their status
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import yaml
|
||||||
|
import time
|
||||||
|
import logging
|
||||||
|
from datetime import datetime, timezone, timedelta
|
||||||
|
from kubernetes import client, config
|
||||||
|
from kubernetes.client.rest import ApiException
|
||||||
|
|
||||||
|
# Configure logging
|
||||||
|
logging.basicConfig(
|
||||||
|
level=logging.INFO,
|
||||||
|
format='%(asctime)s - %(levelname)s - %(message)s',
|
||||||
|
handlers=[
|
||||||
|
logging.StreamHandler(sys.stdout)
|
||||||
|
]
|
||||||
|
)
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
class PVCBackupManager:
|
||||||
|
def __init__(self):
|
||||||
|
"""Initialize the backup manager with Kubernetes client"""
|
||||||
|
try:
|
||||||
|
# Load in-cluster config when running in Kubernetes
|
||||||
|
config.load_incluster_config()
|
||||||
|
logger.info("Loaded in-cluster Kubernetes configuration")
|
||||||
|
except config.ConfigException:
|
||||||
|
# Fallback to kubeconfig for local development
|
||||||
|
try:
|
||||||
|
config.load_kube_config()
|
||||||
|
logger.info("Loaded kubeconfig for local development")
|
||||||
|
except config.ConfigException:
|
||||||
|
logger.error("Failed to load Kubernetes configuration")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
self.api_client = client.ApiClient()
|
||||||
|
self.snapshot_api = client.CustomObjectsApi(self.api_client)
|
||||||
|
self.core_api = client.CoreV1Api(self.api_client)
|
||||||
|
|
||||||
|
# Backup configuration
|
||||||
|
self.namespace = os.getenv("BACKUP_NAMESPACE", "freeleaps-prod")
|
||||||
|
self.pvcs_to_backup = [
|
||||||
|
"gitea-shared-storage",
|
||||||
|
"data-freeleaps-prod-gitea-postgresql-ha-postgresql-0"
|
||||||
|
]
|
||||||
|
self.snapshot_class = os.getenv("SNAPSHOT_CLASS", "csi-azuredisk-vsc")
|
||||||
|
self.timeout = int(os.getenv("TIMEOUT", "300"))
|
||||||
|
|
||||||
|
def get_pst_date(self):
|
||||||
|
"""Get current date in PST timezone (UTC-8)"""
|
||||||
|
pst_tz = timezone(timedelta(hours=-8))
|
||||||
|
return datetime.now(pst_tz).strftime("%Y%m%d")
|
||||||
|
|
||||||
|
def generate_snapshot_name(self, pvc_name, timestamp):
|
||||||
|
"""Generate snapshot name with timestamp"""
|
||||||
|
return f"{pvc_name}-snapshot-{timestamp}"
|
||||||
|
|
||||||
|
def create_snapshot_yaml(self, pvc_name, snapshot_name):
|
||||||
|
"""Create VolumeSnapshot YAML configuration"""
|
||||||
|
snapshot_yaml = {
|
||||||
|
"apiVersion": "snapshot.storage.k8s.io/v1",
|
||||||
|
"kind": "VolumeSnapshot",
|
||||||
|
"metadata": {
|
||||||
|
"name": snapshot_name,
|
||||||
|
"namespace": self.namespace
|
||||||
|
},
|
||||||
|
"spec": {
|
||||||
|
"volumeSnapshotClassName": self.snapshot_class,
|
||||||
|
"source": {
|
||||||
|
"persistentVolumeClaimName": pvc_name
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return snapshot_yaml
|
||||||
|
|
||||||
|
def apply_snapshot(self, snapshot_yaml):
|
||||||
|
"""Apply snapshot to Kubernetes cluster"""
|
||||||
|
try:
|
||||||
|
logger.info(f"Creating snapshot: {snapshot_yaml['metadata']['name']}")
|
||||||
|
|
||||||
|
# Create the snapshot
|
||||||
|
result = self.snapshot_api.create_namespaced_custom_object(
|
||||||
|
group="snapshot.storage.k8s.io",
|
||||||
|
version="v1",
|
||||||
|
namespace=self.namespace,
|
||||||
|
plural="volumesnapshots",
|
||||||
|
body=snapshot_yaml
|
||||||
|
)
|
||||||
|
|
||||||
|
logger.info(f"Successfully created snapshot: {result['metadata']['name']}")
|
||||||
|
return result
|
||||||
|
|
||||||
|
except ApiException as e:
|
||||||
|
logger.error(f"Failed to create snapshot: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def wait_for_snapshot_ready(self, snapshot_name, timeout=None):
|
||||||
|
if timeout is None:
|
||||||
|
timeout = self.timeout
|
||||||
|
"""Wait for snapshot to be ready with timeout"""
|
||||||
|
logger.info(f"Waiting for snapshot {snapshot_name} to be ready...")
|
||||||
|
|
||||||
|
start_time = time.time()
|
||||||
|
while time.time() - start_time < timeout:
|
||||||
|
try:
|
||||||
|
# Get snapshot status
|
||||||
|
snapshot = self.snapshot_api.get_namespaced_custom_object(
|
||||||
|
group="snapshot.storage.k8s.io",
|
||||||
|
version="v1",
|
||||||
|
namespace=self.namespace,
|
||||||
|
plural="volumesnapshots",
|
||||||
|
name=snapshot_name
|
||||||
|
)
|
||||||
|
|
||||||
|
# Check if snapshot is ready
|
||||||
|
if snapshot.get('status', {}).get('readyToUse', False):
|
||||||
|
logger.info(f"Snapshot {snapshot_name} is ready!")
|
||||||
|
return True
|
||||||
|
|
||||||
|
# Check for error conditions
|
||||||
|
error = snapshot.get('status', {}).get('error', {})
|
||||||
|
if error:
|
||||||
|
logger.error(f"Snapshot {snapshot_name} failed: {error}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
logger.info(f"Snapshot {snapshot_name} still processing...")
|
||||||
|
time.sleep(10)
|
||||||
|
|
||||||
|
except ApiException as e:
|
||||||
|
logger.error(f"Error checking snapshot status: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
logger.error(f"Timeout waiting for snapshot {snapshot_name} to be ready")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def verify_pvc_exists(self, pvc_name):
|
||||||
|
"""Verify that PVC exists in the namespace"""
|
||||||
|
try:
|
||||||
|
pvc = self.core_api.read_namespaced_persistent_volume_claim(
|
||||||
|
name=pvc_name,
|
||||||
|
namespace=self.namespace
|
||||||
|
)
|
||||||
|
logger.info(f"Found PVC: {pvc_name}")
|
||||||
|
return True
|
||||||
|
except ApiException as e:
|
||||||
|
if e.status == 404:
|
||||||
|
logger.error(f"PVC {pvc_name} not found in namespace {self.namespace}")
|
||||||
|
else:
|
||||||
|
logger.error(f"Error checking PVC {pvc_name}: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def run_backup(self):
|
||||||
|
"""Main backup process"""
|
||||||
|
logger.info("Starting PVC backup process...")
|
||||||
|
|
||||||
|
timestamp = self.get_pst_date()
|
||||||
|
successful_backups = []
|
||||||
|
failed_backups = []
|
||||||
|
|
||||||
|
for pvc_name in self.pvcs_to_backup:
|
||||||
|
logger.info(f"Processing PVC: {pvc_name}")
|
||||||
|
|
||||||
|
# Verify PVC exists
|
||||||
|
if not self.verify_pvc_exists(pvc_name):
|
||||||
|
failed_backups.append(pvc_name)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Generate snapshot name
|
||||||
|
snapshot_name = self.generate_snapshot_name(pvc_name, timestamp)
|
||||||
|
|
||||||
|
# Create snapshot YAML
|
||||||
|
snapshot_yaml = self.create_snapshot_yaml(pvc_name, snapshot_name)
|
||||||
|
|
||||||
|
# Apply snapshot
|
||||||
|
result = self.apply_snapshot(snapshot_yaml)
|
||||||
|
if not result:
|
||||||
|
failed_backups.append(pvc_name)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Wait for snapshot to be ready
|
||||||
|
if self.wait_for_snapshot_ready(snapshot_name):
|
||||||
|
successful_backups.append(pvc_name)
|
||||||
|
logger.info(f"Backup completed successfully for PVC: {pvc_name}")
|
||||||
|
else:
|
||||||
|
failed_backups.append(pvc_name)
|
||||||
|
logger.error(f"Backup failed for PVC: {pvc_name}")
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
logger.info("=== Backup Summary ===")
|
||||||
|
logger.info(f"Successful backups: {len(successful_backups)}")
|
||||||
|
logger.info(f"Failed backups: {len(failed_backups)}")
|
||||||
|
|
||||||
|
if successful_backups:
|
||||||
|
logger.info(f"Successfully backed up: {', '.join(successful_backups)}")
|
||||||
|
|
||||||
|
if failed_backups:
|
||||||
|
logger.error(f"Failed to backup: {', '.join(failed_backups)}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
logger.info("All backups completed successfully!")
|
||||||
|
return True
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""Main entry point"""
|
||||||
|
try:
|
||||||
|
backup_manager = PVCBackupManager()
|
||||||
|
success = backup_manager.run_backup()
|
||||||
|
|
||||||
|
if success:
|
||||||
|
logger.info("Backup job completed successfully")
|
||||||
|
sys.exit(0)
|
||||||
|
else:
|
||||||
|
logger.error("Backup job completed with errors")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Unexpected error: {e}")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
24
jobs/freeleaps-data-backup/build.sh
Executable file
24
jobs/freeleaps-data-backup/build.sh
Executable file
@ -0,0 +1,24 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# Build script for Freeleaps PVC Backup Docker image
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
# Configuration
|
||||||
|
IMAGE_NAME="freeleaps-pvc-backup"
|
||||||
|
REGISTRY="freeleaps-registry.azurecr.io"
|
||||||
|
TAG="${1:-latest}"
|
||||||
|
|
||||||
|
echo "Building Freeleaps PVC Backup Docker image..."
|
||||||
|
echo "Image: ${REGISTRY}/${IMAGE_NAME}:${TAG}"
|
||||||
|
|
||||||
|
# Build the Docker image
|
||||||
|
docker buildx build \
|
||||||
|
--platform linux/amd64 \
|
||||||
|
-f Dockerfile \
|
||||||
|
-t "${REGISTRY}/${IMAGE_NAME}:${TAG}" \
|
||||||
|
.
|
||||||
|
|
||||||
|
echo "Build completed successfully!"
|
||||||
|
echo "To push the image, run:"
|
||||||
|
echo "docker push ${REGISTRY}/${IMAGE_NAME}:${TAG}"
|
||||||
26
jobs/freeleaps-data-backup/deploy-argocd.sh
Executable file
26
jobs/freeleaps-data-backup/deploy-argocd.sh
Executable file
@ -0,0 +1,26 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# Deploy script for Freeleaps PVC Backup to ArgoCD
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
echo "Deploying Freeleaps PVC Backup to ArgoCD..."
|
||||||
|
|
||||||
|
# Apply ArgoCD Application
|
||||||
|
echo "Applying ArgoCD Application configuration..."
|
||||||
|
kubectl apply -f argo-app/application.yaml
|
||||||
|
|
||||||
|
echo "ArgoCD Application deployed successfully!"
|
||||||
|
echo ""
|
||||||
|
echo "To check the status:"
|
||||||
|
echo "kubectl get applications -n freeleaps-devops-system"
|
||||||
|
echo ""
|
||||||
|
echo "To view ArgoCD UI:"
|
||||||
|
echo "kubectl port-forward svc/argocd-server -n freeleaps-devops-system 8080:443"
|
||||||
|
echo ""
|
||||||
|
echo "To check the CronJob after sync:"
|
||||||
|
echo "kubectl get cronjobs -n freeleaps-prod"
|
||||||
|
echo ""
|
||||||
|
echo "To view job logs:"
|
||||||
|
echo "kubectl get jobs -n freeleaps-prod"
|
||||||
|
echo "kubectl logs -n freeleaps-prod job/freeleaps-data-backup-<timestamp>"
|
||||||
@ -0,0 +1,17 @@
|
|||||||
|
apiVersion: v2
|
||||||
|
name: freeleaps-data-backup
|
||||||
|
description: Freeleaps PVC Backup CronJob for production environment
|
||||||
|
type: application
|
||||||
|
version: 0.1.0
|
||||||
|
appVersion: "1.0.0"
|
||||||
|
keywords:
|
||||||
|
- backup
|
||||||
|
- pvc
|
||||||
|
- snapshot
|
||||||
|
- cronjob
|
||||||
|
home: https://freeleaps.com
|
||||||
|
sources:
|
||||||
|
- https://freeleaps@dev.azure.com/freeleaps/freeleaps-ops/_git/freeleaps-ops
|
||||||
|
maintainers:
|
||||||
|
- name: Freeleaps DevOps Team
|
||||||
|
email: devops@freeleaps.com
|
||||||
@ -0,0 +1,62 @@
|
|||||||
|
{{/*
|
||||||
|
Expand the name of the chart.
|
||||||
|
*/}}
|
||||||
|
{{- define "freeleaps-data-backup.name" -}}
|
||||||
|
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
|
||||||
|
{{- end }}
|
||||||
|
|
||||||
|
{{/*
|
||||||
|
Create a default fully qualified app name.
|
||||||
|
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
|
||||||
|
If release name contains chart name it will be used as a full name.
|
||||||
|
*/}}
|
||||||
|
{{- define "freeleaps-data-backup.fullname" -}}
|
||||||
|
{{- if .Values.fullnameOverride }}
|
||||||
|
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
|
||||||
|
{{- else }}
|
||||||
|
{{- $name := default .Chart.Name .Values.nameOverride }}
|
||||||
|
{{- if contains $name .Release.Name }}
|
||||||
|
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
|
||||||
|
{{- else }}
|
||||||
|
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
|
||||||
|
{{- end }}
|
||||||
|
{{- end }}
|
||||||
|
{{- end }}
|
||||||
|
|
||||||
|
{{/*
|
||||||
|
Create chart name and version as used by the chart label.
|
||||||
|
*/}}
|
||||||
|
{{- define "freeleaps-data-backup.chart" -}}
|
||||||
|
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
|
||||||
|
{{- end }}
|
||||||
|
|
||||||
|
{{/*
|
||||||
|
Common labels
|
||||||
|
*/}}
|
||||||
|
{{- define "freeleaps-data-backup.labels" -}}
|
||||||
|
helm.sh/chart: {{ include "freeleaps-data-backup.chart" . }}
|
||||||
|
{{ include "freeleaps-data-backup.selectorLabels" . }}
|
||||||
|
{{- if .Chart.AppVersion }}
|
||||||
|
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
|
||||||
|
{{- end }}
|
||||||
|
app.kubernetes.io/managed-by: {{ .Release.Service }}
|
||||||
|
{{- end }}
|
||||||
|
|
||||||
|
{{/*
|
||||||
|
Selector labels
|
||||||
|
*/}}
|
||||||
|
{{- define "freeleaps-data-backup.selectorLabels" -}}
|
||||||
|
app.kubernetes.io/name: {{ include "freeleaps-data-backup.name" . }}
|
||||||
|
app.kubernetes.io/instance: {{ .Release.Name }}
|
||||||
|
{{- end }}
|
||||||
|
|
||||||
|
{{/*
|
||||||
|
Create the name of the service account to use
|
||||||
|
*/}}
|
||||||
|
{{- define "freeleaps-data-backup.serviceAccountName" -}}
|
||||||
|
{{- if .Values.serviceAccount.create }}
|
||||||
|
{{- default (include "freeleaps-data-backup.fullname" .) .Values.serviceAccount.name }}
|
||||||
|
{{- else }}
|
||||||
|
{{- default "default" .Values.serviceAccount.name }}
|
||||||
|
{{- end }}
|
||||||
|
{{- end }}
|
||||||
@ -0,0 +1,54 @@
|
|||||||
|
{{- if .Values.cronjob.enabled -}}
|
||||||
|
apiVersion: batch/v1
|
||||||
|
kind: CronJob
|
||||||
|
metadata:
|
||||||
|
name: {{ include "freeleaps-data-backup.fullname" . }}
|
||||||
|
namespace: {{ .Values.backup.namespace }}
|
||||||
|
labels:
|
||||||
|
app: {{ include "freeleaps-data-backup.name" . }}
|
||||||
|
component: backup
|
||||||
|
{{- include "freeleaps-data-backup.labels" . | nindent 4 }}
|
||||||
|
{{- with .Values.annotations }}
|
||||||
|
annotations:
|
||||||
|
{{- toYaml . | nindent 4 }}
|
||||||
|
{{- end }}
|
||||||
|
spec:
|
||||||
|
schedule: {{ .Values.cronjob.schedule | quote }}
|
||||||
|
concurrencyPolicy: {{ .Values.cronjob.concurrencyPolicy }}
|
||||||
|
successfulJobsHistoryLimit: {{ .Values.cronjob.successfulJobsHistoryLimit }}
|
||||||
|
failedJobsHistoryLimit: {{ .Values.cronjob.failedJobsHistoryLimit }}
|
||||||
|
jobTemplate:
|
||||||
|
spec:
|
||||||
|
template:
|
||||||
|
metadata:
|
||||||
|
labels:
|
||||||
|
app: {{ include "freeleaps-data-backup.name" . }}
|
||||||
|
component: backup
|
||||||
|
{{- include "freeleaps-data-backup.labels" . | nindent 12 }}
|
||||||
|
spec:
|
||||||
|
{{- if .Values.serviceAccount.enabled }}
|
||||||
|
serviceAccountName: {{ .Values.serviceAccount.name }}
|
||||||
|
{{- end }}
|
||||||
|
restartPolicy: {{ .Values.cronjob.restartPolicy }}
|
||||||
|
{{- with .Values.global.imagePullSecrets }}
|
||||||
|
imagePullSecrets:
|
||||||
|
{{- toYaml . | nindent 12 }}
|
||||||
|
{{- end }}
|
||||||
|
containers:
|
||||||
|
- name: backup
|
||||||
|
image: "{{ .Values.global.imageRegistry }}/{{ .Values.image.repository }}:{{ .Values.image.tag }}"
|
||||||
|
imagePullPolicy: {{ .Values.image.pullPolicy }}
|
||||||
|
env:
|
||||||
|
- name: TZ
|
||||||
|
value: "UTC"
|
||||||
|
- name: BACKUP_NAMESPACE
|
||||||
|
value: {{ .Values.backup.namespace | quote }}
|
||||||
|
- name: SNAPSHOT_CLASS
|
||||||
|
value: {{ .Values.backup.snapshotClass | quote }}
|
||||||
|
- name: TIMEOUT
|
||||||
|
value: {{ .Values.backup.timeout | quote }}
|
||||||
|
resources:
|
||||||
|
{{- toYaml .Values.resources | nindent 14 }}
|
||||||
|
securityContext:
|
||||||
|
{{- toYaml .Values.securityContext | nindent 14 }}
|
||||||
|
{{- end }}
|
||||||
@ -0,0 +1,37 @@
|
|||||||
|
{{- if .Values.rbac.enabled -}}
|
||||||
|
{{- if .Values.rbac.create -}}
|
||||||
|
---
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
kind: ClusterRole
|
||||||
|
metadata:
|
||||||
|
name: {{ include "freeleaps-data-backup.fullname" . }}-role
|
||||||
|
labels:
|
||||||
|
app: {{ include "freeleaps-data-backup.name" . }}
|
||||||
|
component: backup
|
||||||
|
{{- include "freeleaps-data-backup.labels" . | nindent 4 }}
|
||||||
|
rules:
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["persistentvolumeclaims"]
|
||||||
|
verbs: ["get", "list"]
|
||||||
|
- apiGroups: ["snapshot.storage.k8s.io"]
|
||||||
|
resources: ["volumesnapshots", "volumesnapshotclasses"]
|
||||||
|
verbs: ["get", "list", "create", "update", "patch", "delete"]
|
||||||
|
---
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
kind: ClusterRoleBinding
|
||||||
|
metadata:
|
||||||
|
name: {{ include "freeleaps-data-backup.fullname" . }}-rolebinding
|
||||||
|
labels:
|
||||||
|
app: {{ include "freeleaps-data-backup.name" . }}
|
||||||
|
component: backup
|
||||||
|
{{- include "freeleaps-data-backup.labels" . | nindent 4 }}
|
||||||
|
roleRef:
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
||||||
|
kind: ClusterRole
|
||||||
|
name: {{ include "freeleaps-data-backup.fullname" . }}-role
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: {{ .Values.serviceAccount.name }}
|
||||||
|
namespace: {{ .Values.backup.namespace }}
|
||||||
|
{{- end }}
|
||||||
|
{{- end }}
|
||||||
@ -0,0 +1,17 @@
|
|||||||
|
{{- if .Values.serviceAccount.enabled -}}
|
||||||
|
{{- if .Values.serviceAccount.create -}}
|
||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
name: {{ .Values.serviceAccount.name }}
|
||||||
|
namespace: {{ .Values.backup.namespace }}
|
||||||
|
labels:
|
||||||
|
app: {{ include "freeleaps-data-backup.name" . }}
|
||||||
|
component: backup
|
||||||
|
{{- include "freeleaps-data-backup.labels" . | nindent 4 }}
|
||||||
|
{{- with .Values.serviceAccount.annotations }}
|
||||||
|
annotations:
|
||||||
|
{{- toYaml . | nindent 4 }}
|
||||||
|
{{- end }}
|
||||||
|
{{- end }}
|
||||||
|
{{- end }}
|
||||||
@ -0,0 +1,40 @@
|
|||||||
|
# Production values for freeleaps-data-backup
|
||||||
|
|
||||||
|
# Image settings
|
||||||
|
image:
|
||||||
|
repository: freeleaps-pvc-backup
|
||||||
|
tag: "latest"
|
||||||
|
pullPolicy: Always
|
||||||
|
|
||||||
|
# CronJob settings
|
||||||
|
cronjob:
|
||||||
|
enabled: true
|
||||||
|
schedule: "0 8 * * *" # Daily at 00:00 UTC (08:00 UTC+8)
|
||||||
|
concurrencyPolicy: Forbid
|
||||||
|
successfulJobsHistoryLimit: 7
|
||||||
|
failedJobsHistoryLimit: 3
|
||||||
|
restartPolicy: OnFailure
|
||||||
|
|
||||||
|
# Resource limits for production
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
memory: "256Mi"
|
||||||
|
cpu: "200m"
|
||||||
|
limits:
|
||||||
|
memory: "512Mi"
|
||||||
|
cpu: "500m"
|
||||||
|
|
||||||
|
# Backup configuration for production
|
||||||
|
backup:
|
||||||
|
namespace: "freeleaps-prod"
|
||||||
|
pvcs:
|
||||||
|
- "gitea-shared-storage"
|
||||||
|
- "data-freeleaps-prod-gitea-postgresql-ha-postgresql-0"
|
||||||
|
snapshotClass: "csi-azuredisk-vsc"
|
||||||
|
timeout: 600 # 10 minutes for production
|
||||||
|
|
||||||
|
# Labels for production
|
||||||
|
labels:
|
||||||
|
environment: "production"
|
||||||
|
team: "devops"
|
||||||
|
component: "backup"
|
||||||
@ -0,0 +1,65 @@
|
|||||||
|
# Default values for freeleaps-data-backup
|
||||||
|
# This is a YAML-formatted file.
|
||||||
|
|
||||||
|
# Global settings
|
||||||
|
global:
|
||||||
|
imageRegistry: "freeleaps-registry.azurecr.io"
|
||||||
|
imagePullSecrets: []
|
||||||
|
|
||||||
|
# Image settings
|
||||||
|
image:
|
||||||
|
repository: freeleaps-pvc-backup
|
||||||
|
tag: "latest"
|
||||||
|
pullPolicy: Always
|
||||||
|
|
||||||
|
# CronJob settings
|
||||||
|
cronjob:
|
||||||
|
enabled: true
|
||||||
|
schedule: "0 8 * * *" # Daily at 00:00 UTC (08:00 UTC+8)
|
||||||
|
concurrencyPolicy: Forbid
|
||||||
|
successfulJobsHistoryLimit: 7
|
||||||
|
failedJobsHistoryLimit: 3
|
||||||
|
restartPolicy: OnFailure
|
||||||
|
|
||||||
|
# Resource limits
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
memory: "128Mi"
|
||||||
|
cpu: "100m"
|
||||||
|
limits:
|
||||||
|
memory: "256Mi"
|
||||||
|
cpu: "200m"
|
||||||
|
|
||||||
|
# Security context
|
||||||
|
securityContext:
|
||||||
|
allowPrivilegeEscalation: false
|
||||||
|
runAsNonRoot: true
|
||||||
|
runAsUser: 1000
|
||||||
|
capabilities:
|
||||||
|
drop:
|
||||||
|
- ALL
|
||||||
|
|
||||||
|
# RBAC settings
|
||||||
|
rbac:
|
||||||
|
enabled: true
|
||||||
|
create: true
|
||||||
|
|
||||||
|
# ServiceAccount settings
|
||||||
|
serviceAccount:
|
||||||
|
enabled: true
|
||||||
|
create: true
|
||||||
|
name: "freeleaps-backup-sa"
|
||||||
|
annotations: {}
|
||||||
|
|
||||||
|
# Backup configuration
|
||||||
|
backup:
|
||||||
|
namespace: "freeleaps-prod"
|
||||||
|
pvcs:
|
||||||
|
- "gitea-shared-storage"
|
||||||
|
- "data-freeleaps-prod-gitea-postgresql-ha-postgresql-0"
|
||||||
|
snapshotClass: "csi-azuredisk-vsc"
|
||||||
|
timeout: 300 # seconds
|
||||||
|
|
||||||
|
# Labels and annotations
|
||||||
|
labels: {}
|
||||||
|
annotations: {}
|
||||||
2
jobs/freeleaps-data-backup/requirements.txt
Normal file
2
jobs/freeleaps-data-backup/requirements.txt
Normal file
@ -0,0 +1,2 @@
|
|||||||
|
kubernetes==28.1.0
|
||||||
|
PyYAML==6.0.1
|
||||||
Loading…
Reference in New Issue
Block a user