Longhorn Distributed Storage¶

Longhorn provides distributed block storage for the Kubernetes cluster, offering persistent storage with replication, snapshots, and backup capabilities.

Overview¶

Purpose: Cloud-native distributed block storage for Kubernetes

Key Features:

Distributed storage across cluster nodes
Automatic data replication for high availability
Volume snapshots and backups
Web-based management interface
CSI driver for seamless Kubernetes integration
Cross-node data distribution

Technical Details:

Namespace: longhorn-system
Chart: longhorn/longhorn
Version: v1.8.1
StorageClass: longhorn
Web UI: https://longhorn.kjho.me

Architecture¶

Core Components¶

Longhorn Manager:

Runs as DaemonSet on all nodes
Manages volume creation, deletion, and operations
Handles replica placement and scheduling
Provides API for volume management

Longhorn Engine:

Per-volume process managing I/O operations
Handles data replication between replicas
Performs snapshot and backup operations
Ensures data consistency and integrity

Longhorn UI:

Web-based management interface
Volume monitoring and management
Node and disk management
Backup and snapshot operations

📋 View HelmRelease Configuration

Storage Architecture¶

Data Distribution:

Volumes distributed across available nodes
Configurable replica count (default: 3)
Automatic replica placement based on node resources
Support for node affinity and anti-affinity rules

Node Requirements:

nuc1: 512GB SSD, 32GB RAM
nuc2: 512GB SSD, 16GB RAM
nuc3: 256GB SSD, 8GB RAM
nucx: 1TB SSD, 4-8GB RAM (dynamic)

Configuration¶

Storage Classes¶

Default Longhorn StorageClass:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: longhorn
provisioner: driver.longhorn.io
parameters:
  numberOfReplicas: "2"
  staleReplicaTimeout: "30"
  fromBackup: ""
  fsType: "ext4"
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: Immediate

Key Parameters:

numberOfReplicas: Data redundancy level
staleReplicaTimeout: Timeout for stale replica detection
fsType: Filesystem type for volumes
allowVolumeExpansion: Enable dynamic volume expansion

Volume Management¶

Persistent Volume Claims:

Applications request storage through PVCs that automatically provision Longhorn volumes:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-data
  namespace: app-namespace
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 10Gi

Volume Features:

Dynamic Provisioning: Automatic volume creation
Volume Expansion: Online volume size increase
Snapshots: Point-in-time volume snapshots
Backups: External backup to S3-compatible storage
Cloning: Create volumes from existing volumes/snapshots

High Availability¶

Replica Management¶

Replica Placement:

Automatic replica distribution across nodes
Anti-affinity rules prevent replicas on same node
Node failure tolerance based on replica count
Automatic replica rebuilding on node recovery

Data Consistency:

Synchronous replication between replicas
Read/write quorum requirements
Automatic corrupt replica detection
Consistent snapshot creation across replicas

Failure Scenarios¶

Node Failure:

Remaining replicas continue serving I/O
Automatic replica rebuilding on available nodes
No data loss with sufficient replica count
Application pods automatically reschedule

Disk Failure:

Affected replicas marked as failed
New replicas created on healthy disks
Data rebuilt from healthy replicas
Automatic cleanup of failed replicas

Monitoring and Management¶

Web Interface¶

Dashboard Features:

Volume status and health monitoring
Node and disk utilization
Backup and snapshot management
Performance metrics and alerts
System event logging

Access:

URL: https://longhorn.kjho.me
Authentication: Kubernetes RBAC integration
Authorization: Admin access required

Volume Operations¶

Snapshot Management:

# Create volume snapshot
kubectl apply -f - <<EOF
apiVersion: longhorn.io/v1beta1
kind: VolumeSnapshot
metadata:
  name: volume-snapshot
  namespace: longhorn-system
spec:
  volumeName: pvc-volume-name
EOF

# List snapshots
kubectl get volumesnapshots -n longhorn-system

Backup Operations:

Backup Target: S3-compatible storage (optional)
Automatic Backups: Scheduled backup policies
Cross-Cluster Recovery: Restore volumes in different clusters
Incremental Backups: Efficient backup storage

Performance Optimization¶

Node Configuration¶

Disk Performance:

Use SSDs for better I/O performance
Ensure sufficient disk space for replicas
Monitor disk usage and health
Consider disk locality for performance

Network Optimization:

High-bandwidth network between nodes
Low-latency connections for replication
Monitor network utilization
Consider dedicated storage network

Volume Configuration¶

Performance Tuning:

Adjust replica count based on requirements
Use appropriate filesystem type
Configure volume scheduling policies
Monitor volume I/O patterns

Troubleshooting¶

Common Issues¶

Volume Mount Failures:

# Check PVC status
kubectl get pvc -A

# Check volume status
kubectl get volumes -n longhorn-system

# Check Longhorn system pods
kubectl get pods -n longhorn-system

# View volume details in UI
# Access https://longhorn.kjho.me

Replica Issues:

# Check replica status
kubectl get replicas -n longhorn-system

# View engine logs
kubectl logs -n longhorn-system -l app=longhorn-manager

# Check node disk status
kubectl get nodes -n longhorn-system -o wide

Performance Problems:

# Monitor volume I/O
kubectl top pods -A

# Check node resources
kubectl top nodes

# View Longhorn system events
kubectl get events -n longhorn-system --sort-by=.metadata.creationTimestamp

Recovery Procedures¶

Volume Recovery:

Identify Problem: Check volume status in UI
Assess Replicas: Verify replica health and count
Rebuild Replicas: Trigger replica rebuilding if needed
Data Verification: Verify data integrity after recovery

Node Recovery:

Node Restart: Restart failed node if possible
Replica Migration: Move replicas to healthy nodes
Disk Replacement: Replace failed disks if necessary
Data Rebuild: Rebuild replicas on new storage

Backup Strategy¶

Backup Configuration¶

S3 Backup Target (Optional):

# Configure in Longhorn settings
backupTarget: "s3://bucket-name@region/"
backupTargetCredential: "s3-credentials-secret"

Backup Policies:

Automated Snapshots: Scheduled volume snapshots
Retention Policies: Automatic snapshot cleanup
Cross-Cluster Backup: Backup to external storage
Disaster Recovery: Full cluster backup strategy

Snapshot Management¶

Snapshot Best Practices:

Regular snapshot schedules for critical volumes
Retention policies to manage storage usage
Test snapshot restoration procedures
Document recovery procedures

Useful Commands¶

# Check Longhorn system status
kubectl get pods -n longhorn-system

# List all volumes
kubectl get volumes -n longhorn-system

# Check storage classes
kubectl get storageclass

# View PVC status across namespaces
kubectl get pvc -A

# Check node storage capacity
kubectl get nodes -o custom-columns=NAME:.metadata.name,CAPACITY:.status.capacity.storage

# Monitor Longhorn manager logs
kubectl logs -n longhorn-system -l app=longhorn-manager -f

# Check volume snapshots
kubectl get volumesnapshots -n longhorn-system

# View node disk usage
kubectl get nodes -n longhorn-system -o yaml | grep -A 10 "storage"

Best Practices¶

Storage Management¶

Volume Planning:

Size volumes appropriately for application needs
Consider growth patterns and expansion requirements
Use appropriate replica counts for data criticality
Monitor storage usage and capacity

Performance Optimization:

Place I/O intensive workloads on performant nodes
Monitor volume performance metrics
Use appropriate filesystem types
Consider storage locality for performance

Maintenance¶

Regular Tasks:

Monitor node disk health and capacity
Review volume usage and performance
Test backup and recovery procedures
Update Longhorn version regularly
Clean up unused volumes and snapshots

Capacity Planning:

Monitor cluster storage growth
Plan for node storage expansion
Consider data lifecycle policies
Implement storage quotas where appropriate

📁 Related Files: