This website uses cookies to ensure you get the best experience. Read more in Privacy Policy.
OK
/whitepapers/ data-on-kubernetes

The Cloud-Native Data Manual

Engineering Production Databases on Kubernetes

The debate is over. The real question is: how do you manage stateful workloads without losing data or burning cash?

Executive Summary

The debate over whether databases "belong" on Kubernetes is settled. The discussion is now strictly operational: how do you manage stateful workloads at scale without losing data or overpaying for "zombie" resources? Kubernetes is a world-class orchestrator, but it is fundamentally "data-blind." It understands process status, but it is oblivious to database deadlocks or replica divergence.

This manual outlines the operational mandates for high-performance data layers and explains how the Solanica Platform, powered by OpenEverest, codifies these requirements into a Sovereign DBaaS.

CORE_CHALLENGES
Data Blindness: Kubernetes sees pod status, not database health
Cost Leakage: Zombie resources burn cash 24/7 in dev/test
StatefulSet Limitations: 5-10 minute failover windows are unacceptable
Storage Chaos: Navigating EBS, NVMe, and S3 without a strategy

1. Availability: Beyond the "Running" Pod

In a stateless environment, a Running status is sufficient. In a database environment, a pod can be Running while the engine is stuck replaying a corrupted Write-Ahead Log (WAL) or suffering from extreme replication lag. Relying on Kubernetes' default self-healing—killing and restarting a pod—is not a strategy; it is a gamble with your RTO (Recovery Time Objective).

CRITICAL: ENGINE_NATIVE_FAILOVER

The Engine-Native Failover Mandate

Kubernetes' default controller for stateful apps is the StatefulSet. While it provides stable network identities and persistent storage links, it is fundamentally too slow for high-availability database failover. When a node becomes unreachable, the StatefulSet controller waits for the node-monitor-grace-period (typically 40 seconds) before even marking a pod for deletion. Because of the "At-Most-One-Pod" constraint of StatefulSets, a new pod cannot start until the old one is confirmed dead to prevent two pods from writing to the same PersistentVolume (PV) and corrupting data.

This "Mount Lockdown" often results in downtime exceeding 5-10 minutes.

OpenEverest moves failover intelligence into the database layer itself, bypassing the sluggishness of StatefulSet rescheduling. We utilize engine-native replication (Streaming Replication for PostgreSQL, Raft for MongoDB). When a primary node shows signs of hardware failure or network partitioning, the Solanica Platform orchestrates a promotion of a healthy follower—which already has its own local or network-attached PersistentVolume ready—before Kubernetes detects the pod failure. This shrinks failover windows from minutes to seconds.

By the time the StatefulSet finally decides to reschedule the old primary pod, the application has already been re-routed to a newly promoted leader.

StatefulSet Failover 5-10 min
OpenEverest Failover seconds
Replication Method Engine-Native
FAILOVER TIMELINE Primary Pod ⚠ FAILING PV-1 (locked) Replica Pod ✓ PROMOTING PV-2 (ready) Replica Pod ◎ STANDBY PV-3 (ready) t=0 t=25s NEW PRIMARY
POLICY: ANTI_AFFINITY_MANDATE

Physical Topology: The Anti-Affinity Rule

High availability is a physical problem. Solanica mandates strict podAntiAffinity rules for all production deployments. The primary and its replicas are forbidden from landing on the same physical node.

podAntiAffinity:
  requiredDuringScheduling:
    - labelSelector: db=primary
Node Isolation Required Policy
PHYSICAL NODE DISTRIBUTION node-1.cluster Primary P node-2.cluster Replica 1 R node-3.cluster Replica 2 R ✕ PROHIBITED ✕ PROHIBITED ✓ No Single Point of Failure
RESILIENCE: TOPOLOGY_SPREAD

Multi-AZ Distribution

Furthermore, we automate topologySpreadConstraints across Availability Zones (AZs). If an entire AWS or GCP region fails, your quorum remains intact in the remaining zones. We treat infrastructure as volatile; our platform assumes the node will die and ensures the database stays alive.

Zone Redundancy Quorum Protection Region Failure Ready
MULTI-AZ TOPOLOGY SPREAD us-east-1a node-az1-01 Primary us-east-1b node-az2-01 Replica 1 us-east-1c node-az3-01 Replica 2

2. Compute: The "Guaranteed" Performance Tier

The most common cause of database "flapping" on Kubernetes is the misuse of CPU and Memory requests/limits.

WARNING: ELIMINATE_BURSTABLE_DBS

Eliminate "Burstable" Databases

In stateless apps, setting low requests and high limits is standard for "bin-packing." In a database world, this is a fatal error. If your memory limit is higher than its request, the Linux OOM (Out-of-Memory) killer targets the database first during node pressure.

BURSTABLE_QOS
resources:
  requests:
    memory: 2Gi
  limits:
    memory: 8Gi

⚠️ OOM Kill Target During Pressure

GUARANTEED_QOS
resources:
  requests:
    memory: 8Gi
  limits:
    memory: 8Gi

✓ Protected by Kernel Priority

Solanica enforces Guaranteed QoS (Quality of Service). We set requests == limits by default. This signals to the Linux kernel that the database is the most critical process on the host.

KUBERNETES RESOURCES: REQUESTS vs LIMITS Node Capacity: 16Gi Memory Total Available Memory ❌ BURSTABLE QoS (Dangerous for DBs) requests: 2Gi 2Gi limits: 8Gi 8Gi (Can burst here) ⚠ OOM killed when node under pressure ✓ GUARANTEED QoS (Safe for DBs) requests: 8Gi 8Gi limits: 8Gi 8Gi (No burst) ✓ Protected from OOM killer OOM Kill Priority: BestEffort (no resources) Burstable (req < limit) Guaranteed
MANDATE: 20_PERCENT_OVERHEAD

The 20% Overhead Mandate

OpenEverest automatically calculates memory allocations to be ~20% higher than the database's internal cache (e.g., PostgreSQL's shared_buffers). This is not "buffer"; it is required space for sidecar processes—backups, monitoring agents, and the OS page cache—preventing random OOM-kills during heavy index scans or vacuum operations.

Memory Calculation Formula
Total Memory = shared_buffers × 1.20
Accounts for: sidecars, OS cache, vacuum ops
Auto-Calculated OOM Prevention Sidecar Safe
QOS_CLASSES
Guaranteed requests == limits
Burstable requests < limits
BestEffort no requests/limits
OOM Kill Priority:
BestEffort → Burstable → Guaranteed
BEST_PRACTICE
Always use Guaranteed QoS for databases
Include 20% overhead for sidecars
Monitor kernel OOM events
Never rely on bin-packing

3. Storage: A Tale of Three Tiers

Storage is where 80% of Kubernetes database issues live. It is the primary cost driver and the most likely performance bottleneck. Navigating this requires understanding the hierarchy of safety, convenience, and performance.

3 Local NVMe
500k+ IOPS
<1ms Latency
Performance Peak
2 EBS / PD
10-100k IOPS
~10ms Latency
Standard Tier
1 S3 Storage
11 9's Durability
3-2-1 Rule
Foundation Layer
TIER 1: FOUNDATION

S3 as the Immutable Truth

Before discussing disks, we must discuss survival. In Kubernetes, disks are ephemeral by nature. If a cloud region goes down or a volume is corrupted, your "Persistent" Volume is anything but.

The first mandate of any Solanica deployment is a stream to S3-compatible object storage. Following the 3-2-1 Rule (3 copies, 2 media types, 1 off-site), OpenEverest streams WAL logs and snapshots directly to an external bucket. This is your "Zero-Day" insurance policy; it ensures that even if your entire Kubernetes cluster is wiped from the face of the earth, your data is recoverable in minutes.

Disaster Recovery: Region-level failure protection
Point-in-Time Recovery: WAL streaming for PITR
Off-Site Backup: 3-2-1 compliance by default
S3 Compatible GCS Support Azure Blob
TIER 2: STANDARD

Network-Attached Storage (EBS/PD)

For many workloads, network-attached storage like AWS EBS or GCP Persistent Disk is the correct choice. It is resilient, decoupled from the node's lifecycle, and easy to snapshot.

However, there is a hidden tax. As your data grows and your transaction volume increases, you hit the "IOPS Ceiling." To get the performance your database needs, you are forced to pay for "Provisioned IOPS"—a recurring cloud cost that often exceeds the cost of the compute itself. When your database latency starts creeping up due to network congestion between the node and the storage array, network disks transition from a convenience to a bottleneck.

ADVANTAGES
  • Decoupled from node lifecycle
  • Easy snapshots & backups
  • Resilient by default
  • Simple to manage
TRADE_OFFS
  • Network latency overhead
  • Provisioned IOPS costs
  • IOPS ceiling limitations
  • Congestion bottlenecks
AWS EBS GCP PD Azure Disk
TIER 3: PERFORMANCE PEAK

The Local NVMe Mandate

When sub-millisecond latency is a requirement—for high-frequency financial transactions or massive real-time APIs—network storage fails. This is where we move to Local NVMe.

THE_BENEFIT

You bypass the network entirely, achieving 500k+ IOPS and near-zero latency. By using local hardware, you eliminate the "Provisioned IOPS" tax, often cutting your storage bill by 40-60%.

THE_TRADE_OFF

If the physical node dies, the data on that specific NVMe is gone.

SOLANICA_STRATEGY

We don't fear node failure; we build for it. OpenEverest manages this risk at the application layer through synchronous replication across different physical nodes. Losing a node is treated as a simple "re-sync" event from the S3 stream or other peers, not a data loss disaster.

500k+ IOPS
<1ms Latency
40-60% Cost Savings
Local NVMe Sync Replication Zero Network Tax

Storage Decision Matrix

Dev/Test Environments
→ EBS/PD (Standard)
General Production Workloads
→ EBS/PD + S3 Backup
High-IOPS Financial Trading
→ Local NVMe + Sync Replication
Real-Time Analytics/AI
→ Local NVMe + S3 Streams

4. FinOps: Eliminating Zombie Infrastructure

Databases are typically "always-on" expenses. Even in Dev/Test environments, they sit idle, burning expensive compute and IOPS.

AUTOMATION: SCALE_TO_ZERO

Scale to Zero

Solanica Platform and OpenEverest easily integrates with KEDA (Kubernetes Event-driven Autoscaling) to monitor connection idleness. For non-production environments, the platform automatically hibernates the database, scaling replicas to zero when no developers are active. Billing for compute and IOPS stops immediately; the database wakes only when a new connection is attempted.

Before: Always-On Dev DB
$2,160/month
24/7 computeProvisioned IOPS
After: KEDA Scale-to-Zero
$432/month
8hrs/day usageStorage only
$ 80% Cost Reduction
Auto-hibernate on connection idle
Wake on first connection attempt
Perfect for dev/test/staging
KEDA integration included
KEDA Event-Driven Auto-Hibernate
OPTIMIZATION: VPA_RIGHTSIZING

VPA Rightsizing

Instead of guessing RAM requirements, use the Vertical Pod Autoscaler (VPA) in recommendation mode. With the latest addition to Kubernetes where pods do not require restart on resource change, it becomes even easier to rightsize your resources dynamically.

1
Monitor
VPA observes actual usage patterns
2
Recommend
Suggests optimal requests/limits
3
Apply
Resources adjusted without restart
New in Kubernetes 1.27+: In-place resource updates mean databases can be rightsized without pod restarts, eliminating downtime from resource adjustments.
VPA No Restart Dynamic Sizing

Common Zombie Resources

Forgotten Dev Databases
Running 24/7 with zero connections
Solution: KEDA scale-to-zero after 30min idle
Over-Provisioned Resources
16GB RAM allocated, 4GB used
Solution: VPA recommendation + auto-resize
Orphaned Test Instances
Created for feature branch, never deleted
Solution: TTL policies + auto-cleanup
Idle GPU Instances
$2,000/month GPU sitting unused
Solution: GPU slicing + aggressive hibernation
COST_IMPACT_SIMULATION
Typical Enterprise Setup
  • 50 dev/test databases
  • Average: 8GB RAM, 4 vCPU, 100GB storage
  • Cloud provider: AWS
  • Actual usage: 8hrs/day (weekdays only)
Monthly Cost (Always-On) $18,750
Monthly Cost (Solanica FinOps) $4,875
Annual Savings $166,500

5. Security: Zero-Trust Data Access

In a modern cluster, IP-based security is obsolete. Any compromised pod in your cluster can attempt to scan your database port.

Zero-Trust Database Security production-db DB DB P Strict RBAC staging-db DB S Team Access dev-db DB D Self-Service Namespace Isolation Client App mTLS TLS 1.3 Database Primary SSL Replication Replica Standby NetworkPolicy: Allow ingress from backend namespace only ClusterIP Service (Internal Only) Encrypted by Default
LAYER_1: NAMESPACE_ISOLATION

Namespace Isolation

Kubernetes by design provides namespace isolation. It is handy to put resources in different namespaces and grant your teams segregated access to these. That way you are sure that not only needed people have access to your data, but also have resources under control as namespaces come with quota limits.

production-db
Critical databases, strict RBAC
staging-db
Team access, quota-limited
dev-db
Developer self-service
RBAC controls per namespace
Resource quotas prevent overuse
Logical separation by environment
RBAC Resource Quotas Multi-Tenant
LAYER_2: NETWORK_ISOLATION

Network Isolation

By default OpenEverest deploys all databases with ClusterIP Service that does not expose databases to externally. With specific annotations and other Service types you can control your database exposure flexibly and also integrate with your favorite network proxy tools. Ensuring least exposure for your sensitive data.

LoadBalancer
External access with firewall rules
USE WITH CAUTION
NodePort
Static port on all nodes
AVOID
networkPolicy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
spec:
  podSelector:
    matchLabels: { app: postgres }
  ingress:
  - from:
    - namespaceSelector:
        matchLabels: { tier: backend }
ClusterIP Default NetworkPolicy Proxy Integration
LAYER_3: ENCRYPTED_BY_DEFAULT

Encrypted by Default

There are mainly two encryption layers for databases. Transport security is encryption connection between clients (users and applications) and database proxies. The next layer is encrypting traffic between replicas. Even when databases deployed inside a single k8s cluster, it is mandatory to encrypt your replication traffic to avoid unnecessary risks.

Client-to-Database Encryption
TLS 1.3 for All Connections
Automatic TLS Certificates: cert-manager integration for auto-rotation
Force SSL Mode: Reject unencrypted connections by default
mTLS Support: Mutual authentication for critical workloads
TLS 1.3 cert-manager Auto-Rotation
Replica-to-Replica Encryption
Encrypted Replication Streams
WAL Stream Encryption: PostgreSQL streaming replication over SSL
MongoDB TLS Replication: Encrypted replica set communication
Cross-AZ Protection: Encrypted even within VPC boundaries
SSL Replication Cross-AZ Safe Engine-Native

Zero-Trust Security Checklist

Databases in dedicated namespaces
RBAC policies per team/environment
ClusterIP service by default
NetworkPolicies for ingress control
TLS 1.3 for client connections
Encrypted replication traffic
Auto-rotating certificates
Secrets stored in encrypted etcd
SECURITY_MANDATE
Never Rely on IP Whitelisting

Pod IPs change constantly. Use NetworkPolicies with label selectors instead of hardcoded IP ranges.

Treat Internal Traffic as Untrusted

A compromised pod can lateral-move within your cluster. Always encrypt, even for "internal" communication.

Automate Certificate Rotation

Manual certificate management always fails. Use cert-manager to handle lifecycle automatically.

Conclusion: The Sovereign Data Platform

Solanica and OpenEverest provide a "Private DBaaS." You get the automation of AWS RDS or MongoDB Atlas, but you maintain 100% control over your data, your infrastructure, and your costs. We don't just "run" databases; we engineer them for survival.

The Five Operational Pillars

1
Availability

Engine-native failover bypasses Kubernetes' StatefulSet delays, shrinking downtime from minutes to sub-30 seconds. Physical anti-affinity and multi-AZ distribution ensure survival.

Failover Time: <30s
2
Compute

Guaranteed QoS prevents OOM kills. The 20% overhead mandate ensures sidecars and OS processes never starve your database of resources.

QoS Class: Guaranteed
3
Storage

S3 streams provide disaster recovery. Network disks offer convenience. Local NVMe delivers 500k+ IOPS when latency is non-negotiable. We build for node failure.

Max IOPS: 500k+
4
FinOps

KEDA scale-to-zero eliminates zombie dev databases. VPA rightsizing prevents over-provisioning. The result: 60-80% cost reduction in non-production environments.

Cost Savings: 60-80%
5
Security

Namespace isolation, ClusterIP by default, TLS 1.3 for all connections, and encrypted replication. Zero-trust architecture where internal traffic is treated as untrusted.

Encryption: E2E TLS
Data Sovereignty is Non-Negotiable

Cloud vendors lock you into their ecosystems. They control your data gravity, your egress fees, and your pricing power. With Solanica, you operate your own sovereign DBaaS on your infrastructure—whether that's AWS, GCP, Azure, or bare metal in your own data center.

Zero Vendor Lock-In: Run anywhere Kubernetes runs
Full Data Control: Your data never leaves your control plane
Transparent Costs: No surprise bills, no egress taxes
Compliance Ready: Satisfy GDPR, HIPAA, SOC2 on your terms
THE_MANDATE

"Kubernetes was designed to orchestrate processes, not data. OpenEverest bridges that gap by embedding data-aware intelligence into the platform itself. Failovers happen before Kubernetes notices. Storage tiers adapt to workload demands. Costs vanish when no one is looking. Security is pervasive, not bolted on."

The debate is over. Databases belong on Kubernetes. The only question is: Do you want to manage the complexity yourself, or let Solanica do it for you?

$ run --mode=production

Start Your Sovereign Data Journey

Deploy production-grade databases on Kubernetes without the operational nightmare. Solanica Platform powered by OpenEverest gives you cloud-level automation with complete control.

100% Open Source
seconds Failover Time
60-80% Cost Savings
solanica@k8s:~$ █