This website uses cookies to ensure you get the best experience. Read more in Privacy Policy.

OK

The Cloud-Native Data Manual

Engineering Production Databases on Kubernetes

The debate is over. The real question is: how do you manage stateful workloads without losing data or burning cash?

● STATUS: PRODUCTION_READY

● AUDIENCE: SRE | DBRE | PLATFORM_ENG

● LEVEL: ADVANCED_OPS

$ cat executive-summary.txt

Executive Summary

The debate over whether databases "belong" on Kubernetes is settled. The discussion is now strictly operational: how do you manage stateful workloads at scale without losing data or overpaying for "zombie" resources? Kubernetes is a world-class orchestrator, but it is fundamentally "data-blind." It understands process status, but it is oblivious to database deadlocks or replica divergence.

This manual outlines the operational mandates for high-performance data layers and explains how the Solanica Platform, powered by OpenEverest, codifies these requirements into a Sovereign DBaaS.

● CORE_CHALLENGES

▸

Data Blindness: Kubernetes sees pod status, not database health

▸

Cost Leakage: Zombie resources burn cash 24/7 in dev/test

▸

StatefulSet Limitations: 5-10 minute failover windows are unacceptable

▸

Storage Chaos: Navigating EBS, NVMe, and S3 without a strategy

$ cat TABLE_OF_CONTENTS

Engineering Production Databases on Kubernetes

Five operational pillars for running stateful workloads in cloud-native environments

01

● Availability

Engine-native failover and topology spread for sub-30 second RTO

Failover Anti-Affinity Multi-AZ

→

02

● Compute

QoS guarantees and resource management to prevent OOM kills

QoS Requests/Limits 20% Overhead

→

03

● Storage

Three-tier architecture from NVMe speed to S3 durability

NVMe Network Disk Object Store

→

04

● FinOps

Cost optimization through scale-to-zero and rightsizing strategies

KEDA VPA 60-80% Savings

→

05

● Security

Zero-trust architecture with namespace isolation and E2E encryption

TLS 1.3 NetworkPolicy Encryption

→

Scroll to track progress

$ grep -r "AVAILABILITY" /docs/

1. Availability: Beyond the `"Running"` Pod

In a stateless environment, a Running status is sufficient. In a database environment, a pod can be Running while the engine is stuck replaying a corrupted Write-Ahead Log (WAL) or suffering from extreme replication lag. Relying on Kubernetes' default self-healing—killing and restarting a pod—is not a strategy; it is a gamble with your RTO (Recovery Time Objective).

● CRITICAL: ENGINE_NATIVE_FAILOVER

The Engine-Native Failover Mandate

Kubernetes' default controller for stateful apps is the StatefulSet. While it provides stable network identities and persistent storage links, it is fundamentally too slow for high-availability database failover. When a node becomes unreachable, the StatefulSet controller waits for the node-monitor-grace-period (typically 40 seconds) before even marking a pod for deletion. Because of the "At-Most-One-Pod" constraint of StatefulSets, a new pod cannot start until the old one is confirmed dead to prevent two pods from writing to the same PersistentVolume (PV) and corrupting data.

This "Mount Lockdown" often results in downtime exceeding 5-10 minutes.

OpenEverest moves failover intelligence into the database layer itself, bypassing the sluggishness of StatefulSet rescheduling. We utilize engine-native replication (Streaming Replication for PostgreSQL, Raft for MongoDB). When a primary node shows signs of hardware failure or network partitioning, the Solanica Platform orchestrates a promotion of a healthy follower—which already has its own local or network-attached PersistentVolume ready—before Kubernetes detects the pod failure. This shrinks failover windows from minutes to seconds.

By the time the StatefulSet finally decides to reschedule the old primary pod, the application has already been re-routed to a newly promoted leader.

StatefulSet Failover 5-10 min

OpenEverest Failover seconds

Replication Method Engine-Native

● POLICY: ANTI_AFFINITY_MANDATE

Physical Topology: The Anti-Affinity Rule

High availability is a physical problem. Solanica mandates strict podAntiAffinity rules for all production deployments. The primary and its replicas are forbidden from landing on the same physical node.

podAntiAffinity:

    requiredDuringScheduling:

      - labelSelector: db=primary

Node Isolation Required Policy

● RESILIENCE: TOPOLOGY_SPREAD

Multi-AZ Distribution

Furthermore, we automate topologySpreadConstraints across Availability Zones (AZs). If an entire AWS or GCP region fails, your quorum remains intact in the remaining zones. We treat infrastructure as volatile; our platform assumes the node will die and ensures the database stays alive.

Zone Redundancy Quorum Protection Region Failure Ready

$ kubectl get qos --all-namespaces

2. Compute: The `"Guaranteed"` Performance Tier

The most common cause of database "flapping" on Kubernetes is the misuse of CPU and Memory requests/limits.

● WARNING: ELIMINATE_BURSTABLE_DBS

Eliminate "Burstable" Databases

In stateless apps, setting low requests and high limits is standard for "bin-packing." In a database world, this is a fatal error. If your memory limit is higher than its request, the Linux OOM (Out-of-Memory) killer targets the database first during node pressure.

● BURSTABLE_QOS

resources:

    requests:

      memory: 2Gi

    limits:

      memory: 8Gi

⚠️ OOM Kill Target During Pressure

● GUARANTEED_QOS

resources:

    requests:

      memory: 8Gi

    limits:

      memory: 8Gi

✓ Protected by Kernel Priority

Solanica enforces Guaranteed QoS (Quality of Service). We set requests == limits by default. This signals to the Linux kernel that the database is the most critical process on the host.

● MANDATE: 20_PERCENT_OVERHEAD

The 20% Overhead Mandate

OpenEverest automatically calculates memory allocations to be ~20% higher than the database's internal cache (e.g., PostgreSQL's shared_buffers). This is not "buffer"; it is required space for sidecar processes—backups, monitoring agents, and the OS page cache—preventing random OOM-kills during heavy index scans or vacuum operations.

Memory Calculation Formula

Total Memory = shared_buffers × 1.20

Accounts for: sidecars, OS cache, vacuum ops

Auto-Calculated OOM Prevention Sidecar Safe

$ df -h /var/lib/postgresql

3. Storage: A Tale of Three Tiers

Storage is where 80% of Kubernetes database issues live. It is the primary cost driver and the most likely performance bottleneck. Navigating this requires understanding the hierarchy of safety, convenience, and performance.

3 Local NVMe

500k+ IOPS

<1ms Latency

Performance Peak

2 EBS / PD

10-100k IOPS

~10ms Latency

Standard Tier

1 S3 Storage

11 9's Durability

3-2-1 Rule

Foundation Layer

⬢ TIER 1: FOUNDATION

S3 as the Immutable Truth

Before discussing disks, we must discuss survival. In Kubernetes, disks are ephemeral by nature. If a cloud region goes down or a volume is corrupted, your "Persistent" Volume is anything but.

The first mandate of any Solanica deployment is a stream to S3-compatible object storage. Following the 3-2-1 Rule (3 copies, 2 media types, 1 off-site), OpenEverest streams WAL logs and snapshots directly to an external bucket. This is your "Zero-Day" insurance policy; it ensures that even if your entire Kubernetes cluster is wiped from the face of the earth, your data is recoverable in minutes.

● Disaster Recovery: Region-level failure protection

● Point-in-Time Recovery: WAL streaming for PITR

● Off-Site Backup: 3-2-1 compliance by default

S3 Compatible GCS Support Azure Blob

■ TIER 2: STANDARD

Network-Attached Storage (EBS/PD)

For many workloads, network-attached storage like AWS EBS or GCP Persistent Disk is the correct choice. It is resilient, decoupled from the node's lifecycle, and easy to snapshot.

However, there is a hidden tax. As your data grows and your transaction volume increases, you hit the "IOPS Ceiling." To get the performance your database needs, you are forced to pay for "Provisioned IOPS"—a recurring cloud cost that often exceeds the cost of the compute itself. When your database latency starts creeping up due to network congestion between the node and the storage array, network disks transition from a convenience to a bottleneck.

● ADVANTAGES

Decoupled from node lifecycle
Easy snapshots & backups
Resilient by default
Simple to manage

● TRADE_OFFS

Network latency overhead
Provisioned IOPS costs
IOPS ceiling limitations
Congestion bottlenecks

AWS EBS GCP PD Azure Disk

▲ TIER 3: PERFORMANCE PEAK

The Local NVMe Mandate

When sub-millisecond latency is a requirement—for high-frequency financial transactions or massive real-time APIs—network storage fails. This is where we move to Local NVMe.

● THE_BENEFIT

You bypass the network entirely, achieving 500k+ IOPS and near-zero latency. By using local hardware, you eliminate the "Provisioned IOPS" tax, often cutting your storage bill by 40-60%.

● THE_TRADE_OFF

If the physical node dies, the data on that specific NVMe is gone.

● SOLANICA_STRATEGY

We don't fear node failure; we build for it. OpenEverest manages this risk at the application layer through synchronous replication across different physical nodes. Losing a node is treated as a simple "re-sync" event from the S3 stream or other peers, not a data loss disaster.

500k+ IOPS

<1ms Latency

40-60% Cost Savings

Local NVMe Sync Replication Zero Network Tax

●

Storage Decision Matrix

Dev/Test Environments

→ EBS/PD (Standard)

General Production Workloads

→ EBS/PD + S3 Backup

High-IOPS Financial Trading

→ Local NVMe + Sync Replication

Real-Time Analytics/AI

→ Local NVMe + S3 Streams

$ kubectl top pods --sort-by=cpu | grep idle

4. FinOps: Eliminating Zombie Infrastructure

Databases are typically "always-on" expenses. Even in Dev/Test environments, they sit idle, burning expensive compute and IOPS.

● AUTOMATION: SCALE_TO_ZERO

Scale to Zero

Solanica Platform and OpenEverest easily integrates with KEDA (Kubernetes Event-driven Autoscaling) to monitor connection idleness. For non-production environments, the platform automatically hibernates the database, scaling replicas to zero when no developers are active. Billing for compute and IOPS stops immediately; the database wakes only when a new connection is attempted.

Before: Always-On Dev DB

$2,160/month

24/7 compute • Provisioned IOPS

→

After: KEDA Scale-to-Zero

$432/month

8hrs/day usage • Storage only

$ 80% Cost Reduction

▸ Auto-hibernate on connection idle

▸ Wake on first connection attempt

▸ Perfect for dev/test/staging

▸ KEDA integration included

KEDA Event-Driven Auto-Hibernate

● OPTIMIZATION: VPA_RIGHTSIZING

VPA Rightsizing

Instead of guessing RAM requirements, use the Vertical Pod Autoscaler (VPA) in recommendation mode. With the latest addition to Kubernetes where pods do not require restart on resource change, it becomes even easier to rightsize your resources dynamically.

1

Monitor

VPA observes actual usage patterns

↓

2

Recommend

Suggests optimal requests/limits

↓

3

Apply

Resources adjusted without restart

●

New in Kubernetes 1.27+: In-place resource updates mean databases can be rightsized without pod restarts, eliminating downtime from resource adjustments.

VPA No Restart Dynamic Sizing

●

Common Zombie Resources

⊗

Forgotten Dev Databases

Running 24/7 with zero connections

Solution: KEDA scale-to-zero after 30min idle

⊗

Over-Provisioned Resources

16GB RAM allocated, 4GB used

Solution: VPA recommendation + auto-resize

⊗

Orphaned Test Instances

Created for feature branch, never deleted

Solution: TTL policies + auto-cleanup

⊗

Idle GPU Instances

$2,000/month GPU sitting unused

Solution: GPU slicing + aggressive hibernation

COST_IMPACT_SIMULATION

Typical Enterprise Setup

50 dev/test databases
Average: 8GB RAM, 4 vCPU, 100GB storage
Cloud provider: AWS
Actual usage: 8hrs/day (weekdays only)

Monthly Cost (Always-On) $18,750

Monthly Cost (Solanica FinOps) $4,875

Annual Savings $166,500

$ kubectl get networkpolicies -A

5. Security: Zero-Trust Data Access

In a modern cluster, IP-based security is obsolete. Any compromised pod in your cluster can attempt to scan your database port.

● LAYER_1: NAMESPACE_ISOLATION

Namespace Isolation

Kubernetes by design provides namespace isolation. It is handy to put resources in different namespaces and grant your teams segregated access to these. That way you are sure that not only needed people have access to your data, but also have resources under control as namespaces come with quota limits.

■ production-db

Critical databases, strict RBAC

▲ staging-db

Team access, quota-limited

◆ dev-db

Developer self-service

✓ RBAC controls per namespace

✓ Resource quotas prevent overuse

✓ Logical separation by environment

RBAC Resource Quotas Multi-Tenant

● LAYER_2: NETWORK_ISOLATION

Network Isolation

By default OpenEverest deploys all databases with ClusterIP Service that does not expose databases to externally. With specific annotations and other Service types you can control your database exposure flexibly and also integrate with your favorite network proxy tools. Ensuring least exposure for your sensitive data.

● ClusterIP

Internal only, no external exposure

RECOMMENDED

● LoadBalancer

External access with firewall rules

USE WITH CAUTION

● NodePort

Static port on all nodes

AVOID

networkPolicy.yaml

apiVersion: networking.k8s.io/v1

kind: NetworkPolicy

spec:

  podSelector:

    matchLabels: { app: postgres }

  ingress:

  - from:

    - namespaceSelector:

        matchLabels: { tier: backend }

ClusterIP Default NetworkPolicy Proxy Integration

● LAYER_3: ENCRYPTED_BY_DEFAULT

Encrypted by Default

There are mainly two encryption layers for databases. Transport security is encryption connection between clients (users and applications) and database proxies. The next layer is encrypting traffic between replicas. Even when databases deployed inside a single k8s cluster, it is mandatory to encrypt your replication traffic to avoid unnecessary risks.

■

Client-to-Database Encryption

TLS 1.3 for All Connections

✓

Automatic TLS Certificates: cert-manager integration for auto-rotation

✓

Force SSL Mode: Reject unencrypted connections by default

✓

mTLS Support: Mutual authentication for critical workloads

TLS 1.3 cert-manager Auto-Rotation

⇄

Replica-to-Replica Encryption

Encrypted Replication Streams

✓

WAL Stream Encryption: PostgreSQL streaming replication over SSL

✓

MongoDB TLS Replication: Encrypted replica set communication

✓

Cross-AZ Protection: Encrypted even within VPC boundaries

SSL Replication Cross-AZ Safe Engine-Native

●

Zero-Trust Security Checklist

✓ Databases in dedicated namespaces

✓ RBAC policies per team/environment

✓ ClusterIP service by default

✓ NetworkPolicies for ingress control

✓ TLS 1.3 for client connections

✓ Encrypted replication traffic

✓ Auto-rotating certificates

✓ Secrets stored in encrypted etcd

SECURITY_MANDATE

● Never Rely on IP Whitelisting

Pod IPs change constantly. Use NetworkPolicies with label selectors instead of hardcoded IP ranges.

● Treat Internal Traffic as Untrusted

A compromised pod can lateral-move within your cluster. Always encrypt, even for "internal" communication.

● Automate Certificate Rotation

Manual certificate management always fails. Use cert-manager to handle lifecycle automatically.

$ ./conclusion.sh --sovereign-data

Conclusion: The Sovereign Data Platform

Solanica and OpenEverest provide a "Private DBaaS." You get the automation of AWS RDS or MongoDB Atlas, but you maintain 100% control over your data, your infrastructure, and your costs. We don't just "run" databases; we engineer them for survival.

●

The Five Operational Pillars

1

● Availability

Engine-native failover bypasses Kubernetes' StatefulSet delays, shrinking downtime from minutes to sub-30 seconds. Physical anti-affinity and multi-AZ distribution ensure survival.

Failover Time: <30s

2

● Compute

Guaranteed QoS prevents OOM kills. The 20% overhead mandate ensures sidecars and OS processes never starve your database of resources.

QoS Class: Guaranteed

3

● Storage

S3 streams provide disaster recovery. Network disks offer convenience. Local NVMe delivers 500k+ IOPS when latency is non-negotiable. We build for node failure.

Max IOPS: 500k+

4

● FinOps

KEDA scale-to-zero eliminates zombie dev databases. VPA rightsizing prevents over-provisioning. The result: 60-80% cost reduction in non-production environments.

Cost Savings: 60-80%

5

● Security

Namespace isolation, ClusterIP by default, TLS 1.3 for all connections, and encrypted replication. Zero-trust architecture where internal traffic is treated as untrusted.

Encryption: E2E TLS

◆

Data Sovereignty is Non-Negotiable

Cloud vendors lock you into their ecosystems. They control your data gravity, your egress fees, and your pricing power. With Solanica, you operate your own sovereign DBaaS on your infrastructure—whether that's AWS, GCP, Azure, or bare metal in your own data center.

✓

Zero Vendor Lock-In: Run anywhere Kubernetes runs

✓

Full Data Control: Your data never leaves your control plane

✓

Transparent Costs: No surprise bills, no egress taxes

✓

Compliance Ready: Satisfy GDPR, HIPAA, SOC2 on your terms

THE_MANDATE

"Kubernetes was designed to orchestrate processes, not data. OpenEverest bridges that gap by embedding data-aware intelligence into the platform itself. Failovers happen before Kubernetes notices. Storage tiers adapt to workload demands. Costs vanish when no one is looking. Security is pervasive, not bolted on."

The debate is over. Databases belong on Kubernetes. The only question is: Do you want to manage the complexity yourself, or let Solanica do it for you?

$ run --mode=production

Start Your Sovereign Data Journey

Deploy production-grade databases on Kubernetes without the operational nightmare. Solanica Platform powered by OpenEverest gives you cloud-level automation with complete control.

Book a Demo → Try OpenEverest ↗

100% Open Source

seconds Failover Time

60-80% Cost Savings

solanica@k8s:~$ █

Products

Solutions

Resources

The Cloud-Native Data Manual

Engineering Production Databases on Kubernetes

Executive Summary

Engineering Production Databases on Kubernetes

1. Availability: Beyond the "Running" Pod

The Engine-Native Failover Mandate

Physical Topology: The Anti-Affinity Rule

Multi-AZ Distribution

2. Compute: The "Guaranteed" Performance Tier

Eliminate "Burstable" Databases

The 20% Overhead Mandate

3. Storage: A Tale of Three Tiers

S3 as the Immutable Truth

Network-Attached Storage (EBS/PD)

The Local NVMe Mandate

Storage Decision Matrix

4. FinOps: Eliminating Zombie Infrastructure

Scale to Zero

VPA Rightsizing

Common Zombie Resources

5. Security: Zero-Trust Data Access

Namespace Isolation

Network Isolation

Encrypted by Default

Zero-Trust Security Checklist

Conclusion: The Sovereign Data Platform

The Five Operational Pillars

Start Your Sovereign Data Journey

1. Availability: Beyond the `"Running"` Pod

2. Compute: The `"Guaranteed"` Performance Tier