As some point, if you're developing an application, you'll need to pause, and reflect on the right approach to store/retrieve the data persisted.
As illustrated in the following diagram, before starting to implement anything, you need to understand the trade offs, ask yourself these questions :
Choosing the right database will impact the performance of your application, so it's important to understand which type of databases are the most appropriate for your use case.
Your application might be a monolith or microservices, either way, the structure of the data persisted might be more efficient for some services and not for others. You often have to choose between 2 main type of databases :
Relational Database (SQL)
Non Relational Database (NoSQL)
Relational databases are the most common database used by professionals. The main advantage is to provide strong consistency for your application.
Relational databases offer ACID properties : Atomicity, Consistency, Isolation and Durability.
One other aspect, is that relational database require a schema (fixed structure before storing data), this provides more control on the type of data your application persist.
If you need strong relationship (a query to join multiple tables, etc ..), relational database are the most appropriate choice.
To summarize, if your application require the following, you may choose a relational database (SQL):
ACID properties
Data needs to be structured and consistent
The transactions are a strong priority
You have to perform complex queries between data (joins, etc..)
You need in advance to set a fixed schema (structure of the data persisted)
Customers table in a SQL database | CustomerID | FirstName | LastName | Email | |------------|-----------|----------|------------------------| | 101 | Bob | Smith | bob.smith@email.com | | 102 | Alice | Johnson | alice.j@email.com | | 103 | John | Doe | john.doe@email.com | Students table | StudentID | Name | Age | GPA | GraduationYear | |-----------|--------------|-----|------|----------------| | S001 | Daniel Monto | 20 | 3.7 | 2026 | | S002 | Emma Brown | 22 | 3.9 | 2024 | | S003 | Liz Brian | 21 | 3.5 | 2025 |
Use-case 1 (Financial industry) : If you develop an application for the financial sector, there is chance that you may want to choose a relational database, for strong consistency , to prevent double spending or balances. You may also have to perform complex queries between customers, accounts and transactions.
Use-case 2 (Ecommerce platform) : If you develop an ecommerce application, you probably need to have a structured data to store your products, categories, customers, orders or reviews. You may need to perform complex queries between the customer and the orders, and you need to guarantee transactions (strong consistency) to ensure a reliable inventory management or checkout.
Use-case 3 (Healthcare systems) : If you develop a healthcare application, and as you store sensitive data, you probably need strong consistency, and integrity. Also, healthcare industries are heavily regulated, so having a relational database will ensure your data persisted follow the latest compliance and regulations.
Non-Relational databases offer more flexibility. You don't need to have strong consistency and you don't need a fixed schema before storing your data. NoSQL databases are more appropriate for use cases where you need to store unstructured or semi-structured data (JSON Document, Key Value), and you think about having lower latency.
Types of unstructured data | Examples |
---|---|
Text documents | Emails, chat messages, Word/PDF files |
Media files | Images, videos, audio recordings |
Social media posts | Tweets, comments, likes |
Logs & event data | Server logs, clickstreams, IoT device messages |
Web pages | HTML content with mixed text, links, metadata. |
JSON Document [ { "user_id": 123, "name": "Alice", "purchases": ["Book", "Laptop", "Headphones"], "last_login": "2025-09-14" }, { "user_id": 124, "name": "Bob", "purchases": ["Book", "Laptop", "Headphones"], "last_login": "2025-09-14" }, { "user_id": 125, "name": "John", "purchases": ["Book", "Laptop", "Headphones"], "last_login": "2025-09-14" } ]
NoSQL databases are a good choice, if you don't want to deal with strict rules to manage your data and the transactions.
To summarize, if your application require the following, you may choose a non-relational database (NoSQL):
Unstructured or semi-structured data, and it's flexible
You expect a huge volume of data, and you don't want to worry about horizontal scaling
You need to have a simple design, with the potential to scale fast.
Use-case 1 (Real-time analytics and Big Data) : You develop an application, at some point you may need to understand more about the users behavior and activity, this will create a huge volume of data (logs, clickstreams, ...), the best way is to use NoSQL databases to handle these type of data (unstructured and semi-structured). Also choosing a non-relational database, is a better option for faster writes and horizontal scaling.
Use-case 2 (Content Management & Product Catalog) : You develop a content management application, or you need to store product catalogs. The content often changes (flexible schema), using JSON documents to store these type of data is more efficient.
Company | Database used | Use Case | Estimated Daily Active User (DAU) | Sources |
---|---|---|---|---|
Stripe | PostgreSQL + Redis | PostgreSQL for transactional payment data (ACID compliance critical), Redis for caching & real-time fraud detection | ~1.3 million active websites using Stripe globally | https://redstagfulfillment.com/how-many-payments-stripe-process-per-day/ |
Notion | PostgreSQL + Redis | PostgreSQL for document + collaboration metadata storage, Redis for session caching and quick lookups | ~20M monthly active users | https://www.boringbusinessnerd.com/top-startups? |
Canva | PostgreSQL + Redis | PostgreSQL for design assets and user accounts, Redis for fast retrieval of frequently accessed templates | ~100M monthly active users | https://www.boringbusinessnerd.com/top-startups? |
Slack | MySQL, PostgreSQL, Redis | MySQL/Postgres for message history & team data, Redis for ephemeral data (presence, typing indicators) | ~42–47M daily active users | https://www.demandsage.com/slack-statistics/? |
Your database is by default in a healthy state, all operations (connections, queries) return a successfull response. But you may experience unexpected downtime due to failures .
Unavailable access to your data can have significant impact on your business activities, revenue and reputation. It's important to understand the most common root cause of database failures.
Replication Lag & Failover Issues
Schema Migrations Gone Wrong
Misconfigurations (DNS, Networking, Permissions)
Lack of Backups & Disaster Recovery Planning
Planning for these scenarios during the design phase is essential. There are 2 main metrics to implement in order to lower the risks if an incident happens :
Recovery Time Objective (RTO) : How long it takes to restore the database in case of failure ?
Recovery Point Objective (RPO) : How much data can you loose (the retention of your backups : minutes, days, months, years) ?
There are several database recovery strategies :
Database recovery strategies | RTO (Recovery Time Objective) | RPO (Recovery Point Objective) | Cost |
---|---|---|---|
Full Backup (Daily/Weekly) | Hours to days (restore from backup) | Up to last backup (24h+ data loss possible) | $ (Cheap) |
Database recovery strategies | RTO (Recovery Time Objective) | RPO (Recovery Point Objective) | Cost |
---|---|---|---|
Asynchronous Replication (Active-Passive) | Seconds to minutes | Seconds to minutes (small lag possible) | $$ (higher cost) |
Database recovery strategies | RTO (Recovery Time Objective) | RPO (Recovery Point Objective) | Cost |
---|---|---|---|
Synchronous Replication (Active-Standby) | Seconds to minutes (automatic failover) | Zero (no data loss) | $$$ (Very expensive) |
Here are the recommended RTO and RPO per industry :
Type of workload | Recommended RTO (Recovery Time Objective) | Recommended RPO (Recovery Point Objective) | Sources |
---|---|---|---|
Mission-critical (tier-1) applications | ≤ 15 minutes | “near-zero” | https://docs.aws.amazon.com/whitepapers/latest/disaster-recovery-of-on-premises-applications-to-aws/recovery-objectives.html |
Important but not mission critical (tier-2)” apps | ≤ 4 hours | ≤ 2 hours | https://docs.aws.amazon.com/whitepapers/latest/disaster-recovery-of-on-premises-applications-to-aws/recovery-objectives.html |
Less critical applications | 8-24 hours | 4 hours | https://docs.aws.amazon.com/whitepapers/latest/disaster-recovery-of-on-premises-applications-to-aws/recovery-objectives.html |
The database stores all kind of data, you probably want to limit who can have access to the database, especially if you store sensitive and critical information. Understanding who should have access is a first step, is it a service account or a human user ?
To enhance the access to the database, it might be a good practice to limit the network access, authorizing only requests from specific subnet (usually backend subnet), and use security group to authorize incoming traffic on the database target TCP port from restricted sources (IP address, group of servers, etc..)
Once the user or service account authenticated, it's important to limit the actions on the database (CREATE DATABASE, CREATE TABLE, UPDATE, ...). Best practices include assigning the least privileges (roles to perform only restricted operations on the database). Credentials can be stored in a secrets store (secrets manager, key vault, ..), with password rotation to enhance security.
It's a good practice to secure the traffic between your applications and the database. By enabling SSL mode in your database, you lower the risks, and prevent anonymous connections from intercepting or tampering the traffic sent to the database.
Also if you need to follow these regulations (GDPR, HIPAA, PCI DSS, SOX, HDS), it's required to enable encryption to secure the data in transit.
GDPR (General Data Protection Regulation) : A regulation by the EU that governs how organizations collect, process, store, and share personal data of individuals in the EU
HIPAA (Health Insurance Portability and Accountability Act) : sets standards for how healthcare organizations, insurers, and their business partners handle Protected Health Information (PHI) — any data that can identify a patient and relates to their health, treatment, or payment for care
PCI DSS (Payment Card Industry Data Security Standard) : a global security standard created to protect cardholder data (credit/debit card information) and reduce payment fraud. Applies to any organization that stores, processes, or transmits payment card data.
HDS (Hébergement de Données de Santé) : french regulation, It requires that healthcare-related personal data (patient records, medical history, test results, etc.) be stored and processed only by certified Health Data Hosts
Enabling SSL mode on your database does add CPU overhead for encryption and decryption. One option to prevent more load on your CPU is to use connection pooling.
Connection pooling is cache of open database connections that your application can reuse instead of creating a new one each time, reducing latency and CPU load.
Using benchmark to measure if your database performs better with or without connection pooling is recommended.
You've just secured the traffic sent to your database. But how about the data stored in your database ? It's a good practice to encrypt the data at rest, at the disk level, using key management system. Backups should also be encrypted to prevent unauthorized access to the data.
You may have to handle more load on your database (traffic, queries, ..), and to prevent any performance degradation, you have some options to scale your database. Usually there are 2 scaling strategies :
Vertical Scaling
Horizental Scaling
Vertical scaling is the first option if you want to handle more traffic, it's easier to implement, you just have to add more resources (more cpu, memory or faster disk and bandwitdh).
Horizental scaling is a little bit more complex to implement. You need to add replicas node which handle only read-only requests. All the write queries goes to the primary DB, and the read queries to the replicas nodes to distribute the traffic.
There are several other options you can implement to handle more traffic
Load balancers or databaes proxies : helping you redistribute the traffic (connection pooling)
Autoscaling : cloud managed databases offer options to scale automatically your database.
Sharding and Partionning : For heavy workloads and traffic, you may need at some point to use sharding and partionning, the goal is to split your database in multiple smaller "datasets". It's usually quite complex to implement.
Let's assume your application is running smoothly, everything works just fine. Users have access to every features of your application and can perform all operations. Then suddenly, you start getting emails from 1 frustated user, furious about their experiences, seems like some features didn't work fine, or there is a huge latency when performing some actions. But it doesn't end there, more and more users start to send emails with their bad experiences.
You're shocked as you didn't notice any sign of performance degradation, your application seems to work just fine. You then realize by investigating, that there are a lot of warnings, and errors in the logs.
This experience shows the importance of monitoring the activity of your infrastucture, to anticipate and prevent in advance your users/customers from having bad experiences.
It's then crucial to understand which metrics can help your understand what's happening on your database and if you should get any alerts as soon as some threshold are met.
Monitoring Objectives | Description | Core metrics to collect | Alerts and thresholds |
---|---|---|---|
Availability & health |
Is the database up and reachable ? |
Examples of metrics on PostGreSQL : pg_up, pg_connections, pg_stat_activity_count |
Examples with PostgreSQL (pg_up == 0) |
Performance |
Is there any latency to reach the database ? Are the queries slows ? |
Total queries/sec and transactions/sec 95th/99th percentile query latencies Slow queries count |
connections >= 90% of max_connections (Connections near max) query running > 5–15 min (Long running query) slow_queries / total_queries > 1% over 5m (Slow query rate) |
Resources & capacity |
Is the database server capacity overloaded ? What is the status of the resources consumption (CPU, Memory, Disk usage, Disk latency, network I/O) |
CPU Memory Disk usage, Disk Latency |
CPU > 85% for 5m CPU > 85% for 5m Disk usage > 80% |
Errors & Fault |
What are the errors in the logs ? This should indicated the failed queries, connection errors |
Error Logs |
Error rate (failed queries / total queries) |
Security and Audits |
Logins (successful & failed) Privilege changes Suspicious exports |
failed logins new user creation privilege escalations unusual export activity |
failed login attempts spike (possible brute force) |
Replication and Bakcups |
Replication lag Backup success WAL/redo lag |
last successful backup time backup duration restore test results |
last_backup_time older than expected retention window lag > 10s (or > 60s depending on app) |
Databases breaches and incidents happen all the time, 100% protection does not exist. Prestigious companies and firms have experienced security breaches despite having implemented tight security controls, it's important to learn from their experiences to understand how you can prevent these scenarios or at least lower the risks or the gravity if you experience a breach.
The goal is not to implement the latest security tools and think you're out of reach, it's better to implement a progressive and continuous protection, re-assess continuously your seurity posture, improve the protection of your weakest link in your infrastructure.
Here are the summary of recent database breaches grouped by root cause :
Misconfiguration (Open Databases, Wrong ACLs,..)
Credentials (weak passwords, reused credentials, no password rotation, no MFA)
Zero-days /Supply Chain attacks
Provider/Partner attacks
Root cause (ranked by frequency) | Example of breach | Year | Impact (record/people) | Why It happened | Sources |
---|---|---|---|---|---|
1. Misconfiguration (Unprotected DBs / Poor Access Controls) |
Chinese Surveillance DB |
2025 |
~4 Billions |
Database left exposed, no password protection |
More details |
1. Misconfiguration (Unprotected DBs / Poor Access Controls) |
Microsoft/Apple/Google/PayPal DB leak |
2025 |
184 Millions |
Elasticsearch DB exposed without auth |
More details |
1. Misconfiguration (Unprotected DBs / Poor Access Controls) |
Texas General Land Office |
2025 |
44,485 people |
Access control misconfig allowed cross-user data view |
More details |
2. Credential Compromise / Weak Authentication |
Snowflake |
2024 |
~5.5 Millions accounts. |
Credential stuffing, reused passwords, weak protection |
More details |
2. Credential Compromise / Weak Authentication |
23andMe |
2023-25 |
Dozens of orgs; AT&T, Ticketmaster, Santander, etc. |
Weak MFA/credential hygiene; compromised access |
More details |
2. Credential Compromise / Weak Authentication |
Kering / Gucci–Balenciaga |
2025 |
Millions of customers |
Hacker group “Shiny Hunters” stole customer DB creds |
More details |
3. Software Vulnerabilities / Supply Chain Exploits |
MOVEit Transfer Zero-day |
2023 |
~93–100 Millions people, 2700+ orgs of customers |
Zero-day exploited in widely used file transfer system |
More details |
3. Software Vulnerabilities / Supply Chain Exploits |
Latitude Financial |
2023 |
~14 Millions records |
Exploit in provider’s systems; details not fully disclosed |
More details |
4. Direct Cyberattacks / Service Provider Compromise |
Qantas Airways |
2025 |
~5.7 Millions customers |
Hackers gained access to customer DB |
More details |
4. Direct Cyberattacks / Service Provider Compromise |
Miljodata |
2025 |
~1.5 Million people |
IT provider cyberattack leaked sensitive records |
More details |