dashboard image

Objectives :

Databases

dashboard image
The Problem

As some point, if you're developing an application, you'll need to pause, and reflect on the right approach to store/retrieve the data persisted.

As illustrated in the following diagram, before starting to implement anything, you need to understand the trade offs, ask yourself these questions :

dashboard image

Choosing the right database will impact the performance of your application, so it's important to understand which type of databases are the most appropriate for your use case.

dashboard image

Which type of databases for my use case ?

Your application might be a monolith or microservices, either way, the structure of the data persisted might be more efficient for some services and not for others. You often have to choose between 2 main type of databases :

Relational Database (SQL)

Non Relational Database (NoSQL)


Relational Database (SQL)

Relational databases are the most common database used by professionals. The main advantage is to provide strong consistency for your application.

Relational databases offer ACID properties : Atomicity, Consistency, Isolation and Durability.

One other aspect, is that relational database require a schema (fixed structure before storing data), this provides more control on the type of data your application persist.

If you need strong relationship (a query to join multiple tables, etc ..), relational database are the most appropriate choice.

To summarize, if your application require the following, you may choose a relational database (SQL):

ACID properties

Data needs to be structured and consistent

The transactions are a strong priority

You have to perform complex queries between data (joins, etc..)

You need in advance to set a fixed schema (structure of the data persisted)

            Customers table in a SQL database
            
            | CustomerID | FirstName | LastName | Email                  |
            |------------|-----------|----------|------------------------|
            | 101        | Bob       | Smith    | bob.smith@email.com  |
            | 102        | Alice     | Johnson  | alice.j@email.com      |
            | 103        | John      | Doe      | john.doe@email.com     |
            
            Students table
            
            | StudentID | Name         | Age | GPA  | GraduationYear |
            |-----------|--------------|-----|------|----------------|
            | S001      | Daniel Monto | 20  | 3.7  | 2026           |
            | S002      | Emma Brown   | 22  | 3.9  | 2024           |
            | S003      | Liz Brian    | 21  | 3.5  | 2025           |
              

Use-case 1 (Financial industry) : If you develop an application for the financial sector, there is chance that you may want to choose a relational database, for strong consistency , to prevent double spending or balances. You may also have to perform complex queries between customers, accounts and transactions.

Use-case 2 (Ecommerce platform) : If you develop an ecommerce application, you probably need to have a structured data to store your products, categories, customers, orders or reviews. You may need to perform complex queries between the customer and the orders, and you need to guarantee transactions (strong consistency) to ensure a reliable inventory management or checkout.

Use-case 3 (Healthcare systems) : If you develop a healthcare application, and as you store sensitive data, you probably need strong consistency, and integrity. Also, healthcare industries are heavily regulated, so having a relational database will ensure your data persisted follow the latest compliance and regulations.


Non-Relational Database (NoSQL)

Non-Relational databases offer more flexibility. You don't need to have strong consistency and you don't need a fixed schema before storing your data. NoSQL databases are more appropriate for use cases where you need to store unstructured or semi-structured data (JSON Document, Key Value), and you think about having lower latency.

Types of unstructured data Examples
Text documents Emails, chat messages, Word/PDF files
Media files Images, videos, audio recordings
Social media posts Tweets, comments, likes
Logs & event data Server logs, clickstreams, IoT device messages
Web pages HTML content with mixed text, links, metadata.
        JSON Document
      [
        {
          "user_id": 123,
          "name": "Alice",
          "purchases": ["Book", "Laptop", "Headphones"],
          "last_login": "2025-09-14"
        },
        {
          "user_id": 124,
          "name": "Bob",
          "purchases": ["Book", "Laptop", "Headphones"],
          "last_login": "2025-09-14"
        },
        {
          "user_id": 125,
          "name": "John",
          "purchases": ["Book", "Laptop", "Headphones"],
          "last_login": "2025-09-14"
        }
      ]
            

NoSQL databases are a good choice, if you don't want to deal with strict rules to manage your data and the transactions.

To summarize, if your application require the following, you may choose a non-relational database (NoSQL):

Unstructured or semi-structured data, and it's flexible

You expect a huge volume of data, and you don't want to worry about horizontal scaling

You need to have a simple design, with the potential to scale fast.


Use-case 1 (Real-time analytics and Big Data) : You develop an application, at some point you may need to understand more about the users behavior and activity, this will create a huge volume of data (logs, clickstreams, ...), the best way is to use NoSQL databases to handle these type of data (unstructured and semi-structured). Also choosing a non-relational database, is a better option for faster writes and horizontal scaling.

Use-case 2 (Content Management & Product Catalog) : You develop a content management application, or you need to store product catalogs. The content often changes (flexible schema), using JSON documents to store these type of data is more efficient.


Real world companies - use cases

Company Database used Use Case Estimated Daily Active User (DAU) Sources
Stripe PostgreSQL + Redis PostgreSQL for transactional payment data (ACID compliance critical), Redis for caching & real-time fraud detection ~1.3 million active websites using Stripe globally https://redstagfulfillment.com/how-many-payments-stripe-process-per-day/
Notion PostgreSQL + Redis PostgreSQL for document + collaboration metadata storage, Redis for session caching and quick lookups ~20M monthly active users https://www.boringbusinessnerd.com/top-startups?
Canva PostgreSQL + Redis PostgreSQL for design assets and user accounts, Redis for fast retrieval of frequently accessed templates ~100M monthly active users https://www.boringbusinessnerd.com/top-startups?
Slack MySQL, PostgreSQL, Redis MySQL/Postgres for message history & team data, Redis for ephemeral data (presence, typing indicators) ~42–47M daily active users https://www.demandsage.com/slack-statistics/?

How can I prevent failures ?

Your database is by default in a healthy state, all operations (connections, queries) return a successfull response. But you may experience unexpected downtime due to failures .

dashboard image

Unavailable access to your data can have significant impact on your business activities, revenue and reputation. It's important to understand the most common root cause of database failures.

Replication Lag & Failover Issues

Schema Migrations Gone Wrong

Misconfigurations (DNS, Networking, Permissions)

Lack of Backups & Disaster Recovery Planning

Planning for these scenarios during the design phase is essential. There are 2 main metrics to implement in order to lower the risks if an incident happens :

Recovery Time Objective (RTO) : How long it takes to restore the database in case of failure ?

Recovery Point Objective (RPO) : How much data can you loose (the retention of your backups : minutes, days, months, years) ?

There are several database recovery strategies :

Database recovery strategies RTO (Recovery Time Objective) RPO (Recovery Point Objective) Cost
Full Backup (Daily/Weekly) Hours to days (restore from backup) Up to last backup (24h+ data loss possible) $ (Cheap)
dashboard image
Database recovery strategies RTO (Recovery Time Objective) RPO (Recovery Point Objective) Cost
Asynchronous Replication (Active-Passive) Seconds to minutes Seconds to minutes (small lag possible) $$ (higher cost)
dashboard image
Database recovery strategies RTO (Recovery Time Objective) RPO (Recovery Point Objective) Cost
Synchronous Replication (Active-Standby) Seconds to minutes (automatic failover) Zero (no data loss) $$$ (Very expensive)
dashboard image

Here are the recommended RTO and RPO per industry :

Type of workload Recommended RTO (Recovery Time Objective) Recommended RPO (Recovery Point Objective) Sources
Mission-critical (tier-1) applications ≤ 15 minutes “near-zero” https://docs.aws.amazon.com/whitepapers/latest/disaster-recovery-of-on-premises-applications-to-aws/recovery-objectives.html
Important but not mission critical (tier-2)” apps ≤ 4 hours ≤ 2 hours https://docs.aws.amazon.com/whitepapers/latest/disaster-recovery-of-on-premises-applications-to-aws/recovery-objectives.html
Less critical applications 8-24 hours 4 hours https://docs.aws.amazon.com/whitepapers/latest/disaster-recovery-of-on-premises-applications-to-aws/recovery-objectives.html

How can I limit the access to the database ?

The database stores all kind of data, you probably want to limit who can have access to the database, especially if you store sensitive and critical information. Understanding who should have access is a first step, is it a service account or a human user ?

dashboard image

To enhance the access to the database, it might be a good practice to limit the network access, authorizing only requests from specific subnet (usually backend subnet), and use security group to authorize incoming traffic on the database target TCP port from restricted sources (IP address, group of servers, etc..)

dashboard image

Once the user or service account authenticated, it's important to limit the actions on the database (CREATE DATABASE, CREATE TABLE, UPDATE, ...). Best practices include assigning the least privileges (roles to perform only restricted operations on the database). Credentials can be stored in a secrets store (secrets manager, key vault, ..), with password rotation to enhance security.

dashboard image

Do I need to secure the data in transit ?

It's a good practice to secure the traffic between your applications and the database. By enabling SSL mode in your database, you lower the risks, and prevent anonymous connections from intercepting or tampering the traffic sent to the database.

Also if you need to follow these regulations (GDPR, HIPAA, PCI DSS, SOX, HDS), it's required to enable encryption to secure the data in transit.

dashboard image

GDPR (General Data Protection Regulation) : A regulation by the EU that governs how organizations collect, process, store, and share personal data of individuals in the EU

HIPAA (Health Insurance Portability and Accountability Act) : sets standards for how healthcare organizations, insurers, and their business partners handle Protected Health Information (PHI) — any data that can identify a patient and relates to their health, treatment, or payment for care

PCI DSS (Payment Card Industry Data Security Standard) : a global security standard created to protect cardholder data (credit/debit card information) and reduce payment fraud. Applies to any organization that stores, processes, or transmits payment card data.

HDS (Hébergement de Données de Santé) : french regulation, It requires that healthcare-related personal data (patient records, medical history, test results, etc.) be stored and processed only by certified Health Data Hosts


Enabling SSL mode on your database does add CPU overhead for encryption and decryption. One option to prevent more load on your CPU is to use connection pooling.

Connection pooling is cache of open database connections that your application can reuse instead of creating a new one each time, reducing latency and CPU load.

Using benchmark to measure if your database performs better with or without connection pooling is recommended.


How can I secure the database at rest ?

You've just secured the traffic sent to your database. But how about the data stored in your database ? It's a good practice to encrypt the data at rest, at the disk level, using key management system. Backups should also be encrypted to prevent unauthorized access to the data.

dashboard image

What happens if I need to scale the database ?

You may have to handle more load on your database (traffic, queries, ..), and to prevent any performance degradation, you have some options to scale your database. Usually there are 2 scaling strategies :

Vertical Scaling

Horizental Scaling


Vertical scaling is the first option if you want to handle more traffic, it's easier to implement, you just have to add more resources (more cpu, memory or faster disk and bandwitdh).

dashboard image

Horizental scaling is a little bit more complex to implement. You need to add replicas node which handle only read-only requests. All the write queries goes to the primary DB, and the read queries to the replicas nodes to distribute the traffic.

dashboard image

There are several other options you can implement to handle more traffic

Load balancers or databaes proxies : helping you redistribute the traffic (connection pooling)

Autoscaling : cloud managed databases offer options to scale automatically your database.

Sharding and Partionning : For heavy workloads and traffic, you may need at some point to use sharding and partionning, the goal is to split your database in multiple smaller "datasets". It's usually quite complex to implement.


How can I monitor the activity on the database ?

Let's assume your application is running smoothly, everything works just fine. Users have access to every features of your application and can perform all operations. Then suddenly, you start getting emails from 1 frustated user, furious about their experiences, seems like some features didn't work fine, or there is a huge latency when performing some actions. But it doesn't end there, more and more users start to send emails with their bad experiences.

You're shocked as you didn't notice any sign of performance degradation, your application seems to work just fine. You then realize by investigating, that there are a lot of warnings, and errors in the logs.

This experience shows the importance of monitoring the activity of your infrastucture, to anticipate and prevent in advance your users/customers from having bad experiences.

It's then crucial to understand which metrics can help your understand what's happening on your database and if you should get any alerts as soon as some threshold are met.

Monitoring Objectives Description Core metrics to collect Alerts and thresholds
Availability & health

Is the database up and reachable ?

Examples of metrics on PostGreSQL : pg_up, pg_connections, pg_stat_activity_count

Examples with PostgreSQL (pg_up == 0)

Performance

Is there any latency to reach the database ?


Are the queries slows ?

Total queries/sec and transactions/sec


95th/99th percentile query latencies


Slow queries count

connections >= 90% of max_connections (Connections near max)


query running > 5–15 min (Long running query)


slow_queries / total_queries > 1% over 5m (Slow query rate)

Resources & capacity

Is the database server capacity overloaded ?


What is the status of the resources consumption (CPU, Memory, Disk usage, Disk latency, network I/O)

CPU


Memory


Disk usage, Disk Latency

CPU > 85% for 5m


CPU > 85% for 5m


Disk usage > 80%

Errors & Fault

What are the errors in the logs ?


This should indicated the failed queries, connection errors

Error Logs

Error rate (failed queries / total queries)

Security and Audits

Logins (successful & failed)


Privilege changes


Suspicious exports

failed logins


new user creation


privilege escalations


unusual export activity

failed login attempts spike (possible brute force)


Replication and Bakcups

Replication lag


Backup success


WAL/redo lag

last successful backup time


backup duration


restore test results

last_backup_time older than expected retention window


lag > 10s (or > 60s depending on app)


Learning from real world experiences

Databases breaches and incidents happen all the time, 100% protection does not exist. Prestigious companies and firms have experienced security breaches despite having implemented tight security controls, it's important to learn from their experiences to understand how you can prevent these scenarios or at least lower the risks or the gravity if you experience a breach.

The goal is not to implement the latest security tools and think you're out of reach, it's better to implement a progressive and continuous protection, re-assess continuously your seurity posture, improve the protection of your weakest link in your infrastructure.

Here are the summary of recent database breaches grouped by root cause :

Misconfiguration (Open Databases, Wrong ACLs,..)

Credentials (weak passwords, reused credentials, no password rotation, no MFA)

Zero-days /Supply Chain attacks

Provider/Partner attacks


Root cause (ranked by frequency) Example of breach Year Impact (record/people) Why It happened Sources
1. Misconfiguration (Unprotected DBs / Poor Access Controls)

Chinese Surveillance DB

2025

~4 Billions

Database left exposed, no password protection

More details
1. Misconfiguration (Unprotected DBs / Poor Access Controls)

Microsoft/Apple/Google/PayPal DB leak

2025

184 Millions

Elasticsearch DB exposed without auth

More details
1. Misconfiguration (Unprotected DBs / Poor Access Controls)

Texas General Land Office

2025

44,485 people

Access control misconfig allowed cross-user data view

More details
2. Credential Compromise / Weak Authentication

Snowflake

2024

~5.5 Millions accounts.

Credential stuffing, reused passwords, weak protection

More details
2. Credential Compromise / Weak Authentication

23andMe

2023-25

Dozens of orgs; AT&T, Ticketmaster, Santander, etc.

Weak MFA/credential hygiene; compromised access

More details
2. Credential Compromise / Weak Authentication

Kering / Gucci–Balenciaga

2025

Millions of customers

Hacker group “Shiny Hunters” stole customer DB creds

More details
3. Software Vulnerabilities / Supply Chain Exploits

MOVEit Transfer Zero-day

2023

~93–100 Millions people, 2700+ orgs of customers

Zero-day exploited in widely used file transfer system

More details
3. Software Vulnerabilities / Supply Chain Exploits

Latitude Financial

2023

~14 Millions records

Exploit in provider’s systems; details not fully disclosed

More details
4. Direct Cyberattacks / Service Provider Compromise

Qantas Airways

2025

~5.7 Millions customers

Hackers gained access to customer DB

More details
4. Direct Cyberattacks / Service Provider Compromise

Miljodata

2025

~1.5 Million people

IT provider cyberattack leaked sensitive records

More details