dashboard image

Objectives :

Design a reservation system

dashboard image
Problem

Bob is passionate about events. He often goes to major online platform/SaaS application, which offer several options to choose between different type of events (festival, concerts, comedy shows, experiences, etc ..). He's pretty satisfied with the user experience of these platforms.

Bob has been using these platforms for years, but he's looking for a platform which can offer more local experiences, a platform understanding the local cultures, and also offering new experiences, experiences that are mainly focused on authentic local vibes. Bob is also seeking to have new options to purchase tickets, like in cryptocurrency.

A summary of what Bob is looking for :

Local online platform

Platform focused on essential and authenticity

Platform offering a better representation of local cultures

Crypto currency payment

Company culture with local roots

Our objectives now is to define an online platform offering these services and options to Bob.


Minimum Viable Product

We understand Bob's problem, now we need to convert his problem into a minimum viable product. From the recent analysis, it seems like Bob is not the only one to have this problem, the MVP should then help to test our solution and find a possible market :

How many people/customers are willing to pay for our product ?

Who exactly they are ?

Where exactly they are ?

How to reach them ?

Our challenges will to define a minimum solution product, and see if it covers what Bob is looking for.


Functional and non-functional requirements

We need to define the features that will make our product (functional requirements), and at the same time we need to make sure that Bob will have a good user experience (non-functional requirements), like fast responses when looking for events, low latency when doing a reservation, and a good service availability.

Functional requirements

Here is the minimal list of what users should be able to do on the product :

Users should be able to views all events

Users should be able to search for events

Users should be able to buy a ticket

Users should be able to view order confirmation

Users should be able to receive email with pdf tickets in attachment

Users should be able to download pdf tickets from a link in their email and from their user profile

We also need to have the admin

Admin should be able to create event

Admin should be able to edit event

Admin should be able to delete event

Admin should be able to view basic analytics

Non-functional requirements

Even though the product is a minimal viable product, the user experience should still be good :

Availability : The application should be available based on the industry standard

Latency : The users should be able to have the lowest latency when performing actions (search for events, buy a ticket)

Scalability : As we're just starting, we don't need to focus on the scalability, but we can prepare and anticipate the design.

Reliability : We need to make sure that the product is reliable, so we should prevent failures or have minimum reliability options (recover efficiently from crash, failures, security incidents)

Consistency : We need to have consistency, when a user buy a ticket, we need to make sure that his order is safely saved in the database before displaying the confirmation.


High Level Design

Here is a brief high level design

dashboard image

Workflow

(1) => The user request is sent to the CDN after a DNS Resolution, the CDN caches static files like images stored in the blob storage, rendering faster the response to the user.

(2) => The application server hosts the backend (the routes and the logic of the application), the list of all events is sent back to the user; the user then selects an event where he wants to buy a ticket.

(3) => Once the event is selected, the user places an order, which creates a payment event with the external payment provider.

(4) => Once the payment confirmed, the order is saved in the database.

(5) => To prevent the user from waiting, the order event, with the confirmation of the payment and the user details, is pushed to a messaging queue.

(6) => A serverless function, once an event is pushed to the messaging queue, is triggered.

(7) => The serverless function generates and uploads the ticket (pdf format) to a blob storage.

(8) => The second action of the serverless function is to send a confirmation email to the user.


API Design

Users actions

User Actions API Routes Require Login Notes
Users should be able to views all events

GET /api/events

GET /

No

Users should be able to search for events

POST /api/search?q=query

No

Users should be able to select an event

GET /api/events/:id/buy

Yes

Users should be able to buy a ticket

POST /api/events/:id/buy

Yes

After the payment confirmation, the order is saved in the database and the order event is pushed to messaging queue.

Users should be able to view order confirmation

GET /api/orders/:id/confirmation

Yes

After the payment confirmation, the user is redirected directly to the confirmation page without waiting for the background tasks (pdf ticket pushed to a blob storage and email sent with attachment)

Users should be able to receive email with pdf tickets in attachment

GET /api/orders/:id/confirmation

Yes

Users have the option to download the pdf ticket from the confirmation page.


Admin actions

User Actions API Routes Require Login Notes
Admin should be able to create event

POST /api/admin/events/create

Yes

Admin should be able to edit event

POST /api/admin/events/edit/:id

Yes

Admin should be able to delete event

POST /api/admin/events/delete/:id

Yes

Admin should be able to view analytics (number puchased tickets per event, total revenue, ...)

GET /api/admin/analytics

Yes


Design choices, constraints and tradeoffs

The architecture decisions will have a significant impact on our product. It's then important to understand the roles of each component in our architecture. We first need to gather some product/business metrics, that should guide us through the constraints and tradeoffs.

Context :

Based on the recent statistics and market research, the number of tickets for events sold is around 12000 in a year. During peak day, we can estimate the number of users to be around 250.

Constraints and tradeoffs:

Our MVP is an assumption based on Bob's problem, there are no real constraints at this early stage, we just need to keep an eye on the implementation costs, and we need to resist the temptation of over-engineering the solution. If our assumptions happen to be true, we should have some basic foundations to make the architecture evolves with the product enhancement and the number of users.

Architecture decisions :

These estimations (market analysis and statistics) are optimistic, we should then take these metrics as our objectives, but we probably need to start with a small, flexible architecture, that should be easily scalable if our assumptions become a reality.

Delivering core product features will help us design a minimalist architecture and reduce the costs of our experience.

Here are the key components of our first-stage architecture to validate our MVP :

CDN (Content Delivery Network) :

In the first stage we can estimate to host hundreds of events in our platform. Each event is represented as a card, with a picture. When users will load the web pages, it's important that they have a smooth and fast experience. The pictures will then be saved on a blob storage (bucket), and the CDN will use the bucket as origin (cache miss goes to the bucket), the pictures/images will be cached and be downloadable faster from multiple and different geo-location.

Future constraints : If the number of hosted events increase exponentially, we may have to deal with the increasing cost of storing large images in the bucket. We need to keep this in mind, and we should find a process to resize the large image into smaller one and generate low size format image.

Load Balancer :

To prevent exposing the application server publicly, and to prepare our infrastructure for scalability, we set a load balancer. Ideally, the load balancer can be set as origin in the CDN, and we can enhance the security by restricting the access with prefix list (only the cloudfront ips can send requests to load balancer) and by adding a custom header. Load balancer will reject requests not including the custom header.

Future constraints : If the number of users on the platform increase, we need to keep track of the costs associated with the data volume transfer. We may also need to implement more resiliency, and balance the traffic across multiple availability zones and even regions (which can cost more).

Application Server (Backend) :

The application server will run the logic of our application and host the api routes. At this stage and for simplicity we can use an EC2 instance (medium instance type) with a persistent disk, or we can containerize the application, run it as a container managed service.

Future constraints : As the number of users increase, there will be a limitation if we opt for vertical scaling. We may need to think about splitting the monolith into micro-services and benefit from the horizontal scaling capabilities.

External payment gateway :

At this stage we don't need to process the payment ourselves. We can offload the payment process to an external service provider. This should speed up the payment process, and we won't need to deal with the full PCI-DSS compliance.

Database:

We will store in a relational database (for consistency), the events, the orders, the users, etc.. At this stage, we aim for simplicity, we don't need to have replication yet, but we can perform regular backups, to restore our data if there is an incident.

Future constraints : As the number of users increase, the number of requests to the database increase as well, more queries. To reduce the load on the database, we may need to add a caching layer. And for very large workload, implement sharding. To improve the reliability and the resiliency, we may also need to implement read-replicas with replication across multiple availability zones.

Queue System :

During the reservation/order process, we don't want users to be waiting once the order is confirmed (saved in the database). The order event is sent to a queue system, and the user gets a confirmation page right away without waiting. If there are multiple requests and if the application server crashes, we can safely keep the already purchased events and orders in the queue system, waiting to be processed by the serverless function.

Future constraints : As the number of users grow, the volume of messages sent to SQS queue will increase. We may need to have a better calibration of the queue system, monitor the processed events and implement mechanisms to prevent messages from remaining too long in the queue (dead letter queue).

Serverless function :

The serverless function is triggered when new events are pushed to the queue system. The serverless functions processes the event, generates a PDF Ticket, pushes the ticket to a blob storage (bucket) and sends an email to the user using an emailing service.


Feedbacks & Improvements

Product/Market Fit

Your product roadmap and journey is a set of assumptions and risks. Over-building or over-engineering at early stage might represent a high risk and create an unbalance between the value you're trying to create for your users and the resources to sustain the value creation.

The Product/Market Fit is that point in your roadmap where your product finds a suitable market, where users are willing to use your product because they see a real value.

dashboard image

Architecture and practices improvements

Before spending more time improving your architecture, you may want to carefully assess your product/market fit indicators. If you’re on track to achieve product/market fit, you should focus on essential practices and continue delivering good value to your users/customers.

DevOps/SRE practices : essentially delivering a product and user experience with fewer bugs/failures as much as possible. That means you have to control the lifecycle of the changes in your product. Follow as much as possible the best practices, to test your new developments, improve the security, keep track of productions issues, rollback as fast as possible if new changes create incidents, etc. The DORA metrics are useful indicator to measure the maturity of your product development lifecycle.

Keep the latency as low as possible for your users : eventually if your product gets some traction, you will face the common challenges in production; the number of requests increase and the infrastructure will need to adapt to a high demands. You may probably need to think about scaling the key component of your architecture.

Managing peak activity : Your product may experience spike moments, where too many users try to perform some actions on your platform, you may want to stress-test your MVP architecture and know in advance the maximum RPS (Requests per second) your infrastructure can tolerate (and respond with an acceptable latency) before seeing some degradation (high latency, slow responses). You then need to implement basic observability to track the activity on your platform, and get some alerts if you're about to reach that barrier. In our sample architecture we have already implemented a queue system, you may also need to add a caching layer to reduce the load on the database (as we discussed earlier).

Analytics/ML Recommendation : As the number of users increase and your product get more and more traction, at some point you may need to have a better understanding of your user choices. You'll then need to collect more and more data on your user activities, actions, habits with your products (which type of events they like to attend, when is the best time to offer discount, etc..). The goal is without being too oppressive to offer the best recommendations for their next experiences, and based on the data collected.

There are probably more architecture improvements, but first we need to make sure we have a product that creates a real value for our customers.

Product/Architecture Fit

The evolution of your architecture should as much as possible align with the evolution of your product, to prevent under-engineering or over-engineering.

dashboard image

Lesson learned from Twitter evolution

Here is the evolution of Twitter platform (2006 - Present) from the simple features (tweets, and following users) to a more mature platform. Take a look at the evolution of the architecture components (in synch) to sustain the value creation and provided to users experiences.

dashboard image

Sources for Twitter architecture

https://highscalability.com

https://blog.x.com