Serve custom domains with caching: Part 1 - Overview & Architecture
Understanding the why and how of the complete system
In one my previous blog I briefly gave an overview of providing custom domain functionality for your SAAS product, while also mentioning different ways and the pain of doing it.
Check it out before jumping into this one -
In this blog series, I will go one level deeper and show you the step-by-step process of building your own custom domain service.
Why are we doing this?
Well, the reason is obvious, you want to provide a level of customization to your users, like what Hashnode does. It gives users the chance to choose between the custom domain & custom subdomain. Before this solution, a lot of people, including me are curious about what's the way to forward the traffic from a domain to your server. For that, the answer is simple, DNS resolution. The real challenge is providing on-demand SSL to whatever domain users want. Even that can be done by using something like Cloudflare edge SSL, but that's not the right user experience. A lot of your users won't want to spend time and get extra DevOps on their hand. If you are providing out of the box experience, then you really need to have everything in the box.
So, let's look at what are the key expectations from this setup -
- ✅ On-demand SSL for any subdomain and custom domain with good integrity.
- ✅ Automatic SSL renewal and ACME challenges - without any human interventions.
- ✅ Server side caching - to reduce number of origin requests and save on bandwidth with consistent caching.
- ✅ Distributed edge nodes for minimum latency
- ✅ Minimal steps for users
- ✅ Affordable to scale and minimum maintenance - ideal for solo DevOps, Opensource preferred.
- ✅ Show the right content on the application based on domain being visited. (This one is the most important otherwise what's the point of custom domain )
This is how the end user experience will look like -
- Go to your product settings
- Add his desired domain / subdomain
- If he chose custom domain then, a CNAME or an A address value which he can add to his domain DNS
That's it, a simple 2-3 step process, the way everything in tech should be.
OfCourse a lot of other shenanigans can be added like custom selectable nodes, built in domain registrar, DNS management, unicorn flying on settings page, but we won't do it here.
How are we doing this ?
To begin with, let's look at the architecture which will set the basis for different layers.
How did you like that sexy looking Miro Diagram?
To understand let's look at Different layers of this setup from bottom to top -
Application Layer - You can technically do it with any server-side framework or language. We tried it in PHP first and later moved to Express.js + Next.js and it worked well. For this series we will see code example in Express.js + Next.js. This is where you will write the application code to do 2 things -
1. Domain check to avoid abuse and SSL rate limits - This would be an endpoint where our caddy server would send a check request on a new domain SSL request, and you will return 200 OK if it exists in your DB. In this setup we created a dedicated microservice in Express.js just to fetch the record from MongoDB.
2.Read the domain like a parameter and then show desired frontend. - This would be in your server-side front-end code, Next.js in this setup. Here we will read the domain from Header params and then show content using Next.js routing.
Part 2 will cover this part of setup.
Server Layer - This is where the real magic happens. We are getting the request and choosing to issue certificate/ Direct the traffic to origin / Respond from cache/ Denying the request. The setup can be distributed across different nodes across the globe to minimize latency and depends upon your budget. Each node is a docker container for easy deployment & scaling running 2 images -
- Caddy Server - This is the heart of our setup. It issues SSL certificates, does reverse proxy, Gzip compression, and many other things. Caddy is Nginx on steroids with all batteries included. We don't have to do much here, just a Caddy script that has the right instructions, Caddy handles everything else.
- Varnish Caching Server - We will use Varnish behind our caddy server which automatically caches all requests and reduces the number of origin requests. It does it all by itself, we just need to provide Varnish config file with desired settings.
Entire setup is in docker-compose and is once click process to deploy. I might make the image available opensource in the future.
Part 3 will cover this part of setup.
Routing Layer - This is the easiest of all. All nodes are hosted over different IP addresses. We will use AWS route 53 to create a hosted zone. Here you can choose the custom domain you want your user to point to. Then define for each geolocation region which node you would choose so that there is the least amount of traversal.
Part 4 will cover this part of setup and DevOps of whole setup
Result
If all goes according to plan, then you have a minimal Maintenace sweet running service that issues SSL certs and minimal latency like Willy Wonka chocolate factory. (Okay that might not have been a right example) Let's look at some of the key metrics -
SSL Certificate - Check
Latency - Check First time - Repeat request -
Happy user - Check
Effort, Cost and Maintenance
Cost
The overall setup is considerably affordable when you look at the benefits. But let's look at the cost of each component -
- Issuing Certificate - Free (Thanks to Let's encrypt and AutoSSL )
- Routing - based on number of requests from AWS Route 53~ 1.2 $/month for 1 million requests
- Per node cost - hosting on Digital Ocean ~ 5$ per node x 4 nodes (1 for each continent are good enough) + Primary node= 25 $
Total Cost ~ 26.2 $ (Look's cheap to me)
Effort
Time taken by a single DevOps engineer if he knows what he is doing.
- Application Layer - 6 hrs.
- Server Layer - 6 hrs.
- Routing Layer - 1 hrs.
- Setting Up Nodes and CI/CD - 5 hrs.
Total ~ 18 hrs.
Maintenance
There are not lot of moving parts in the entire setups, but based on my experience for last 1 year, I must do following things.
- Server Updates (Linux + Caddy + Docker) = Once a month
- Latency lookup and reconfigure = Once a quarter
- SSL renewal check - It does automatically but I still check for integrity = Once a quarter
- (optional) Overthinking about life decisions = Everyday
The best DevOps is solo DevOps ~ Aman Sharma
Shortcut
If you don't want to go through all this, then there are existing solutions that are mentioned in my previous blog.
I have some great ideas on how I can make this like an out of the box product, but I'm still procrastinating on it like my other dream projects.
But if you really have an urgent need and no bandwidth to do it yourself, then you can DM me, and we will see what arrangement we can make. I accept professionally designed swags too or 1000 Schrute bucks.
In the upcoming parts we will go step by step over each layer, so stay tuned.
Disclaimer: This blog is not click bait; I really intend to write it unless procrastination kicks in. So please share some love/ criticism / feedback.