Background
For the last 10 years, the team behind MoneyOnFIRE have been building different applications on AWS. We were early serverless adopters and AWS Lambda and, to give AWS full credit, Lambda was a vast improvement over what came before.
The first version of MoneyOnFIRE was built on AWS as a serverless architecture. However, as our demands grew and found we were spending too much time on AWS vs Product and our Developer Experience was poor. After one too many AWS frustrations, we decided to try migrating everything to Vercel.
This post documents our experience, the tradeoffs we found, and whether it was worth it.
Our Architecture
The Architecture of MoneyOnFIRE is pretty close to the serverless web app reference architecture: a very common architecture for many SaaS products.
The only unusual requirement that MoneyOnFIRE may have is that the main lambda function is very compute-intensive. Every time the user clicks we:
- 'Run' their main scenario which is ~70 years of month by month, account by account with tax calculations.
- There are a few optimization problems within a sceanrio run which may cause us to explore several paths (run the 70 years multiple times)
- We then run ~12 different scenarios to show things like different expenses or market returns. Each of these scenarios is a completely fresh run of the above.
For reference, a spreadhsheet implementation of a single plan took 5-10 seconds to run. Getting access to dependable (low spin up, high flops) lambda compute was a key issue.
The Bakeoff
In essence, it is a straight contest - where is it 'better' to build the SaaS serverless architecture; AWS or Vercel? Which one is quicker, cheaper, most secure and with the least ongoing maintenance effort?
Category | Feature | AWS | Vercel | Winner |
---|---|---|---|---|
Time to Market | Initial Setup | A weekend to configure everything. To set up repositories (one code, one CDK), the CDK deploy, IAM, certificates, github tokens, Github deployment pipeline | A morning (after we did some orientation). What we found is that nextjs support was excellent so everything worked out the box. Right down to dev vs production builds etc. | Vercel |
Subsequent Pushes | Any infrastructure change was a 10-15 minute push with CDK with frustrating rollbacks. Application pushes were 5-10 minutes. | 57 seconds from push to live in production. | Vercel | |
Developer Experience | Branch Deploys | It's possible but it requires a lot of setup - that we had to do. Either separate API gateways or complex stage variables. No preview URLs for front-end changes. | Automatic preview deployments for every branch. Preview URLs for every PR out of the box. | Vercel |
Multiple Environments | Manual setup. Separate API Gateways, Lambda aliases, and environment variables for dev/staging/prod. | Production, preview, and development environments work natively. No configuration needed. | Vercel | |
Dev/Prod Parity | API Gateway creates differences frustrating differences between local and production. Required custom shims to bridge the gap. | vercel dev mirrors production locally. No gateway layer. We have not found any parity issues and run our main webserver locally. | Vercel | |
Logging | CloudWatch is incredibly frustrating. Learning curve for CloudWatch Insights query syntax. | Built-in dashboard with real-time streaming and simple search. No setup required. | Vercel | |
Observability | Requires configuring X-Ray, CloudWatch dashboards, and alarms manually. | Built-in analytics and monitoring. Request timing, errors, and performance metrics out of the box. | Vercel | |
Environment Variables | Scattered across Lambda configs, Parameter Store, Secrets Manager, and CDK code. No single source of truth. | Centralized management in dashboard. Scope by environment (dev/preview/prod). Easy to update without code changes. | Vercel | |
Security | Simplicity | Powerful but complex. IAM permissions, VPC configurations, security groups. Easy to misconfigure. | Simpler security model with good defaults. Less surface area for misconfiguration. | Vercel |
Web Application Firewall | AWS WAF available but requires separate setup and configuration. Additional cost per rule. | Integrated WAF and DDoS protection included. Automatic protection against common threats. (This dashboard allowed us to see more bad actors) | Vercel | |
Authentication | We found Cognito overly complex and inflexible. | We used Clerk which simplified auth but added another third party (we have it turned off at the moment to remove the login wall) | Vercel | |
Compute | Spin up time, capacity and determinism | Noticeable cold starts. We had to spend a lot of time optimizing provisioned concurrency, lambda memory and docker images. Provisioned concurrency made our deployment more complex. | Our calculations ran ~30% faster with lower variance and minimal cold starts. | Vercel |
Cost | Baseline | Connecting Lambda to VPC creates the need for NAT which is $32 / month straight off the bat. | Simpler pricing model. Initial Pro cost is $20 / month. | Vercel |
Free Credits | AWS has strong free credits for startups (and via Clerky) | We didn't find any similar offering for Vercel | AWS | |
Flexibility & Breadth | Services | 200+ services. Can build virtually any architecture. Full control over every aspect. | Focused on web applications and serverless functions. More opinionated and constrained. | AWS |
vercel dev
mirrors production locally. No gateway layer. We have not found any parity issues and run our main webserver locally.Our Takeaways
We made the point before but our architecture is very much in the wheelhouse of Vercel - so YMMV depending on your architecture. However, we do wonder how many startups now start with this architecture and whether Vercel is now catching this market.
Dev experience really matters. To find issues in logs, to issue fixes, to test, to deploy, to have confidence in that deploy.
It's not that it is impossible to build any of these features on AWS. Of course it is possible. The question is time. The default AWS dev experience is really quite poor (and Amplify is so bad that its not up for discussion).
Our big concern was whether there was something we wanted to do that Vercel would not support and we would be unable to leave the ecosystem. However, we haven't found anything - perhaps because we conform to their target architecture so YMMV. If we do then we might run a special back end service back on AWS. So far, there hasn't really been any downside.
Conclusion
Ultimately, after doing a few tests to derisk we moved our entire application to Vercel over one weekend and deleted a lot of excess code. When we look at our repo now its 99% our application and 1% infrastructure. That makes us happy.