Transactions and Serverless are Made for Each Other

I had a great time presenting DBOS at the SF Systems Meetup in San Francisco on November 20, 2024. My presentation is based on the CACM paper, "Transactions and Serverless are Made for Each Other" that Peter Kraft and I recently published about the challenge of making serverless work effectively for applications that require statefulness.

The Problem: A Fundamental Mismatch

Popular serverless platforms offer incredible scalability and simplicity, but only for stateless applications. However, modern applications are usually stateful—long running with complex real-world interactions. To build stateful serverless applications, developers have to patch together many services and build custom state handling, which is time-consuming, error-prone, and hard to scale.

The Solution: Durable Serverless with DBOS

DBOS enables stateful serverless by storing the execution state of workflows in a database. That way, if a workflow is interrupted (due to scaling, failure, or migration), it can automatically recover from where it left off by looking up its execution state in the database.

Once stateful programs are durable, making them serverless becomes far easier. Workflows can now be deployed to ephemeral executors and autoscale or migrate between executors for load balancing or failover. This unlocks the best of both worlds: the reliability and statefulness of traditional systems combined with the scalability and operational simplicity of serverless platforms. Developers can build robust, stateful applications without compromising on performance or ease of deployment.

Here are the slides I presented at the SF Systems Meetup.

There were many good questions from the audience:

Q: What database do you support?

A: DBOS uses vanilla Postgres and works with any Postgres-compatible database, such as Neon, Supabase, RDS, or Aurora. DBOS leverages your application's database (Postgres) to store state, so there's no need for separate storage.

Q: Why use a database for everything?

A: As we discussed in our recent blog post (https://www.dbos.dev/blog/what-is-lightweight-durable-execution), using a database makes durable execution extremely lightweight, because you no longer need to set up a separate orchestration server. The database oriented idea is extensible to many other features such as durable messaging, durable queues, durable sleep, and many more.

Q: How do you handle code changes in the middle of a workflow?

A: Each workflow is versioned on DBOS Cloud, so new requests go to microVMs hosting new application versions while old workflows can be processed in the background by a different microVM hosting an old version. This article explains exactly how code updates are handled by DBOS Cloud without disrupting workflows that are in flight.

Q: Do you provide time travel debugging?

A: Yes, DBOS supports time travel debugging and time travel queries (experimental). I presented DBOS time travel queries at PGConf NYC 2024 (https://postgresql.us/events/pgconfnyc2024/schedule/session/1711-time-travel-queries-with-postgres/), and I'd love to have another presentation to tell more about it!

Q: How do you prevent someone from trying to use shared memory (global variables)?

A: Our summer intern Caspian wrote a static analysis tool to check for global variables and other unsafe non-deterministic calls in TypeScript (https://www.dbos.dev/blog/static-analysis-eslint-dbos), we are examining similar tools in Python and other languages.

If you have questions about the slides or DBOS, feel free to ask them on the DBOS Community on Discord. I look forward to hearing from you.