Today I’m excited to announce the release of DBOS Cloud, a transactional serverless computing platform, made possible by a revolutionary new operating system, DBOS, that implements OS services on top of a distributed database. We’ve used this new architecture to build a novel Python and TypeScript transactional programming environment that enhances applications with automatic statefulness, transactionality, observability, and cyber-resilience. This makes fault-tolerant cloud-native application development much simpler and faster.
The origin of DBOS - run the OS on top of the database
The idea for DBOS (DataBase oriented Operating System) originated 3 years ago with my realization that the state an operating system must maintain (files, processes, threads, messages, etc.) has increased in size by about 6 orders of magnitude since I began using Unix on a PDP-11/40 in 1973. As such, storing OS state is a database problem. Also, Linux is legacy code at the present time and is having difficulty making forward progress. For example there is no multi-node version of Linux, requiring people to run an orchestrator such as Kubernetes. When I heard a talk by Matei Zaharia in which he said Databricks could not use traditional OS scheduling technology at the scale they were running and had turned to a DBMS solution instead, it was clear that it was time to move the DBMS into the kernel and build a new operating system.
So Matei and I launched a joint MIT-Stanford open source R&D project to prototype DBOS. In a nutshell, DBOS codes operating system services in SQL running on top of a high-performance distributed, transactional, partitioned fault-tolerant DBMS. This is in contrast to the traditional method of running the DBMS in user space on top of an OS without DBMS services. Our prototype showed we could provide OS functionality (messaging, cluster scheduling, etc.) with comparable performance to Linux, but also add many new features, including:
- High availability, because all state is stored in a highly available DBMS.
- Time travel–the DBMS logs all events and DBOS persists the log for hours or even days, so it is straightforward to back up the OS to a previous time.
- Transactionality and fault tolerance, because all state is managed by a transactional, fault-tolerant database.
- Built-in multi-node scaling, with no need for Kubernetes.
- SQL-accessible system state and observability data (metrics, logs, traces).
- Cyber resilience and disaster recovery – we use time travel to roll back the entire system to before a disruption.
Based on the success of the DBOS prototype and the positive feedback we received from potential users, we secured funding and launched DBOS, Inc. in April 2023.
Introducing Transactional Serverless Computing
Today, we're releasing DBOS Cloud, a transactional serverless platform built on DBOS, targeting stateful Python and Typescript cloud backends. DBOS Cloud is no ordinary serverless platform. Because it’s built on the DBOS operating system, it offers powerful and unique features, including reliable execution and time travel.
Reliable execution: If code running on a DBOS program is ever interrupted, it automatically resumes from where it left off without repeating any of the work already performed. Programs always run to completion, and their operations execute once and only once. To see why this is important, consider the “confirm order” button in a food delivery app. It needs to place an order at a restaurant, await confirmation, process payment, and request a delivery driver. If the program is interrupted during any of these steps, it needs to resume from where it left off (or the order is lost) without repeating any completed steps (or the customer may be charged twice). Providing such guarantees yourself is months of work, but in DBOS, they’re built into every program.
Time travel: DBOS lets you “rewind time” and restore the state of an application to what it was at any point in the past. In today’s release, we provide a time travel debugger, which lets you replay any DBOS Cloud trace locally on your laptop, exactly as it originally happened. You can step through past executions to reproduce rare bugs and even run new code against historical state. In the near future, we also plan to release time travel for disaster recovery, allowing you to rollback your application and its data to any past state.
Reliable execution and time travel are just the start of what DBOS can offer–we have many other current and planned features, including:
- Automatic export of data provenance, metrics, logs, and traces to encrypted, SQL-accessible tables.
- Built-in dashboards and OpenTelemetry-compatible observability.
- Enhanced cyber attack self-detection and self-recovery.
Getting Started with DBOS
DBOS Cloud is easy and free for you to try. Here’s how to build and deploy a DBOS application in three simple steps:
- Create a free DBOS Cloud account.
- Create an application using the open-source DBOS Transact framework for TypeScript or Python.
- Deploy your application to DBOS Cloud.
After your first DBOS app is up and running, there’s plenty more to explore:
- Follow our programming guide to learn how to build your own reliable DBOS application. If you want to have some fun, try terminating and restarting your code in the middle of a program–it will always resume execution as if nothing happened.
- Try the Time Travel Debugger to replay your code execution, observe past application state, and test code changes.
- Explore idempotency, workflow execution, and other topics in the DBOS docs
Give it a try, and let us know what you think! We look forward to meeting you and answering your questions in the DBOS Community on Discord.