A couple of years ago, I worked on an exploratory protoype of web application to get my hands dirty with Clojure and Datomic. I put together a prototype web application with a frontend written using Tailwind CSS and Alpine JS and a backend written using Pedestal and Datomic. I chose to hand-roll everything including authentication and role based access control for various user groups.
From the outside it looks like a typical webapp, but the application internals are unusual and worth discussing.
Brief Overview of Clojure
Clojure is a JVM programming language with some distinguishing features:
- dynamic scripting language designed to keep a running process that can be manipulated (similar to Python, but with more interaction)
- emphasizes functional programming, but allows for mutation where needed
- uses Lisp syntax, i.e.
(println "hello")
instead ofSystem.out.println("hello")
Clojure has a very solid selection of libraries for web development, lets you develop rapidly by incrementally defining functions without needing to recompile, and cuts away a lot of the boilerplate you find in older enterprise languages. As a result, web programming is one of Clojure's strongest use cases.
For these types of applications, deployment works pretty much the same as it does for Java; you can package up Clojure applications into an archive that include all dependencies, drop it into a server, and run it. I spun up a $5/mo VPS, installed the JDK and Clojure, set up the Datomic peer library backed by Postgres, and had a functional starting point without much trouble.
Once deployed, Clojure applications can listen on a socket for connections. Developers can connect to the program from their editor and run commands to inspect or manipulate the program. In fact, it is common to develop applications while they are running, without the need for recompilation. For medium to large sized projects, this is significant. Java projects I worked on in the past could take fifteen minutes to compile, and larger projects took longer.
Templating
In languages like Java, you tend to use HTML templating languages like Thymeleaf or JSPs to embed dynamic data from the server into pages. It generally looks something like this:
<!DOCTYPE html>
<html>
<head>
<title>Sample Page</title>
</head>
<body>
<div th:text="'Hello, ' + ${user}">Hello, World!</div>
</body>
</html>
The Java template above is stored in a file, parsed, and processed by Java code. These work well and are instantly familiar to all web developers, but it's easy to wind up with either heavy amounts of duplication or a rat's nest of nested page includes that are difficult to work with.
Clojure, by contrast, has a popular library named Hiccup to render with Clojure data structures directly. Using Hiccup looks like the following:
[:html
[:head
[:title "Sample Page"]]
[:body
[:div "Hello, " user]]]
The Clojure definition is source code in the syntax of the language. You can access and manipulate it by directly operating on the underlying data structure - a vector. In templating languages, you have to usually have to use a combination of A) splitting templates into many pages and B) using a weird templating language that is different from the main programming language. In practice, that can get very hairy.
Using hiccup makes it easy to maintain page logic as it gets increasingly complex. A lot of the noise gets boiled out and you can operate on HTML output as data, rather than relying on template mechanisms and String processing. The value proposition mirrors that of Clojure vs traditional imperative languages.
Datomic
Cognitect, the stewards of Clojure, created a database called Datomic that serves as an alternative to SQL. Datomic is built on the principles of immutability and stores data as a series of immutable events. More importantly (for me personally, at least) it provides a lot of extremely useful functionality out of the box that you don't get in SQL.
As the database schema changes, and when data is added or removed, Datomic keeps a complete accounting of events. This approach contrasts with traditional SQL databases where updates or deletions overwrite the original data. In practice, between careful database migrations and database or server logs, you can sometimes stitch together a true historical record of what happened in a database. But as anyone who has worked as an application developer will know - it is far from easy.
If you've maintained a traditional database for an extended period of time, you know that it sees incremental changes over time as enhancements are added. So even if you're able to look at past query results, it can be very difficult to keep track of the entire state of the database at a previous point of time- what the schema was at that time, what data was in there, and so on. In Datomic, all of this information is forever at your fingertips.
You can query the database for its transaction history, and you'll get back a traversable Clojure sequence of results the same way you get back any other database result. Here's one way you can do it:
(defn collect-transactions [conn]
(let [tx-log (d/log conn)]
(map (fn [tx]
{:transaction-id (:t tx)
:timestamp (get-tx-timestamp (d/db conn) (:t tx))
:data tx})
(d/tx-range tx-log nil nil))))
And this is what results look like through the eyes of the Morse tool:
Another key feature of Datomic is its use of a declarative query language similar to Clojure's syntax, which can simplify the learning curve for those already familiar with Clojure.
Database as a Value
In Datomic, the database is treated as a "value." This implies that the database, at any point in time, represents a specific state that does not change. When you query the database, you are essentially looking at a snapshot of the data as it existed at a particular moment.
There are a lot of cool consequences to this. One of my favorite is the ability to do speculative translations. What this means is that you can develop experimental features to your application, pass your production database to it, and iterate on your concept with no extra work required.
Another benefit is that it is easy to write tests on the fly using live data. It is easy to explore and manipulate your database without worrying about persisting unwanted data - you can simply create and discard branches from any point of your database as desired.
Modifying Schemas
Datomic allows you to change your schema as needed over time. This means you can add new attributes, change the types of existing attributes, or remove attributes that are no longer necessary without disrupting the existing data. These modifications can be done while the database is live and in use, enabling ongoing development and adjustment to evolving data requirements.
I think this is a big deal, because there are definitely projects where you know ahead of time that you'll be changing direction quite a lot. In a traditional database, this can result in tech debt piling up fairly quickly, or painting yourself into a corner.
As you might expect, there are constraints on this. For example, if your modified schema invalidates existing data, you will need to retract that data first. But overall, evolving your database is far less terrifying than it is in a SQL backed project.
Thoughts on Datomic
In a traditional web app + database deployment, the webapp and the database are two separate behemoths, and each requires a lot of work to spin up and maintain. In enterprise environments, this involves commonly is spread across multiple people, often on different teams.
Another issue with this setup is that SQL has quite a lot of arcane knowledge you have to lug around in order to execute even relatively simple projects. You need to be able to create tables, understand queries, joins, indexing, migration, and best practices associated with all of them. And when you work with your database, it's generally separate from the application, either from the command line or with some kind of GUI tool.
Datomic has a bit of an initial learning curve because of its use of Datalog syntax and the fact that it operates quite differently from SQL, but I found it to be much simpler to use than SQL after that. You write your schema by passing in maps that lay out your data, update the schema by passing in new maps, seed the database by passing in maps that conform to the schema, and query the database by passing in Clojure vectors that describe what you're looking for. You don't need additional dependencies or ORMs, you just interface with the API directly from within your source code. If you've ever had to spend time bug fixing decades old ORMs, or migrating off database dependencies because they're abandoned and are lighting up security scanners, you'll understand this is a hugely welcome change.
When your application takes shape, it consists of pure functions in your application layer and pure functions in your UI layer through which you thread data you're entering or retrieving. It comes together quite beautifully.
In many ways, I think that Datomic is to the traditional database as Clojure is to traditional web applications. The value proposition and tradeoffs for both of them with respect to their traditional alternatives is similar- you get the benefits of functional programming at the cost of added space and performance constraints. In Clojure, this means that you (and the folks who build your web servers and other tooling) avoid many gnarly situations that arise from mutable state. In Datomic, you get to treat the database as an immutable value as described above.
Datomic is a bold departure from SQL, and I have a lot of respect for its choices. You may have noticed a lot of its benefits are still somewhat doable in a traditional database, but it often requires quite a lot of extra work, or extra dependencies that bring additional points of failure. Overall, I think Datomic aligns better to real-world persistance requirements of applications than SQL.
It comes at a cost, of course. I've read that performance on Datomic Cloud can be quite bad. I personally used the on-prem "peer" library instead, which worked great for me, even when I threw a few stress tests at it.
Some other significant constraints are that Datomic uses a single process to write, and after 10 billion transactions it apparently becomes unweildly. My intuition is that a large number of small to mid sized applications would stay well within its performance budget. The auditing you get for free would already have been such a life saver for a number of projects I've worked on in the past.
I ran into a handful of quirks and frustrating moments while building my prototype, mainly stemming from Clojure's stack traces and error messages from Datomic that were hard to read. Since the language is not very widely used, it can also be much more difficult to get help with specific issues, because you're less likely to find online posts from other people who ran into them.
Another big drawback is that while the database tracks its "value" over time, you cannot go back and add previous historical data. It only records data as of the moment it passes through the transactor. This presents challenges for situations that will occur often enough in the real world that it bears mentioning. This is called "bi-temporal history", which is a feature of other similar database systems, but is not a feature of Datomic.
Overall, I really loved working with Datomic. If it fits your problem, it removes so many issues that arise with SQL. I'll be keeping an eye on the library and looking for opportunities to use it in the future.