Clojure and Datomic for Web Applications

A couple of years ago, I got an opportunity to pitch a mid-sized project web application. I built out a prototype, including the front and back end, with authentication and role based access control for various user groups. I opted to write it from scratch using Clojure and Datomic, without using web frameworks, as is commonly done with Clojure web apps.

From the outside, it looks more or less like any other web application you can expect.

The insides were much more interesting, however. The project didn't pan out, but the experience stuck with me and I wanted to do a post-mortem writeup to discuss my thoughts on using Clojure/Datomic for web applications.

Clojure, very briefly, for the uninitiated

Clojure is a JVM programming language with some distinguishing features:

Because Clojure has a very solid selection of libraries for web development, allows for rapid development, and cuts away a lot of the boilerplate you find in older enterprise languages, web programming is one of Clojure's strongest use cases. For these types of applications, deployment works pretty much the same as it does for Java; you can package up Clojure applications into an archive that include all dependencies, drop it into a server, and run it. I used a $5/mo VPS, installed the JDK and Clojure, set up the Datomic peer library backed by Postgres, and it was off to the races.

Once deployed, Clojure applications can listen on a socket for connections. Developers can connect to the program from their editor and run commands to inspect or manipulate the program. In fact, it is common to develop applications while they are running, without the need for recompilation. For medium to large sized projects, this is significant. Java projects I worked on in the past could take fifteen minutes to compile, and larger projects took longer.

Templating

In languages like Java, you tend to use HTML templating languages like Thymeleaf or JSPs to embed dynamic data from the server into pages. It generally looks something like this:

<!DOCTYPE html>
<html>
<head>
    <title>Sample Page</title>
</head>
<body>
    <div th:text="'Hello, ' + ${user}">Hello, World!</div>
</body>
</html>

The Java template above is stored in a file, parsed, and processed by Java code. These work well and are instantly familiar to all web developers, but it's easy to wind up with either heavy amounts of duplication or a rat's nest of nested page includes that are difficult to work with.

Clojure, by contrast, has a popular library named Hiccup to render with Clojure data structures directly. Using Hiccup looks like the following:

[:html
 [:head
  [:title "Sample Page"]]
 [:body
  [:div "Hello, " user]]]

The Clojure definition is just a literal hash table - a plain data structure in the language. It doesn't need special treatment or processing, and it can be passed around or programmatically manipulated at will. Here is a chunk of code from my application that builds some of my HTML. It's a little noisy because I'm using Tailwind CSS, which needs lots of classes and markup.

[:div.mx-auto.max-w-7xl.px-4.sm:px-6.lg:px-8.py-4
  (if (empty? courses)
    [:p "Staff is not assigned to any courses"]
    (map (tailwind-table-registrations staff-id) courses))
  [:div.py-4
    [:a.mt-3.rounded-md.bg-indigo-600.px-3.py-2.text-sm.font-semibold.text-white.shadow-sm.hover:bg-blue-500.focus:outline
      {:href (str "/admin/assign/" staff-id)} (str "Assign " staff-name " to Courses")]]
  [:div
    [:a.mt-3.rounded-md.bg-indigo-600.px-3.py-2.text-sm.font-semibold.text-white.shadow-sm.hover:bg-blue-500.focus:outline
      {:href "/admin"} "Back to Portal"]]]

Even after working with HTML templates for years, I found Hiccup to be much immediately easier to use. My front-end code is separated into Hiccup-generating functions, and when I want to tweak anything, I only needed to change a particular function and re-evaluate it in my editor. Testing these functions as you write them also feels much easier with this setup, because you can just write your tests and run them on the spot without dealing with extra machinery.

Datomic

Cognitect, the stewards of Clojure, created a database called Datomic that serves as an alternative to SQL. Datomic is built on the principles of immutability and stores data as a series of immutable events. More importantly, it provides a lot of extremely useful functionality out of the box that you don't get in SQL.

As the database schema changes, and when data is added or removed, Datomic keeps a complete accounting of events. This approach contrasts with traditional SQL databases where updates or deletions overwrite the original data. In practice, between careful database migrations and database or server logs, you can sometimes stitch together a true historical record of what happened in a database. But as anyone who has worked as an application developer will know - it is far from easy.

If you've maintained a traditional database for an extended period of time, you know that it sees incremental changes over time as enhancements are added. So even if you're able to look at past query results, it can be very difficult to keep track of the entire state of the database at a previous point of time- what the schema was at that time, what data was in there, and so on. In Datomic, all of this information is forever at your fingertips.

You can query the database for its transaction history, and you'll get back a traversable Clojure sequence of results the same way you get back any other database result. Here's one way you can do it:

(defn collect-transactions [conn]
  (let [tx-log (d/log conn)]
    (map (fn [tx]
           {:transaction-id (:t tx)
            :timestamp (get-tx-timestamp (d/db conn) (:t tx))
            :data tx})
         (d/tx-range tx-log nil nil))))

And this is what results look like through the eyes of the Morse tool:

Another key feature of Datomic is its use of a declarative query language similar to Clojure's syntax, which can simplify the learning curve for those already familiar with Clojure.

Database as a Value

In Datomic, the database is treated as a "value." This implies that the database, at any point in time, represents a specific state that does not change. When you query the database, you are essentially looking at a snapshot of the data as it existed at a particular moment.

There are a lot of cool consequences to this. One of my favorite is the ability to do speculative translations. What this means is that you can develop experimental features to your application, pass your production database to it, and iterate on your concept with no extra work required.

Another benefit is that it is easy to write tests on the fly using live data. It is easy to explore and manipulate your database without worrying about persisting unwanted data - you can simply create and discard branches from any point of your database as desired.

Modifying Schemas

Datomic allows you to change your schema as needed over time. This means you can add new attributes, change the types of existing attributes, or remove attributes that are no longer necessary without disrupting the existing data. These modifications can be done while the database is live and in use, enabling ongoing development and adjustment to evolving data requirements.

I think this is a big deal, because there are definitely projects where you know ahead of time that you'll be changing direction quite a lot. In a traditional database, this can result in tech debt piling up fairly quickly, or painting yourself into a corner.

As you might expect, there are constraints on this. For example, if your modified schema invalidates existing data, you will need to retract that data first. But overall, evolving your database is far less terrifying than it is in a SQL backed project.

Thoughts on Datomic

In a traditional web app + database deployment, the webapp and the database are two separate behemoths, and each requires a lot of work to spin up and maintain. In enterprise environments, this involves commonly is spread across multiple people, often on different teams.

Another issue with this setup is that SQL has quite a lot of arcane knowledge you have to lug around in order to execute even relatively simple projects. You need to be able to create tables, understand queries, joins, indexing, migration, and best practices associated with all of them. And when you work with your database, it's generally separate from the application, either from the command line or with some kind of GUI tool.

Datomic has a bit of an initial learning curve because of its use of Datalog syntax and the fact that it operates quite differently from SQL, but I found it to be much simpler to use than SQL after that. You write your schema by passing in maps that lay out your data, update the schema by passing in new maps, seed the database by passing in maps that conform to the schema, and query the database by passing in Clojure vectors that describe what you're looking for. You don't need additional dependencies or ORMs, you just interface with the API directly from within your source code. If you've ever had to spend time bug fixing decades old ORMs, or migrating off database dependencies because they're abandoned and are lighting up security scanners, you'll understand this is a hugely welcome change.

When your application takes shape, it consists of pure functions in your application layer and pure functions in your UI layer through which you thread data you're entering or retrieving. It comes together quite beautifully.

In many ways, I think that Datomic is to the traditional database as Clojure is to traditional web applications. The value proposition and tradeoffs for both of them with respect to their traditional alternatives is similar- you get the benefits of functional programming at the cost of added space and performance constraints. In Clojure, this means that you (and the folks who build your web servers and other tooling) avoid many gnarly situations that arise from mutable state. In Datomic, you get to treat the database as an immutable value as described above.

Datomic is a bold departure from SQL, and I have a lot of respect for its choices. You may have noticed a lot of its benefits are still somewhat doable in a traditional database, but it often requires quite a lot of extra work, or extra dependencies that bring additional points of failure. Overall, I think Datomic aligns better to real-world persistance requirements of applications than SQL.

It comes at a cost, of course. I've read that performance on Datomic Cloud can be quite bad. I personally used the on-prem "peer" library instead, which worked great for me, even when I threw a few stress tests at it.

Some other significant constraints are that Datomic uses a single process to write, and after 10 billion transactions it apparently becomes unweildly. My intuition is that a large number of small to mid sized applications would stay well within its performance budget. The auditing you get for free would already have been such a life saver for a number of projects I've worked on in the past.

I ran into a handful of quirks and frustrating moments while building my prototype, mainly stemming from Clojure's stack traces and error messages from Datomic that were challenging to unpack, but nothing I couldn't work through. I really hope I get a chance to work more with it in the future.