Getting Started with Cosmopolitan Libc

I recently learned a little bit about Cosmopolitan Libc in order to make a portable executable of clp, a small program written for POSIX systems.

I had a little bit of trouble getting started at first, so I thought it might be useful to write up what I learned.

In this blog post I'll share some of my notes on the basics of Cosmopolitan and demonstrate a couple approaches to building software with it.

Introduction to Cosmopolitan Libc

Cosmopolitan is a C library. When you compile your C program and link it, your executable will be:

Cosmo's executables are tiny- far smaller than you'd see for comparable from Rust or Go.

Binaries currently only target x86. They also run out of the box on Apple Silicon and Windows ARM thanks to their built in x86_64 emulation, and you can use explicit x86_64 emulation elsewhere.

GUI programs are not in a workable state or even an officially stated goal, although they are being explored.

Build-once run-anywhere is achieved by the Actually Portable Executable (APE) format. Executables built by Cosmopolitan are interpreted by Windows as Portable Executables, and on Unix they are interpreted as a shell script without a shebang line.

Now let's take a look at how we can use Cosmopolitan to build programs.

Building Programs with the Prebuilt Amalgamation

The simplest way to build APE programs is to use GCC and link the prebuilt library.

An example of this approach can be seen in this makefile project consisting of a "hello world source file":

int main() {
    printf("hello world\n");
}

And a Makefile, which, which grabs Cosmopolitan libs and compiles the program with it:

CC=gcc
OBJCOPY=objcopy
BASEURL=https://worker.jart.workers.dev
AMALGAMATION=cosmopolitan-amalgamation-2.0.zip
LIBCOSMO_SHA256_EXPECTED=\
2228cd5924c001b2d8c8efcc9ddacaab354ba4c99a3e0c8858098e2c3f2e3fdb

hello.com: libcosmo hello.c
	$(CC) -g -Os -static -fno-pie -no-pie -nostdlib -nostdinc                  \
		-fno-omit-frame-pointer -pg -mnop-mcount -mno-tls-direct-seg-refs -o   \
		hello.com.dbg hello.c -Wl,--gc-sections -fuse-ld=bfd -Wl,--gc-sections \
		-Wl,-T,libcosmo/ape.lds -include libcosmo/cosmopolitan.h         \
		libcosmo/crt.o libcosmo/ape-no-modify-self.o                     \
		libcosmo/cosmopolitan.a
	$(OBJCOPY) -S -O binary hello.com.dbg hello.com

libcosmo: $(AMALGAMATION)
	@libcosmo_sha256_actual=`sha256sum $(AMALGAMATION) | cut -d ' ' -f 1`; \
echo "expected sha256sum: $(LIBCOSMO_SHA256_EXPECTED)" && echo \
"actual   sha256sum: $$libcosmo_sha256_actual"; if	 \
[ "$$libcosmo_sha256_actual" = "$(LIBCOSMO_SHA256_EXPECTED)" ]; then echo \
"checksums match"; else echo "checksums don't match, aborting" && exit 1; fi;
	unzip -d libcosmo $(AMALGAMATION)

$(AMALGAMATION):
	wget "$(BASEURL)/$(AMALGAMATION)"

clean:
	rm -f hello.com.dbg hello.com

distclean: clean
	rm -rf cosmopolitan* libcosmo

You might notice that hello.c does not contain a #include <stdio.h> line to provide the header for printf. Ordinarily, the c compiler would search the system headers for this file, which would provide the function declaration for printf, and then the linker would find the definition in the system standard library.

In this case, the Makefile does not search the system headers because of the -nostdinc flag, and it doesn't link the system standard library because of the -nostdlib flag.

Instead, the makefile specifies an include for cosmopolitan.h, which provides declarations for all libc functions including printf, while the function definitions are provided by cosmopolitan.a.

Because of the -nostdinc flag, if we added the include line to hello.c, we would get an error:

hello.c:1:19: error: no include path in which to search for stdio.h
    1 | #include <stdio.h>

As you might imagine, this will cause issues when you're trying to port existing code written against the standard library. In some cases, there's quite a few of them- too many to remove by hand. And we can't just remove the -nostdinc flag, otherwise the compiler would find conflicting function declarations.

A workaround is to create a folder in your project, fill it empty files that have names of all the system headers your code is looking for, and include that folder in the build path.

---
 Makefile             | 2 +-
 header_stubs/stdio.h | 0
 hello.c              | 2 ++
 3 files changed, 3 insertions(+), 1 deletion(-)
 create mode 100644 header_stubs/stdio.h

diff --git a/Makefile b/Makefile
index 4bd867d..f7d3623 100644
--- a/Makefile
+++ b/Makefile
@@ -10,7 +10,7 @@ hello.com: libcosmo hello.c
 		hello.com.dbg hello.c -Wl,--gc-sections -fuse-ld=bfd -Wl,--gc-sections \
 		-Wl,-T,libcosmo/ape.lds -include libcosmo/cosmopolitan.h         \
 		libcosmo/crt.o libcosmo/ape-no-modify-self.o                     \
-		libcosmo/cosmopolitan.a
+		libcosmo/cosmopolitan.a -Iheader_stubs/
 	$(OBJCOPY) -S -O binary hello.com.dbg hello.com
 
 libcosmo: $(AMALGAMATION)
diff --git a/header_stubs/stdio.h b/header_stubs/stdio.h
new file mode 100644
index 0000000..e69de29
diff --git a/hello.c b/hello.c
index d07c467..11f7c99 100644
--- a/hello.c
+++ b/hello.c
@@ -1,3 +1,5 @@
+#include <stdio.h>
+
 int main() {
     printf("hello world\n");
 }
--

This way the includes specified by the source code are found, but they're replaced with empty content during preprocessing. This is what you want, since the declarations that the includes usually contain are already provided by cosmopolitan.h.

I think building with the prebuilt amalgamation makes sense for small projects with few dependencies. It keeps the size of the source repository small and builds quickly.

I was able to build clp with this approach by forking another project which built Lua using the amalgamation prior to its inclusion in the hermetic monorepo. You can see the result here.

Without that fork existing, I wouldn't have been able to compile clp this way. If your projects require non-trivial external dependencies, you'll really need to know what you're doing to get it to build.

Luckily for me, there's a different option that provides additional batteries.

Building Programs Inside the Hermetic Monorepo

Another way to build APE programs is to use the hermetic monorepo. You'll only need make installed on your system to build it; it uses a vendored GCC to bootstrap itself.

The repository is managed with an interconnected system of makefiles which is documented within the main Makefile that you use for building.

In addition to providing core system functionality, the repository contains many programs, utilities, libraries, including the redbean single file web server, a command line debugger, an nes emulator that runs in a terminal, implementations of sed, awk, and make, and much more. The code in all these projects is tailored to Cosmopolitan, which is nice because it makes it easier to yoink from them compared to external code that might need some finangling.

For quick and dirty programs, you can drop a C file inside the examples folder and it'll get picked up and built inside o/examples. That can be nice for trying out ideas or getting up and running quickly.

For more involved projects, or for adding re-usable libraries, you will probably want to add a third_party module.

Third_party modules follow repository-wide conventions where they include Makefiles that declare their source files, header files, dependencies on other packages, and more.

To add a third_party module, you'll want to study the example library package and some of the programs packaged in the repository. You can also check out my fork, where I added a clp third_party.

I'd love to give a detailed breakdown of how to properly write a makefile for a third party, but I pretty much just copied sed's, changed stuff, and hacked in some extras.

Maintaining a personal fork of the monorepo and adding your own packages to it looks like a good way to manage projects. The repository appears designed to be modular and extensible, so upstream changes seem unlikely to break your code. If you want to core parts of the monorepo for a single package, you could always do so on another branch.

Using the hermetic monorepo might seem like overkill for a tiny program. It's pretty big repository, and the initial build can be kind of long. On the other hand, the executables it produces are tiny, iterative builds are quick, it gives nice options for tuning build output, and in general it makes maintainence easier.

There's another option for building software that I think is worth mentioning.

Shortcut for Portable Lua Programs

Here's a quick demonstration of how to package portable Lua programs. In this section we'll build the Lua interpreter, copy it to the base of the repository, zip up a lua file into the executable, and configure it to run the lua file when the file is run.

First, grab the monorepo if you haven't already.

git clone https://github.com/jart/cosmopolitan.git
cd cosmopolitan

From this point on, all commands should be run from the base of the repository.

Build the Lua interpreter, and copy it to the base of the repo.

build/bootstrap/make.com -j$(nproc) o//third_party/lua
cp o//third_party/lua/lua.com .

Next, make a .lua folder and build the zip.com program.

mkdir -p .lua
build/bootstrap/make.com -j$(nproc) o//third_party/zip

Now create a file mymodule.lua inside .lua by pasting the following text into your shell.

printf %s 'local mymodule = {}

function mymodule.hello()
   print("Hello World!")
end

return mymodule' > .lua/mymodule.lua

Next create a file .args in the base of the repository by again pasting the following text into your shell.

printf %s '-l
mymodule
-e
require"mymodule".hello()' > .args

This file will tell be loaded by the modified interpreter and the arguments will be presented to the interpreter as if they came from the command line.

Finally, we'll use the zip.com program to load a .lua directory containing your lua files and the .args file which launches your program into the Lua interpreter binary.

./o/third_party/zip/zip.com -r lua.com .lua/
./o/third_party/zip/zip.com -r lua.com .args

Now if you run lua.com, the program will run the lua file and print "Hello World!" as if you had run it with lua.com -l mymodule -e 'require"mymodule".hello()'.

You can read more about these in the /.args and the /.lua/... sections of redbean's documenation; just be aware that not all of redbean's functionality is available to the Lua interpreter out of the box.

Wrap Up

That covers most of what I figured out on my own while playing around with Cosmopolitan.

If you want to see what other people have done, check out some of the projects here: https://github.com/shmup/awesome-cosmopolitan

Acknowledgements

Thanks to Justine Tunney for guidance on porting clp.