hyPiRion

(rationalize inlein)

posted

Earlier today I announced Inlein, which is a program that runs Clojure scripts with dependencies. In short, you add a shebang at the top of your program, quote the relevant parts of a project.clj map, and make the file executable. If you want to learn more about Inlein and how to use it, you should take a look at inlein.org and the project pages on GitHub.

I think it’s worth writing about the rationale for Inlein itself. Although its use should be evident, it is also relevant to other projects as an experimental playground, and I believe it may make Clojure adaption easier as well.

Black-white image of half of a pylon on the right side, with straight powerlines over the rest of the image
Lines by Konstantin Stepanov, CC-BY-NC-SA 2.0

Speed and Repeatability

Inlein is a project I started on for fun when I saw this thread on Clojure’s Google Group, which discusses startup times and Clojure scripting, and what the experience has been for users.

The thread began with a link to a form, and among one of the questions in it is “How did you start your program?”. The options were, perhaps not surprisingly

  • lein run
  • lein repl
  • Boot via #!/usr/bin/env boot
  • boot repl
  • Direct Java command
  • Shell script wrapping Java command
  • (Other)

Another option which is worth mentioning is lein-exec, which is more or less equivalent to the shebang-boot option, but for Leiningen instead.

What’s interesting to see here, is that there are no tools that are designed with the sole purpose of making scripting in Clojure easier. Obviously, Boot and Leiningen are present, but neither were originally/mainly designed to be used for standalone scripts. It’s really cool that you can use both for scripts, perhaps especially the shebang option Boot provides. However, if you want to do it for standalone scripts, there are a couple of gotchas:

  • It’s not possible to specify JVM options on a script-by-script basis
  • It’s not possible/evident how you would specify the Clojure version you’d like to use on a script-by-script basis
  • The classpath will already contain some libraries, which may cause havoc with dependencies you want to load
  • And finally, they both have slow startup times

The first three ones aren’t related to the original problem described in the thread, but rather to repeatability. Time and again I have found repeatability to be incredibly valuable and a lifesaver for projects, and it’s weird that we don’t treat our scripts the same way: When I share standalone scripts in Python/Bash with coworkers on different setups, they sometimes fail because they have a different version of a library or program used, or do not have the library/program installed at all1. Of course, no one would like nondeterministic and unrepeatable scripts, so part of the reason I made Inlein was to minimize that problem. But I also wanted to bring faster scripting capabilities to Clojure, which was the catalyst for starting this project in the first place.

Providing both repeatability and good startup speed, while still being able to program Inlein in Clojure, is actually not as straightforward as you’d hope. Clojure, for better or worse, takes time to boot up2, which you have probably noticed when you start your favourite Clojure project manager: Boot and Leiningen are both Clojure programs, and to ensure isolation, they start a different Clojure runtime when you test/run your application. One of the obvious ways to speed up startup times is to avoid starting two Clojure runtimes per invocation, so that’s what Inlein does. Per invocation, there’s effectively only one Clojure runtime starting, the rest is just a small Java program on top.

Clojure is fast once it has loaded, though. And since I want as much as possible of Inlein to be a Clojure program, I made more or less the entire program in Clojure as a daemon that the Java program sends requests to3.

Adoption and Learning

Having a tool that makes it possible to easily distribute and share standalone Clojure scripts, would hopefully also make it easier to adopt and learn Clojure itself.

I myself started to toy around with Clojure through Project Euler. The solutions were all in a big project, and to run one of the solutions, you had to manually call it through the REPL. This was fine, but in the very beginning I wish I could just have it as a standalone file I could run like I could with Python4.

Beginning with Clojure may feel more daunting if you must make a “project” out of things the first time you toy with it. You have to learn how to set up the project, how to separate code into namespaces, and so on. And while all these are important things to learn if you want to use Clojure and not incredibly hard to learn, they are not essential when learning Clojure is your current goal, and just want a small program up and running.

I also hope that Inlein may ease adoption of Clojure in companies somewhat: At some point, you’ll likely have to make some one-off migration scripts that you don’t need to maintain, or some longer running scripts that won’t go into production. It’s a smart way to introduce Clojure in manageable chunks to coworkers, while still being useful to the team. If you’re clever, you can slowly, but surely put more and more script utilities in a utility library you depend on, and at some point, the team may realise that you have enough Clojure code that you can create a new service or replace an old one with a full Clojure project.

Inlein isn’t as great as I’d like it to be for beginners yet, though. The error reporting done by Inlein, or rather the lack thereof, should be much better than it is right now. Bruce Hauman’s recent work on exactly this is really impressive, so it would not be surprising if I’ll end up using some of those ideas myself:

Image of Figwheel Configuration Error Recovery

And Inlein cannot be a panacea for newcomers by itself. Ideally, people should start experiencing Clojure through the REPL. After that, they may head off and do small programs with Clojure – in which Inlein will be a good option – then work on bigger projects. For now, Lein and Boot have more “batteries included” with their provided REPLs, but perhaps this isn’t impossible for Inlein to provide as well.

This also depends on the ease of installation. For now, it’s alright for people familiar putting things on their $PATH, but it would be better if one could just install it via sudo apt-get install inlein or brew install inlein. Which brings me to the part why I think Inlein is interesting and valuable for other reasons as well.

Experimental Playground

The sudo apt-get install annoyance is also present in Leiningen: Linux users are usually unable to install lein via their preferred package manager. That’s a minor problem compared to a lot of other issues that we all want to be fixed. But trying to fix the bigger issues – especially when you don’t know exactly how it should be done, or even if it IS possible – is not easy to do without some exploration and lots of testing. And most of the big remaining issues may cause breaking changes to some extent.

That’s a problem since Leiningen is a relatively important project in the Clojure community. And although we’re not extremely pedantic about semantic versioning, going from 2.x to 2.y should not intentionally cause any (big) breakage. As a result, many of these big changes are planned for 3.0. But we’d obviously like to try these things out and figure out whether the ideas are worth pursuing or not.

And this is where Inlein can help: Although it has a different purpose, it’s a small, new program which is built up very similar to Leiningen itself. Testing out new things in Inlein is much easier and will cause fewer issues, if any at all. If the ideas work out and we can figure out the nitty-gritty details, then it’s interesting to see if we can implement them for Leiningen too.

For example, some of the things I’ve experimented with in Inlein is

  • Executable jars (Avoid different startup scripts for Windows/OS X/Linux)
  • Java 7 (way better I/O redirection support)
  • Use fewer dependencies (Makes it easier to package it for Linux distros)
  • Client + Daemon (Better startup time)

but I’m also interested in better error reporting, as I mentioned in the previous section.

Whether any of these things will be implemented in Leiningen, let alone be doable, will to some extent depend on whether it works in Inlein, and how well it works. So don’t assume that any of the mentioned bullet points will be implemented, but at least be aware that development on Leiningen is happening, but not necessarily in the Leiningen project itself.

I also hope Inlein can be a place where other projects can get some ideas as well. And perhaps Inlein can be an entry point for people interested in contributing to either Leiningen or Boot, or just contributing to Clojure OSS in general.

  1. For example: The version of Bash that’s provided by default with OS X is 3.2, whereas I guess any Linux distribution these days ships with 4.3. If you want to use associative arrays or any other 4.x feature, you better tell your Mac developers, and do a small check in the script to make sure that the version is 4.x. 

  2. Adding JVM libraries, especially Clojure ones, makes it even slower. 

  3. For Inlein, this kind of split is fine, as the only question the client will ever request is the “what are the JVM parameters for this file?”. This would probably be harder for Boot and Leiningen as they both give users option to load/execute arbitrary code through plugins or tasks. 

  4. While Project Euler solutions don’t have to be runnable by executing a standalone file, there is, in fact, a similar area where this is super valuable in my experience: Programming competitions.

    In competitions like Facebook Hacker Cup and Google Code Jam, you solve programming problems based on correctness, development time and running time. Having a single file you can modify, execute from the command line and play around with makes this much easier. See resolution.clj for an example on this. Since you also have to upload the files you used to produce the output, having it all in a single file while still being able to use dependencies saves you for a lot of time.