Scripting an application is the capability to automate some functionalities of that application. That means being able to bundle together some basic actions the application exposes and to execute that bundle as if it had been a functionality integrated in the application itself. This provides a great power as new functionalities can be created (easily) by the users themselves. This post describes how we can provide such a behaviour to Elixir applications.
Script and extension languages
A script language is a programming language dedicated for automation of tasks within a given environment. Typical example are operating systems that come with various shells which we use to clean file systems, to schedule backups, to manage users, etc.. Many other languages are tagged as “script language”; think of Perl, JavaScript, ruby, php, lua, python, guile to list a few. They all have some similar characteristics:
- they are interpreted even though some can be compiled into a byte code or even into native code
- they are most of the time general purpose languages, that is they are able to do computations on various data types, they have access directly or via libraries to the operating system (file system, network, sometimes graphics, …), etc.
- they often propose a REPL to the programmer so they are quite easy to learn and to experiment with
- some of them are embeddable by design into other applications.
Embeddable means that the scripting language can be integrated into an application so that it will be able to execute some code that had not been defined when the application was itself implemented. Think for example about an application that reads its data files in only one format. If a new format is to be processed, it would be “just” a matter of defining a new read function that would transform the data from the new format into the internal format. When a script language is embeddable into an application, it is often called an extension language.
Many major applications have an extension language and I believe that they are major because they can be extended by regular users and not only developers. Conversely, if an application doesn’t provide such a capability it is more likely to disappear and be replaced by one that does. To be convinced, imagine the internet without JavaScript (the older ones may remember Mosaic) or MS Office and VB or AutoCAD with AutoLisp. On the open source side, Emacs, despite its Spartan interface comes with thousands of extensions written in ELisp, gimp with scheme, sublime text with python and lua itself extends hundreds of applications. Extensibility is a future proof feature and it is a very important to take it into account in the early stages of application design.
The nature of the language itself is not a requirement per se: being object oriented or functional or what ever else is mainly a matter of taste. Yet, we won’t be the only users of that language and if it is easy the adoption of the application will be obviously broader.
How does this fit in Elixir’s landscape
Elixir compiles its code down to bytecode that is interpreted by a virtual machine called the beam (well, to be precise, it compiles and executes the code). The beam is actually the VM for the Erlang language and Elixir is just one of several languages that run on the beam.
There are several ways one can think of to integrate an extension language with an application running on the beam.
Integrate the interpreter as a NIF
In the Erlang and beam ecosystem, a NIF, a Native Implemented Function, is a function, typically implemented in C, and liked to the beam VM itself. This makes that function available to Erlang (and Elixir) just as a “regular” function. If we take a language that comes bundled as a C library, it should be possible to write some glue code so that the application and the language can interact. An example is esqlite, an Erlang NIF that encapsulates the sqlite3 database engine and its SQL interpreter.
Never forget, however, that NIF are dangerous as they can crash the whole beam.
Integrate the interpreter via a port
There is a much more traditional solution in the beam echo system that enable communications between the beam and an external process called a port. Basically, a port is a pipe which on one end is connected to a beam process (called the port owner or the connected process) and on the other end to a OS spawned process. Communication on the OS process side is done on standard input and output. Because the connected process is a beam process, it can be managed and supervised the Erlang/Elixirs way. As an example, Apache CouchDB database written in Erlang uses the Spidermonkey JavaScript engine as its query language.
Write the full interpreter in Elixir (or Erlang)
In this scenario, the interpreter engine is written itself in Elixir. To be executed, programs are transformed into some internal representation, typically an AST, Abstract Syntax Tree (and possibly further into a byte code) and evaluated on the fly. An example is Lispex, a toy lisp interpreter.
Compile the interpreter into the beam bytecode
Just as Elixir which compiles to the beam bytecode, get a language that also compiles to beam bytecode, integrate it as a library in the application and use it as the extension language.
There are about twenty languages that run on the beam (see lists on github and on Erlang ecosystem foundation). Most of them are functional because of the nature of the beam itself which was designed in the first place for a functional language, Erlang. There are some flavors of Lisp, some languages from the ML family, a haskell, a prolog and some unclassifiable languages that mainly have in common to be statistically typed. There are also few imperative languages that are maintained actively : lua and php. Published in November this year, this post explains how lua can be used from within Erlang and Elixir to make configurations more flexible.
And in this zoo, there is a nice functional language called Elixir which can also be used to script applications written in… Elixir. Let’s give it a try.
Getting a taste of scripting with Elixir
Elixirs core API contains a module called Code
which
documentation says it contains Utilities for managing code
compilation, code evaluation, and code loading.
Evaluating strings
eval_string(string, binding \\ [], opts \\ [])
evaluates Elixir
code contained in the string. For example:
1iex(39)> Code.eval_string("1+2")
2{3, []}
The first element of the tuple is the result and the second is a keyword list of bindings which, in this example, is empty.
Let’s try with some variables:
1iex(45)> Code.eval_string("c = a + b")
2Warning: variable "a" does not exist and is being expanded to "a()", ...
3...
4** (CompileError) nofile:1: undefined function a/0
5...
First notice, the error is a CompileError which means our code gets
compiled and is not just interpreted which is important in terms of
performance. The error says here that the variables a
and b
don’t
exist in the execution context of a+b
.
To solve the issue, we need to define the variables and write:
1iex(48)>
2Code.eval_string("""
3...(48)> a=1
4...(48)> b=2
5...(48)> c=a+b
6...(48)> """)
7{3, [a: 1, b: 2, c: 3]}
or to explicitly write the newline characters:
1iex(48)>
2iex(50)> Code.eval_string("a=1\n b=2\n c=a+b")
3{3, [a: 1, b: 2, c: 3]}
or use the binding
parameter to define the values of a
and b
:
1iex(54)> Code.eval_string("c=a+b", [a: 1, b: 2])
2{3, [a: 1, b: 2, c: 3]}
The code contained in the string can be of any complexity. We can for example define a module with functions:
1iex(80)> s="defmodule M do\n def add(a, b) do\n a+b\n end\n end"
2"defmodule M do\n def add(a, b) do\n a+b\n end\n end"
3iex(81)> Code.eval_string(s)
4{ {:module, M,
5 << 70, 79, 82, 49, 0, 0, ...> >, {:add, 2}}, []}
6iex(82)> M.add(3, 6)
79
Note that here the result of the evaluation is the compiled code as
there is no other calculation. The code above is not very clean for
the human eye: the value of the string can also be read from a
file. Let’s define the file M.exs
to be:
1defmodule M do
2 def add(a, b), do: a + b
3 def sub(a, b), do: a - b
4end
then:
1iex(104)> {:ok, code} = File.read("M.exs")
2{:ok,
3 "defmodule M do\n def add(a, b), do: a + b\n def sub(a, b), do: a - b\ned\n"}
4 iex(105)> Code.eval_string(code)
5{ {:module, M,
6 <<70, 79, 82, 49, ...>>, {:sub, 2}}, []}
7iex(106)> M.add(M.sub(2, 1), M.sub(5, 4))
82
Evaluating files
Evaluating a file is a matter of loading the file content in a string
and of evaluating the string. There is a convenience function that
encapsulates all this, eval_file(filename, relative_path \\ nil)
. As
for eval_string
, this function returns the result of the evaluation
and the bindings
Other interesting functions
There are some other compilation functions:
compile_string(string, file \\ "nofile")
compiles the code contained instring
and returns the generated bytecode. The optionalfile
parameter is used for reporting warnings or errors, if any, as if the code was located from a file.compile_file(filename, relative_path \\ nil)
compiles the content of the file.require_file(filename, relative_path \\ nil)
compiles the content of the file. The difference withcompile_file
is that if the file is compiled by several processes concurrently, it will get compiled only once
Finally, the module contains functions that make sure a module had
been compiled, had been loaded, that specific compiler options are
set, etc. Going through the
documentation of the Code
module helps also to understand how the whole Elixir system works.
Scripting an Elixir application with Elixir
A dummy application for testing
As we’ll be doing some experiments with code, let’s create a dummy command line application which we will modify to illustrate our tests. Every test will be sitting on its own branch. The repository is hosted on Gitlab.
To initialize the project, type:
1mix new se --module SE
2cd se
Then edit the mix.exs
file and add the line escript: [main_module: SE]
in the project
function:
1def project do
2 [
3 app: :se,
4 version: "0.1.0",
5 elixir: "~> 1.10",
6 start_permanent: Mix.env() == :prod,
7 escript: [main_module: SE], # <<<<< here
8 deps: deps()
9 ]
10end
and finally change the content of the lib/se.ex
file with:
1defmodule SE do
2
3 def main(args) do
4 IO.inspect(args, label: "Command Line Arguments")
5 end
6
7end
Eventually, compile the application and execute the generated executable, just to make sure everything works properly:
1mix escript.build
2Generated escript se with MIX_ENV=dev
3
4./se 1 2 3
5Command Line Arguments: ["1", "2", "3"]
Perfect! This basic code is on branch master
.
Reading data from an external script file
Imagine a basic use case where we want to read some constants defined
in a configuration file. We’ll first modify the se.ex
file to make
it load and evaluate a file as discussed in the previous section:
1defmodule SE do
2
3 def main([]), do: main(["-f", "init.cnf"])
4
5 def main(["-f" , filename]) do
6 IO.puts("This is Elixir code. Config file is #{filename}")
7 {_res, bind} = Code.eval_file(filename)
8 IO.puts("Result=#{inspect(bind, pretty: true)}")
9
10 server = bind[:server]
11 port = Keyword.get(bind, :port, 80)
12 url = "http://#{server}:#{port}/"
13 IO.puts("url=#{url}")
14 end
15
16 def main(any) do
17 IO.puts("Error in argument list #{inspect(any)}")
18 end
19
20end
Notice there are three main()
functions: this is a quick command
line parameter parsing method, thanks to pattern matching. The
“interesting main
” is the second one: it loads the file which name
is provided after the -f
switch and interprets it. If we define a
configuration file, init.cnf
with the following content:
1server = "myserver.com"
2port = 8080
Code.eval_file()
would return a tuple where the first element is the
result of the interpretation, that is /my/doc
and the second a
keyword list of variable bindings: [path: "/my/doc", port: 8080, server: "myserver.com"]
. From within the main()
function, we get
access to the parameters with, for example server = bind[:server]
or
port = Keyword.get(bind, :port, 80)
to allow the port
parameter to
be optional with a default value of 80
. This code is on branch
configuration-1
.
Calling application’s code from the script
One step further is to make the script file call code from the
application’s core. Taking the example above, we can define the url
variable in the script file rather than in the core application. We’ll
also modify the definition of the variables to illustrate that as the
script file contains Elixir regular code, we can use any valid Elixir data
structure:
1endpoint = %{
2 :server => "myserver.com",
3 :port => 8080,
4 :path => "/my/doc"
5}
6
7url = "http://#{endpoint[:server]}:#{endpoint[:port]}#{endpoint[:path]}
The url
is build during script evaluation which calls string
interpolation and concatenation functions.
We modify the content of se.ex
accordingly:
1 def main(["-f", filename]) do
2 ## as before
3
4 server = bind[:endpoint].server
5 port = bind[:endpoint].port
6 path = bind[:endpoint].path
7 url = bind[:url]
8
9 IO.puts("server=#{server}")
10 IO.puts("port=#{port}")
11 IO.puts("path=#{path}")
12 IO.puts("url=#{url}")
13 end
Checkout the branch configuration-2
for the code.
Of course, the script code can call any function available to the application. Try for example to add the following at the begining of the script file:
1require Logger
2Logger.info("Logging from the script")
This also holds for functions you may have defined yourself in your
application. Let’s define the following function in se.ex
:
1 def pretty_print(msg) do
2 IO.puts("Message: #{msg}")
3 end
and call it the following way from the script:
SE.pretty_print("Start loading configuration file")
.
Checkout the branch configuration-3
for the code.
Calling code defined in the script file
Reversely, we can call code defined in the script file, as we have
quickly seen above with the M
module. Let’s reuse it and let’s
modify the se.ex
to make it load the M.exs
file:
1defmodule SE do
2
3 def main(_args) do
4 Code.eval_file("M.exs")
5
6 x1 = M.add(1, 2)
7 IO.puts("x1=#{inspect(x1)}")
8
9 x2 = M.sub(10, M.add(1, 1))
10 IO.puts("x2=#{inspect(x2)}")
11 end
12
13end
When we build the executable (with mix escript.build
), the
compilation succeeds but we get a bunch of warnings:
1mix escript.build
2Compiling 1 file (.ex)
3warning: M.add/2 is undefined (module M is not available or is yet to be defined)
4Found at 2 locations:
5 lib/se.ex:6: SE.main/1
6 lib/se.ex:9: SE.main/1
7
8warning: M.sub/2 is undefined (module M is not available or is yet to be defined)
9 lib/se.ex:9: SE.main/1
10
11Generated se app
12Generated escript se with MIX_ENV=dev
Obviously, the module M
is not available as we are going to load it
later, when the script will be executed. The execution of the program
works fine:
1./se
2x1=3
3x2=8
If we want an interactive session with iex
, we’ll run into the
same issue as during the compilation: the module M
is not
available. To get it available, we have to execute the main
function
so that it loads and compiles the source code of the module M
:
1iex -S mix
2Erlang/OTP 22 [erts-10.6.4] ...
3
4Interactive Elixir (1.10.1) ...
5iex(1)> M.add(1, 2)
6** (UndefinedFunctionError) function M.add/2 is undefined (module M is not available)
7 M.add(1, 2)
8iex(1)> SE.main([])
9x1=3
10x2=8
11:ok
12iex(2)> M.add(1, 2)
133
14iex(3)>
Conclusion
We have seen in this post a whole spectrum of possibilities where we
can use Elixir as a scripting language of an Elixir application. At
one end, it can be used to define basic configuration constants, just
like an .ini
file. On the other end, it can be used to write the
full business logic of the application as the ELisp files do with
Emacs. Where to put the boundary is a matter of architectural taste,
of the flexibility we want to provide to end users, etc.. We also need
to be aware that this provides a way to execute arbitrary code by the
bean VM so security is an important pillar for the choice.
One last important point is performance. Usually, scripting languages are associated with poor performance. However, as we saw previously, the Elixir “scripted” part of the Elixir code is actually compiled just the same way as the rest of the application so except for the compilation phase that needs to be done at application’s start up, the “scripted” code should be as fast as the rest
Happy Elixir scripting!