Skip to main content

Writing a DICT (RFC 2229) server

843 words (approximately 4 minutes reading)

In last few weeks, I’ve implemented a minimal, barely compliant1 DICT server called ExTra (also stylized ex.tra). The server implements the protocol as described in the linked specification, as well as reading from existing text-databases used by other servers (dictd and GNU dico). I’ve done this as an exercise for learning Elixir, mainly, and it lacks many features in comparison with the two other existing implementations.

Spec summary

DICT is a simple protocol for looking up dictionaries over TCP/IP. It include a handful (well, more than handful, if you include the optional authentication) commands like MATCH or DEFINE to retrieve entries from a dictionary database. The database format is not defined, so technically, one can make it to work with, say, an SQL database, but it’s customary to use the format the reference implementation (dictd) uses.

I largely built this based on Elixir’s guide for building a key-value server


This diagram shows a very rough and simplified architecture overview of ExTra. Yea, I know, it looks ugly, but organizing a diagram is hard, so I’ll describe it in details below.

Architecture overview of ExTra

The processes2 are supervised by the application supervisor, and are respawned once they crash. There are three main processes concerned here:

TCP server

The TCP server is implemented using erlang’s :gen_tcp:

{:ok, client} = :gen_tcp.accept(socket)

Here the client is a connection to client. We send and receive data via this process. To read from this until disconnection, we would put it on an infinite loop (acceptor in the diagram). However, that also means the server is locked to that client and cannot accept another—you probably have learned this from a network class implementing an echo server in C. This is where the task supervisor comes in: instead of running directly into the infinite loop, we spawn a Task that does it:

Task.Supervisor.start_child(ExTra.ConnSupervisor, fn -> serve_first(client, host) end)

In the loop, commands are parsed and run, something like ExTra.Command.parse(data) and The ExTra.Command module then sends these commands to ExTra.Dict GenServer to execute it, something like ExTra.Dict.command(ExTra.Dict, command).


The commands are sent to the GenServer and handled by some other modules:

@impl true
def handle_call({:define, dictionary, word}, _from, state) do
  {:reply, ExTra.Dict.Define.define(dictionary, word), state}

def handle_call({:match, dictionary, strategy, word}, _from, state) do
  {:reply, ExTra.Dict.Match.match(dictionary, strategy, word), state}

This level of abstraction may seems a bit convoluted, but using GenServer here would allow for caching matches and definitions, and separating matches and definitions to a separate modules allow for different search modules depending on config. Not a necessary thing, just for educational purpose.

Matching definitions

The .dict file stores entries as well as metadata as plain text, while the .index file store positions of the entries as:

<entry> <start> <length>

Where <start> and <length> are in quartosexagesimal or base 64. Numeral base 64, not base 64 encoding that is implemented in the standard library. After a few shortening, the conversion is done in less than 20 lines:

def base64num(num) do
  alphabet =
    |> Stream.with_index()
    |> Enum.into(%{})

  len = String.length(num)

  |> String.to_charlist()
  |> Stream.with_index()
  |> {c, i} -> {alphabet[c], len - i - 1} end)
  # Left-shift 6 * power bits is equal to multiply by (2^6)^power, but faster
  |> {digit, power} -> Bitwise.bsl(digit, power * 6) end)
  |> Enum.sum()

Firstly, I mapped the digits in the alphabet to its respective values, which are also their indices in the char list. The digits in the input string are then mapped to their values based on this map, while their indices are mapped to the power. Finally, these values are powered and summed as the answer.

Upon reading these values, the definition from the dict files can be retrieved as simply as:

# the content that comes before what we need
_ =, start)
# has to `binread` to interpret the UTF-8 encoded characters
IO.binread(file, length)


The full implementation can be found on SourceHut. As this is the first application I’ve written in Elixir, I’m sure there’s a lot of stuff I’ve written here isn’t recommended, so if you have some suggestion for improvement, please send me.

As of writing, there are still several features I’d like to implement that I haven’t, such as:

  1. The OPTION MIME command is not yet compliant, actually. It currently is a no-op command while it should check 00-database-mime-header in the file and respond an empty line if not present. However, none of the clients (that is, only dict and dico, as far as I know) does anything with that field, so it doesn’t cause any trouble. ↩︎

  2. To be understood as BEAM processes and not OS processes ↩︎


Look at my fedi fellows' sites:
  1. Previous site
  2. What is Fediring?
  3. Next site

Articles from blogs I read

Generated by fead