Polymorphism in Erlang

26 Jun 2008, 14:59 PDT

Over the weekend, I wrote my first Erlang application of any size -- an XML-RPC server that supplies geocoding data from a PostGIS database running the tiger_geocoder.

With the desire to support arbitrary geocoder data sources (including a mock data source for unit tests), I set out to discover how to best implement polymorphism in Erlang:

public interface Geocoder {
    public Geometry geocode (final String address) throws GeocoderException;
}

Process Polymorphism

In Erlang, threads (processes, in Erlang parlance) communicate via message passing. In this, processes are polymorphic -- any message may be sent to any process, allowing for handling by disparate implementations.

However, there is a limitation to leveraging this method. Individual processes handle messages serially, not concurrently. If your geocoder implementation does not require serialized execution, then relying on processes constrains the natural concurrency of your implementation.

Polymorphic Function Dispatch

In Erlang, code is organized into modules, with each module declaring a set of exported functions. While modules are analogous to objects, Erlang's modules differ in their inability to maintain any state (a small fib -- explained in the Parameterized Modules section below).

Modules themselves are first-class entities in Erlang -- references may be assigned to variables, and thus a limited form of stateless polymorphism introduced:

Eshell V5.6.2  (abort with ^G)
1> Mod = lists.
lists
2> Mod:reverse([1, 2, 3, 4]).
[4,3,2,1]

Like Java classes, Erlang modules may declare their implementation of a behavior (ie, an interface). Validated at compile time, behaviors define the functions that a module should implement. By combining behaviors and module polymorphism, we can achieve functionality analogous to Java interfaces.

First, let's define a geocoder behavior that dispatches function calls to the concrete implementation. A behavior may be defined by implementing a module which exports a behaviour_info function:

-module(geocoder).
-export([behaviour_info/1]).
 
%% A geocoder instance.
%%
%% @type geocoder() = #geocoder {
%%  module = term(),
%%  state = term()
%% }
-record(geocoder, {
    module,
    state
}).
 
% Return a list of required functions and their arity 
behaviour_info(callbacks) -> [{geocode, 2}];
behaviour_info(_Other) -> undefined.
 
% Create a new geocoder instance with the provided Module and State.
% This method should not be called directly -- use the concrete implementation
create(Module, State) ->
    Geocoder = #geocoder { module = Module, state = State },
    {ok, Geocoder}.
 
  
% Geocode an address string, returning the normalized geocode_address()
% record and WGS84 geocode_coordinates().
geocode(Geocoder, AddressString) ->
    (Geocoder#geocoder.module):geocode(Geocoder#geocoder.state, AddressString).

Now we can define our concrete implementation that implements the geocoder behavior -- a mock geocoder used for unit testing:

-module(geocoder_mock).
-behavior(geocoder).
 
% Create a new instance 
create() ->
    geocode_source:create(?MODULE, undefined).
 
geocode(State, AddressString) ->
    {ok, #geocode_coordinates{latitude = "43.162523", longitude = "-87.915512"}}.

To use our mock geocoder, we first construct an instance, and then dispatch all calls through the geocoder module:

Geocoder = geocoder_mock:create(),
Coordinates = geocoder:geocode(Geocoder, "565 N Clinton Drive, Milwaukee, WI 53217").

Experimental: Parameterized Modules

Parameterized modules are a new addition to Erlang, and remain undocumented and experimental. It's worth reading Richard Carlsson's paper, Parameterized modules in Erlang.

In short, using parameterized modules one can construct a module that maintains (immutable) instance state:

M = geocoder_mock:new("Your mock geocoder").

There are a few downsides to this functionality: