How to set up an ejabberd cluster on Amazon EC2 in 6 easy steps
Tenerife Skunkworks |
Boldly going where few have gone before |
This is the first in my series of posts on optimizing Erlang. I plan to tackle optimizing Mnesia, profiling and scalability.
You need to use arrays of up to 10,000 elements. Erlang offers you tuples as well as fixed-size and extensible pseudo-arrays. What is the fastest option?Let us start the death match by pitting arrays against tuples in a death match. Trees were an option before the array module became available, so lets throw in trees just for laughs.I'm running Mac OSX Leopard 10.5.4 on a Mac Pro 2x2.8Ghz Quad-Core Intel Xeon with 14Gb 800Mhz DDR2 FB-DIMM.Erlang (BEAM) emulator version 5.6.3 [source] [64-bit] [smp:8] [async-threads:0] [kernel-poll:false] 27> arr:test(). Fixed-size array: get: 2921us, set: 5902us Extensible array: get: 3336us, set: 8144us Tuple: get: 632us, set: 107467us Tree: get: 4321us, set: 45256us ok 30> arr:test(100000). Fixed-size array: get: 35314us, set: 74653us Extensible array: get: 35349us, set: 74059us Tuple: get: 6411us, set: 24304490us Tree: get: 53681us, set: 632795us okNote that timer:tc returns time in microseconds. I ran each test 3 times and the results above are from the third iteration.Trees in Erlang (gb_trees) are built on top of regular tuples and so is the array module. The array module is much more efficient about using tuples than a regular tree, though, and this is the reason why it's so much faster.The tuple test pre-allocates a tuple of 10k or 100k elements. There's no destructive assignment in Erlang and so the same large tuple needs to be allocated and discarded on every set operation. It's very inefficient to allocate and discard a large tuple on every set operation, thus naive tuple set is very slow.The array module uses an efficient tree-like internal representation:
%% A tree is either a leaf, with LEAFSIZE elements (the "base"), an %% internal node with LEAFSIZE+1 elements, or an unexpanded tree, %% represented by a single integer: the number of elements that may be %% stored in the tree when it is expanded. The last element of an %% internal node caches the number of elements that may be stored in %% each of its subtrees. %% %% Note that to update an entry in a tree of height h = log[b] n, the %% total number of written words is (b+1)+(h-1)*(b+2), since tuples use %% a header word on the heap. 4 is the optimal base for minimizing the %% number of words written, but causes higher trees, which takes time. %% The best compromise between speed and memory usage seems to lie %% around 8-10. Measurements indicate that the optimum base for speed is %% 24 - above that, it gets slower again due to the high memory usage. %% Base 10 is a good choice, giving 2/3 of the possible speedup from %% base 4, but only using 1/3 more memory. (Base 24 uses 65% more memory %% per write than base 10, but the speedup is only 21%.)It's far more efficient to allocate small tuples on every set and this is why the array module wins hands down. Use the code below to replicate my results on your hardware.
Mnesia is the Erlang embbedded distributed DBMS, that supports high scalability and fault tolerance through replication. Mnesia has been used to great success in all kinds of applications but it's not without limitations. These limitations mainly stem from the underlying ETS and DETS mechanisms used to implement Mnesia tables.
There have been attempts to hack around Mnesia limitations before, e.g I wrote a Mnesia backend for Amazon S3 just last summer. All these attempts to add external table functionality to Mnesia suffered from being hard to extend, or were closed source implementations. Thanks to the Dukes of Erl, I finally got a chance to do a proper Mnesia extension, the one to rule them all! The result is a series of about 50 careful git patches that support all the features of regular Mnesia tables such as indexing, replication, distribution, fragmentation and backup, as well as set or bag semantics. Best of all, this functionality is available to anyone using OTP release R11B5. The external table API is supplied as an Erlang behavior so all you need to do is supply a callback module that conforms to it. You can now use Mnesia as a front-end to disk-based hash tables like SleepyCat/BerkeleyDB or Tokyo Cabinet, a memory-mapped table or MySQL.You are no longer constrained by the scalability or size limitations of ETS and DETS so go wild and let me know what you come up with. Oh, and please petition the Erlang/OTP team to include this work in the official Erlang distribution!Mnesia 4.3.5 with the extension mechanism is available from Google Code and so is the Tokyo Cabinet external table.
-module(act).
-compile([export_all]).
parse(Filename) when is_list(Filename) ->
{ok, Bin} = file:read_file(Filename),
parse(Bin).
parse(Bin) when is_binary(Bin) ->
parse(Bin, [], [], []).Note that the empty lists are the initial values for our accumulators. Field is the list of characters that represents the current field we are processing. Line is the list of fields we have gathered while processing the current line. Finally, Acc is the list of lines we have gathered while processing our file.We also have two functions named parse here: one that takes a Filename string (a list of integers) and another that takes a binary. Keeping the function name the same and using guards is strictly a matter of taste. I could have just as well called the second parse function parse1 or do_parse. Erlang functions are distinguished based on their number of arguments and there's a special fun_name/num_args notation to reflect that. Our parse function above would be written as parse/1. When the number of arguments is the same, the functions are distinguished based on guards such as is_list or is_binary above.It's good coding practice to present a neat interface to the outside world. parse/1 above would serve that purpose and *parse/4* below would not. parse/1 takes just one argument whereas parse/4 clutters the interface with three extra arguments. As a user of the parsing module I would not know the purpose of the Field, Line and Acc arguments to parse/4 below.parse(<<$\,, Rest/binary>>, Field, Line, Acc) -> parse(Rest, [], [lists:reverse(Field)|Line], Acc);$\, matches the tab character and Rest matches the rest of the binary. It's quite fast to add to the beginning of the list but the accumulated list needs to be reversed when we are done. Also, we almost certainly have accumulated a field by the time we hit the tab. This is why we reverse the field accumulator, prepend it to the accumulated list of fields and start our next iteration (recurse) with an empty field accumulator.
parse(<<$\r, Rest/binary>>, Field, Line, Acc) -> parse(Rest, Field, Line, Acc);
parse(<<$\n, Rest/binary>>, Field, Line, Acc) -> Field1 = lists:reverse(Field), FieldList = [Field1|Line], parse(Rest, [], [], [FieldList|Acc]);A new line ($\n) means that we have hit the end of our current line and need to start processing a new one. We start this processing with empty field and line accumulators and prepend the list of fields to the lines accumulator Acc.
parse(<<Char, Rest/binary>>, Field, Line, Acc) -> parse(Rest, [Char|Field], Line, Acc);
parse(<<>>, [], [], Acc) ->
{ok, lists:reverse(Acc)};parse(<<>>, Field, Line, Acc) -> parse(<<$\n>>, Field, Line, Acc).What do we do if our accumulators are not empty by the time we are done processing? We can add custom processing code, of course, but wouldn't it be better to leverage code that we have already written? We already go through the proper motions when we find a new line so to make our job easier we make it look like we found one. We continue parsing by creating a small fake binary with a single new line character.Processing binary data the hard way You did not misread it! Yes, the customary way of processing Erlang binaries is the hard way. It's low level and involves lots of typing and a good deal of code duplication. It's also the fastest and most efficient way.I will show you a less efficient but more structured way to process binary data later in this chapter. We need to learn to walk before we learn to run so lets take a look at how you normally process binary data in Erlang.Here's a chunk of the binary protocol that my OpenPoker server uses. Packet format:
0 1 2 N +--+---+--- ... + | Size | Body | +------+--- ... +Body:
0 1 N +------+--- ... ---+ | Type | Arguments | +------+--- ... ---+Each packet starts with a 2-byte packet size then a 1-byte packet type and the data payload. The body of a NOTIFY_JOIN command will then look like this:
0 1 5 9 10 12 +----+-----+-----+-------+-----+ | 21 | GID | PID | Seat# | Seq | +----+-----+-----+-------+-----+GID and PID are 4-byte integers, the seat number is a byte and the sequence number a 2-byte integer. We need one function to read this command from a binary packet and return something easy to deal with, e.g. a tuple.
read(<<?PP_NOTIFY_JOIN, GID:32, PID:32, SeatNum, Seq:16>>) ->
{21, GID, PID, SeatNum, Seq};To send the command out through the socket, we first need to convert the tuple to a binary. write({21, GID, Player, SeatNum, Seq})
when is_number(GID),
is_pid(Player),
is_number(SeatNum),
is_number(Seq) ->
PID = gen_server:call(Player, 'ID'),
<<21, GID:32, PID:32, SeatNum, Seq:16>>;There are scores of commands in the OpenPoker protocol and the number will grow as new functionality is added. I did write code like the above for each and every command in the OpenPoker protocol and I wish I knew a way that enabled more code reuse.What you should walk away with here is that reading and writing binary data in Erlang is simple and straightforward. Pickler combinatorsThe OpenPoker protocol handling code is quite verbose. The protocol is also very much flat as it does not involve nested data structures. There is a way to describe reading and writing of structured data and generally save ourselves time and typing.Andrew Kennedy coined the term pickler combinator in his 2004 'Functional Pearl of the same name. He wrote that
The tedium of writing pickling and unpickling functions by hand is relieved using a combinator library similar in spirit to the well-known parser combinators. Picklers for primitive types are combined to support tupling, alternation, recursion, and structure sharing.
-module(pickle). -export([pickle/2, unpickle/2, test/0]). -export([byte/0, short/0, sshort/0, int/0, sint/0, long/0, slong/0]). -export([list/2, choice/2, optional/1, wrap/2, tuple/1, record/2, binary/1, wstring/0]). -compile([export_all]).Let's design and implement a pickling module that will save us a lot of typing down the road. The goal is to save us a lot of typing and enable us to describe our packet formats in terms of bytes, words and strings, as opposed to bits and binaries.
%%% Pickle and unpickle. We accumulate into a list.
pickle({Pickler, _}, Value) ->
lists:reverse(Pickler([], Value)).
unpickle({_, Pickler}, Bin) ->
element(1, Pickler(Bin)).pickle and unpickle are responsible for doing the work for us and like a good manager they delegate the bulk of the work to their underlings. Note that the pickler combinator is represented by a two-element tuple where the first element is the function used for pickling and the second element for unpickling.To pickle any Erlang term, we give pickle the tuple representing the pickler combinator as well as the value. Binaries, while convenient, don't lend themselves to accumulating values so we accumulate the pickled data into a list. Fortunately for us, the Erlang input/output system can take lists and convert them to binaries for us.To unpickle a binary we invoke the un-pickler on it and take the first element of the resulting tuple. Why does the unpickler return a tuple? We may have data left over from processing and it needs to be stored somewhere. The first element of the tuple stores the result of the unpickle operation and the second stores the remainder of the data.Without further ado lets implement a pickler combinator for serializing byte data. %%% Byte
byte() ->
{fun write_byte/2, fun read_byte/1}.
write_byte(Acc, Byte) ->
[<<Byte:8>>|Acc].
read_byte(Bin) ->
<<Byte:8, Rest/binary>> = Bin,
{Byte, Rest}. byte is the name of the combinator and byte/0 simply returns a tuple of the pickler and unpickler functions. This is the pattern that we will be using over and over again. Also, picklers take the list serving as accumulator as their fist argument and the Erlang value as their second argument. To pickle a byte we simply tell Erlang to prepend the binary representation of the byte <<Byte:8>> to the accumulator list. To unpickle a byte, read_byte/1 splits the binary into the byte itself and the remainder of the data and returns both as a tuple.Simple, isn't it?A pickler combinator for unsigned short values stored in little-endian format looks like this. %%% Unsigned short
short() ->
{fun write_short/2, fun read_short/1}.
write_short(Acc, Word) ->
[<<Word:16/little>>|Acc].
read_short(Bin) ->
<<Word:16/little, Rest/binary>> = Bin,
{Word, Rest}. Signed short is implemented similarly, the only difference is that we use little-signed instead of little. %%% Signed short
sshort() ->
{fun write_sshort/2, fun read_sshort/1}.
write_sshort(Acc, Word) ->
[<<Word:16/little-signed>>|Acc].
read_sshort(Bin) ->
<<Word:16/little-signed, Rest/binary>> = Bin,
{Word, Rest}. I will skip the implementation of signed and unsigned integers and long integers. You can find the full code at the end of this book. It's much more interesting to ponder the design of a list combinator. %%% List. We supply a pickler for list length
%%% as well as a pickler for list elements.
list(Len, Elem) ->
{fun(Acc, List) -> write_list(Len, Elem, Acc, List) end,
fun(Bin) -> read_list(Len, Elem, Bin) end }. To pickle and unpickle a list we need to pickle the length of the list and then pickle each element of the list in sequence. The only requirement for the picker and unpickler functions is to take certain arguments and return values in the format that the library expects.list/2 returns a tuple of anonymous functions that follow this convention. The functions that are ultimately invoked look a bit different. write_list({Len, _}, {Elem, _}, Acc, List) ->
Acc1 = Len(Acc, length(List)),
Fun = fun(A, Acc2) -> Elem(Acc2, A) end,
lists:foldr(Fun, Acc1, List). write_list/4 takes both the pickler for the length of the list and the pickler for each element. Remember that each pickler returns a list of pickled binary chunks. The length pickler itself is primed with the list of previously pickled chunks given to write_list/4 (Acc). Acc1 is the list of binary chunks resulting from the pickling of the list length (Len) together with the list of chunks accumulated before the call to write_list/4. I won't be explaining lists:foldr/3 but we are creating an anonymous function (Fun) that will be invoked for each element of the to-be-pickled list we are given (List). This anonymous function will then invoke the list element pickler and return the list of binary chunks we have pickled so far, to prime list:foldr/3 on its next iteration.Reading a pickled list aka unpickling is, again, a two step operation. read_list({_, Len}, {_, Elem}, Bin) ->
{N, Bin1} = Len(Bin),
read_list(N, [], Elem, Bin1). First we unpickle the list length. This gives us both the number of elements to read and the remainder of our binary data. We then proceed to unpickle the list itself using the data that was left over. read_list(0, Acc, _, Bin) -> {Acc, Bin};
read_list(N, Acc, Elem, Bin) ->
{E, Bin1} = Elem(Bin),
read_list(N - 1, [E|Acc], Elem, Bin1). There are no for loops in Erlang but they are easily emulated with recursion. We unpickle each element of the list, accumulate it and proceed to unpickle the next element after decrementing the number of elements read. We finish once we have reached 0.Often, we will need to store different data depending on some flag. This requires us to both store the value of the flag and store the data. The choice combinator implements this alternative selection. %%% Alternative selection. This could probably use some
%%% deeper thinking. Otherwise, we take a pickler for the tag
%%% as well as a tuple of two functions. The first one
%%% returns the tag value and a pickler based on the supplied
%%% value. The second one selects a pickler based on a tag value.
choice(Tag, Choice) ->
{fun(Acc, Value) -> write_choice(Tag, Choice, Acc, Value) end,
fun(Bin) -> read_choice(Tag, Choice, Bin) end }. There's not much to say about choice itself. It looks a lot like the list combinator that we have just implemented. write_choice({Tag, _}, {Choice, _}, Acc, Value)
when is_function(Tag),
is_function(Choice) ->
{T, {Pickler, _}} = Choice(Value),
Acc1 = Tag(Acc, T),
Pickler(Acc1, Value). read_choice({_, Tag}, {_, Choice}, Bin)
when is_function(Tag),
is_function(Choice) ->
{T, Bin1} = Tag(Bin),
{_, Pickler} = Choice(T),
Pickler(Bin1). We work in the opposite direction to unpickle. We first unpickle the tag and let the choice function (selector) give us the combinator to use for the data. We use this combinator to unpickle the data. %%% Optional value. Use 'none' to indicate no value.
optional(Pickler) ->
{fun(Acc, Value) -> write_optional(Pickler, Acc, Value) end,
fun(Bin) -> read_optional(Pickler, Bin) end}.
write_optional(_, Acc, none) ->
[<<0>>|Acc];
write_optional({Pickler, _}, Acc, Value) ->
Pickler([<<1>>|Acc], Value).
read_optional({_, Pickler}, Bin) ->
<<Opt:8, Bin1/binary>> = Bin,
case Opt of
0 -> {none, Bin1};
_ -> Pickler(Bin1)
end.
To serialize optional values (i.e. value or nothing) we pickle 0 when there's no value and 1 otherwise.The value of the wrapper combinator may not be readily apparent but bear with me, it will come in handy to implement serialization of enumerations! %%% Wrapper. Take a pickler and a wrapper tuple of two functions
%%% where the first one is used to convert the value before
%%% pickling and the second one after unpickling.
wrap(Wrap, Pickler) ->
{fun(Acc, Value) -> write_wrap(Wrap, Pickler, Acc, Value) end,
fun(Bin) -> read_wrap(Wrap, Pickler, Bin) end}.
write_wrap({Wrap, _}, {Pickler, _}, Acc, Value) ->
Pickler(Acc, Wrap(Value)).
read_wrap({_, Wrap}, {_, Pickler}, Bin) ->
{Value, Bin1} = Pickler(Bin),
{Wrap(Value), Bin1}.enum(Enum, Pickler) -> wrap(wrap_enum(Enum), Pickler).The enum/2 combinator takes both the format of the enumeration (list or tuple) as well as the pickler for the enumeration value. It's clear in its simplicity but it's also clear that it's hiding something!What is wrap_enum/1 and why do we need it?
wrap_enum(Enum) when is_tuple(Enum) -> wrap_enum_1(prep_enum_tuple(Enum)); wrap_enum(Enum) when is_list(Enum) -> wrap_enum_1(prep_enum_list(Enum)).Recall that the format of our enumeration can be given both as a tuple and a list. We will pre-process the enumeration format to convert it to a list when it's given to us as a tuple.
prep_enum_tuple(Enum)
when is_tuple(Enum) ->
prep_enum_tuple(Enum, size(Enum), [], []).
prep_enum_tuple(_, 0, Acc1, Acc2) ->
{Acc1, Acc2};
prep_enum_tuple(Enum, N, Acc1, Acc2) ->
prep_enum_tuple(Enum, N - 1,
[{element(N, Enum), N}|Acc1],
[{N, element(N, Enum)}|Acc2]). The above will convert {cow, sheep, horse} into a pair (tuple of two elements) of lists [{cow, 1}, {sheep, 2}, {horse, 3}] and [{1, cow}, {2, sheep}, {3, horse}]. We need the regular list to convert cow into 1 for pickling and the inverted list to convert 1 back into cow. prep_enum_list(Enum)
when is_list(Enum) ->
% expect a list of {tag, #value} pairs
Inv = fun({Key, Val}) -> {Val, Key} end,
InvEnum = lists:map(Inv, Enum),
{Enum, InvEnum}. The only thing we need to do when the format of the enumeration is a list is to create the inverted list of pairs. The anonymous function Inv swaps {cow, 100} for {100, cow}. We use lists:map/2 to apply our anonymous function to each element of the enumeration specification list, thus inverting each element. And here's the ultimate wrapper, the function that we spent so much time gearing up to! wrap_enum_1({List1, List2}) ->
F = fun(A, B) -> A < B end,
%% gb_trees needs an ordered list
Dict1 = lists:sort(F, List1),
Dict2 = lists:sort(F, List2),
Tree1 = gb_trees:from_orddict(Dict1),
Tree2 = gb_trees:from_orddict(Dict2),
{fun(Key) -> gb_trees:get(Key, Tree1) end,
fun(Key) -> gb_trees:get(Key, Tree2) end}.The gb_trees module implements a lightweight hash table using Erlang tuples. We order our enumeration lists and convert them to gb_trees to make lookups simple. We save two trees, one to look up cow using one and another to look one up using cow. This completes our processing of enumerations. There's one last hill to climb and then I promise you that we will be in combinator nirvana! But first we need to handle serialization of tuples and Erlang records. %%% Tuple. Uses a tuple of picklers of the same size.
tuple(Picklers)
when is_tuple(Picklers) ->
wrap({fun tuple_to_list/1,
fun list_to_tuple/1},
tuple_0(tuple_to_list(Picklers))). We reuse the wraping code we just develop to ensure that each tuple is pickled as a list and that each list is converted back to a tuple after we deserialize it. record(Tag, Picklers)
when is_tuple(Picklers) ->
wrap({fun(Record) -> record_to_list(Tag, Record) end,
fun(List) -> list_to_record(Tag, List) end},
tuple_0(tuple_to_list(Picklers))). We rely on Erlang records being tuples and just add the record tag as the first element when unpickling the record. write_tuple_0([], Acc, _) ->
Acc;
write_tuple_0([{Pickler, _}|Rest], Acc, [Value|Tuple]) ->
write_tuple_0(Rest, Pickler(Acc, Value), Tuple).
read_tuple_0(Picklers, Bin) ->
read_tuple_0(Picklers, Bin, []).
read_tuple_0([], Bin, Acc) ->
{lists:reverse(Acc), Bin};
read_tuple_0([{_, Pickler}|Rest], Bin, Acc) ->
{Value, Bin1} = Pickler(Bin),
read_tuple_0(Rest, Bin1, [Value|Acc]). We serialize a tuple by requiring a tuple of combinators of the same length.We then convert the tuple of combinators to a list to simplify processing. We pickle and unpickle recursively, using the accumulator idiom. %%% It's convenient to be able to convert the tuple
%%% to a list first as there's no erlang:prepend_element/2.
tuple_0(Picklers)
when is_list(Picklers) ->
{fun(Acc, Value) -> write_tuple_0(Picklers, Acc, Value) end,
fun(Bin) -> read_tuple_0(Picklers, Bin) end}.
record_to_list(Tag, Record)
when is_atom(Tag) ->
lists:nthtail(1, tuple_to_list(Record)).
list_to_record(Tag, List)
when is_atom(Tag),
is_list(List) ->
list_to_tuple([Tag|List]). Erlang records are regular tuples where the first element of the tuple is the record tag. When pickling a record we convert the record to a list and drop the first element (tag). When unpickling we prepend the tag. We never store the tag itself since the record combinator knows it.We'll skip the implementation of the binary combinator since it's nothing fancy. You can find the full code at the end of the book.Now is a good time to tackle testing.Erlang macro facilities could definitely be better but they are also nothing to sneeze at! Lets define a couple of macros to help us with writing our unit tests. -define(error1(Expr, Expected, Actual),
io:format("~s is ~w instead of ~w at ~w:~w~n",
[??Expr, Actual, Expected, ?MODULE, ?LINE])).
-define(match(Expected, Expr),
fun() ->
Actual = (catch (Expr)),
case Actual of
Expected ->
{success, Actual};
_ ->
?error1(Expr, Expected, Actual),
erlang:error("match failed", Actual)
end
end()). We will be using match/2 a lot to make sure that whatever we pickle matches whatever we get after unpickling.check(Pickler, Value) -> X = pickle(Pickler, Value), Bin = list_to_binary(X), unpickle(Pickler, Bin).This little function is our testing workhorse. It takes a pickler combinator and a value and proceed to pickle a value and unpicle it using the same combinator.
test1() -> X = 16#ff, ?match(X, check(byte(), X)). test3() -> X = -1, ?match(X, check(sshort(), X)).This is what our tests look like. We give check a combinator and a value and use our match macro to check results. I'll skip a few non-essential tests since you have the full source code at the end of the book. I do want to spend time on other tests, though, since these tests server as the manual and documentation for our pickler combinator library.
test8() -> X = "Wazzup!", ?match(X, check(list(int(), byte()), X)).We serialize the length of the list as an integer and each elment as a byte. Erlang strings are nothing but lists of integers but ASCII character values fit within a byte. There's nothing preventing you from using the int combinator to serialize elements of the list, though.
value2tag(Action)
when is_list(Action) ->
{0, list(byte(), byte())}; We want to serialize either a list of bytes where the length of the list is also stored as a byte or a long value. We use 0 as a tag for the list. value2tag(_) ->
{1, long()}. And store 1 when the following value is a long.tag2value(0) -> list(byte(), byte()); tag2value(1) -> long().We reverse things when unpickling. If a tag value of 0 is found then we return the list combinator and otherwise return the long one.
selector() ->
{fun value2tag/1, fun tag2value/1}. selector simply combines the above tag convertors together.test9() -> X1 = "Just testing", X2 = 16#ffff, ?match(X1, check(choice(byte(), selector()), X1)), ?match(X2, check(choice(byte(), selector()), X2)).We need to make sure that pickling and unpickling both the string (list) and the long value works. And it does!
test10() -> X1 = none, X2 = 55, ?match(X1, check(optional(byte()), X1)), ?match(X2, check(optional(byte()), X2)).We use none to stand for no value.
test11() ->
%% tuple enum
Enum1 = {cow, sheep, horse},
{FROM1, TO1} = wrap_enum(Enum1),
?match(1, FROM1(cow)),
?match(2, FROM1(sheep)),
?match(3, FROM1(horse)),
?match(cow, TO1(1)),
?match(sheep, TO1(2)),
?match(horse, TO1(3)),
%% list enum
Enum2 = [{cow, 20}, {sheep, 30}, {horse, 40}],
{FROM2, TO2} = wrap_enum(Enum2),
?match(20, FROM2(cow)),
?match(30, FROM2(sheep)),
?match(40, FROM2(horse)),
?match(cow, TO2(20)),
?match(sheep, TO2(30)),
?match(horse, TO2(40)). Personally, I found the enumeration combinator the most tedious to describe, probably because there's so much supporting code. That supporting code needs to be tested and we do it here. wrap_enum works like a charm! test12() ->
Enum1 = {cow, sheep, horse},
Enum2 = [{cow, 20}, {sheep, 30}, {horse, 40}],
?match(cow, check(enum(Enum1, byte()), cow)),
?match(sheep, check(enum(Enum2, byte()), sheep)). Once we know our supporting code works, we can proceed with the rest. Here we test enumerations given as a tuple and a list of key/value pairs. Farm animals galore! test13() ->
Tuple = {"Joel", 16#ff00, none},
Spec = {list(byte(),byte()), short(), optional(byte())},
?match(Tuple, check(tuple(Spec), Tuple)). Tuple combinators, remember those? We use a tuple of combinators to pickle a tuple of values. The size of both tuples must be the same but the values can be anything. Here we use a string, a short and an optional byte. -record(foo, { a, b }).
-record(bar, { c, d }).
-record(baz, { e, f }). Records are important to an Erlang programmer, even though they are syntactic sugar on top of tuples. I would be quickly left without hair to tear out, if I always had to use element/2 to access tuple elements by number.We define a we records for the purposes of testing. test14() ->
R = #baz{ e = 30, f = "Enough nesting!" },
R1 = #foo{ a = 10, b = #bar{ c = 20, d = R }},
Pickler = record(foo, {
byte(),
record(bar, {
int(),
record(baz, {
sshort(),
list(byte(), byte())
})
})
}),
?match(R1, check(Pickler, R1)). Here's the mother of all tests where we pickle a bunch of nested records with mixed values and restore them back to their original glory. This concludes our presentation of pickler combinators. Please feel free to use them as you see fit and don't forget to send me any ideas for improvement!The last thing to note is that the pickler combinator approach is definitely less efficient than processing binaries directly. Still, it is suitable for lots of applications where programming productivity is more important than ultimate efficiency.Full pickler combinator source code is available.This article describes how to upgrade an Erlang cluster in one fell swoop once you have deployed it on Amazon EC2.
Why not the Erlang/OTP upgrade procedureThe standard and sanctioned way of deploying and upgrading Erlang applications is described in chapters 10-12 of the OTP Design Principles. Calling the upgrade procedure complex is an understatement. Bowing to the OTP application packaging procedure, I wanted to have a way of upgrading applications with a "push of a button". More precisely, I wanted to be able to type make:all() to rebuild my application and then type sync:all() to push updated modules to all nodes in my cluster. These nodes were previously set up as diskless Amazon EC2 nodes that fetch their code from the boot server since I didn't want to reinvent the application packaging wheel.The sync applicationThe principal application deployed in the cluster is the sync app. This is a gen_server set up according to chapter 2 of the OTP Design Principles. The gen_server handles requests to restart the Erlang node without shutting down and set environment variables, as well as requests upgrade the code by application or by process. Each sync gen_server joins the 'SYNC' distributed named process group and this is what enables upgrade of the whole cluster in one fell swoop. The sync server will invoke init:restart/0 to restart the node without shutting down upon receiving a RESTART request. This is incredibly handy since the restart sequence takes the contents of the Erlang VM to the trash can and then repeats the same steps taken by the Erlang VM when it is started from the command line. Which is to say that the VM loads the boot file from the boot server, parses the boot file, downloads the applications and runs them. If we have upgraded the code on the boot server then the Erlang VM will run new code after a restart. Upgrading by application or by processThe above procedure is quite intrusive since all apps running in the Erlang VM are killed. Any Erlang node will normally be running a number of apps and you may want to upgrade just one or two of them. This is where the "upgrade by application" procedure comes in. application:get_application/1 will give you the name of the application that a module belongs to. I build a unique list of applications that my changed modules belong to and then stop each application with application:stop/1, re-load changed modules and start the application with application:start/1. The upgrade process by process procedure first grabs a list of all processes running in the same node as the sync gen_server. It does this by calling processes(). I check whether each process is running the code in one of the modified modules using erlang:check_process_code/2. Next, I suspend affected processes with erlang:suspend_process/1, re-load changed modules with erlang:resume_process/1 and I'm done.Reloading modules for fun and profitI'm still not absolutely sure if I got reloading of changed modules right but it looks like this load_modules([]) ->
ok;
load_modules([Mod|T]) ->
code:purge(Mod),
code:soft_purge(Mod),
{module, Mod} = code:load_file(Mod),
load_modules(T).The need to call code:soft_purge/1 after code:purge/1 was determined empirically.Everything I have described thus far is small bits of code. The biggest chunk of code in the sync server figures out what modules were modified. What to reload: Inspecting module versionsRemember my original intent to run make:all/0 followed by sync:all/0 to upgrade all nodes in the cluster at the same time? It's only possible because 1) it's possible to grab the module version from a module loaded into memory, 2) it's possible to grab the same from a module on disk and, crucially, modules are not reloaded when make:all/0 is run.The module version defaults to the MD5 checksum of the module if no -vsn(Vsn) attribute is given. For the life of me I can't remember where Module:module_info() is documented but this is what you use to grab the attributes of the module. It's a property list so you can use proplists:get_value/2 to grab the vsn property and thus the module version.To take advantage of local processing power, the API initiating the upgrade request does no work apart from inspecting the SYNC distributed named process group and telling each sync gen_server in the group to initiate the upgrade procedure. This means that each module loaded into the Erlang node hosting the sync server needs to be checked for changes.Grabbing the version of the BEAM file holding the code for a given module is done using beam_lib:version/1. This is complicated by the fact that all of the Erlang EC2 nodes in the cluster download their code from the boot server. Normally, beam_lib:version/1 takes either a module name, a file name or a binary. I haven't documented why I'm not using a module name or a file name in the boot server scenario but I must have found them not to work. I had to resort to fetching the module BEAM file from the boot server and inspecting that. Fortunately, traffic between EC2 instances is free and fast and the same applies to your LAN.To find out if a module is modified I grab the list of loaded modules with code:all_loaded/0 and inspect each module with code:is_loaded/1. I skip preloaded modules (see documentation for code:is_loaded) and use the path returned otherwise to instruct erl_prim_loader:get_file/1 to fetch the BEAM file. I then pass the file contents to beam_lib:version/1 and I have my disk version. After that it's a simple matter of comparing the two versions and reloading the module if they differ.
#!/bin/sh
COOKIE=RRFJBVGLSOUPFLWVEYJP
BOOT=/Users/joelr/work/erlang/sync/ebin/diskless
HOST=192.168.1.33
ID=diskless
erl -name $ID -boot $BOOT -setcookie $COOKIE -id $ID -loader inet -hosts $HOST -mode embedded ${1+"$@"}You will also need to create a boot file which must be created with full paths inside (local option to systools:make_script/2).To build a boot file I use these two lines of Erlang:code:add_path("./ebin").
systools:make_script("diskless", [local, {outdir, "./ebin"}]).
It started innocently enough with my wondering why High-performance Erlang (HiPE) was not available on my shiny new MacBook Pro. This turned into a long but awesome email exchange and culminated in victory, and my port of HiPE to Mac Intel, less than two weeks later.
I learned a fair number of things about the Mac OSX kernel, the FPU, floating-point exceptions, Intel 32-bit architecture and SSE2 along the way. I also had an awesome time with assembler code.I could not have done it without the mach_override library and numerous tips from Mikael Pettersson as well as other folks on the Erlang Questions mailing list. I will be shipping my patches to the HiPE team this week.