SE 765 - Distributed Software Development CS 610 - Introduction to Parallel
and Distributed Computing |
Erlang 3 – Distributed Programming I |
This
lecture uses materials and has been adapted from:
http://www.erlang.org/doc/pdf/otp-system-documentation.pdf
http://learnyousomeerlang.com/distribunomicon
Setting
up an Erlang
Cluster
·
Erlang
gives names to each of the nodes to be able to locate and contact them.
·
The names are of the form Name@Host,
where the host is based on available DNS entries, either over the network or in
your computer's host files (/etc/hosts on OSX, Linux and other Unix-likes, C:\Windows\system32\drivers\etc\hosts for
most Windows installs).
·
All names need to be unique to avoid
conflicts — if you try to start a node with the same name as another one on the
same exact hostname, you'll get a crash message.
There
are two types of names: short names and long names.
·
Long names are based on fully qualified
domain names (aaa.bbb.ccc),
and many DNS resolvers consider a domain name to be fully qualified if they
have a period (.) inside of it.
·
Short names will be based on host names
without a period, and are resolved going through your host file or through any
possible DNS entry.
Because
of this, it is generally easier to set up a bunch of Erlang
nodes together on a single computer using short names than long names.
One
last thing: because names need to be unique, nodes with short names cannot
communicate with nodes that have long names, and the opposite is also true.
To
pick between long and short names, you can start the Erlang
VM with two different options:
erl -sname short_name@domain
or
erl -name long_name@some.domain.
Note: that you can also
start nodes with only the names: erl -sname short_name or erl -name long_name.
·
Erlang
will automatically attribute a host name based on your operating system's
configuration.
·
Lastly, you also have the option of
starting a node with a name such as erl
-name name@127.0.0.1 to
give a direct IP.
Note: Windows users should
still use werl instead of erl.
However, in order to start distributed nodes and giving them
a name, the node should be started from the command line instead of clicking some shortcut or executable.
Let's
start two nodes:
erl -sname
ketchup ... (ketchup@ferdmbp)1> erl -sname
fries ... (fries@ferdmbp)1> |
To
connect fries with ketchup (and make a delicious cluster) go to the first shell
and enter the following function:
(ketchup@ferdmbp)1> net_kernel:connect(fries@ferdmbp). true |
The
net_kernel:connect(NodeName) function sets up a
connection with another Erlang node (some tutorials
use net_adm:ping(Node)
·
If you see true as the result from
the function call, congratulations, you're in distributed Erlang
mode now.
·
If you see false, then you're in for a
world of hurt trying to get your network to play nice. For a very quick fix,
edit your host files to accept whatever host you want. Try again and see if it
works.
You
can see your own node name by calling the BIF node()
and
see who you're connecting to by calling the BIF nodes():
(ketchup@ferdmbp)2> node(). ketchup@ferdmbp (ketchup@ferdmbp)3> nodes(). [fries@ferdmbp] |
To
get the nodes communicating together, register each shell's process as shell locally:
(ketchup@ferdmbp)4> register(shell, self()). True (fries@ferdmbp)1> register(shell, self()). true |
You
can call the process by name.
The
way to do it is to send a message to {Name, Node}.
Let's
try this on both shells:
(ketchup@ferdmbp)5> {shell, fries@ferdmbp} !
{hello,
from,
self()}. {hello,from,<0.52.0>} (fries@ferdmbp)2> receive
{hello, from, OtherShell} -> OtherShell ! <<"hey there!">> end. <<"hey
there!">> |
So
the message is apparently received, and we send something to the other shell,
which receives it:
(ketchup@ferdmbp)6> flush(). Shell got
<<"hey there!">> ok |
·
We transparently send tuples, atoms, pids, and binaries without a problem.
·
Any other Erlang
data structure is fine too. And that's it.
If
you recall the beginning of the chapter, I had mentioned the idea that all Erlang nodes are set up as meshes. If someone connects to a
node, it gets connected to all the other nodes. There are times where what you
want to do is run different Erlang node clusters on
the same piece of hardware. In these cases, you do not want to be accidentally
connecting two Erlang node clusters together.
Because
of this, the designers of Erlang added a little token
value called a cookie. While documents like the official Erlang documentation put cookies under the topic of
security, they're really not security at all. If it is, it has to be seen as a
joke, because there's no way anybody serious considers the cookie a safe thing.
Why? Simply because the cookie is a little unique value that
must be shared between nodes to allow them to connect together. They're
closer to the idea of user names than passwords and I'm pretty sure nobody
would consider having a username (and nothing else) as a security feature.
Cookies make way more sense as a mechanism used to divide clusters of nodes
than as an authentication mechanism.
To give a cookie to a node, just start it by adding a -setcookie Cookie argument to the
command line.
Let's try again with two new nodes:
$ erl -sname salad -setcookie 'myvoiceismypassword' ... (salad@ferdmbp)1> $ erl -sname mustard -setcookie 'opensesame' ... (mustard@ferdmbp)1> |
Now
both nodes have different cookies and they shouldn't be able to communicate
together:
(salad@ferdmbp)1> net_kernel:connect(mustard@ferdmbp). false |
This
one has been denied. Not many explanations. However, if we look at the mustard
node:
=ERROR REPORT====
10-Dec-2011::13:39:27 === ** Connection
attempt from disallowed node salad@ferdmbp ** |
Good.
Now what if we did really want salad and mustard to be together? There's a BIF
called erlang:set_cookie/2 to do what we need.
If you call erlang:set_cookie(OtherNode,
Cookie),
you will use that cookie only when connecting to that other node. If you
instead use erlang:set_cookie(node(), Cookie), you'll be changing
the node's current cookie for all future connections. To see the changes, use erlang:get_cookie():
(salad@ferdmbp)2> erlang:get_cookie(). myvoiceismypassword (salad@ferdmbp)3> erlang:set_cookie(mustard@ferdmbp,
opensesame). true (salad@ferdmbp)4> erlang:get_cookie(). myvoiceismypassword (salad@ferdmbp)5> net_kernel:connect(mustard@ferdmbp). true (salad@ferdmbp)6> erlang:set_cookie(node(), now_it_changes). true (salad@ferdmbp)7> erlang:get_cookie(). now_it_changes |
One
of the first things we've learned in Erlang was how
to interrupt running code using ^G (CTRL + G). In there, we had seen a menu for
distributed shells:
(salad@ferdmbp)1> User switch command --> h c [nn]
- connect to job i [nn]
- interrupt job k [nn]
- kill job j
- list all jobs s
[shell] - start local shell r [node
[shell]] - start remote shell q
- quit erlang ? |
h -
this message |
The
r [node [shell]]
option
is the one we're looking for. We can start a job on the mustard node by doing as
follows:
--> r mustard@ferdmbp --> j 1 {shell,start,[init]} 2* {mustard@ferdmbp,shell,start,[]} --> c Eshell V5.8.4 (abort with ^G) (mustard@ferdmbp)1> node(). mustard@ferdmbp |
Remote
shell can be used the same way you would with a local one.
By
using ^G again, you can go
back to your original node.
Be
careful when you stop your session though. If you call q() or init:stop(),
you'll
be terminating the remote node!
Ping pong example modified to run on two
separate nodes:
-module(tut17).
-export([start_ping/1, start_pong/0,
ping/2, pong/0]).
ping(0, Pong_Node) ->
{pong, Pong_Node} ! finished,
io:format("ping
finished~n", []);
ping(N, Pong_Node) ->
{pong, Pong_Node} ! {ping, self()},
receive
pong ->
io:format("Ping received pong~n",
[])
end,
ping(N - 1, Pong_Node).
pong() ->
receive
finished ->
io:format("Pong finished~n",
[]);
{ping, Ping_PID} ->
io:format("Pong received ping~n",
[]),
Ping_PID ! pong,
pong()
end.
start_pong() ->
register(pong,
spawn(tut17, pong, [])).
start_ping(Pong_Node) ->
spawn(tut17, ping,
[3, Pong_Node]).
Assume two
computers called gollum and kosken.
We will start
a node on kosken called ping and then a node on gollum called pong.
On kosken (on a Linux/Unix system):
kosken> erl
-sname ping
Erlang (BEAM) emulator version 5.2.3.7 [hipe]
[threads:0]
Eshell V5.2.3.7
(abort with ^G)
(ping@kosken)1>
On gollum:
gollum> erl
-sname pong
Erlang (BEAM) emulator version 5.2.3.7 [hipe]
[threads:0]
Eshell V5.2.3.7
(abort with ^G)
(pong@gollum)1>
Now start the
"pong" process on gollum:
(pong@gollum)1> tut17:start_pong().
true
and start the "ping"
process on kosken (from the code above you will see
that a parameter of the start_ping function is the node name
of the Erlang system where "pong" is
running):
(ping@kosken)1> tut17:start_ping(pong@gollum).
<0.37.0>
Ping received
pong
Ping received
pong
Ping received
pong
ping finished
Here we see
that the ping pong program has run, on the "pong" side we see:
(pong@gollum)2>
Pong received
ping
Pong received
ping
Pong received
ping
Pong
finished
(pong@gollum)2>
Looking at
the tut17 code we see that the pong function itself is unchanged, the lines:
{ping, Ping_PID}
->
io:format("Pong received ping~n", []),
Ping_PID ! pong,
works in the same way
irrespective of on which node the "ping" process is executing.
Erlang pids contain information about where the process executes
so if you know the pid of a
process, the "!" operator can be used to send it a message if the
process is on the same node or on a different node.
A difference
is how we send messages to a registered process on another node:
{pong, Pong_Node} ! {ping, self()},
We use a
tuple {registered_name,node_name} instead of just the registered_name.
· In the previous example, we started
"ping" and "pong" from the shells of two separate Erlang nodes.
·
spawn can also be used to start
processes in other nodes.
·
The next example is the
ping pong program, yet again, but this time we will start "ping" in
another node:
-module(tut18).
-export([start/1,
ping/2, pong/0]).
ping(0, Pong_Node) ->
{pong, Pong_Node} ! finished,
io:format("ping
finished~n", []);
ping(N, Pong_Node) ->
{pong, Pong_Node} ! {ping, self()},
receive
pong ->
io:format("Ping received pong~n",
[])
end,
ping(N - 1, Pong_Node).
pong() ->
receive
finished ->
io:format("Pong finished~n",
[]);
{ping, Ping_PID} ->
io:format("Pong received ping~n",
[]),
Ping_PID ! pong,
pong()
end.
start(Ping_Node) ->
register(pong,
spawn(tut18, pong, [])),
spawn(Ping_Node, tut18, ping, [3, node()]).
Assuming an Erlang system called ping (but not the "ping"
process) has already been started on kosken, then on gollum we do:
(pong@gollum)1> tut18:start(ping@kosken).
<3934.39.0>
Pong received
ping
Ping received
pong
Pong received
ping
Ping received
pong
Pong received
ping
Ping received pong
Pong finished
ping finished
Notice we get
all the output on gollum. This is because the io system finds out where the
process is spawned from and sends all output there.