Sharedwww / meccah / article.texOpen in CoCalc
Author: William A. Stein
1
\documentclass[11pt]{article}
2
\title{\sc MECCAH: The Mathematics Extreme Computation Cluster At Harvard}
3
\author{William Stein}
4
\date{August 2002}
5
\begin{document}
6
\maketitle
7
In Spring 2002 I assembled and configured MECCAH, which is a
8
rack-mounted cluster of six fast computers for use by mathematicians
9
who are doing demanding computational work. This article is about my
10
experience building and maintaining MECCAH. It should be of use to
11
anyone considering undertaking or funding a similar project at their
12
institution.
13
14
As a graduate student at Berkeley and a faculty member at Harvard, the
15
computational resources available to me at my host universities
16
consisted of scattered Sun workstations running at about one-fourth
17
the raw speed of the current Pentium processors. These machines spent
18
much of their time running Netscape and texing documents, so they were
19
not suitable for demanding computations that could easily use all
20
available resources. Sure, at each institution a senior faculty
21
member had a powerful computer (McMullen at Berkeley, Elkies at
22
Harvard), but that was for his own personal use.
23
24
In 2001, the Harvard sysadmin, Arthur Gaer, mentioned that the
25
department was tentatively considering spending several tens of
26
thousands of dollars (that they didn't yet have) on a single
27
multi-processor Sun workstation to support computation-intensive work.
28
My opinion was that such a workstation would be solid but hardly
29
useful; the raw computational power would scarcely touch what two
30
cheap Intel-based Linux boxes could do, though the Linux boxes would
31
likely be less reliable.
32
33
I decided to build a cluster of dual processor machines running Linux.
34
I did research and discussed possible configurations with Berkeley
35
grad student Wayne Whitney and a Harvard undergrad named Alex Healy,
36
and requested money. Finally, I secured a grant of \$6000 from
37
Harvard, and Harvard alumnus William Randolph Hearst III gave me an
38
additional \$14000, which made the budget \$20000.
39
40
I decided to assemble an Athlon-based system. The Athlon 2000MP is a
41
multi-processor-ready Pentium-like CPU that Athlon claims has
42
performance that is similar to a 2GHz Pentium IV. I selected the
43
Athlon 2000MP processor in March because it was the fastest available
44
budget-priced multi-processor capable CPU on the market. Intel's only
45
fast multi-processor capable CPU was the Xeon, which was then much
46
more expensive (the Xeon might be a good choice today). Six months
47
later, Athlon has just announced the 2200MP, so I don't feel like
48
Athlon 2000MPs are out of date.
49
50
In February 2002, I ordered six custom-built
51
Athlon 2000MP machines in 2U-sized rack-mount cases from
52
{\tt www.pcsforeveryone.com}
53
which is a local Cambridge ``chop shop''.
54
They ordered the parts I wanted, assembled them, tested them, found
55
surprisingly often that they were defective, got replacements, and
56
finally delivered the individual computers. I still have occasional
57
hardware reliability problems with one of the nodes, even after
58
returning it for service under warranty, and it is currently off
59
(a CPU fan had failed, so they replaced the CPU fan, but not the CPU,
60
which is a cheap ``solution'' that didn't work).
61
62
Unwrapping the rack and putting the computers in it took Alex Healy a
63
full afternoon. Once assembled, I had to keep the machine in my
64
office, because the math department's server closet was tiny and
65
currently full of equipment. It would be several months until we made
66
room in the server closet for the cluster. In the meantime, I kept a
67
rack of noisy and hot computers running in my office. When students
68
came to see me during office hours, they had to shout over the 30
69
cooling fans in MECCAH.
70
71
And, the fuses kept blowing! My neighbor's office is on the same
72
circuit as mine and when he returned from vacation and turned his
73
computer on, the circuit breaker blew, so I had to call the
74
electricians out to switch it back. I moved back to running only four
75
machines, then once increased to five, again blowing the circuit.
76
77
MECCAH's operating system is Redhat 7.2 with Linux kernel version
78
2.4.16 on all six nodes. MECCAH also uses openMosix, which makes the
79
rack of six computers appear to the user as a single computer with 12
80
processors and 13GB memory (though a single process should not use
81
more memory than on any node). Under openMosix, jobs are
82
automatically migrated from one node to another to dynamically balance
83
the overall system load. Users only have accounts and login
84
privileges for the master node, and never worry about logging into
85
other nodes. I also configured MECCAH to use the ext3 journaled
86
filing system, so, e.g., I can pull the plug from the wall, plug it
87
back in, and have MECCAH back up in five minutes with absolutely no
88
file system corruption.
89
90
For computations, people mainly use MAGMA, PARI, Python, C++, and
91
Mathematica. Though Harvard has a Mathematica site license, I HATE
92
administering Mathematica because the licenses regularly expire and
93
limit the number of copies of Mathematica that can be run at once
94
(there should be a way around the latter problem). MAGMA for Linux,
95
on the other hand, requires no license and is free to me because I'm a
96
MAGMA developer. Evidently, Maple is expensive, so we have only a
97
limited Sun license for Maple in the math department.
98
99
Here is how I organize computation of a basis for the space of modular
100
forms with level N and weight 2 for N between 1 and 1000. I run 12
101
jobs simultaneously that each look to see the next level that hasn't
102
been computed, compute that level, and save the result. If it took 1
103
day to do this computation on my 1Ghz Pentium III last year, it will
104
take only 1 hour to do it on MECCAH. When I am in the throws of a big
105
computation, having this kind of computational resource available to
106
me is extremely exciting. Instead of waiting 1 day, I wait only an
107
hour to generate more than enough data to stimulate theorem proving!
108
109
I've given MECCAH accounts to nearly 80 mathematicians all over the
110
world. Abuse of the system by users is rare but not unheard of.
111
Somewhat surprisingly, the usage pattern comes in bursts. There are
112
almost always at least two or three jobs running, but every so often
113
many mathematicians simultaneously become inspired to run lots of
114
computations all at once.
115
116
I am the only systems administrator of MECCAH, and I typically spend
117
under five hours a week on administrative responsibilities. I still
118
haven't upgraded the Linux kernel or openMosix since March, but I
119
probably should since there have been a few unexplained problems that
120
might be fixed by a Linux and openMosix upgrade. I use a 30GB
121
Onstream ADRx2 tape drive to make regular backups.
122
123
If I were to build a similar cluster from scratch again, I would
124
probably buy more expensive and better warrantied pre-configured
125
dual-processor rack mount nodes instead of custom designing the nodes
126
myself. I definitely would not have kept the computer in my office.
127
When first designing MECCAH, I thought long about whether or not to
128
stack a bunch of conventional cases on shelves or to buy a rack and
129
rack-mount cases. A rack costs nearly \$1000 and rack-mount cases cost
130
more than double what ordinary cases cost. In retrospect, it would
131
have been madness to buy conventional cases and shelves, because I've
132
had to move the cluster around many times, and it barely fits in the
133
tiny server closet. The \$1500 premium for a rack-mounted system was
134
well worth it. I also deliberated between a fancy serial console or a
135
KVM (keyboard, video, mouse) switch; I went with the \$500 KVM, which
136
turned out to be an excellent choice.
137
138
The six nodes are networked via a switched 100Mbps ethernet network.
139
I wish the network were faster, because it takes a few minutes to
140
transfer 1 GB from one computer to another. Since user programs
141
migrate between machines and frequently do use in excess of 1GB
142
memory, this transfer time is significant. I purchased 100Mbps
143
ethernet instead of 1Gbps ethernet, because I read that 1Gbps ethernet
144
with Linux is not very reliable, and there can be significant latency
145
problems. Since I didn't have the resources to experiment with many
146
configurations, I opted for 100Mbps, which is very easy.
147
148
\end{document}
149