| Hosted by CoCalc | Download
1
\documentclass{article}
2
3
\usepackage{hyperref}
4
5
\title{Collaborative Calculation in Your Web Browser}
6
\author{William Stein}
7
8
\begin{document}
9
\maketitle
10
11
I greatly appreciate the opportunity to write about online resources for
12
mathematics in this theme of the Early Career Section, since such resources are
13
something that I have long enjoyed sharing with the community. I first started
14
creating websites for number theorists in the late 1990s when I was a graduate
15
student at UC Berkeley. I started by creating tables of modular forms, inspired
16
by a question from Ken Ribet, then encouraged by other people who told me that
17
my data inspired their research (e.g., \cite{emerton}). I later became involved
18
with the $L$-functions and Modular Forms Database (LMFDB) project \cite{lmfdb}, which organized dozens of mathematicians to
19
make amazing tables of number theoretic data.
20
21
I also created interactive online calculators, in which you could input the
22
parameters of some mathematical object (e.g., an elliptic curve) into a web
23
page, and the web server would compute and display extensive information. Later,
24
when teaching undergraduate number theory courses at Harvard, I made similar
25
online calculators that enabled my students to run calculations using PARI and
26
Magma (see \cite{magma}), so that they could explore nontrivial
27
computations in algebraic number theory. I wanted something similar to Magma
28
that was free and open source, so I started the Python-based software SageMath
29
in 2004 (see \cite{sage}).
30
31
SageMath was difficult to install and only had a command line interface, so I
32
started working to make it possible to use in a web browser. At University of
33
Washington (UW) in 2006, some students and I undertook the creation of a
34
web-based interactive notebook interface called the Sage Notebook, which was
35
inspired by Mathematica's desktop notebook. The Sage Notebook was the first
36
serious web-based computational notebook, and it was challenging for us to
37
implement because of the primitive Javascript and HTML technology of the day,
38
and the dangers arising from running arbitrary user code on my server. I hosted
39
the notebook publicly for anybody to use starting in 2007, and my students and
40
others frequently used it in courses, summer student research programs, and
41
SageMath development workshops. A few years later the IPython project created a
42
new notebook called Jupyter that looked and felt similar, but with a more
43
modern underlying architecture that was built to work with a wide range of
44
programming environments. The arrival of the Jupyter notebook was fantastic
45
news, because by that time the Sage Notebook's design began to feel antiquated;
46
moreover, Jupyter benefited from a multimillion dollar grant that resulted in
47
substantial open source software development and publicity.
48
49
To address more general needs related to my teaching, I started dreaming about
50
creating another web application called CoCalc in 2012 (see \cite{cocalc}).
51
CoCalc is an abbreviation for ``Collaborative Calculation", and I hoped it could
52
provide a unified home for everything I was using for my teaching and research.
53
My favorite undergraduate course to teach at UW was called ``Mathematical
54
Computation"; it covered \LaTeX{}, R, Python, and computational statistics,
55
abstract algebra, graph theory, symbolic calculus and cryptography using SageMath, and
56
today it would significantly overlap with an introductory data science course. I
57
wanted a web application that would be ideal for hosting this course.
58
59
Prior to launching CoCalc in 2013, I taught Mathematical Computation many times
60
using the Sage Notebook. However, the Sage Notebook did not adequately support
61
important topics that I wanted to teach, including ``How to create \LaTeX{}
62
documents'', ``How to use a Linux terminal'', and ``How to develop code in files
63
instead of a notebook''. Furthermore, it did not possess the crucial ability to
64
do simultaneous multiuser editing, which is a key feature of popular web
65
applications today. Over the years I noticed two things about my students: (1)
66
they appreciated working collaboratively with others, especially on final
67
projects, and (2) they wanted to work collaboratively with themselves via a
68
``time machine", in the sense that they wanted easy access to exactly what they
69
were doing 15 minutes ago before they messed everything up. Having mulled over
70
these issues, I decided I wanted to create a single web application that would
71
support collaborative editing of Jupyter notebooks and \LaTeX{} documents,
72
multiuser Linux terminals, integrated chat, and a detailed browsable time
73
travel history of editing documents. The first release incorporated some of
74
these things, but with stripped down functionality, and it was hosted on a
75
single desktop in my office.
76
77
That being said, I made CoCalc public and free just like the Sage Notebook, and
78
a few other professors were interested in trying it out. Several students in my
79
course were skilled at programming and began helping with its development.
80
However, the instructors who first tried the application reported being annoyed
81
with the tedium involved with uploading and downloading assignments. To combat
82
this, we implemented a basic course management system, which automated
83
distribution and collection of assignments.
84
85
With this course management system in place, we received much more usage.
86
Initially, I bought a few dozen servers using a grant, and hosted them in data
87
centers on the UW campus. As usage grew, I received free credits on Google
88
Cloud, and moved hosting to Google, which made dynamic scaling in response to
89
variations in load much more efficient. We ended up facing a number of
90
challenges due to the nature of our users, who were mainly people in courses
91
making heavy use of Python and R. In one course, the students would regularly
92
create massive plots, so we had to come up with better ways of dealing with huge
93
output, especially in the context of simultaneous multiuser editing. Over a
94
period of several years, we reimplemented the Jupyter stack for our purposes,
95
while attempting to preserve the functionality, look and feel of the official
96
Jupyter notebook.
97
98
We also found that instructors needed access to a broad software stack, and we
99
accumulated hundreds of gigabytes of installed software, along with automated
100
scripts to install and test it. It was a major technical challenge for us to
101
make this software quickly available for use across the cluster, periodically upgrade it
102
with automated testing, and provide stable images over time. Along the way, some
103
of our solutions to these problems failed at scale, which resulted in lost
104
data, difficult weeks of hard work and many sleepless nights. Fortunately, we
105
eventually engineered robust solutions to these problems, which have been
106
working for several years.
107
108
In 2019, I officially resigned my Full Professor position at UW to work full
109
time on CoCalc as a business. Today this business is reasonably stable, with a
110
manageable number of issues, and enough paying customers so that our team can
111
proactively improve CoCalc, rather than reacting to crises. The company's daily
112
operations are funded entirely by customers. Much of our current development
113
effort on CoCalc is driven by the following question, which can be asked about
114
every relevant piece of software: ``What would $X$ look like if one were to
115
focus hard for 10 years on perfecting it?" This principle guides our development
116
efforts, and we strive to make CoCalc faster to load and navigate, and to
117
improve its efficiency, stability, and usability. As an example, we are currently working to provide beginner friendly graphical user interfaces for editing
118
Jupyter notebooks and other documents. Most CoCalc users are beginners at
119
scientific computation, so a graphical editor, with an easily browsable
120
libraries of code snippets, makes CoCalc significantly more accessible to them.
121
122
During the last 20 years, over one thousand people have contributed to the
123
online tools I have described here. Today these tools are relatively mature and
124
powerful. I hope you find them as useful as I do, and that maybe you will help
125
improve them over the coming decade.
126
127
\begin{thebibliography}{2}
128
129
\bibitem{cocalc} \url{https://cocalc.com}
130
\bibitem{emerton} \url{https://link.springer.com/article/10.1007/s00208-003-0449-2}
131
\bibitem{lmfdb} \url{https://www.lmfdb.org/}
132
\bibitem{magma} \url{http://magma.maths.usyd.edu.au/calc/}
133
\bibitem{sage} \url{https://sagemath.org}
134
135
\end{thebibliography}
136
137
138
\end{document}
139
140