CoCalc Public Fileswrite / 2021-05-notes-online-resources / notes.mdOpen with one click!
Authors: Harald Schilly, William A. Stein
Description: Draft of article for the Notices of the AMS called "Collaborative Calculation in Your Web Browser"
Compute Environment: Ubuntu 20.04 (Default)

How to do Collaborative Calculation in your Web Browser by William Stein

I greatly appreciate the opportunity to write about online resources for mathematics in this special issue of the Notices, since such resources are something that I have long enjoyed sharing with the community. I first started creating online resources for mathematicians in the late 1990s when I was a graduate student at UC Berkeley. I started by creating tables of modular forms, elliptic curves, and other related data, inspired first by a question from Ken Ribet, and encouraged on when many other mathematicians reported that my data inspired their research (e.g., https://link.springer.com/article/10.1007/s00208-003-0449-2). I later became involved with the LMFDB project (https://www.lmfdb.org/), which organized dozens of mathematicians to make amazing tables of number theoretic data.

I also created interactive online calculators, in which you could input the parameters of some mathematical object (e.g., an elliptic curve) into a web page, and the web server would compute and display extensive information about it. Later, when teaching undergraduate number theory courses at Harvard, I made similar online calculators that enabled my students to run arbitrary PARI and Magma code, and in doing so, they could explore nontrivial computations in algebraic number theory. This worried the authors of the closed source software Magma, which motivated me to start the free open source software SageMath in 2004, whose goal was to provide software for doing mathematics built on Python and many open source mathematics packages. (Magma is incredibly powerful closed source software for pure mathematics, and you can play around with it for free online today at http://magma.maths.usyd.edu.au/calc/.)

In order to make SageMath easier to use online, at the University of Washington (UW) in 2006, some students and I undertook the creation of a sophisticated web-based interactive notebook interface called the Sage Notebook, which was inspired by Mathematica's desktop notebook. The Sage Notebook, which was the first serious web-based computational notebook, was challenging for us to implement because of the primitive Javascript and HTML technology of the day, and the challenges arising from running arbitrary user code on my server. I hosted the notebook publicly for anybody to use starting in 2007, and my students and others frequently used it in courses, summer student research programs, and SageMath development workshops. A few years later the IPython project created a new web-based notebook called Jupyter that looked and felt similar, but with an underlying more modern architecture that was built to work with a wide range of programming environments. The arrival of the Jupyter notebook was fantastic news, because by that time the Sage Notebook's underlying architecture had become fairly antiquated; moreover, Jupyter benefited from a multimillion dollar grant that resulted in substantial open source software development and publicity.

To address more general needs related to my teaching, I started dreaming about creating another web application called CoCalc in 2012. CoCalc is an abbreviation for "Collaborative Calculation", and I hoped it could provide a unified home for everything I was using for my teaching and research. My favorite undergraduate course to teach at UW was called "Mathematical Computation"; it covered LaTeX, R, Python, and computational statistics, abstract algebra, graph theory, symbolic calculus and crypto using SageMath, and today it would significantly overlap with an introductory data science course. I wanted a web application that would ideal for hosting this course.

Prior to launching CoCalc in 2013, I taught Mathematical Computation many times using the Sage Notebook. However, the Sage Notebook did not adequately support important topics that I wanted to teach, including "How to create LaTeX documents'', "How to use a Linux terminal", and "How to develop code in files instead of a notebook". Furthermore, it did not possess the crucial ability to do realtime multiuser editing, which is a key feature of popular web applications today. Over the years I noticed two things from my students: (1) they appreciated working collaboratively with others, especially on final projects, and (2) they wanted to work collaboratively with themselves, in the sense that they wanted easy access to exactly what they were working on 15 minutes ago. Having mulled over these issues, I decided I wanted to create a single web application that would support collaborative editing of Jupyter notebooks and LaTeX documents, multiuser Linux terminals, and integrated chat and a detailed browsable TimeTravel history of editing documents. The first release incorporated some of these things, but with stripped down functionality, and it was hosted it on a single desktop in my office.

That being said, I made CoCalc public and free just like the Sage Notebook, and a few other professors were interested in trying it out. Several students in my course were skilled at programming and began helping with its development. However, the instructors who first tried the application reported being annoyed with the tedium involved with uploading and downloading assignments. To combat this, we implemented a basic course management system, which automated distribution and collection of assignments.

With this course management system in place, and CoCalc running on a more powerful server, we received much more usage. Initially, I bought a few dozen servers using a grant, and hosted them in data centers on the UW campus. As usage grew, I received free credits on Google Cloud, and moved hosting to Google, which made dynamic scaling in response to variations in load much more efficient. We ended up facing a number of challenges due to the nature of our users, who were mainly people in courses making heavy use of Python and R. In one course, the students would regularly create massive plots, so we had to come up with better ways of dealing with huge output, especially in the context of real-time multiuser editing. Over a period of several years, we reimplemented the Jupyter stack for our purposes, while attempting to preserve the functionality, look and feel of the official Jupyter notebook.

We also found that instructors needed access to a broad software stack, and we accumulated hundreds of gigabytes of installed software, along with automated scripts to install and test it. It was a major technical challenge for us to make this software quickly available for use across the cluster, upgrading it with automated testing, and provide stable images over time. Along the way, some of our solutions to these problems failed at scale, which resulted in lost data, difficult weeks of hard work and many sleepless nights. Fortunately, we eventually engineered robust solutions to these problems, which have been working for several years.

In 2019, I officially resigned my Full Professor position at UW, and began working full time on CoCalc as a business. Today this business is reasonably stable, with a manageable number of issues, and enough paying customers so that our team can proactively improve CoCalc, rather than reacting to crises. The company's daily operations are funded entirely by customers. Much of our current development effort on CoCalc is driven by the following question: “What would X look like if the community could focus for many years on writing new code and facing difficult algorithmic challenges?". This principle guides our development efforts, and we strive to make CoCalc faster to load and navigate, and to improve its efficiency, stability, and usability. An example is current work underway to provide a beginner friendly WYSIWYG user interfaces for editing Jupyter notebooks and other documents. Most CoCalc users are beginners at scientific computation, so a graphical editor, with an easily browsable libraries of code snippets, makes CoCalc significantly more accessible to them.