Private email with kiran: https://mail.google.com/mail/u/0/#search/kedlaya/161536fb6cd8944e
Realtime collaboration in Jupyter notebooks using CoCalc
I will spend about 15 minutes describing what CoCalc is and how it relates to Project Jupyter
I will then spend 25 minutes outlining the story of how I implemented (and re-implemented, ...) realtime collaboration in CoCalc.
Brief overview: I will explain how CoCalc relates to the Jupyter project, then describe how I implemented realtime collaborative editing of Jupyter notebooks in CoCalc.
I launched the Sage Notebook in 2006. It was a web application initially motivated by a talk at Sage Days 1 about a GUI for IPython. CoCalc is a continuation of this project using more modern technology. CoCalc includes a distinct full stack implementation of both the frontend and backend parts of Jupyter, built from scratch using React, Node and Kubernetes. Compatibility with Jupyter---both the published wire protocol and the general feel of UI---is a fundamental design goal.
CoCalc tackles many of the same problems as JupyterLab and JupyterHub, but with very different design constraints, motivations, and results. For example, realtime collaboration has been a core feature of CoCalc since it launched in 2013, whereas classical Jupyter does not have realtime collaboration support; on the other hand, drag and drop and a flexible plugin architecture are central features of JupyterLab, which are mostly absent in CoCalc. Another subtle difference is that in CoCalc if you close your browser while a computation is running, then open it later, all your output will be there, whereas classical Jupyter discards that output (though it displays output more quickly).
The core goal of CoCalc is to provide easy, safe and beginner-friendly access to all open source mathematics and data science software, including SageMath, LaTeX, R, Anaconda, and a large number of Jupyter kernels. The motivation for fully supporting multiple simultaneous people editing Jupyter notebooks is that it makes it much easier for teachers to support their students, for students to support each other, and for the people who run CoCalc to support users.
The same technology that we use to implement realtime collaboration also provides a complete history of all modifications of the notebook. With a granularity of about 2 seconds, people can follow exactly how a document evolved over time. Hence there is no longer the fear of "messing everything up", because it is easy to go back in time. Additionally, professors can follow exactly how students worked out their way to solve a given problem. People can collaborate without having to learn Git or fiddle around with nbdiff.
Another difference is that CoCalc is primarily a whole product designed to make life easier for teachers who, with minimal extra effort, want to use open source data science software in teaching beginners. College professors are very busy, so CoCalc is built as a single centralized service that is hosted in Google's cloud and run by SageMath, Inc., rather than something professors or staff have to install and run themselves. Moreover, a huge amount of software is preinstalled in the standard image that CoCalc projects use; this includes everything that anybody has ever requested since 2013 and we could figure out how to install.
Implementing realtime collaboration in a web application involves many choices and tradeoffs. CoCalc has had realtime collaboration support for five years; I implemented it entirely from scratch, then rewrote and extended it many times over the years. The resulting implementation is open source and conceptually very easy to understand, but complicated for me to implement. It is optimized for what our users need, the sort of documents they edit, and the choices we've made for how to store data, both long and shortterm. It's challenging to usefully describe how this all works in 30 minutes, so accept this talk proposal to see if I'm up to the challenge!
People teaching courses or building software to support teaching.
Understand technically what CoCalc is and how it relates to JupyterHub and JupyterLab; learn how realtime collaboration with Jupyter notebooks works in CoCalc.
Familiar with Jupyter notebooks.
Biography: William Stein is the founder of the SageMath open source math software project, and also came up with the name Cython and launched that project. He is a Full Professor of Mathematics at University of Washington (currently on leave), and is the CEO of SageMath, Inc., whose main product is CoCalc. He has published 3 books and a few dozen papers in number theory.
Familiar with Jupyter notebooks.
Here's one of me speaking at the RethinkDB meetup in San Francisco: https://youtu.be/WU6eSckPR7E
Anything you can possibly provide would be greatly appreciated. I'm completely on 100% leave from my academic job, and my small company is losing a lot of money every month still. I am in fact self employed right now, and costs are a significant travel constraint for me. I could use some (of the very limited) Sage Foundation donations toward my travel expenses for this conference, but otherwise it is a significant direct cost to me.