Contact
CoCalc Logo Icon
StoreFeaturesDocsShareSupport News AboutSign UpSign In
| Download

jupytercon proposal

Project: SMI Business
Views: 385

See https://conferences.oreilly.com/jupyter/jup-ny/public/cfp/621

Private email with kiran: https://mail.google.com/mail/u/0/#search/kedlaya/161536fb6cd8944e

Submitted: https://mail.google.com/mail/u/0/#inbox/161fcd2790dec047

Proposed title and abstract

Realtime collaboration in Jupyter notebooks using CoCalc

Description of the presentation

  1. I will spend about 15 minutes describing what CoCalc is and how it relates to Project Jupyter

  2. I will then spend 25 minutes outlining the story of how I implemented (and re-implemented, ...) realtime collaboration in CoCalc.

Brief overview: I will explain how CoCalc relates to the Jupyter project, then describe how I implemented realtime collaborative editing of Jupyter notebooks in CoCalc.

Here are more details on how I'll explain what CoCalc is:

I launched the Sage Notebook in 2006. It was a web application initially motivated by a talk at Sage Days 1 about a GUI for IPython. CoCalc is a continuation of this project using more modern technology. CoCalc includes a distinct full stack implementation of both the frontend and backend parts of Jupyter, built from scratch using React, Node and Kubernetes. Compatibility with Jupyter---both the published wire protocol and the general feel of UI---is a fundamental design goal.

CoCalc tackles many of the same problems as JupyterLab and JupyterHub, but with very different design constraints, motivations, and results. For example, realtime collaboration has been a core feature of CoCalc since it launched in 2013, whereas classical Jupyter does not have realtime collaboration support; on the other hand, drag and drop and a flexible plugin architecture are central features of JupyterLab, which are mostly absent in CoCalc. Another subtle difference is that in CoCalc if you close your browser while a computation is running, then open it later, all your output will be there, whereas classical Jupyter discards that output (though it displays output more quickly).

The core goal of CoCalc is to provide easy, safe and beginner-friendly access to all open source mathematics and data science software, including SageMath, LaTeX, R, Anaconda, and a large number of Jupyter kernels. The motivation for fully supporting multiple simultaneous people editing Jupyter notebooks is that it makes it much easier for teachers to support their students, for students to support each other, and for the people who run CoCalc to support users.

The same technology that we use to implement realtime collaboration also provides a complete history of all modifications of the notebook. With a granularity of about 2 seconds, people can follow exactly how a document evolved over time. Hence there is no longer the fear of "messing everything up", because it is easy to go back in time. Additionally, professors can follow exactly how students worked out their way to solve a given problem. People can collaborate without having to learn Git or fiddle around with nbdiff.

Another difference is that CoCalc is primarily a whole product designed to make life easier for teachers who, with minimal extra effort, want to use open source data science software in teaching beginners. College professors are very busy, so CoCalc is built as a single centralized service that is hosted in Google's cloud and run by SageMath, Inc., rather than something professors or staff have to install and run themselves. Moreover, a huge amount of software is preinstalled in the standard image that CoCalc projects use; this includes everything that anybody has ever requested since 2013 and we could figure out how to install.

Some remarks about explaining realtime collaboration

Implementing realtime collaboration in a web application involves many choices and tradeoffs. CoCalc has had realtime collaboration support for five years; I implemented it entirely from scratch, then rewrote and extended it many times over the years. The resulting implementation is open source and conceptually very easy to understand, but complicated for me to implement. It is optimized for what our users need, the sort of documents they edit, and the choices we've made for how to store data, both long and shortterm. It's challenging to usefully describe how this all works in 30 minutes, so accept this talk proposal to see if I'm up to the challenge!

Suggested main topic and application area (i.e. science, education, industry)

Science, Education

Audience information

  • Who is the presentation is for?

People teaching courses or building software to support teaching.

  • What will they be able to take away?

Understand technically what CoCalc is and how it relates to JupyterHub and JupyterLab; learn how realtime collaboration with Jupyter notebooks works in CoCalc.

  • What prerequisite knowledge do they need

Familiar with Jupyter notebooks.

For tutorial proposals: hardware installation, materials, and/or downloads attendees will need in advance

N/A

Speaker(s): biography and hi-res headshot (minimum 1400 pixels wide; required). Check out our guidelines for capturing a great portrait.

Biography: William Stein is the founder of the SageMath open source math software project, and also came up with the name Cython and launched that project. He is a Full Professor of Mathematics at University of Washington (currently on leave), and is the CEO of SageMath, Inc., whose main product is CoCalc. He has published 3 books and a few dozen papers in number theory.

Prerequisite knowledge and/or requirements needed by attendees

Familiar with Jupyter notebooks.

Here's one of me speaking at the RethinkDB meetup in San Francisco: https://youtu.be/WU6eSckPR7E

Anything you can possibly provide would be greatly appreciated. I'm completely on 100% leave from my academic job, and my small company is losing a lot of money every month still. I am in fact self employed right now, and costs are a significant travel constraint for me. I could use some (of the very limited) Sage Foundation donations toward my travel expenses for this conference, but otherwise it is a significant direct cost to me.