CoCalc Shared Filesscientific-python-lectures / Lecture-7-Revision-Control-Software.ipynb

# Revision control software

J.R. Johansson (jrjohansson at gmail.com)

The latest version of this IPython notebook lecture is available at http://github.com/jrjohansson/scientific-python-lectures.

The other notebooks in this lecture series are indexed at http://jrjohansson.github.io.

In [13]:
from IPython.display import Image


In any software development, one of the most important tools are revision control software (RCS).

They are used in virtually all software development and in all environments, by everyone and everywhere (no kidding!)

RCS can used on almost any digital content, so it is not only restricted to software development, and is also very useful for manuscript files, figures, data and notebooks!

## There are two main purposes of RCS systems:

1. Keep track of changes in the source code.
• Allow reverting back to an older revision if something goes wrong.
• Work on several "branches" of the software concurrently.
• Tags revisions to keep track of which version of the software that was used for what (for example, "release-1.0", "paper-A-final", ...)
2. Make it possible for serveral people to collaboratively work on the same code base simultaneously.
• Allow many authors to make changes to the code.
• Clearly communicating and visualizing changes in the code base to everyone involved.

## Basic principles and terminology for RCS systems

In an RCS, the source code or digital content is stored in a repository.

• The repository does not only contain the latest version of all files, but the complete history of all changes to the files since they were added to the repository.

• A user can checkout the repository, and obtain a local working copy of the files. All changes are made to the files in the local working directory, where files can be added, removed and updated.

• When a task has been completed, the changes to the local files are commited (saved to the repository).

• If someone else has been making changes to the same files, a conflict can occur. In many cases conflicts can be resolved automatically by the system, but in some cases we might manually have to merge different changes together.

• It is often useful to create a new branch in a repository, or a fork or clone of an entire repository, when we doing larger experimental development. The main branch in a repository is called often master or trunk. When work on a branch or fork is completed, it can be merged in to the master branch/repository.

• With distributed RCSs such as GIT or Mercurial, we can pull and push changesets between different repositories. For example, between a local copy of there repository to a central online reposistory (for example on a community repository host site like github.com).

### Some good RCS software

1. GIT (git) : http://git-scm.com/
2. Mercurial (hg) : http://mercurial.selenic.com/

In the rest of this lecture we will look at git, although hg is just as good and work in almost exactly the same way.

## Installing git

On Linux:

$sudo apt-get install git  On Mac (with macports): $ sudo port install git


The first time you start to use git, you'll need to configure your author information:

$git config --global user.name 'Robert Johansson'$ git config --global user.email [email protected]

## Creating and cloning a repository

To create a brand new empty repository, we can use the command git init repository-name:

In [4]:
# create a new git repository called gitdemo:
!git init gitdemo

Reinitialized existing Git repository in /home/rob/Desktop/scientific-python-lectures/gitdemo/.git/

If we want to fork or clone an existing repository, we can use the command git clone repository:

In [5]:
!git clone https://github.com/qutip/qutip

Cloning into 'qutip'... remote: Counting objects: 7425, done. remote: Compressing objects: 100% (2013/2013), done. remote: Total 7425 (delta 5386), reused 7420 (delta 5381) Receiving objects: 100% (7425/7425), 2.25 MiB | 696 KiB/s, done. Resolving deltas: 100% (5386/5386), done.

Git clone can take a URL to a public repository, like above, or a path to a local directory:

In [6]:
!git clone gitdemo gitdemo2

Cloning into 'gitdemo2'... warning: You appear to have cloned an empty repository. done.

We can also clone private repositories over secure protocols such as SSH:

## Branches

With branches we can create diverging code bases in the same repository. They are for example useful for experimental development that requires a lot of code changes that could break the functionality in the master branch. Once the development of a branch has reached a stable state it can always be merged back into the trunk. Branching-development-merging is a good development strategy when serveral people are involved in working on the same code base. But even in single author repositories it can often be useful to always keep the master branch in a working state, and always branch/fork before implementing a new feature, and later merge it back into the main trunk.

In GIT, we can create a new branch like this:

In [70]:
!git branch expr1


We can list the existing branches like this:

In [71]:
!git branch

expr1 * master

And we can switch between branches using checkout:

In [81]:
!git checkout expr1

Switched to branch 'expr1'

Make a change in the new branch.

In [74]:
%%file README

A file with information about the gitdemo repository.

README files usually contains installation instructions, and information about how to get started using the software (for example).


In [76]:
!git commit -m "added a line in expr1 branch" README

[expr1 a6dc24f] added a line in expr1 branch 1 file changed, 3 insertions(+), 1 deletion(-)
In [77]:
!git branch

* expr1 master
In [78]:
!git checkout master

Switched to branch 'master'
In [79]:
!git branch

expr1 * master

We can merge an existing branch and all its changesets into another branch (for example the master branch) like this:

First change to the target branch:

In [82]:
!git checkout master

Switched to branch 'master'
In [83]:
!git merge expr1

Updating a9dc0a4..a6dc24f Fast-forward README | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
In [84]:
!git branch

expr1 * master

We can delete the branch expr1 now that it has been merged into the master:

In [85]:
!git branch -d expr1

Deleted branch expr1 (was a6dc24f).
In [86]:
!git branch

* master
In [88]:
!cat README

A file with information about the gitdemo repository. README files usually contains installation instructions, and information about how to get started using the software (for example). Experimental addition.

## pulling and pushing changesets between repositories

If the respository has been cloned from another repository, for example on github.com, it automatically remembers the address of the parant repository (called origin):

In [5]:
!git remote

origin
In [4]:
!git remote show origin

* remote origin Fetch URL: [email protected]:jrjohansson/scientific-python-lectures.git Push URL: [email protected]:jrjohansson/scientific-python-lectures.git HEAD branch: master Remote branch: master tracked Local branch configured for 'git pull': master merges with remote master Local ref configured for 'git push': master pushes to master (up to date)

### pull

We can retrieve updates from the origin repository by "pulling" changesets from "origin" to our repository:

In [6]:
!git pull origin


We can register addresses to many different repositories, and pull in different changesets from different sources, but the default source is the origin from where the repository was first cloned (and the work origin could have been omitted from the line above).

### push

After making changes to our local repository, we can push changes to a remote repository using git push. Again, the default target repository is origin, so we can do:

In [7]:
!git status

# On branch master # Untracked files: # (use "git add <file>..." to include in what will be committed) # # Lecture-7-Revision-Control-Software.ipynb nothing added to commit but untracked files present (use "git add" to track)
In [8]:
!git add Lecture-7-Revision-Control-Software.ipynb

In [9]:
!git commit -m "added lecture notebook about RCS" Lecture-7-Revision-Control-Software.ipynb

[master d0d6a70] added lecture notebook about RCS 1 file changed, 2114 insertions(+) create mode 100644 Lecture-7-Revision-Control-Software.ipynb
In [11]:
!git push

Counting objects: 4, done. Delta compression using up to 4 threads. Compressing objects: 100% (3/3), done. Writing objects: 100% (3/3), 118.94 KiB, done. Total 3 (delta 1), reused 0 (delta 0) To [email protected]:jrjohansson/scientific-python-lectures.git 2495af4..d0d6a70 master -> master

## Hosted repositories

Github.com is a git repository hosting site that is very popular with both open source projects (for which it is free) and private repositories (for which a subscription might be needed).

With a hosted repository it easy to collaborate with colleagues on the same code base, and you get a graphical user interface where you can browse the code and look at commit logs, track issues etc.

Some good hosted repositories are

In [14]:
Image(filename='images/github-project-page.png')


## Graphical user interfaces

There are also a number of graphical users interfaces for GIT. The available options vary a little bit from platform to platform:

Image(filename='images/gitk.png')