Version Control with Mercurial

We use Mercurial (hg) for version control of code, documentation, and pretty much any other important computer files in the Salish Sea MEOPAR project.

The Mercurial site includes beginner’s guides, but Mercurial - The Definitive Guide (also known as “the redbean book”) is the go-to reference. If you are new to version control you should read at least Chapters 1 and 2. Users experienced with other version control tools (e.g svn or git) can get up to speed with Mercurial by reading Chapter 2.

The central storage of repositories is in the SalishSea-MEOPAR team account on Bitbucket.org. If you haven’t done so already, you should:

  • Create a Bitbucket.org account. If you use an academic domain email address (like @eos.ubc.ca) you will get perks like unlimited private repo collaboration.

  • Send your Bitbucket user id to dlatornell@eos.ubc.ca so that you can be added to the SalishSea-MEOPAR team account.

  • Follow the Bitbucket ssh Set-up instructions to enable ssh key authentication.

    Note

    You only need to do the ssh-keygen part described in step 3 once. Once you have an ssh key-pair you can use it from all of your working environments.

Installing Mercurial

Obviously, you need to have Mercurial installed on your computer. It is already installed on the Waterhole workstations, sable, and salish at UBC. It is also installed on orcinus on WestGrid, and cedar and graham on ComputeCanada. If you have administrator privileges on your workstation or laptop you can download and install Mercurial for your operating system from http://mercurial-scm.org/downloads, otherwise, contact your IT support to have it installed for you.

Windows users may want to use TortoiseHg or SourceTree, GUI interface tools that integrate with Windows Explorer. However, this documentation focuses on command line use of Mercurial. The workflows described below should be easily translatable into the GUI interface. TortoiseHg also includes a command line interface.

Mercurial Configuration

Mercurial uses configuration settings in your $HOME/.hgrc file as global settings for everything you do with it. You need to set up this configuration on each machine that you use Mercurial on. You should create or edit your $HOME/.hgrc file to contain:

[extensions]
color =
graphlog =
pager =
rebase =
progress =
strip =

[pager]
pager = LESS='FRX' less

[ui]
username = Your Name <your_email_address>
ignore = $HOME/.hgignore
ssh = ssh -C

The [extensions] section enables several useful Mercurial extensions:

  • color shows log listing, diffs, etc. in colour
  • graphlog provides the hg glog command and the synonymous hg log -G command that formats the output as a graph representing the revision history using ASCII characters to the left of the log
  • pager sends output of Mercurial commands through the pager that you specify in the [pager] section so that long output is displayed one page at a time
  • rebase enables rebasing which is particularly useful when working in repositories to which several contributors are pushing changes. As described below, rebase allows changes that have been pushed by other contributors to be pulled into your cloned repo while you have committed changes that have not been pushed without having to do frivolous branch merges. See Pulling and Rebasing Changes from Upstream for more details.
  • progress provides progress bars in the output of commands that are going to take more than a second or two to complete
  • strip provides the strip command to remove changesets and their descendants from a repository. We very occasionally need to use this for repository maintenance.

The [ui] section configures the Mercurial user interface:

  • username defines the name and email address that will be used in your commits. You should use the same email address as the one you have registered on Bitbucket.
  • ignore is the path and name of an ignore file to be applied to all repositories (see Global Ignore File)
  • ssh specifies the ssh command to use when communicating with remote Mercurial instances like the one on Bitbucket. Setting it to ssh -C enables data compression.

See the Mercurial configuration file docs for more information about configuration options.

Global Ignore File

Mercurial uses the file specified by ignore in the [ui] configuration section to define a set of ignore patterns that will be applied to all repos. Like your Mercurial configuration, you need to set this up on each machine that you use Mercurial on. The recommended path and name for that file is $HOME/.hgignore.

You should create or edit your $HOME/.hgignore file to contain:

syntax: glob
*~
*.pyc
*.egg-info
.ipynb_checkpoints
.DS_Store
.coverage
.cache

syntax: regexp
(.*/)?\#[^/]*\#$
^docs/(.*)build/

The syntax: glob section uses shell wildcard expansion to define file patterns to be ignored.

The syntax: regexp section uses regular expressions to define ignore patterns. The ^docs/(.*)build/ pattern ignores the products of Sphinx documentation builds in docs/ directories.

Most repos have their own .hgignore file that defines patterns to ignore for that repo in addition to those specified globally.

See the ignore file syntax docs for more information.

Mercurial Workflows

Note

Mercurial commands may be shortened to the fewest number of letters that uniquely identifies them. For example, hg status can be spelled hg stat or even hg st. If you don’t provide enough letters Mercurial will show the the possible command completions.

Pulling and Rebasing Changes from Upstream

The upstream Bitbucket repos from which you cloned your local working repos are the central repos to which everyone working on the project push their changes. This section describes workflows for pulling those changes into your repos, how to do so without having to do frivolous branch merges, and how to recover from the common mistakes.

Use hg incoming to see changes that are present in the upstream repo that have not yet been pulled into your local repo. Similarly, hg outgoing will show you the changes that are present in your local repo that have not been pushed upstream.

Ensure that you have committed all of your changes before you pull new changes from upstream; i.e. hg status should show nothing or a list of untracked files marked with the ! character.

hg pull --rebase will pull the changes from upstream and merge your locally committed changes on top of them. Using rebase avoids the creation of a new head (aka a branch) in your local repo and an unnecessary merge commit that results from the use of hg pull --update. That reserves branching and merging for the relatively rare occasions when temporarily divergent lines of development are actually required.

The rebase extension docs have more information and diagrams of what’s going on in this common rebase use case.

Rebasing an Accidental Branch

Sooner or later you will accidentally create a branch in your local repo. Using hg pull --rebase with uncommitted changes and then commiting those changes is one way that an accidental branch can happen. hg glog is a variant of the hg log command that shows an ASCII-art graph of the commit tree to the left of the commit log, providing a way of visualizing branches.

hg rebase can be used to move the changes on an accidental branch to the tip of the repo. See the scenarios section of the rebase extension docs for diagrams and rebase command options for moving branches around in various ways.

Aborting a Merge

You may find yourself having followed Mercurial’s workflow suggestions have having merged changes from upstream but then realizing that you really should have rebased. At that point if you try to do almost anything other than commit the merge Mercurial will stop you with a message like:

abort: outstanding uncommitted merges

You can use hg update --clean to discard the uncommitted changes, effectively aborting the merge (and any other uncommitted changes you might have). After that you should use hg glog or hg heads to examine your repo structure because you may well have an accidental branch that you will want to rebase.

Incidentally, hg update --clean can be used any time that you want to discard all uncommitted changes, but be warned, it does so without keeping a backup. See hg revert for a less destructive way of discarding changes on a file by file basis (but note that hg revert cannot be used to undo a merge).

Amending the Last Commit

hg commit --amend can be used to alter the last commit, provided that it has not yet been pushed upstream. This allows for correction or elaboration of the commit message, inclusion of additional changes in the commit, or addition of new files to the commit, etc.

Commit Message Style

Commit messages can be written on the command line with the hg commit -m option with the message enclosed in double-quotes ("); e.g.

hg commit -m"Add Salish Sea NEMO model quick-start section."

Assuming that you have the EDITOR environment variable set hg commit without the -m option will open your editor for you to write your commit message and the files to be committed will be shown in the editor. Using your editor for commit message also makes it easy to write multi-line commit messages.

Here are recommendations for commit message style:

Short (70 chars or less) summary sentence.

More detailed explanatory text, if necessary.  Wrap it to about 72
characters or so. The blank line separating the summary from the body
is critical (unless you omit the body entirely).

Write your commit message in the imperative: "Fix bug" and not "Fixed bug"
or "Fixes bug."

Further paragraphs come after blank lines.

- Bullet points are okay, too

- Typically a hyphen or asterisk is used for the bullet, followed by a
  single space, with blank lines in between

- Use a hanging indent