Git Book — 6. Workflows

6. Workflows

In software development, workflows are usually used to describe strategies that define workflows in a team (e.g. 'agile software development'). We can generally limit ourselves to literature references here.⁠^[88]

In Git, you can see “workflows” from two different perspectives: Workflows (command sequences) that affect individual users, and project-related workflows (e.g., release management). Both aspects are discussed below.

6.1. User

Below you will find a list of general development strategies (in no particular order):

Make commits as small and independent as possible: Divide your work into small, logical steps and make a commit for each step. The commits should be independent of future commits and should pass all tests (if any). This makes it easier for your colleagues or maintainers to keep track of what you have done. It also increases the efficiency of commands that examine the story, such as git bisect and git blame. Don’t be afraid to make commits that are too small. It’s easier in hindsight to combine several small commits with git rebase --interactive than to split one big one into several small ones.

Develop in topic branches: Branching is easy, fast and intuitive in Git. Subsequent merging works without problems, even repeatedly. Take advantage of Git’s flexibility: Don’t develop directly in master, but develop each feature in its own branch, called the Topic Branch.

This has several advantages: you can develop features independently; you get a well-defined point in time for integration (merge); you can rebase the development to be “streamlined” and clear before you publish it; you make it easier for other developers to test a new feature in isolation.

Use Namespaces: You can create different classes of branches by using / characters in the branch name. In a central repository you can create your own namespace using your initials (e.g. jp/refactor-base64) or store your features under experimental/ or pu/ (see below) depending on stability.

Rebase early, Rebase often: If you frequently work with Rebase on Topic Branches, you will create a much more readable version history. This is convenient for you and other developers and helps to split the actual programming process into logical units.

Merge small commits when they belong together. If necessary, take the time to split up large commits again in a sensible way (see Sec. 4.2.2, “Editing Commits Arbitrarily”).

However, only use Rebase for your own commits: do not modify already published commits or other developers' commits.

Make a conscious distinction between FF and regular merges: Integrate changes from upstream always via fast-forward (you simply fast forward the local copy of the branches). In contrast, integrate new features through regular merges. The aliases presented in Sec. 3.3.2, “Fast Forward Merges: Fast Forwarding One Branch” are also helpful for differentiation.

Note the merge direction: The command git merge pulls one or more branches into the current one. So always pay attention to the direction in which you perform a merge: Integrate topic branches into the mainline (the branch on which you are preparing the stable release), not the other way around.⁠^[89] This way you can isolate the history of a feature from the mainline even after the fact (git log topic lists only the relevant commits).

Criss-cross merges (crossed merges) should be avoided if possible: They occur when you integrate a branch A into a branch B and an older version of B into A.

Test the compatibility of features via Throw-Away Integration: Create a new (disposable) branch and merge the features whose compatibility you want to test. Run the test suite or test the interaction of the new components in another way. You can then delete the branch and continue developing the features separately. Such Throw-Away branches are usually not published.

Certain work steps appear again and again. Here are a few general solution strategies:

Fix a small bug: If you notice a small bug that you want to fix quickly, you can do this in two ways: stash existing changes (see Sec. 4.5, “Outsourcing Changes — Git Stash”), check out the corresponding branch, fix the bug, change the branch again, and apply the stash.

The other possibility is to fix the bug on the branch you are currently working on and to subsequently transfer the corresponding commit(s) via Cherry Pick or Rebase-Onto (see Sec. 3.5, “Taking over Individual Commits: Cherry Picking”) to the designated bugfix or topic branch.

Correcting a Commit: With git commit --amend you can customize the last commit. The --no-edit option causes the description to be retained and not offered again for editing.

To fix deeper commits, either use interactive rebase and the edit keyword (see Sec. 4.2.2, “Editing Commits Arbitrarily”), or create a small commit for each fix, then arrange them accordingly in the interactive rebase, and apply the fixup action to them to correct the original commit.

Which branches are not yet in master?: Use git branch -vv --no-merged to find out which branches are not yet included in the current branch.

Merge multiple changes from different sources: Use the index to combine several changes, e.g. changes that complement each other but are in different branches or as patches. The commands git apply, git cherry-pick --no-commit and git merge --squash apply the corresponding changes only to the working tree or index without creating a commit.

6.2. A Branching Model

The following section introduces a branching model based on the model described in the gitworkflows(7) man page. The branching model determines which branch performs which functions, when and how commits are taken from a branch, which commits are to be tagged as releases, etc. It is flexible, scales well, and can be extended as needed (see below).

In its basic form the model consists of four branches: maint, master, next, and pu (Proposed Updates). The master branch is used to prepare the next release and to collect trivial changes. pu branches are used for feature development (topic branches). In the next branch halfway stable new features are collected, tested for compatibility, stability and correctness and improved if necessary. Critical bug fixes for previous versions are collected in the main branch and published as maintenance releases.

In principle, commits are always integrated into another branch by a merge (in Figure 41, “Branch model according to gitworkflows (7)” indicated by arrows). Unlike cherry picking, commits are not duplicated, and you can easily see whether a branch already contains a particular commit or not.

The following diagram is a schematic representation of the ten-point workflow, which is explained in detail below.

Figure 41. Branch model according to gitworkflows (7)

New Topic Branches arise from well-defined points, e.g. tagged releases, on the master.
```
$ git checkout -b pu/cmdline-refactor v0.1
```
Sufficiently stable features are taken from their respective pu branch to next (feature graduation).
```
$ git checkout next
$ git merge pu/cmdline-refactor
```
Release preparation: If enough new features have accumulated in next (feature driven development), next is merged to master and if necessary a release candidate tag (RC tag) is created (suffix -rc<n>).
```
$ git checkout master
$ git merge next
$ git tag -a v0.2-rc1
```
From now on, only so-called release critical bugs (RC bugs) are corrected directly in the master. These are “show-stoppers”, i.e. bugs that significantly limit the functionality of the software or make new features unusable. If necessary, you can undo merges of problematic branches (see Sec. 3.2.2, “Rolling Back Commits”).

What happens to next during the release phase depends on the size of the project. If all developers are busy fixing the RC bugs, a development stop for next is a good idea. For larger projects, where development for the next release but one is already being pushed forward during the release phase, next can continue to serve as an integration branch for new features.
Once all RC bugs have been eliminated, the master is tagged as a release and, if necessary, published as a source code archive, distribution package, etc. Furthermore, master is merged to next to transfer all fixes for RC bugs. If no further commits have been made to next in the meantime, this is a fast forward merge. Now new topic branches can be opened again, based on the new release.
```
$ git tag -a v0.2
$ git checkout next
$ git merge master
```
Feature Branches that didn’t make it into the release can now either be merged into the next Branch, or, if they are not yet finished, they can be rebuilt to a new, well-defined base.
```
$ git checkout pu/numeric-integration
$ git rebase next
```
In order to separate feature development from bug fixes and maintenance, bug fixes that affect a previous version are made in the branch maint. This maintenance branch, like the feature branches, branches off from master at well-defined points.
If enough bug fixes have accumulated or if a critical bug has been fixed, e.g. a security bug, the current commit is tagged as maintenance release on the main branch and can be published via the usual channels.
```
$ git checkout maint
$ git tag -a v0.1.1
```
Sometimes it happens that bug fixes made on master are also needed in maint. In this case it is okay to transfer them there using git cherry-pick. But this should be the exception rather than the rule.
To ensure that bug fixes are available in the future, the maint branch is merged to master after a maintenance release.
```
$ git checkout master
$ git merge maint
```
If the bug fixes are very urgent, they can be transferred to the appropriate branch (next or pu/*) using git cherry-pick. As with git cherry-pick to maint, this should only happen rarely.
When a new release is released, the maint branch is fast-forwarded to the state of master, so maint now contains all commits that make up the new release. If no fast-forward is possible here, this is an indication that there are still bug fixes in maint that are not in master (see point 9).
```
$ git checkout maint
$ git merge --ff-only master
```

You can extend the branching model as you wish. One approach that is often encountered is the use of namespaces (see Sec. 3.1, “References: Branches and Tags”) in addition to the pu/* branches. This has the advantage that each developer uses his own namespace, which is delimited by convention. Another very popular extension is to have a separate maint branch for each previous version. This makes it possible to maintain any number of older versions. For this purpose, before merging from maint to master, a corresponding branch for the version is created in point 9.

$ git branch maint-v0.1.2

But keep in mind that these additional maintenance branches mean an increased maintenance effort, because every new bug fix has to be checked. If it is also relevant for an older version, it must be added to the maintenance branch for that version using git cherry-pick. In addition, a new maintenance version may have to be tagged and published.

6.3. Release Management

As soon as a project has more than one or two developers, it usually makes sense to assign a developer to manage the releases. This Integration Manager decides after consultation with the others (e.g. via the mailing list) which branches are integrated and when new releases are made.

Each project has its own requirements for the release process. Below are some general tips on how to monitor development and partially automate the release process.⁠^[90]

6.3.1. Exploring Tasks

The maintainer of a software must have a good overview of the features that are actively being developed and will soon be integrated. In most development models, commits graduate from one branch to the next — in the model presented above, first from the pu branches to next and then to master.

First of all, you should always clean up your local branches in order not to lose the overview. The command git branch --merged master, which lists all branches that are already fully integrated into master (or another branch), is especially helpful here. You can usually delete these.

To get a rough overview of the tasks that need to be done, it is recommended to use git show-branch. Without any further arguments, it lists all local branches, each with an exclamation mark (!) in its own color. The current branch gets a star (*). Below the output all commits are shown and for each branch in the respective column a plus (+) or a star (*) if the commit is part of the branch. A minus (-) indicates merge commits.

$ git show-branch
! [for-hjemli] initialize buf2 properly
 * [master] Merge branch _stable_
  ! [z-custom] silently discard "error opening directory" messages
---
+   [for-hjemli] initialize buf2 properly
--  [master] Merge branch _stable_
+*  [master\^2] Add advice about scan-path in cgitrc.5.txt
+*  [master^2\^] fix two encoding bugs
+*  [master\^] make enable-log-linecount independent of -filecount
+*  [master\~2] new_filter: correctly initialise ... for a new filter
+*  [master\~3] source_filter: fix a memory leak
  + [z-custom] silently discard "error opening directory" messages
  + [z-custom^] Highlight odd rows
  + [z-custom\~2] print upstream modification time
  + [z-custom\~3] make latin1 default charset
+*+ [master~4] CGIT 0.9

Only so many commits are shown until a common merge base of all commits is found (in the example: master~4). If you don’t want to examine all branches at once, but only the branches under pu/, for example, then explicitly specify this as argument. --topics <branch> defines <branch> as integration branch, whose commits are not explicitly shown.

So the following command shows you all commits of all pu branches and their relation to master:

$ git show-branch --topics master "pu/*"

It is worth documenting the commands you use for release management (so that others can continue your tasks if necessary). You should also abbreviate common steps by using aliases.

You could convert the above command into an alias todo as follows:

$ git config --global alias.todo \
  "!git rev-parse --symbolic --branches | \
  xargs git show-branch --topics master"

However, the git show-branch command only recognizes identical, i.e. identical commits. If you use git cherry-pick to copy a commit to another branch, the changes are almost the same, but git show-branch would not detect this because the SHA-1 sum of the commit changes.

The git cherry tool is responsible for these cases. It uses the small tool git-patch-id internally, which reduces a commit to its changes. It ignores whitespace changes and the contextual position of the hunks (line numbers). So the tool returns the same ID for patches that essentially commit the same change.

Usually, git cherry is used when the question arises: Which commits have already been transferred to the integration branch? The command git cherry -v <upstream> <topic> is used for this: It lists all commits from <topic>, and puts a minus (-) in front of them if they are already in <upstream>, otherwise a plus (+). This looks like this:

$ git cherry --abbrev=7 -v master z-custom
+ ae8538e guess default branch from HEAD
- 6f70c3d fix two encoding bugs
- 42a6061 Add advice about scan-path in cgitrc.5.txt
+ cd3cf53 make latin1 default charset
+ 95f7179 Highlight odd rows
+ bbaabe9 silently discard "error opening directory" messages

Two of the patches were already applied after master. git cherry recognizes this, although the commit IDs have changed.

6.3.2. Creating Releases

Git provides the following two useful tools to help you prepare for a release:

git shortlog: Summarizes the output of git log.
git archive: Automatically creates a source code archive.

A good release includes a so-called changelog, i.e. a summary of the most important changes including thanks to people who have contributed help. This is where git shortlog comes in. It shows the respective authors, how many commits each one has made, and the commit messages of each commit. This makes it easy to see who did what.

$ git shortlog HEAD~3..
Georges Khaznadar (1):
      bugfix: 3294518

Kai Dietrich (6):
      delete grammar tests in master
      updated changelog and makefile
      in-code version number updated
      version number in README
      version number in distutils setup.py
      Merge branch _prepare-release-0.9.3_

Valentin Haenel (3):
      test: add trivial test for color transform
      test: expose bug with ID 3294518
      Merge branch _fix-3294518_

The --numbered or -n option sorts the output by the number of commits instead of alphabetically. With --summary or -s the commit messages are omitted.

But if in doubt, don’t simply write the output of git log or git shortlog to the file CHANGELOG. Especially with many technical commits, the changelog is not helpful (if you’re interested in this information, you can always check the repository). But you can take the output as a basis, delete unimportant changes and combine the rest into meaningful groups.

Often the question arises for the maintainer what has changed since the last release. This is where git-describe (see Sec. 3.1.3, “Tags — Marking Important Versions”) comes in handy. In conjunction with --abbrev=0, it outputs the first accessible tag from the HEAD:

$ git describe
wiki2beamer-0.9.2-20-g181f09a
$ git describe --abbrev=0
wiki2beamer-0.9.2

In combination with git shortlog the question can be answered very easily:

$ git shortlog -sn $(git describe --abbrev=0)..
    15  Kai Dietrich
     4  Valentin Haenel
     1  Georges Khaznadar

The git archive command helps to create a source code archive. The command can handle both tar and zip format. Additionally, you can set a prefix for the files to be saved with the option --prefix=. The top level of the repository is then stored below this prefix, usually the name and version number of the software:

$ git archive --format=zip --prefix=wiki2beamer-0.9.3/ HEAD \
    > wiki2beamer-0.9.3.zip
$ git archive --format=tar --prefix=wiki2beamer-0.9.3/ HEAD \
    | gzip > wiki2beamer-0.9.3.tgz

As a mandatory argument the command expects a commit (or a tree), which should be packed as an archive. In the above example the HEAD. But it could also have been a commit ID, a reference (branch or tag) or directly a tree object.⁠^[91]

Again, you can use git describe after you have tagged a release commit. If you have a suitable tag scheme <name>-<X.Y.Z> as above, the following command is sufficient:

$ version=$(git describe)
$ git archive --format=zip --prefix=$version/ HEAD > $version.zip

It’s possible that not all of the files you manage in your git repository should also be in the source code archives, such as the project website. You can also specify paths - so to limit the archive to the src directory and the LICENSE and README files, use

$ version=$(git describe)
$ git archive --format=zip --prefix=$version/ HEAD src LICENSE README \
    > $version.zip

Git will store the SHA-1 sum in the archive if you specify a commit as an argument. In tar format, this is stored as a pax header entry, which Git can read again with the command git get-tar-commit-id:

$ zcat wiki2beamer-0.9.3.tgz | git get-tar-commit-id
181f09a469546b4ebdc6f565ac31b3f07a19cecb

In zip files, Git simply saves the SHA-1 sum in the comment field:

$ unzip -l wiki2beamer-0.9.3.zip | head -5
Archive:  wiki2beamer-0.9.3.zip
181f09a469546b4ebdc6f565ac31b3f07a19cecb
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  05-06-2011 20:45   wiki2beamer-0.9.3/

One problem you should keep in mind is that for example .gitignore files are automatically packed. But since they have no meaning outside a git repository, it is worth excluding them with the git attribute (see Sec. 8.1, “Git Attributes — Treating Files Separately”) export-ignore. This is done with an entry .gitignore export-ignore in .git/info/attributes.

You can also perform automatic keyword substitutions before packing the archive (see Sec. 8.1.2, “Keywords in Files”).

88. Among others, the third chapter of Open Source Projektmanagement by Michael Prokop (Open Source Press, Munich, 2010) is recommended. The Manifesto for Agile Software Development also provides informative information at http://agilemanifesto.org.

89. An exception is if you need a new development in the mainline in your topic branch, but in that case you can consider rebuilding the topic branch via rebase so that it already contains the required functionality.

90. You can find further suggestions in chapter 6 of the book Open Source Projektmanagement by Michael Prokop (Open Source Press, Munich, 2010).

91. Each commit references exactly one tree. However, git archive behaves differently depending on whether you specify a commit (which references a tree) or a tree directly: For trees, the time of the last modification included in the archive is the system time — but for a commit, the time of the commit is set.