Version control with multiple branches

admin's picture

Simple 1-branch dev-live version control

Git, Subversion (SVN) and Mercurial are class-leading examples of version control systems (VCS). They all work slightly differently, but the fundamental principle is the same. Two developers work on the same codebase. They commit their own changes, and update or pull the other's. When they both work on the same file, there may be conflicts, but the VCS will help guide them through a conflict resolution process to get to a master committed version.

Each developer's machine (dev) is typically independent of the other, and independent of the live server (live) where the production version of the project is hosted. In a simple dev-live system, each developer works on the same branch, often the master or main branch. Deployment is handled by logging into the live server and simply 'updating' everything. All the committed changes to all files are pulled to the server and the site then works from the updated codebase. It's important to recognise this works pretty well, but it's not perfect:

Pros

  • Changes are managed

Cons

  • Developers have to be careful not to commit untested or bug-ridden work
  • Testing can only be done either atomically (before the commit) or retrospectively, once the commit has been deployed to the live server

That means that if the commit breaks something, it may not be exposed right away. Also testing happens publicly which means live data may be corrupted with test data.

1-branch dev-live-live

This simple variant introduces the idea of a staging server. In this setup it's really just a second 'live' server. Again both developers work on a single branch, but instead of pulling their changes straight to live, then pull them to a staging server where they're tested before the final deploy to live.

This addresses some of the cons above, notably some of the testing concerns, but there are still limitations.

Pros

  • Work can be integration tested before pushing live
  • Test failures are private as the staging server need only be accessible to the developers and the testers
  • Boundary-case test data can be used without interfering with live data
  • Any load impacts imposed by the execution of the tests are confined to the staging server. The live server's load isn't affected

Cons

  • If a developers machine crashes, they lose all their uncommitted work
  • Each developer can only work on a small set of independent features. Where their work overlaps, they'll need to collaborate to keep the staging server in a consistent state.

If there are too many committed but dependent changes, the site might only rarely get to the point where everything's working. This reduces the time when the staging server can effectively be used to test and when the live server can be re-deployed.

N-branch dev-stage-live

The solution is to use multiple branches. In a nice reasoned tutorial, this all seems to make sense, but when you're unravelling developers' versioning errors, this can seem like the worse decision ever made. You might ask why is it so complicated?

The setup is simple enough. There's a live branch, which the live server uses, and a staging branch, which the staging server uses. Staging is often called 'master' in the Git world because the master branch always exists and if you're going to create branches, it makes sense to just create one additional one (live).

The dev process begins with a little complexity. A developer checks out the live branch:

Clone the repo

git clone >repository_name<

Look at a list of available branches

git branch

Switch to the live branch, as generally clone gives you master by default, and get latest code

git checkout live; git pull

Branch from live to create a specific feature branch

git checkout -b >feature_branch_name<

Now this point is crucial. Live represents the most solid, grounded, tested version of your code. Ultimately all features should eventually go to live, which means all feature branches should eventually get merged back into live. It is so easy to forget to do it and check it and (alas) so consequential to get wrong.

Double-check which branch we're on, then commit our changes

git branch
git commit . -m 'explanation of my changes'

We may/may not decide to push the feature branch up to the repo. I suggest this is essential unless you've got some kind of live/near-live backup of your local git repo, or so much time that you don't mind doing things twice.

git push

Once you're happy with your local/unit testing, merge the feature branch back into master (yes, it came from live and now it's going to master!)

git checkout master
git merge >feature_branch_name<
git push

We merge to master, so that the feature can then be checked out on staging and tested thoroughly.

 

Pull changes on to staging

git pull

Once that's done, we're ready to merge to live and repeat our deploy, this time to the live server.

git checkout live
git merge >feature_branch_name<
git push

Check and deploy on live

git status -u

git pull

Pros

  • Properly managed features
  • Developers can work collaboratively or entirely indepently of one another

Cons

  • It's complicated and requires vigilence from everyone, not just the repo keeper
  • It's nasty to unpick when things go wrong

The nightmare of discovering you branched from master in the beginning

You may discover when merging back into live that your VCS thinks you've made a lot more changes than you actually did. Check your branch. If you realise you've branched from master, panic not, but you have got a bit of a mess to clean up. Why?

When you branch from master you get everyone elses non-live, staged changes. Many if not most of their feature branches will be sitting in the master branch, so your feature branch now contains your feature and a tonne of others. When you merge into master (to deploy to staging) you may get lucky and realise what you've done, but the error can go undetected at this point because merging your (branched from master) branch back to master may not show up many changes. That really depends on how busy the rest of the team is. However, when you merge back into live, you'll find a tonne of changes and get that sinking feeling. Stop! Don't push your erroneously merged branch and definitely don't pull onto live.

The good news is that you've not (so far) done any damage. You've just got one dirty branch (because it came from master and contains their changes). There are lots of potential solutions, but the one I like the best is to create a new branch (from live this time) and consult the git log to see what files you worked on. Depending how monumental the error, you may need to try and do something semi-automatic, like diff with master and create a patch, then apply the patch to your new branch.

If you did push the bad merge or worse pulled it onto live, use git pull to get live back onto the previous commit, then delete your erroneous merge commit from the master repo - it's a horrible job, but you probably won't ever do it twice!

Either way, you end up with a new branch, which can be merged back into master (test on staging again) and then to live.

In summary, version control is a fantastic mechanism for managing (and not losing) your work or the work of your peers. Devopera doesn't do version control or VCS hosting for now, but we do setup production-ready servers that embed version-controlled deployment to make all this easy. If you're interested in using our services, or just want some help getting started with an integrated development-operations workflow, why not download one of our virtual machines for free.

Recent Articles

published 3 years 1 month ago

Site

Follow Us

Twitter icon
Facebook icon
LinkedIn icon
SlideShare icon
YouTube icon
RSS icon