A love of Git

25 August 2012 - Git

I started off intending this post to be about the opensource Git-Tfs project. I've used Git at home for both personal and freelance projects for quite a few years, but at work we unfortunately use TFS2008. To allow me to use some of the power of Git at work (at least locally on my computer), I've turned to Git-Tfs.

I quickly realised that in order to explain my reasons for wanting the extra functionality of Git over TFS, I would first have to explain how some core features differ in different Version Control Systems (VCSs). So I decided to split this into two posts - one about Git and VCSs in general, and the other about Git-Tfs. The second post about Git-Tfs can be found here.

What is a Version Control System?

Okay, I'm going to make this section very brief, as to be honest if you don't know what one is, then this article is probably beyond what you're looking for. I'm intended this post to be for developers who have used a VCS before, but haven't used Git.

The short version is that a VCS allows you to make changes to your files without destroying the previous content. The history of each of these changes is kept. Each time you explicitly commit changes to the VCS, you type a description which describes the changes you have made. You can then view/compare/restore previous commits at any stage. There are many many benefits to this on top of the obvious. For example, imagine you're looking through code and you see something that you (or someone else) wrote years ago and you're unsure why that change was made. If a VCS was used, then you can very quickly find out when the person made that change, see their description of why they made that change, and also see what other changes they made as part of that commit.

Another major purpose of using a VCS is to allow multiple people to collaborate on the same project/codebase. A history of everyone's changes is stored, and the system intelligently manages combining the different changes when they are committed to the VCS.

Major differences between VCSs

Depending on which Version Control System you choose, your workflow can be quite different. Below, I've outlined some of the core differences that you may come across when using different systems.

Centralised vs Distributed (Git is distributed)

In a centralised source control system, you have one master repository for the source code (normally kept on a server). Everyone working on the project will update their local copy from this central code repository, and check in their changes back to this same repository. The revision history is stored in the master repository, and there is no revision history stored locally.

With a distributed VCS (DVCS), each developer working on the project has their own local repository which contains the full revision history. They can commit code locally to their own repository, and then synchronise their local repository with other repositories. Generally there is still a master repository that developers synchronise with, but this isn't a requirement. If you're working alone on a project, then there's not necessarily a reason to have anything more than your local repository (other than to backup or share across multiple computers). This makes it very easy to get started using source control with a new project. You can always create a remote repository later if required.

Another huge advantage to a DVCS is that because you can create commits and branches locally before pushing to the main server, you can make temporary check-point commits (or even dedicated branches) to help manage your code whilst you're working on a feature. Then you can tidy up those commits (combine, split, rename, reorder etc) before pushing them to the master repository. You end up with far better defined commits, rather than having many different changes dumped into the same commit.

Lock vs merge model (Git is the merge model)

Some source control systems work by locking files that you don't have flagged for editing (checked out). This is done by making all files readonly until you check them out. The term 'checking out a file', means that the source control system both has knowledge that you've checked out that file, and it also makes the file writable. To help make this a bit more seamless, most IDEs will automatically checkout a source controlled readonly file when you try to edit it.

Version Control Systems that use the merge model allow you to freely edit any file. Files aren't set to readonly as part of the source control workflow. The source control system keeps track of which files have changed, and merges those changes back into the repository when you commit them. You don't need to worry about files being checked out when you edit them - you can just make the changes without the VCS getting in the way.

Branches - Ease of creation / destruction

All Version Control Systems nowadays support branches. However in a lot of VCSs, branches are expensive and generally only used for major milestone versions and releases. With Git, branches are very inexpensive and generally used all the time in the development workflow. You can easily create temporary branches for small features, modify commits in that branch as required, then combine it back into the main branch and delete the temporary branch. Very useful if you think you might be starting a feature, but then need to make other unrelated changes in the main branch whilst working on that features.

Learning Git

If you are using Git for the first time, then you will probably find it quite daunting. The negative side to having so much power and flexibility is the learning curve. It's not actually that complicated, and once you understand a few key principles, it's fairly simple. However, you will need to dedicate some reading time to learning these key principles. There's a [free online book][GitBook] which I recommend reading through if you're serious about using Git. Don't skip over topics like branches and rebasing because you think it might be too complicated - it really is worth the learning investment. A lot of the headaches that you commonly see people complain about when using Git is normally due to not fully understanding it. Git is not initially intuitive, and there are some fundamental concepts that you will never properly understand without either reading up on them or having them explained to you. There are certainly a lot of "aha!" moments when learning Git when a concept clicks into place - and it's then when you realise how powerful and flexible Git actually is.

Useful Git Tools

Whilst I would certainly recommend you try and use the command line to an extent to ensure you understand more intrinsically how Git works, it's certainly more efficient to find a decent GUI. I'm a Windows user, so I can only really recommend Windows GUIs. There is a list of Git GUIs on this page which also includes GUIs for other operating systems: http://git-scm.com/downloads/guis.

For Windows, I've used Git Extensions for quite a while and have been quite happy with it. However, I've recently discovered Smart Git, and I must say that I'm very impressed with it. It is much more polished than Git Extensions, and feels much more like a VCS itself rather that just a thin Git GUI (whilst still harnessing the power of Git). There are a few features missing From SmartGit, for example it doesn't support interactive rebases, or the blame command. Hopefully these will be added in a future version.

Also, more recently released is the Github For Windows client. Whilst this is designed for users of Github, it can also be used as a standalone Git client. I haven't used it extensively, so won't comment on it at this stage, but it's certainly worth experimenting with if you're trying to find your GUI of choice.

When you do use the command prompt, there's Posh-Git, which will give you various extra functionality, including autocompletion, syntax highlighting, stats within the prompt (eg. current branch, number of changes files, etc), and much more. This is Powershell based, so will only work if you're using a Powershell prompt (why would you not be?!).

Don't forget that due to all these various clients just sitting ontop of Git itself - you are free to swap and change between the various GUIs and the command line even within the same project. The state is stored in the Git meta files - which is independent of the client.

Summary

Hopefully this post has given you a bit of an overview over some of the core differences that can be found in difference Version Control systems, and why Git seems to use the best of all those features. As mentioned earlier though, the downside of Git is that it's more complicated. This puts a lot of people off, especially as they try to use it without fully understanding it, which can cause headaches very quickly! If you're serious about using Git, then I'd recommend reading the free ebook I mentioned earlier, and especially making sure you understand rebasing (which sounds much harder than it actually is). It's so incredibly powerful, then you're missing out a huge part of what Git is all about if you skip this.

In my next post, I discuss the open source project Git-Tfs.

Search


Recent Posts


Featured Posts


.NET Oxford Links