We had a very sad call in tech support the other day. A frantic (and I mean a certifiably, going-out-of-his- mind frantic) customer called the other day and wanted to know if there was any way he could restore his project’s AVR source from his deployed Web application. Alas, there isn’t any way to do that. Someone had inadvertently deleted the wrong folder somewhere along the way, leaving this person with a large, deployed Web application that he would soon be rewriting, from scratch.
Over the years, at Paloozas and in individual consultations, we have implored our customers to get, and learn how to effectively use, a version control system (VCS). Over the years, many have. And those customers won’t ever be making frantic phone calls to tech support hoping we can magically resurrect their source code from EXEs and DLLs. However, just as many haven’t yet embraced VCS and live their every programming day on the edge of the abyss of disaster. For all but the most trivial of software projects, good version control is essential.
Why use version control
With a good version control system in place, not only will you always know where your source, but, among other things, you’ll be able to
- determine when a bug was introduced
- see who has made the most contributions to the project
- track progress towards a new feature
- roll back to a previous version
- identify who changed what code when
- maintain multiple versions of a project
- experiment with a new feature without breaking existing code
- snapshot projects as archival milestones
For larger shops, especially those with many programmers or those governed by strict regulations (such as publicly-held companies or financial institutions) version control isn’t just a luxury or insurance policy, but it is a necessity to be able to track exactly who changed what when to ensure the integrity of the system.
Centralized version control
The older model of version control is centralized version control. With centralized version control, source code is kept in a central repository. Programmers check out files to their PC (and generally lock the files they are changing), make changes, and then check the programs back into the version control repository. Usually with this model, while the source is checked out, other programmers cannot make changes to those source members. Subversion (SVN), Concurrent Versions System (CVS), and SourceGear’s Vault are examples of centralized version control software. Microsoft’s Team Foundation Server (TFS) is also a centralized VCS, but it aims a little higher than the others listed and also includes application lifecycle management (ALM) facilities, such as continuous integration, project management, testing, and bug tracking. In the past, TFS has been aimed at large shops with sophisticated ALM requirements. Lately, MS has tried to make TFS more appealing to smaller shops—but even at that I think its features and facilities are probably best exploited by teams of 25 or more programmers.
As a side note, Microsoft used to have a VCS product called Visual SourceSafe. It was a centralized version control system that was generally known, especially in its earlier versions, for its complete lack of reliability. There are many horror stories about the earlier versions of SourceSafe and how not only did those versions provide flawed features and user experiences, they committed the ultimate VCS sin and destroyed or lost code! Read SourceSafe horror stories here or here. Later versions of SourceSafe fixed most of the early problems—but even at that SourceSafe is generally not considered a very good choice with the wide array of other options available today.
Distributed version control
The centralized version control model has been around for a long time and its workflows and practices are deeply ingrained in many shops today. However, in the last couple of years a new type of VCS system has gotten very popular; it’s called distributed version control. With distributed version control, the repository is located remotely—just like a centralized VCS. However with a distributed VCS, programmers don’t check out files, they check out entire projects. This makes not only the entire project available to programmer locally but also the entire VCS history. When the programmer is done working on the project, the changed project is committed back to the centralized location. While checking out an entire project and its history may seem cumbersome, it isn’t—even for very large projects. Distributed version control systems perform quite well.
Unlike most centralized version control implementations, when distributed projects are checked out they are not locked. This is heresy to most stalwarts of centralized version control. However, distributed version control systems provide sophisticated merge facilities so if two programmers check out the same project, each programmer’s changes are merged into the centralized location. This isn’t as crazy as it first seems. It’s unlikely that two programmers are working on the same files in a large project so merges aren’t as convoluted as they first seem. It is a concern, but it’s not as scary as it sounds and the distributed version control systems provide very capable mechanisms to ensure you’re merging your code changes correctly.
Examples of distributed version control systems include Git, Mercurial, and SourceGear’s Veracity. All three are open source—with a variety of services from variety of companies available to help use them in the enterprise. Each of these need a centralized “home” repository. You can install the server component on your own servers or use a variety of hosted servers on the Internet. Github.com is a free Git hosting service for open source projects but it charges for private repositories. Bitbucket.org is a similar Git/Mercurial hosting service, but private repositories, for limited number of team members, are free at Bitbucket.
All three distributed VCSs above are worthy, but Git is rapidly ascending as the defacto VCS for many shops and virtually all popular open source projects. Git has always been a popular VCS for Linux users (Git was written by Linux creator Linus Torvalds to be used as the VCS for Linux development), but in the last couple of years it has become much more Windows friendly. Git was originally intended to be used purely from the command line. Its commands and switches have a steep and tough learning curve. However, a number of Git graphical front ends have appeared and they dramatically reduce its learning curve—especially for its core facilities.
Github for Windows is a great Windows Git client, but it is limited in what Git features its UI offers. However, for the core check in/check out features, it excels. SourceTree is a new Windows GUI for Git that implements most of Git’s facilities—but at the cost of being more complex to use. I use them both; I use Github for Windows as my workhorse Git client—it does all I need 80% of the time. But for special cases where I need some of Git’s features that it doesn’t offer, I’ll use SourceTree (for example, SourceTree makes comparing versions of a file completely effortless—Github for Windows leaves that to you and the Git command line). Both Github for Windows and SourceTree are free downloads. There are also many other Git Windows clients, but I’ve not seen any other Windows Git client as well done as Github for Windows or SourceTree.
Figure 1. Github for Windows UI
Figure 2. SourceTree UI
Forging your path to version control
If yours is a shop without a version control system in place, where should you start? Here are a handful of guidelines to help you get started with an VCS:
Expect frustration at first. With no version control disciplines in your past, adding VCS to your development workflow will be frustrating at first. Since you’ve probably been coding for a long time, you’ve surely developed home-brewed strategies (which probably include lots folder renaming, zip files, and mapped drives network drives—all of which, by the way, are not version control!) for protecting your source code. You need to realize, and accept, that learning to use VCS will take some initial upfront effort. It will cause you some pain at first—but I promise you it’s worth it. Once you’ve worked with a VCS for a bit and understand how things work, you’ll never go back.
Don’t worry about Visual Studio integration with version control products. When I first started getting familiar with VCS, I insisted on Visual Studio integration. There are VS add-ons for tools such as SVN and I tried several of them. Then I tried Vault’s VS integration. These VS integration components (and all of the others I used) provided no end of frustration for me. These VS add-ins were all almost very effective—but there was always a snag somewhere. They also made Visual Studio, an already very busy application, and even busier one. I wasted a lot of time chasing the elusive Visual Studio/ VCS integration before it dawned on me that I was being a complete and utter idiot. I found Visual Studio integration with VCS to add friction, not reduce it. I also realized that It wasn’t just Visual Studio projects that I wanted to put under the control of an VCS system, I also wanted to put Word docs (you can put binary files in VCS—showing changes isn’t as graceful as it with text files, but even at that using VCS with Office files protects you from your own silliness), HTML/CSS/JS only projects, Ruby and PHP hobby projects, and lots of other things that live outside the purview of VS into VCS. (One quick note about putting binary files into a distributed VCS: there isn’t a good way to merge changes from two users back into a binary document—so be very careful using a distributed VCS with your binary files.)
Folders define units of work—not Visual Studio projects. Rather than think of my units of work as being defined by Visual Studio projects, I started to think of my work as being defined by folders. It was folders (and their contents) that I really wanted to put under VCS. Visual Studio didn’t really have anything to do with the file system directly. (note: Microsoft’s TFS may have solved some of these issues, but what I have now works very well for me, so I’m not interested in burning extra cycles on TFS just to see!). So, for any project (be it a large whitepaper that I’m writing in Word, or Visual Studio project, or a little HTML/JS test to learn Angular.js) I work at the folder level. Before I start, I commit a folder to VCS and then use the VCS to track changes to make commits. I might open an AVR project three or four times during the day to make and test changes to a project. When those changes are done, I will then commit those changes to the VCS.
Don’t start learning VCS on a large important project! It takes a little time to develop confidence in a VCS package. Take the time to establish a comfort level with some test projects before jumping all the way in. Start out humble with a very small test project so that you can put it through its paces with the VCS you’re using in a predictable fashion. Watch closely what happens when you commit or rollback. Compare your changes to ensure you understand what’s going on.
Which product should you use? This is a subjective question. The right answer is the one that works best for you. I learned VCS using SourceGear’s Vault (that’s what we use internally at ASNA). Vault is a good product to use to get started. It is free for any single user; but those single users won’t be able to share each other projects (which defeats a major part of the purpose in a multi-programmer shop). If you have a single-developer shop Vault is a good choice for a centralized VCS. Its downside is that it requires SQL Server as its repository host. It does, however, support SQL Server Express so the single-user cost is free—but you have do have the hassle of SQL Server setup and configuration. And, in a team environment you’d probably want to use a commercial version of SQL Server. Vault provides a good user experience overall and is very easy use. For me, though, it has one critical flaw: you must manually remember to tell it to look for new files added to your project. It won’t discover them automatically.
I worked on a couple of projects with SVN (but never with CVS). I tried hard, but never ever felt confident with SVN that I was doing the right thing. It’s open source, though, so it costs nothing to try and it uses its own proprietary repository.
Three years ago, before out of the box good Windows experiences were available I did a deep dive into Git and, despite its command line warts and hairs, after some study and trial and error, found it to be a good for my typical workflows. Back then, unless you were willing to really dig around, I don’t think it was too recommendable for Windows users. However, with the advent of the great, free Windows clients, Github for Windows and SourceTree, Git is a highly recommendable place to start. A way to get started is to simply download the Github for Windows client and get busy. It works great as a local single-user system. Play around with it locally for a while to get a feel for it then you can get a free Bitbucket.org account and start pushing your work up to it (remember, private repositories for small teams are free there).
I’ve never used Veracity or Mecurial. Veracity seems to be aimed at a sophisticated enterprise need. It might be quite capable, I’ve just not spent any time with it. Mecurial was once generally considered to be more Windows-friendly than Git. I’m not sure that with the newer Git Windows clients that’s true–they do make Git easy to use on Windows.
What VCS system you select is entirely up to you—the important thing is that you pick a good one and run with it! What would you ever do without your source code?
Get a version control system today!
A Version control system is a tough thing to understand quickly. As a programming discipline, it doesn’t seem to get the attention it deserves. However, if you poke around you can find some resources. Eric Sink, the CEO of SourceGear, has written a pretty good VCS introduction and it’s free. It covers both centralized and distributed systems and covers both abstract concepts and some detail about several specific products. There is also a very good, free Git-specific book by Scott Chacon. There is also a good VCS primer at Bitbucket.org. There is also a good VCS primer here. Although it is several years old its primary concepts still apply and its pictures really help understanding. In addition to these resources, several podcasts often have good VCS discussions. Check out the archives at DotNetRocks, HerdingCode, or Hanselminutes. Also, Googling “version control primer” and “version control tutorial” turns up some interesting looking content.
Regardless of how you start, it’s really important that you do start. The product that any programming team is intangible—you can’t touch or readily see it and you also assume it’s there. Without good version control, you are making invalid assumptions assuming you can always get back to the correct version of your source code. Avoid yourself an embarrassing, groveling call to tech support and get started with VCS in your shop today!