Thursday, June 7, 2007

Version Control

Everybody is blogging about SCM tools, and since I have this soapbox at my disposal I might as well too. I've been using git for over a year now, and I think I've earned the right not to have my preference dismissed as flavor of the week at this point. And I feel that the time I've spent learning git has been more than recuperated, and I'm sure anyone who sits down and learns git will feel the same way within a month or so. But OMG it has over 100 commands in /usr/bin, there's this thing called the index, sometimes you have to edit a text file, and hey I got this scary looking warning, bla bla... still, I'll claim—and I don't do this lightly—that git is the single most useful software development tool I've learned in the past 10 years or so.

CVS is shit. Subversion is gold plated shit. Some people claim that SCMs doesn't matter and that we should just keep using what "works". I hear this from people who typically don't do much development anymore, and certainly haven't tried anything other than CVS or SVN. If you truly don't care about SCMs, get out of the way and let people that care and have investigated the options pick a system that actively assists us in software development.

Centralized SCM advocates often think that it's either or; either it's the SVN/CVS model or it's the Linux model where everybody has his own tree, everybody pulls from everybody and nobody knows where the latest official version is. The way it works, though, is that most distributed systems provide a superset of functionality of centralized systems, and can be set up—through configuration and convention—to work with a centralized repository. This is how most of the freedesktop.org git repositories work.

People have been using cvsup, rsync or other mechanisms to copy and synchronize remote cvs repositories long before distributed systems were available. git can clone an svn repository to a local git repository where you can commit and branch, and then eventually push those changes back to the upstream svn repository. However, if you're choosing an SCM for your new project why would you choose a system that requires your developers to jump through these hoops to achieve that? Given that git can be set up with a central repository, ssh accessible to a group of developers just like cvs and svn, there is no reason to cripple your co-developers workflow, just because you feel that distributed SCMs have nothing to offer.

Another point often raised about distributed SCMs is that they encourage private development. No they don't. My CVS workflow when implementing a new feature used to be: hack on huge patch, break it, fix it, add more stuff, break it, fix it, etc until the patch was done. That is private development. Now, with git, I add a branch in a private copy, work on the feature in a series of commits, and if the feature works out alright, merge it to the upstream repository. During the entire process the branch is visible and the repository can be cloned by people who wants to test or contribute.

One tool I'd like to do if I ever get some spare time is a svn commandline compatible wrapper for git. It's entirely doable, and shouldn't be too much work. svn checkout becomes git clone, svn update becomes git pull, svn commit becomes git commit -a; git push. I'm sure the devil is in the details, but having this wrapper lets you choose a repository format that preserves branching and merging history and is designed to support cloning, while still letting people use the CVS/SVN workflow.

9 comments:

Havoc said...

So if git is cool, please get rid of all these other ones!

(I bet porting git to Windows would be a huge step in that direction, btw. I think it's why Mozilla didn't use git. I bet it's a blocker for tons of projects.)

Unknown said...

What Havoc said. I'm not really an open source developer, but for both personal & work use I've been investigating a lot of SCMs. I'm enamored with bzr right now because I really like the branch-hack-merge distributed workflow, and bzr is the only tool that works on both of the platforms I want to do development on.

From reading the git tutorial / documentation it seems as if they're farther along, but when I can't use the same tool on both platforms it's a no-go.

Frej said...

Havoc, You can't win ;)

People still use fortran - because that is what their coworkers use (In scientific computing world.. not really same in floss... I know).

It's the same reason SCM R is spreading - Hacker A works with Hacker B who knows scm R. Hacker A chooses scm R, since he can be lazy and ask questions to a human instead of reading man pages. (That was basicly the reasoning for xorg using git?). Very very few do like mozilla and evalutate all choices?

The choice is
A) Use what the guy next to me uses, then i can always ask. ;)

B.1) Learn new tool
B.2) Convince others to use it
B.3) Expect to be responsible when the new scm fails in some way.

Ofcourse this is not 100% the same in public/oss internet world....

PS: Don't listen to me I never used any scm for anything complex ;)

Anonymous said...

Havoc: So if git is cool, please get rid of all these other ones!

The only way to get rid of all the other ones is for you yourself to pick one!

If this seems like a tough choice then I'll choose for you: git.

Your welcome :-)

Anonymous said...

git is probably the best one, but it's user interface is ugly compared to subversion or mercurial.

It's like a Borg cube, full of sharp edges and bits sticking out.

Unknown said...

I think you just need to accept that some people consider version control BORING. This includes people, that, while it may surprise you, still do development. I'm happy that there are people that care deeply about version control. I'm sure that they'll to agreement some day. In the same sort of way that I'm sure that some day I'll have a volume control dialog without lots of sliders that do nothing on my hardware. Until then, I'll learn to use whatever vcs I need to get my job done, just as I eventually figure out which slider to set to get sound to come out. But don't expect me to spend my evenings fooling around with version control systems. I'm not going to think "oh, goody, project X is using vcs Y, I was hoping someone would give me a chance to try it out". IT'S BOOORING.

Kristian Høgsberg said...

havoc: yes, the windows port is gits biggest problem as I see it. And the plethora of SCMs available today is a problem, but I don't see how it detract from a given SCM. And we don't make the problem go away by keeping using CVS ;)

owen: I'm past my SCM-of-the-week phase, I have no desire to try out darcs or bzr. I've used git on a daily basis for more that a year now, and it still excites me and I naiively believe I'm more efficient for using it.

I respect that some people don't care about SCMs, but if high-profile community figures keeps dismissing new-fangled SCMs, it's only going to get worse. Choose one and help establish a new de-facto SCM. If gnome.org had gone with git, there would be less incentive to pick your favorite distributed SCM and Havoc might not have had to deal with a different SCM for each layer in his stack.

Anonymous said...

Some people who consider version control BOOOOORING, are just plain wrong.

Linus Torvalds greatest contribution to the world is not some gnarly C code, it's the social process he's developed to get others to write gnarly C code, and keep on writing it, at a faster rate, with more people, year after year.

That's why in this recent video he talks about the psycological effects of not having an explicit committers list, and how cheap and easy branching encourages branching (whereas we all know that the last thing you want to do with CVS is branch, lest some do you have to merge it).

http://www.youtube.com/watch?v=4XpnKHJAok8

There are multiples of productivity to be had by using a better process. And frankly, we need it! One of the single most important parts of our stack, GTK, is suffering from a lack of manpower, with dozens of unapplied patches sitting in bugzilla.

Distributed version controll is probably one of the most exciting things happening in the free software community right now. It has the potential to dramatically increase productivity and participation, precicely because it targets and enhances the social processes which are unique in the free software world.

DBBD said...

Kristian, seems like you had your share of experience with SCM tools. So have I. I'm pretty impressed with git, and would like to switch to using it.

The main problem with git, is actually not git, but git schemes.

I'd like to know how to manage a project that uses git as its source code management tool. How do I manage releases? How do I work with a main repository?

The linux kernel model is based on people. Linus tree is the linux release. But in a small project, I need to double both as a developer and the release "integrator". How do I set it up with git.

The bottom line, more git how--to documentation is needed, NOT git commands, but schemes and recipes.

Dan