Wednesday, June 13, 2007

Setting up a cloned git repo

I was going to set up a clone of the xserver git repo in my people.freedesktop.org home directory and I figured I'd document the steps. If you want the cheat sheet just skip to the last couple of paragraphs. First of all, to set up a repo I need ssh access to people.freedesktop.org, but since I was going to put the repo in my home directory there, I already have that. Now, the official git repos are read only mounted on /git, so the obvious thing to do is to say

krh@annarchy:~$ git clone --bare /git/xorg/xserver.git xserver.git
Initialized empty Git repository in /home/krh/xserver.git/
...
krh@annarchy:~$ du -sh xserver.git/
24M     xserver.git/

Most of this space is in the objects representing the files, directories and commits of the project history. Compared to the repo we cloned from, it's pretty efficient, as git compresses all those object into a pack as part of the cloning process:

krh@annarchy:~$ du -sh /git/xorg/xserver.git/
135M    /git/xorg/xserver.git/

But given that both repos are on the same filesystem, we can ask git clone to share the underlying objects. This is a pretty clever trick that uses the fact that git objects are immutable. Once you've stored a file or an entire revision of you project as part of a git commit, that object never changes. It would be nice if git could just detect this and do the right thing, but you have to pass a couple of options:

krh@annarchy:~$ git clone -s -l --bare /git/xorg/xserver.git xserver.git
Initialized empty Git repository in /home/krh/xserver.git/
krh@annarchy:~$ du -sh xserver.git/
1.3M    xserver.git/

That's a lot better though, we're down to 1.3M for my own little copy of the xserver repo. However, when you think about it, it's just a glorified symlink to /git/xserver.git and 1.3M is a pretty heavy symlink. It's pretty easy to spot the problem

krh@annarchy:~$ du -h xserver.git/
252K    xserver.git/refs/heads
932K    xserver.git/refs/tags
1.2M    xserver.git/refs
...

When we converted the xserver repo over, we of course imported all branches and tags, most of which now are just in the way—when was the last time anybody looked at xprint_packagertest_20041125? One solution is to just nuke the branches and tags you don't care about. But if you need to preserve this important historical information in your repo or are just to lazy to weed it out, you can say

krh@annarchy:~$ GIT_DIR=xserver.git git-pack-refs --all
krh@annarchy:~$ du -sh xserver.git/
124K    xserver.git/

Again, git should just do this by default in the same way it compresses the objects when cloning. As a final step, I want the gitweb script and the git daemon to pick up my new repo so I can browse it on gitweb.freedesktop.org (or cgit) and clone it using the git protocol. To do that I need to touch a special file in the repo:

krh@annarchy:~$ touch xserver.git/git-daemon-export-ok

which is just a way to say that it's ok to export this repository to the world. It typically takes a little while before the gitweb script finds your repo.

I'm sure there's somebody shaking their head now thinking, "gah, git is useless, look at all those commands", so to address that let me first sum up what you have to do to create your own repo and make it visible to the world:

krh@annarchy:~$ git clone -s -l --bare /git/xorg/xserver.git xserver.git
Initialized empty Git repository in /home/krh/xserver.git/
krh@annarchy:~$ touch xserver.git/git-daemon-export-ok

Second, this is my own repo, I can create all the branches I want and commit crazy stuff, without affecting the upstream repo. I set it up without bugging any of the freedesktop.org admins and I didn't even need xserver commit access. Third, if it turns out that I do useful work there, we can merge it back into the main repo, without loosing any history, and without polluting the upstream branch name space with branch names suchs as gah, doh and gahgah, which are among my favorite choices. Allthough, they'd fit right in with the rest of the xserver branches.

1 comment:

Anonymous said...

"git gc" does several useful things you should run periodically, including git-pack-refs.