A little bit of Git-Fu

I’m working on a Git project where I have several different repositories as submodules of another. I’ve decide that submodules are way too annoying to maintain (damn detached heads) and they’re not supported by the eclipse plugin so instead, I’m merging all the repos into one.

Googling around I found several straight forward references for doing just that. The basic process is to modify your repo so that everything is in a sub-directory using this command:

git filter-branch --index-filter 'git ls-files -s | sed "s-\t-&newdir/-" |
    GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
    git update-index --index-info &&
    mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' HEAD

and then merge the various sub directories into one project. Perfect. Just what I needed. However, I spent a few hours trying to do so and ran into a few problems which I thought I’d share.

First, I was using Mac OS X and unbeknownst to me the sed command doesn’t support \t from the command line so sed "s-\t-&newdir/-" in the above command was searching for a literal t instead of a tab. This messed up the new file paths in the git ls-files -s output so I had to find a workaround. Apparently the \t issue is due to posix or something and I tried several variations without luck so I moved to a Debian box that didn’t have this issue (I’d love to know how to fix it).

Second, the git repositories I’m working from were created from a large SVN import. In SVN, you can commit an empty directory to the repository so in several cases the commit in my repo didn’t have any files associated with it (this was always the case with the first commit that created the initial SVN repository). Git doesn’t allow you to commit empty directories—it must contain at least one file—so the above command’s mv portion failed when a file didn’t exists. I tried to remove or squash the offending commits several different ways but always ran into more problems with conflicts or strange trees. Sitting back and trying to see the big picture, I broke down the command and realized that I could just check if the file existed before it was moved. This made a lot more sense:

git filter-branch --index-filter 'git ls-files -s | sed "s-\t-&newdir/-" |
    GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
    git update-index --index-info &&
    if test -f "$GIT_INDEX_FILE.new"; then mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE; fi' HEAD

That did the trick and I could proceed with merging my repos as I pleased.