Toward a Better Discussion of Parrot's Future VCS Options

Jonathan Leto jaleto at gmail.com
Wed May 5 08:54:13 UTC 2010


Howdy,

> So far, the compelling reason I've been given for why we should move to git
> is branch merging. But, it's been demonstrated quite effectively that the
> git merging tools work on an svn data-store.

Branch merging is only one of the reasons. Speed, having the full
history at your fingertips when you are off-line and every clone being
a backup of full history are some of the others.

Also, using git to talk to an svn data-store is painfully slow and an
inefficient use of resources. Also, I just don't have the trust that
svn is not corrupting my data, because it does not use any type of
hashing.

> I totally understand that some people prefer working in git, and as far as I
> know most of them already are (pulling from the svn central data-store).
> What I don't understand is what advantage we gain from using git as the
> central data-store. As far as I can tell, in the age of distributed version
> control the format of the central data-store is entirely irrelevant. (At if
> it really is irrelevant, the weight is on keeping what you already have
> because of the cost of migration.)

The advantage of using git as our central data-store is that it allows
access from more VCS's than anything else. Github offers read-write
svn access, that is mostly what we need.

> The funny thing is, AFAIK no one whats to switch to distributed development
> patterns. The git advocates want to use git as a centralized repository. If
> we were planning to move to a more distributed development pattern, I would
> consider git a greater advantage.

I would advocate that some distributed development patterns would
greatly help Parrot development. Changing to git allows these to be
possible in the future. Staying in svn, they are forever unavailable.

[snip]


> This I would like to see a great deal more on. About the only way I'm likely
> to be comfortable with a move to git is if we do an entire mock run on the
> migration to a test server, including the svn-dump-and-git-import, the
> integration with Trac, email notifications, and our tools for access control
> management. We need to figure out hosting, if our current hosting can
> support git, if we'd use a service like githib.

There is some confusion here. Hosting is a solved problem. There are
no less than
three places that will freely host Git mirrors, and since every
developer's clone is a backup of fully history, we don't have to pay
extra for off-site backups. Many mirrors of Parrot in git exist and
are publicly available. That is a solved problem. Integration with
Trac is the biggest question right now, and many people seem to be
working on that. I can consult on access control management solutions,
since I have now helped two companies that I work for from Subversion
to Git.


> We need to figure out if
> Trac integration will work if git is hosted on a different box (svn and Trac
> have to be hosted on the same server for the integration to work). I want to
> know if we can do bzr and svn mirrors from git.

I doubt this is an issue, but if Trac needs a local instance, SSHFS
can solve that problem.
Yes we can do bzr (read-only) and svn (read-write) mirrors of git.

>
> I also want to see the equivalent of our docs/project/committer_guide.pod
> written for git. I want to see agreement among the git users on a project
> standard for how we setup our local repos, how we do checkouts, commits,
> branches, merges, where we publish our private branches, etc. I know git
> allows for dozens of different styles of development, but it makes it
> enormously difficult to collaborate with other developers when you have to
> work around a dozen different ways of doing things (I've already been bitten
> by it, and we're not even using git all that heavily at the moment.)

I have helped create these policies before, so I can help write that
document. Should that all go in a new git_committer_guide.pod, for
now?

> I want
> a plan for how we'll refer to revisions, and if there is a good way to save
> existing references to our 46286 (and counting) revisions,

By default, when you convert a repo with git-svn, it includes
references to the SVN revision numbers, for instance:

http://github.com/leto/parrot/commit/c5c01adaf92036fceb6753286581242451567e84

This is a good reason to import with git-svn, even though it is slower
than svn-all-fast-export. Good thing someone keeps a daily-updated
git-svn mirror (thanks jhelwig++):

http://technosorcery.net/system/parrot-git-svn.tbz


> a policy on how we'll tag releases

This is not something specific to git, but if specific instructions on
how to make "tagged releases in git" are needed, "git help tag" can
help with that. A tag in git is just a symbolic name that refers to a
SHA1 commit ID. It is not a copy of the history at a certain point as
it is in svn.

>and details on how we'll restrict our git hosting to
> prevent the more destructive repository-changing commits that git allows.

It is trivial to have a pre-commit-hook that disallows
non-fast-forward (history-changing) pushes. This can be configured
per-branch, which is what the git team does. Certain branches can be
"rewound", while others must always march forward, so that people can
always base other work on them and know that the rug will not get
pulled out from under them. I have also worked at places where certain
git branches disallowed non-fast-forward changes, but personal
branches could, which is usually what people want.

> Ideally, I'd like to spend some time talking to a sysadmin who has done svn
> migrations and maintains git repos. Specifically, one who is willing and
> open to talk about the disadvantages of git as well as advantages. (All
> software has disadvantages. I worked as a sysadmin for years, and I don't
> believe anyone who tells me a particular package has none.) These kind of
> infrastructure decisions are trade-offs, working out if the mix of
> advantages/disadvantages of one system are a better fit for the particular
> project's needs than the advantages/disadvantages of another system.

I started out as a sysadmin, so while I do not work as one know, I
feel very comfortable in
saying that there is little work for a sysadmin to do with a git repo.
The developers set their development policies, and off they go. The
sysadmin has a lot less to worry about, since a bare git repo take up
so much less space than a svn repo.

>
> Unfortunately, I won't be at YAPC::NA. Whatever parrot devs are there can
> certainly get together, but it wouldn't be effective at resolving this
> conversation.

It will be great to talk to other Parrot dev face-to-face about their
questions and concerns, and I definitely am willing to help any Parrot
developer with git questions, so if you will be at YAPC and you have
questions, please come find me.

> I also get the impression that the git advocates would like an answer soon,
> so the end of June may be too long anyway.

Rushing is rarely a good algorithm. I would much rather have a
well-thought out migration plan that satisfies everyone's worries and
allows for enough time for each step to be done correctly and to the
satisfaction of everyone involved.

Duke

PS: I recently gave a presentation to my new $job about Git, which
some may find useful:

http://docs.google.com/present/view?id=dk89d5g_34c978zvhh


-- 
Jonathan "Duke" Leto
jonathan at leto.net
http://leto.net


More information about the parrot-dev mailing list