Features Download
From: Fields, Christopher J <cjfields <at> illinois.edu>
Subject: Re: Google Summer of Code - BioPerl proposals
Newsgroups: gmane.comp.lang.perl.bio.general
Date: Monday 1st April 2013 22:23:45 UTC (over 5 years ago)
On Apr 1, 2013, at 12:17 PM, Carnë Draug  wrote:

> On 1 April 2013 04:28, Fields, Christopher J 
>> On Mar 31, 2013, at 9:05 PM, Carnë Draug 
>>> On 1 April 2013 01:34, Fields, Christopher J 
>>>> I agree.  Another approach might be to cleave off a section that you
could mould into your own; this could be done for bioperl-run,
bioperl-live, etc.
>>> Why did the project ran out of time 2 years ago? The blog posts about
>>> it are very few and don't sound too bad. It mentions having prepared a
>>> couple of them, but none was actually ever released. Instead, the
>>> source was also kept in bioperl-live and seems to have already
>>> branched. Is there any reason for this? It was my understanding that
>>> splitting the project is still desirable, from a discussion back in
>>> February
>>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/26395
>>> it just happens that no one has picked it up yet.
>> The project actually made a lot of headway; the particular pieces moved
out (Bio::Root, Bio::Factory, etc) worked fine, but we never followed up on
exactly what to do next on master branch.  It's perfectly feasible for
someone to go ahead and finish the initial part of that (in fact, I believe
there were some branches that started along this path but never merged back
> Can I merge any branching between these and bioperl-live and set them
> up so you only have to run dzil on their repos?

I wouldn't worry about the branches, they are probably too stale.  Have it
so dzil works for the various repos from that project (it should already). 
We will likely need to think about having a stub Build.PL that can be used
for basic installation, but would be auto-generated based on the needs for
that repo (and so shouldn't be committed to).  This is mainly to help
git-savvy users, not devs; we don't necessarily want users to install dzil,
which had somewhere north of 40 or so dependencies IIRC.

>>> I think splitting bioperl-live into subdistributions and make a new
>>> 1.70 release of each of them is perfectly doable over a summer. And I
>>> say this after having split and release Bio-Biblio. This is one of my
>>> itches with BioPerl. I have been using it for almost 3 years, but have
>>> never seen a release. I would like to make new releases of everything,
>>> no changes at the start, but take them to the point that "dzil
>>> release" does everything. Make it really easy for anyone to come in
>>> and contribute and even easier for a maintainer to make a new release
>>> after receiving a contribution. Is this desirable for the project?
>> Hilmar's point is pretty valid, namely that a case would have to be made
as to why the initial run at it wasn't completed, or why it would work
better this time.  We're not suggesting that this can't be done, but the
above point would have to be answered.
> The only reason why I claim to be able to finish this is that I'm very
> well familiar with both BioPerl and the tools to make the split. Plus,
> I already split one (and trying to split another) to get a clear idea
> what it involves.

Right, I do think it's feasible.  But see Hilmar's response on this point;
you don't have to convince us.

>> Frankly, the project has been pretty reliant on me for releases, so it's
perfectly valid to point out the modules haven't made it out yet b/c I
haven't made a release since then.  From that point of view, this would be
a continuation of that work, maybe with the intent/focus on making code
releases much easier.
> As a maintainer of another FOSS gigantic project that is also a
> collection of libraries, I can relate to this. Of course it can be
> much more interesting to write new sexy code and add it to the huge
> pile of modules already in bioperl-live but I want to make it easier
> for others to develop on BioPerl. Comparing with chemistry, I want
> this to be the equivalent of a catalyst for the development, rather
> than another reactant.
>> Regarding updating Bioperl to use Dist::Zilla amongst other modern perl
tools (Moose included), yes, it is very much our wish/intent to have this,
in any way possible.  But I don't think we can call it BioPerl v1.7, simply
based on past release cycles; we're somewhat bound by deprecations, etc. 
We really need a clean break.
>> So, my general feeling is that while we are cleaving out code and
releasing the independent dist and core, we should re-christen core as 1.9
(e.g. pre-v2).  We move to v2 when we feel we're at the right point.  Each
of the individual distributions would have to start with their own
versions, anything greater than the point where they left the core/live
distribution should work.  I agree with you in that I don't think it would
take a long time, but we also have bioperl-run in the mix (and in many
cases it would make sense to combine wrappers with the proper parsers), so
simply cleaving out from one repo may not be the best approach.
>> With that in mind, my point was meant to indicate we can also start
afresh with a section of the code that you would like to focus on, using
some of the same ideas (pulling out the relevant modules you want to work
on).  This might be an attainable goal in the minds of GSoC reviewers and
might suit your particular needs (for instance, if you had a research
project reliant on such code).  I'm supportive either way, and I don't
think you'll have a problem finding a mentor if you need one.
> I suggested 1.70 only because it has no change. And it won't be
> BioPerl 1.7. It would be Bio-Seq, Bio-Align, Bio-Popgen, etc v 1.70.

There may be a point where we will likely find it hard to split out more
w/o running into circular dependency issues.  This will likely center
around Bio::Seq, Bio::SeqFeature, and Bio::Annotation (with others thrown
in).  But let's see how far we can go with it.  If we get to a point where
division becomes problematic, we can deem that 'core'.  But I would like to
see Bio::Seq etc in their own space.

Re: versioning: I'm not particularly hung up on any particular versioning
scheme, but the key point is support.  It's easy for me to say "as of
bioperl v2 the installation scheme will be something completely different"
as opposed to doing so with v1.7.  Will installation of v1.7 be the same is
it was for v1.6 (or even similar)?  Will it install the same modules by
default?  We would be changing a key step in using BioPerl (installation)
w/o much warning.  

> These smaller distributions can then stay as they are or evolve into
> 2.0 if their maintainers are so interested. I saw biome and liked it,
> but is the plan to make a BioPerl 2.00 written in Moose?

Not necessarily, unless it can be demonstrated to help considerably.  I
think it can FWIW.  

> Won't that
> path take us to the same place we are now in a couple of years? Won't
> it be better to make the split now, and make the clean break on each
> smaller distribution?

Right.  Exactly. (the latter point :)

> Would you be available to talk about this on #bioperl? I'm online
> there most of the time.
> Carnë

I'll join in tomorrow, sure. I may be on and off channel due to meetings.  

CD: 3ms