Fields, Christopher J | 2 Apr 00:23 2013

Re: Google Summer of Code - BioPerl proposals

On Apr 1, 2013, at 12:17 PM, Carnë Draug <carandraug+dev <at> gmail.com> wrote:

> On 1 April 2013 04:28, Fields, Christopher J <cjfields <at> illinois.edu> wrote:
>> On Mar 31, 2013, at 9:05 PM, Carnë Draug <carandraug+dev <at> gmail.com> wrote:
>> 
>>> On 1 April 2013 01:34, Fields, Christopher J <cjfields <at> illinois.edu> wrote:
>>>> I agree.  Another approach might be to cleave off a section that you could mould into your own; this could
be done for bioperl-run, bioperl-live, etc.
>>> 
>>> Why did the project ran out of time 2 years ago? The blog posts about
>>> it are very few and don't sound too bad. It mentions having prepared a
>>> couple of them, but none was actually ever released. Instead, the
>>> source was also kept in bioperl-live and seems to have already
>>> branched. Is there any reason for this? It was my understanding that
>>> splitting the project is still desirable, from a discussion back in
>>> February
>>> 
>>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/26395
>>> 
>>> it just happens that no one has picked it up yet.
>> 
>> The project actually made a lot of headway; the particular pieces moved out (Bio::Root, Bio::Factory,
etc) worked fine, but we never followed up on exactly what to do next on master branch.  It's perfectly
feasible for someone to go ahead and finish the initial part of that (in fact, I believe there were some
branches that started along this path but never merged back in).
> 
> Can I merge any branching between these and bioperl-live and set them
> up so you only have to run dzil on their repos?

I wouldn't worry about the branches, they are probably too stale.  Have it so dzil works for the various repos
from that project (it should already).  We will likely need to think about having a stub Build.PL that can be
used for basic installation, but would be auto-generated based on the needs for that repo (and so
shouldn't be committed to).  This is mainly to help git-savvy users, not devs; we don't necessarily want
users to install dzil, which had somewhere north of 40 or so dependencies IIRC.

>>> I think splitting bioperl-live into subdistributions and make a new
>>> 1.70 release of each of them is perfectly doable over a summer. And I
>>> say this after having split and release Bio-Biblio. This is one of my
>>> itches with BioPerl. I have been using it for almost 3 years, but have
>>> never seen a release. I would like to make new releases of everything,
>>> no changes at the start, but take them to the point that "dzil
>>> release" does everything. Make it really easy for anyone to come in
>>> and contribute and even easier for a maintainer to make a new release
>>> after receiving a contribution. Is this desirable for the project?
>>> 
>> 
>> Hilmar's point is pretty valid, namely that a case would have to be made as to why the initial run at it
wasn't completed, or why it would work better this time.  We're not suggesting that this can't be done, but
the above point would have to be answered.
> 
> The only reason why I claim to be able to finish this is that I'm very
> well familiar with both BioPerl and the tools to make the split. Plus,
> I already split one (and trying to split another) to get a clear idea
> what it involves.

Right, I do think it's feasible.  But see Hilmar's response on this point; you don't have to convince us.

>> Frankly, the project has been pretty reliant on me for releases, so it's perfectly valid to point out the
modules haven't made it out yet b/c I haven't made a release since then.  From that point of view, this would
be a continuation of that work, maybe with the intent/focus on making code releases much easier.
> 
> As a maintainer of another FOSS gigantic project that is also a
> collection of libraries, I can relate to this. Of course it can be
> much more interesting to write new sexy code and add it to the huge
> pile of modules already in bioperl-live but I want to make it easier
> for others to develop on BioPerl. Comparing with chemistry, I want
> this to be the equivalent of a catalyst for the development, rather
> than another reactant.
> 
>> Regarding updating Bioperl to use Dist::Zilla amongst other modern perl tools (Moose included), yes,
it is very much our wish/intent to have this, in any way possible.  But I don't think we can call it BioPerl
v1.7, simply based on past release cycles; we're somewhat bound by deprecations, etc.  We really need a
clean break.
>> 
>> So, my general feeling is that while we are cleaving out code and releasing the independent dist and core,
we should re-christen core as 1.9 (e.g. pre-v2).  We move to v2 when we feel we're at the right point.  Each of
the individual distributions would have to start with their own versions, anything greater than the
point where they left the core/live distribution should work.  I agree with you in that I don't think it
would take a long time, but we also have bioperl-run in the mix (and in many cases it would make sense to
combine wrappers with the proper parsers), so simply cleaving out from one repo may not be the best approach.
>> 
>> With that in mind, my point was meant to indicate we can also start afresh with a section of the code that you
would like to focus on, using some of the same ideas (pulling out the relevant modules you want to work on). 
This might be an attainable goal in the minds of GSoC reviewers and might suit your particular needs (for
instance, if you had a research project reliant on such code).  I'm supportive either way, and I don't think
you'll have a problem finding a mentor if you need one.
> 
> I suggested 1.70 only because it has no change. And it won't be
> BioPerl 1.7. It would be Bio-Seq, Bio-Align, Bio-Popgen, etc v 1.70.

There may be a point where we will likely find it hard to split out more w/o running into circular dependency
issues.  This will likely center around Bio::Seq, Bio::SeqFeature, and Bio::Annotation (with others
thrown in).  But let's see how far we can go with it.  If we get to a point where division becomes problematic,
we can deem that 'core'.  But I would like to see Bio::Seq etc in their own space.

Re: versioning: I'm not particularly hung up on any particular versioning scheme, but the key point is
support.  It's easy for me to say "as of bioperl v2 the installation scheme will be something completely
different" as opposed to doing so with v1.7.  Will installation of v1.7 be the same is it was for v1.6 (or even
similar)?  Will it install the same modules by default?  We would be changing a key step in using BioPerl
(installation) w/o much warning.  

> These smaller distributions can then stay as they are or evolve into
> 2.0 if their maintainers are so interested. I saw biome and liked it,
> but is the plan to make a BioPerl 2.00 written in Moose?

Not necessarily, unless it can be demonstrated to help considerably.  I think it can FWIW.  

> Won't that
> path take us to the same place we are now in a couple of years? Won't
> it be better to make the split now, and make the clean break on each
> smaller distribution?

Right.  Exactly. (the latter point :)

> Would you be available to talk about this on #bioperl? I'm online
> there most of the time.
> 
> Carnë

I'll join in tomorrow, sure. I may be on and off channel due to meetings.  

chris

Gmane