[DRAFT] Parrot module ecosystem

Allison Randal allison at parrot.org
Tue Aug 11 22:03:11 UTC 2009


Geoffrey Broadwell wrote:
> 
> A friend suggested that rather than trying to form a module distribution
> network for every individual module, especially using CPAN-style
> tarballs-over-HTTP distribution, a much more likely scenario is a
> torrent service for full Power Packs (the top few of which are likely to
> be sufficiently popular to actually take advantage of traffic sharing).
> 
> I'm not saying *we* do this, just that it is a reasonable thing for some
> enterprising group to decide to do -- without having to do Google-scale
> work.

Yup, that's sensible. They wouldn't even have to create tarballs for the 
Packs, they could set it up to create a bittorrent download queue of 
individual modules from the Pack spec JSON file (bittorrent breaks 
everything down into small chunks anyway, so it might as well start that 
way).

>>>   * Tools must properly handle the difference between user-local,
>>>     site-local, and vendor-installed modules.
>> Not sure this really matters.
> 
> I seem to recall the Perl 5 toolchain people actively iterating to get
> this right, so I'm trusting they've thought about it enough to share
> their institutional knowledge here.

What I mean is, we should make it possible to add additional directories 
to Parrot's library/include/dynext path search (it's already possible), 
but there's no reason we need to specifically support user-local, 
site-local, and vendor-installed. They're all just instances of a custom 
install/search path. (And frankly, I never found those categories 
particularly useful in Perl anyway, so don't want to make them canonical 
in Parrot.)

>>>   * Sufficient for automated programs to create system packages
>>>     (DEB, RPM, etc.).
>> That would be tough, but we can at least cover the simple cases, with 
>> guidance to point people toward how to extend the generated template to 
>> handle the harder cases.
> 
> Sure.  As long as we don't treat perfection as a hard requirement,
> thinking about the issues here can help flesh out weak points in the
> spec.

Aye, it's a good test case. I would put it as a lower priority than 
getting the main install process working.

>>>   * Separate static v. configure-discovered v. hand-edited metadata.
>>>     Separate files?
>> Sounds overly complex. Provide a field in the metadata for "data_source".
> 
> This came from a thread I noticed in #toolchain.  I had not had time to
> investigate it fully, but I'll ask in more depth for the next iteration.

Makes sense.

>>>   * Specifies rules for dependency string parsing/interpretation.
>> Not sure what you mean. More detail?
> 
> Just as examples, not a complete list, a module in the wild could
> reasonably declare dependencies against:
> 
> * Other modules from the same or other HLLs, or Parrot itself
> * Power Packs
> * System libraries, which vary between OSen in basename, version,
>   basic architecture ('libfoo' versus 'Foo Framework'), etc., and may or
>   may not require foo-dev packages to work with
> * System tools, which vary in name and versioning across platforms
>   ('make' v. 'nmake')
> * Operating system versions ('linux 2.6.30+' or 'Windows Vista+')
> * Alternatives ('iceweasel | firefox | safari | web-browser')
> * Virtual dependencies (depending on something that multiple packages
>   provide, such as 'web-browser' above)
> 
> Each of these is potentially in a different namespace, with its own
> rules about how version strings are parsed and compared (the rules for
> cpan shell are not the same as those used by apt-get, for instance).
> 
> The easy thing is to punt off most of the parsing to namespace-specific
> parsers; PHP should know how to parse and compare PHP module versions.
> But even if we do that, we need a meta-syntax that allows us to specify
> which namespace a dependency string should be interpreted in.
> 
> As a first try, we could do:
> 
>     namespace:'dependency_string'
> 
> Thus:
> 
>     perl5:'Foo::Bar 2.16'
>     perl6:':name<Baz::Biff>:auth<CPAN:QUUX>:ver<3.15.*>'
>     apt:'zlib1g (>= 1:1.2.3.3.dfsg)'
> 
> But that's a bit ugly ... plus we still have other questions to resolve,
> such as how to specify alternates across namespaces.
> 
> All of which is just to say: because we need to work with all of the
> existing dependency metadata of every other module ecosystem out there,
> there's a lot to think about even if we try to punt most of the details.

Ack, that's a total nightmare. Step one is to punt for anything 
installed by Aviary, and trust the metadata to provide the right version 
number.

>> The primary interface should be a web form where people can enter 
>> metadata about their module. They should also be able to *update* the 
>> information stored there, to mark an older version as deprecated, that a 
>> module is no longer maintained, change the owner(s), change the URI for 
>> download, or to remove a module entirely. (Look at Launchpad.net for 
>> inspiration.)
> 
> I can certainly see offering this as *an* interface.  But I don't buy it
> as the primary interface.  Manual labor that has to be replicated on the
> project's source hosting site (which may be Launchpad.net or github or
> what have you) and on Aviary as well breaks the main requirements of
> making things dead easy for the module authors and not adding any
> additional friction to their (probably volunteer) workload.

See, you're assuming that it's the module authors who will be submitting 
this information (like CPAN). Aside from a few modules developed by 
Parrot team members, most modules won't be submitted to Aviary by their 
authors. They'll be submitted either by Parrot team members getting the 
system started, or they'll be submitted by users of the Python/Perl/PHP, 
etc module who want to use it with Parrot.

> Or they could register their repository *once*, and we can pull the info
> (perhaps polling, perhaps on request).  Or if the posting site can feed
> us release notices, we can use that.
> 
> We should work very hard to add as little extra work for authors as
> possible, and definitely avoid adding an extra manual publish step.
> Yes, we can offer the manual process as an option ... just not the only
> one.

They're not going to put Aviary information in Launchpad.net or github 
repositories. If the system requires module authors to do anything at 
all, it won't work. We can't depend on them. (Just ask the linux 
distribution packagers. Module authors know nothing about your 
distribution system and don't want to know.)

I completely agree on not duplicating the information. So we *only* keep 
it in the Aviary database. Updating it is two quick clicks through the 
web interface.

I admit I'm prejudiced against making people enter data through a plain 
text file. It seems horribly primitive and 1995 (META.yml blech!), 
especially when we know exactly what data we need. But, to look at it 
from a more practical perspective: presenting it as a web form means 
that any average Joe who has done an install of the module can walk 
through the guided web form and tell us how to do it. JSON text input 
means a) they have to know JSON, b) they have to know our JSON format, 
c) they have to look up what's required, what's optional, and what the 
various options are. All of which will mean we'll get far fewer helpers 
building up Aviary's module data set.

>> And honestly, I think it's backwards. Aviary should be 
>> a standalone "Pack" that has Parrot as its first dependency. (If you 
>> have Parrot installed, great, if not, it'll install it for you. It's 
>> pretty much what Rakudo does already.)
> 
> There's a chicken and egg problem here, and I see no obvious reason that
> one logically comes first.  I don't think the user cares either way, as
> long as they only have to perform one manual step.  More important I
> think is the branding -- do we want to brand ourselves as Parrot or as
> Aviary?  Whichever one the user installs manually is the one they will
> probably perceive as the primary brand.

I was thinking of it as the "Parrot Aviary", so no branding problem.

> My reason for choosing Parrot first was actually more technical -- we
> make binaries/system packages of Parrot for various operating systems,
> and Parrot once installed provides a nice abstract layer insulating us
> from a lot of operating system differences.  Thus building Aviary on top
> of Parrot makes use of Parrot's platform.  Building Aviary separately
> from Parrot, so that it can be used to install Parrot, just forces us to
> solve lots of portability issues all over again.

It's just a bootstrapping problem. I might be more daunted by it, but 
Rakudo has a pretty good working solution.

It will eventually become irrelevant, when Parrot is installed on more 
systems by default. But for now, we have to assume the user has nothing 
Parrot-related installed when they get started.

Bloating the core Parrot distribution with a pile of modules that are 
only used to install more modules really doesn't make sense when most 
users are likely to be installing via apt-get, yum, cygwin, etc.

Parrot core should be light, a module installer should be light and 
optional.

> And however we do it, we shouldn't trample the system's Parrot or Aviary
> if they were installed using the native package tools.

Yes, we shouldn't trample over a system installed Parrot or Aviary. It's 
a configuration test "should we install Parrot?"

For now, it doesn't really matter if Aviary has the ability to install 
Parrot. If you can just get it working as something that's installed as 
a module on a system Parrot, you'll be doing well.

> No, not all of them.  Or at least, Packs should be available in
> 'regular' as well as 'all-in-one'.  This goes back to my reasons for
> Parrot to be installed first.  People making Power Packs should have the
> option to create bootstrapping Packs, but it doesn't make a lot of sense
> to me that bootstrapping be the default.  I'm quite likely to want to
> install the Games, Education, Science, and Graphics Packs all at once.
> I see them as add-ons to a central system, not completely independent
> things.

Ditto to above, get it working first, then worry about making the 
install process easier.

>>>   * Default to simple (CPAN-style) dependency resolution; upgrade to
>>>     full resolution and system package awareness in Basic Batteries.
>> Not sure what you mean. More detail?
> 
> The first part was about handling edge cases such as conflicts/provides
> graphs that can only be resolved by upgrading multiple modules in
> concert to particular versions and removing other modules at the same
> time.  The second part was about knowing that particular non-Parrot
> system packages were installed (and what versions).

Again, this is your biggest nightmare, the hardest problem to solve. 
It's also one that's already solved by Debian, *BSD, cygwin, MacPorts, 
etc, so be wary of reinventing wheels.

>> Skip the manifest, it's a pile of duplicated data that's only needed by 
>> the build process (by the time you start the build process, you have the 
>> tarball anyway).
> 
> This was conceptually to support script-free module installs and a few
> other similar ideas.  This feels like a 'play it by ear' sort of thing;
> I'm too tired to be sure whether a detailed manifest is really useful or
> not.

Aye, the devil is in the details. The general principle is to keep 
Aviary as light and simple as possible, get it working, and only add 
complexity where you absolutely can't avoid it.


>> I don't see anything here for "standard build instructions". As in, the 
>> specific command-line instructions for "configure", "build", "test", and 
>> "install". These could allow variable substitutions from parrot_config 
>> (or Aviary's collected configuration information), so Rakudo's "perl 
>> Configure.pl" could be "@perl@ Configure.pl", while Pynie's "python 
>> setup.py build" could be "@python@ setup.py build", and a general 'make 
>> test' could be "@make@ test" (to allow for nmake, etc).
> 
> The standard instructions should be 'aviary install foo-pack'.
> Everything else is either advanced usage or internal details that I
> think we'll work out as we come across them.

Oh, you misunderstood, I meant store the install instructions that 
Aviary will have to use to install the module. CPAN has the luxury that 
all modules use one single standard set of installation commands, so it 
can run them automatically. Aviary will be facing a host of different 
techniques for installation, so the easiest way to let it do an 
automatic build is to record the commands in Aviary's metadata. That's 
really all it needs, "where do I get the tarball?" and "what commands do 
I run on it?".

Allison


More information about the parrot-dev mailing list