[DRAFT] Parrot module ecosystem

Allison Randal allison at parrot.org
Sun Aug 9 23:29:05 UTC 2009

Geoffrey Broadwell wrote:
> This is a VERY ROUGH OUTLINE of my proposal for the design of the Parrot
> module ecosystem.  I'm looking for any and all comments, and I expect
> many changes before we reach even rough consensus.  Please feel free to
> circulate this to people who understand the problem space better than
> I do!  I can use all the input I can get.
> Note that there are some items explicitly marked 'undecided'.  I thought
> about these enough to realize I just don't know enough yet to make a
> decent recommendation, so input on these is especially appreciated.

A great proposal, thanks for taking the time to think through the issues!

> ************************************************************************
> Requirements
> ------------
> * General
>   * Distributions should be dead easy for module authors to create,
>     and for users to install.
>   * We can create a centralized metadata store, but do not want to
>     build and manage a module distribution network ...


>   * However it should be possible for another group to do so.

<shrug> I'm not sure it is possible. Keep in mind that we're not just 
talking about custom Parrot modules, we're talking about the entire 
collected history of Perl, Python, PHP, Ruby, Lua, etc modules. That's 
just an insane amount of data, even Google isn't crazy enough to want to 
take that on.

And, an additional requirement:
  * The API for extracting info from the metadata store, and the process 
of installing modules based on that metadata should be dead simple and 
clearly specified so people can clone it. If we see implementations 
popping up in Python, PHP, Ruby, Perl (5&6), etc it'll be a mark of success.

> * Toolchain
>   * Basic tools can assume Parrot and the core modules are working,
>     but require no other dependencies internally.
>   * All external tools needed to download/build/install modules will
>     be specified in the module metadata.
>   * Tools should be easy to configure.
>   * Tools should attempt to auto-configure as much as possible.


>   * Tools must properly handle the difference between user-local,
>     site-local, and vendor-installed modules.

Not sure this really matters.

> * Metadata
>   * Simple, extensible format.
>   * Unicode and case-retaining.
>   * Must include its own spec version.


>   * Sufficient for automated programs to create system packages
>     (DEB, RPM, etc.).

That would be tough, but we can at least cover the simple cases, with 
guidance to point people toward how to extend the generated template to 
handle the harder cases.

>   * Separate static v. configure-discovered v. hand-edited metadata.
>     Separate files?

Sounds overly complex. Provide a field in the metadata for "data_source".

>   * Includes fetch, configure, build, test, install, and runtime
>     dependencies.
>   * Should be able to track author, mailing list, bug email/bug URI,
>     wiki, homepage, source repository, etc.


>   * Allows disambiguation as per Perl 6 module spec (authorities,
>     versions, authors, etc.).

It's just extra metadata, sure, why not. We should also make sure we can 
accommodate the metadata for RubyGems, PHP PEAR, etc.

>   * Specifies rules for dependency string parsing/interpretation.

Not sure what you mean. More detail?

> Proposal
> --------
> * Overview
>   * Parrot community builds a module metadata search system.
>   * This search system gathers metadata from various sources, and
>     allows users to query it via web browser or API, but does not
>     itself store the actual modules.
>   * Once found, modules can be fetched from many possible sources,
>     including VCS repositories, FTP mirrors, etc.
>   * Parrot team will need to standardize module metadata, provide
>     the libraries and tools necessary to use the search system,
>     provide guidelines for extending the toolchain, and mentor the
>     growth of the ecosystem until it stands on its own.


> * Metadata format
>   * Served metadata container is gzip'ed tarball (.tar.gz? .tgz?).
>   * Core metadata is in META.json at top level of container.
>   * Container includes copies of special files (e.g. README).
>   * Format for specifying non-metadata-only build scripts undecided.
>   * Integrity check / authentication methods undecided.
>     * Probably at least md5sum and sha1sum for source tarballs,
>       but what about when pulling from raw VCS repo?

You're still thinking CPAN (the best technology 1995 had to offer).

The primary interface should be a web form where people can enter 
metadata about their module. They should also be able to *update* the 
information stored there, to mark an older version as deprecated, that a 
module is no longer maintained, change the owner(s), change the URI for 
download, or to remove a module entirely. (Look at Launchpad.net for 

 From the form, we can generate a JSON dump of any module's metadata. We 
can also accept a JSON block as an alternate input source, so someone 
can keep a copy of the .json file checked into their repository, make a 
few changes and paste it in the web form when releasing the next version.

The metadata shouldn't keep copies of any files from the module 
distribution, though it should have space for a description. (If 
someone's lazy they might paste in the entire README, but that's 
generally about building a module, and so not appropriate for someone 
who's looking for general information about it, a.k.a. "Do I want this 

> * Core modules
>   * parrot config  (already exists -- config.pir)
>   * HTTP client    (at least GET, with redirect and proxy support)
>   * zlib           (at least decompress)
>   * tar            (at least extract)
>   * JSON           (at least parse)
>   * version spec   (at least parse and compare)
>   * library probe  (shared library info: present? version? location?)
>   * file paths     (portability: File::Spec + File::Basename + ...) 
>   * file install   (portability: copy file, set file perms, etc.)
>   * query metadata (perform API calls to metadata/search server)
>   * installer lib  (all the real brains/glue for the module repo client)
>   * installer ui   (CLI and/or Readline, minimal brains, uses lib)

Too heavy weight. And honestly, I think it's backwards. Aviary should be 
a standalone "Pack" that has Parrot as its first dependency. (If you 
have Parrot installed, great, if not, it'll install it for you. It's 
pretty much what Rakudo does already.)

> * Basic Batteries modules
>   * Full versions of any modules that are limited in Core
>   * Installer add-ons:   VCS fetch/use system pkgs/full depresolve/etc.
>   * Module author tools: create/register/update/upload/etc.
>   * PIR-level tools:     disassembler/debugger/profiler/data dumper
>   * NCI tools:           parse header/manage typemap/wrap C struct/etc.
>   * Standard interfaces: TAP, DBDI, logging, ?
>   * Standard libraries:  OpenSSL, DateTime, temp dir/file, ?
> * Possible Power Packs (NOTE: *EXAMPLES ONLY*, DON'T BIKESHED!)
>   * Database:     DBDs (drivers), SQL clients, per-HLL DBI variants
>   * Testing:      smoke/tinder/smolder clients, per-HLL Test::* variants
>   * Security:     SSH, GPG, libpcap, ...
>   * Unixen:       POSIX, Fcntl, Errno, ...
>   * Markup:       YAML, libxml2, Expat, DOM, SAX, ...
>   * VCS:          CVS, Subversion, git, Mercurial, ...
>   * Email:        POP, IMAP, SMTP, MIME, ...
>   * GUI:          Qt, GTK+, Wx, Tk, ...
>   * 2D Graphics:  libpng, GD, SDL, Cairo, ...
>   * 3D Graphics:  OpenGL, EGL, GLU, ...
>   * Sound:        OpenAL, Pulse Audio, JACK, ...
>   * Game Support: Require other Power Packs: Audio, 2D/3D Graphics

Details of what modules go where can come later. First step is to get 
Pack installs working. (Basic Batteries is just a small Pack.)

Aviary could include tools to make it very easy to create a Pack from a 
simple JSON file (a list of modules, and a title/description for the 
Pack). All Packs should be standalone, installing Parrot and Aviary if 

> * Misc recommendations
>   * Separate 'parrot-modules' mailing list for module creators/users.

Nothing kills a good idea faster than shoving it off on a separate 
mailing list that no one reads. If module-specific traffic ever seems to 
be overwhelming parrot-dev we can split it off. (Since we got the ticket 
traffic off parrot-dev, traffic is quite tolerable now.)

>   * Default to simple (CPAN-style) dependency resolution; upgrade to
>     full resolution and system package awareness in Basic Batteries.

Not sure what you mean. More detail?

>   * Names so far suggested for module repository network:
>     + Aviary

Love it. Very Parroty. aviary.parrot.org?

>     + CPAAN
>     + FPAN

Both would lead people to expect CPAN, which would be limiting to us, 
and frustrating to them when they find out it's not.

> Metadata Proposal
> -----------------
> * Required fields:
>   * meta-spec
>     + version
>     + uri
>   * name
>   * authority
>   * version
>   * license
>     + type
>     + uri
>   * copyright_holder
>   * abstract


> * Manifest fields:
>   * files
>     + configure
>     + build
>     + test
>     + install
>       - share
>       - docs
>       - bin
>       - lib
>       - runtime

Skip the manifest, it's a pile of duplicated data that's only needed by 
the build process (by the time you start the build process, you have the 
tarball anyway).

> * Dependency fields (as { [dep_name]: [version_spec], ... }):
>   * provides
>   * conflicts
>   * requires
>     + fetch
>     + configure
>     + build
>     + test
>     + install
>     + runtime


> * Optional features fields:
>   * optional_features
>     + [feature_name]
>       - description
>       - [any/all dependency fields as needed]


> * Other optional fields:
>   * description
>   * keywords
>   * generated_by
>   * contributors
>     + authors
>     + maintainers
>     + translators
>     + testers
>     + reviewers
>   * resources
>     + source
>     + homepage
>     + bugtracker
>     + wiki
>     + repository
>       - type
>       - checkout_uri
>       - browser_uri
>       - project_uri
>     + mailinglists
>       - [list_name]
>         . address
>         . uri


> * Undecided fields:
>   * dynamic_config
>   * no_index
>   * digests
>   * signatures

We should be prepared for the format to grow and change over time, 
possibly allowing custom fields.

I don't see anything here for "standard build instructions". As in, the 
specific command-line instructions for "configure", "build", "test", and 
"install". These could allow variable substitutions from parrot_config 
(or Aviary's collected configuration information), so Rakudo's "perl 
Configure.pl" could be "@perl@ Configure.pl", while Pynie's "python 
setup.py build" could be "@python@ setup.py build", and a general 'make 
test' could be "@make@ test" (to allow for nmake, etc).


More information about the parrot-dev mailing list