Rakudo needs from Parrot in 2011

Patrick R. Michaud pmichaud at pobox.com
Mon Jan 31 14:56:54 UTC 2011


At Saturday's Parrot Developer Summit [1], I agreed to write up a
post addressing what Rakudo needs from Parrot over the next 3/6/9/12
month period.  This is that posting.

   [1] http://irclog.perlgeek.de/parrotsketch/2011-01-29

But before I address what Rakudo needs from Parrot for the future,
I'd first like to acknowledge the many places where Parrot has
responded directly to Rakudo's needs in the past year or so:
  * Parrot now has a pluggable garbage collector (gc2), which is far
    more efficient and less buggy than the previous GC
  * Immutable strings
  * Faster string I/O and manipulation
  * A much improved packfile storage format, reducing Rakudo's memory
    footprint
  * Vastly improved load and startup times
  * A much cleaner and saner character set and encoding system, including
    wider support for Unicode
  * Fixes to the profiling runcore
  * Implementation of :nsentry and other key flags
  * Better introspection of Context PMCs and internal structures
  * Many, many incremental improvements to overall Parrot performance
The above list is certainly not exhaustive, but it's indicative of
places where Parrot continues to support Rakudo and we appreciate
Parrot's many efforts in this regard.

Because of the upcoming changes we are making to nqp and our ability
to prototype improvements there, our list of specific needs from
Parrot is not a very long one (at least in terms of the number
of items involved).  All of them tend to relate to speed and performance
in some manner, which is Rakudo's current focus of development.
In no specific order, our needs are:

0.  Anything that makes Parrot, nqp-rx, nqp, or Rakudo run faster overall.  :-)

1.  GC.  Although GC has much improved, it's still fairly slow
    in places, especially when mark/sweep occurs.  The effect can
    be observed by running the following program in Rakudo:

        my $time = now.x;
        for 1..300 -> $step {
          say $step => '#' x (50 * ((my $t2 = now.x) - $time));
          $time = $t2
        }

    This outputs a row of #'s representing the time elapsed between
    iterations.  On my system, most iterations complete in under
    0.06 sec, but when mark/sweep occurs -- approximately every 75
    iterations -- the iteration requires 0.75 sec or longer.  As
    another example of noticably slow GC, see Larry Wall's "zigzag"
    presentation at YAPC::Asia
    (http://www.youtube.com/user/yapcasia#p/u/131/uzUTIffsc-M ,
    starting at 10:30 in the video).

    We recognize that Rakudo creates a lot of objects when it's
    running, and could potentially make a lot less.  We're working
    on that.  But Perl and other dynamic languages are also regularly
    used to manipulate millions of data values and objects in a single
    program, so Parrot GC still has to be efficient even when millions
    of objects exist.

2.  Profiling tools and documentation, especially at the Parrot sub level.
    Parrot's built-in profiling runcore was recently "fixed" to work again
    with nqp-rx and Rakudo; I'm glad for this but we haven't had a lot of
    tuits to play with it.  Building a suite of useful Rakudo, NQP, and
    Parrot benchmarks is on my personal "to do" list.

    But we still need some basic documentation and clear examples
    for using Parrot's profiling capabilities.  To me, the existing
    profiling runcore seems to produce results for nqp-rx programs that
    either don't make any sense, or I'm unable to understand the results.
    As an example, I just ran the following command using version
    814a916 of parrot master:

        $ ./parrot --runcore profiling ops2c.pbc --dynamic \
              src/dynoplibs/math.ops --quiet

    This runs ops2c.pbc (an nqp-rx program) on the src/dynoplibs/math.ops
    file.  The profiling runcore indeed produces a parrot.pprof.###
    file, and running that file through pprof2cg.pl produces a
    parrot.out.### file that kcachegrind can apparently read.  However,
    the kcachegrind output seems to indicate that (e.g.) the "slurp"
    function used to read the math.ops input file is taking 83.51
    seconds out of the 102.44 seconds needed to run the program.
    I'm fairly certain that is not an accurate depiction of reality.
    So, either some improvements in the profiling system or some
    guides to understanding the output are definitely needed.

3.  Serialization.  The major item that makes Rakudo startup so slow
    is that we have to do so much initialization at startup to get
    Rakudo's type system and setting in place.  There's not a good
    way in Parrot to reliably serialize a set of language-defined types,
    nor to attach compile-time attributes to subroutines and other "static"
    objects in the bytecode itself

    Another issue with Parrot serialization is that it often tends to
    be a "serialize the world" affair -- serializing a data structure
    ultimately ends up serializing the underlying class data types,
    their superclasses, and the like.  There needs to be a mechanism
    for placing boundaries around the serialization; to serialize only
    the unique pieces of a model, as opposed to everything it references.

    We're working on strategies to do better serialization from within
    nqp, but Parrot definitely needs to explore this area as well
    and devise some strategies for compile-time creation of
    language-specific data structures, instead of requiring them
    to always be built at program initialization.

4.  Create .pbc files directly from a Parrot program.
    I know this is being actively worked on, but it's an explicit
    need for Rakudo and NQP and thus belongs on this list.  Currently
    the only reliable mechanism available for creating .pbc files is
    parrot's command-line interface -- it's not possible for a Parrot
    compiler to generate a .pbc on its own directly.  This is why all
    of the compiler tools currently produce .pir files, which are then
    separately compiled by invoking parrot from a command line into
    .pbc (and eventually .exe files).

    This likely has some relation to #3 above regarding the need for
    a better serialization strategy.

That's the list.  There are other areas where we know improvements
are needed for Rakudo, such as faster lexical support, better context
handling, more efficient control exception handler setup (esp.
"return exceptions"), and the like.  But at the moment we're unable
to offer very specific details on what we need to see in these
areas, and we think it'll be more effective for everyone if we
prototype and test solutions in Rakudo and/or NQP first, then 
offer them to Parrot for potential adoption in its core.  This 
approach would be much the same as the one currently being taken
for a new Parrot object metamodel -- i.e., we've developed a new
one in NQP ("6model"), and the consensus expectation is that it
will migrate downward into the Parrot core as an alternate or
replacement for its current object system.  So, we'd hope that
Parrot can be "open" to migrating improvements in other areas
from Rakudo and NQP into the Parrot core as they become more
developed.  (NQP is designed to be a basis for many HLL translators,
not just Perl 6, so we feel that the improvements we offer would
be flexible enough to improve Parrot for languages beyond Perl 6.)
              
As far as timing needs for the above items goes, Rakudo will be
glad to see them "whenever they can be made available".  We have
obvious priority towards those that offer speed improvements 
(e.g. GC and other internal speed improvements) or can be added
with little direct impact to the existing Rakudo codebase 
(e.g., profiling).  We know that any improvements to serialization 
will require a lot of design exploration and core changes, so we 
don't have any specific timeline expectations there, but we also
know we should get some huge speed wins when it does occur. 
    
I hope this outlines Rakudo's needs from Parrot in sufficient
detail to get started on planning and implementation goals; but
if any further detail is needed, please feel free to ask in the 
usual places (parrot-dev, perl6-compiler, #perl6, or #parrot). 
    
Thanks! 
    
Pm  


More information about the parrot-dev mailing list