GSoC 2011 Proposal: Bytecode Emitters for POST

Brian Gernhardt benji at silverinsanity.com
Sat Apr 2 20:48:43 UTC 2011


The following text is also online at https://gist.github.com/899867 and further revisions will updated there.  Comments are welcome via e-mail, gist comments, or IRC.

--------------------

**Name:** Brian Gernhardt

**E-Mail:** benji at silverinsanity.com
**IRC:** benabik on irc.perl.org, irc.freenode.net

Bytecode Emitters for POST
==========================

Abstract
--------

Create an extension to PCT that generates bytecode directly from POST.

Benefits to the Parrot VM and Open Source Community
---------------------------------------------------

While initially this project would make little difference in the day to day
usage of Parrot, it should provide a minor speed improvement and lay the
groundwork for further projects.  With direct bytecode generation, parrot
does not have to spend time parsing PIR that was generated by a language
using PCT.  Removing the PIR dependency from PCT also means that PIR could
be removed from parrot or even implemented using PCT.

Deliverables
------------

The main deliverable would be a library that can convert POST trees to
Parrot bytecode.  This library should have an interface similar (but not
identical to) the existing POST::Compiler and will walk the POST tree and
use the Packfile and related PMCs to generate a PBC file.

Along side this library should be both technical and user documentation.
The technical documentation will be written as POD comments among the code
describing both the API and the algorithms involved in detail.  The user
documentation should be a step by step guide for HLL languages to use the
library.

In addition, a test suite needs to be written to ensure that the library
performs as required.  The principles of Test-Driven Development will be
used so that the test suite is constantly up to date with the current
status of the project.

Project Details
---------------

The bytecode generator should be implemented as an external library that
takes in a POST tree and outputs a PBC file.  This should be possible
without any alterations to the parrot source, since any changes can be
handled either by subclassing or creating custom PMCs.  The library itself
will be written in Winxed as it provides a good balance between power and
ease of use.  For testing, Rosella will be used for similar reasons.

This project will borrow heavily on the design of POST::Compiler and IMCC,
as to some extent it will need to perform tasks done by both.  It needs to
walk the tree and perform similar translations tasks as POST::Compiler, but
instead of outputting text it will need to translate the opcode names to
raw binary as IMCC does.  If simple tree traversal is insufficient, then
the tree-optimization library will provide more complex pattern matching.

The week prior to each evaluation will be dedicated to bugfixes and other
issues.  This extra time should help to both produce a high quality result
and absorb any excess time caused by unforeseen issues.  Also, while this
proposal describes the project as an external library the intention is
that the code could be integrated into the main PCT library under
`compilers/pct`.  A week at the very end of GSoC will be allocated to code
cleanup to address any issues that could prevent such a merge.

There is one major gap in this proposal: `POST::Op` nodes of type `inline`.
Unfortunately, completely implementing such nodes requires a full PIR
parser.  Such handling is outside the scope of this project, but all such
nodes should route through a single place in the code for easy extension.

Project Schedule
----------------

This schedule is written in terms of milestones, so the work listed on each
date will be done in the week(s) prior.

**May 16**: My course-load for this quarter is heavy, but I intend to
spend as much time as possible on IRC and reading existing sources and
documentation.  If time allows, I will create at least the skeleton of the
library with the outline of code, tests and documentation.

**May 16 - 20**: Finals week at RIT.  No GSoC work is likely during this
time.

**May 24** _GSoC Start Date_

**May 30** _Basic Library Structure_:  Functional build and test system.
Documentation describing the main API of the library should be included.

**June 6** _Empty Output_: Library can accept a (TBD) minimal POST tree and
output a PBC file that can be loaded and run but performs no real work.

**June 13** _Opcode Handling_: The framework to handle processing `POST::Op`
nodes will be in place, although it will only handle a small set of opcodes
such as "add" and "say".

**June 20** _Constant Loading_: The ability to load arbitrary constants into
the constants segment will be implemented.

**June 27** _Opcode Lookup_: Full access to core Parrot opcodes via the
information available in the OpLib PMC.

**July 4** _Label Handling_: Populate the Fixup segment and arrange for it to
be used whenever labels are needed.

**July 11** _Midterm Evaluation_: Bugfixes.  At this point the library
should be able to compile simple programs that lack subroutines.

**July 18** _Basic Subroutines_: Basic handling of `POST::Sub` nodes,
including subroutine names and simple parameters.

**July 25** _Subroutine Attributes_: Handling of subroutine attributes such
as name, init, method, etc.

**August 1** _Parameter Attributes_: Handling of :slurpy, :named, optional,
etc.

**August 6** _Dynops_: Handle loading dynamic opcode libraries and parsing
their opcodes.

**August 16** _Suggested 'Pencils down'_: Bugfixes

**August 22** _Final 'Pencils down', evaluation_: Cleanup for possible merge
to parrot.git


References and Likely Mentors
-----------------------------

Trac listed cotto and bacek as possible mentors for this project, but I
have not personally spoken to them about it.

* [PDD 13: Bytecode](http://docs.parrot.org/parrot/latest/html/docs/pdds/draft/pdd13_bytecode.pod.html)
* POST (parrot.git:compilers/pct/src/POST/)
* IMCC (parrot.git:compilers/imcc/)
* [tree-optimization](https://github.com/parrot/tree-optimization)
* [pir compiler using PCT](https://github.com/parrot/pir)
* [packfile.winxed](http://code.google.com/p/winxed/source/browse/trunk/examples/packfile.winxed)


License
-------

I whole-heartedly support open source licenses and will be more than happy to
use the Artistic 2.0 license suggested by Parrot.

Bio
---

My name is Brian C Gernhardt and I'm currently attending the Rochester
Institute of Technology to obtain my Masters in Computer Science.  I did
contracting for 4 years after getting my bachelor's degree (in CS from
RIT).  My current focus has been language design, and I just completed a
project where I implemented a [compiler on the JVM][rit-cs].

[rit-cs]: http://cs.rit.edu/~bcg2784/ (My RIT webpage)

I've been following Parrot for a couple years due to my interest in Perl 6
and Rakudo.  I try to regularly produce smoke reports, which has already
resulted in a couple of Trac tickets (#1544 & #2001).  For my compiler
class, I wrote an [introduction to the Parrot Compiler Toolkit][cish].
I've also worked on other open source projects.  I've worked the most on
git but have also worked on Ruby on Rails, Radiant and fink.

[cish]: http://github.com/benabik/cish (PCT Introduction)

I've also been interested in a bytecode generator since I read the entry
for it in the Parrot glossary:

> **bcg**  
> Bytecode Generation: bcg will be part of the Parrot Compiler tools. It will
> aid in converting POST to bytecode.

Since my initials and the initials of the library match, I thought it would
be an appropriate section of Parrot for me to work on.

Eligibility
-----------

I am 29 and currently attending the Rochester Institute of Technology to obtain
my Master's in Computer Science.  I have completed a quarter of graduate
classes and can produce a transcript to prove such.



More information about the parrot-dev mailing list