Storing Classes in Bytecode

Jonathan Worthington jonathan at jnthn.net
Sat Jul 31 03:08:51 UTC 2010


chromatic wrote:
> Rakudo and other HLLs could greatly benefit from Parrot sorting out the "How 
> and when do I declare a class?" mess.  The best way I can see for us to help 
> them is to ease the requirement that all classes need declaration in :load 
> :init subs when compiling PIR or loading PBC by adding a bytecode segment for 
> classes.
>
> This segment should store attribute information (size, name), methods (name, 
> pointer to appropriate sub in regular bytecode seg), parent information 
> (pointer to other class), and (optional) class name.
>
> Anonymous classes referred to in bytecode can refer to the appropriate class 
> in the segment.
>
> Named classes should also be available by name lookup.
>
> We'll need to add a freeze and thaw mechanism for classes and some mechanism 
> by which HLLs can specify which classes they want to freeze, but if we can 
> make this work (and it's not too much work), we can improve Rakudo's startup 
> immensely.
>
> After that, we can figure out some declarative syntax for classes in PIR and 
> avoid many other messes.
>   
First off, thanks for thinking about these kinds of things. It is indeed 
true that we should be doing a lot less work at startup and should be 
able to serialize much more into a PBC file, and it would be a big win 
all around - not just for Rakudo's own startup time, but for loading 
pre-compiled modules too.

However, the direction that is suggested here is, I'm afraid, not going 
to be helpful. I'd like to make some concrete suggestions of what would 
be helpful, for first of all I need to explain a bit about the direction 
I expect to be taking the objects implementation for Rakudo in, so they 
actually have some context. In all of this, please keep in mind that:

* This is still something I'm in the process of designing and 
prototyping, so specifics are speculative.
* I'm in the process of writing a series of blog posts that will got 
into a LOT more detail.

Now that Rakudo Star is out, these two will become my focus for a while.

Rakudo today builds its object model in terms of Parrot's Class and 
Object PMCs. We actually don't expose Parrot classes directly to the 
user, nor Parrot roles. They're instead packaged up inside 
"meta-packages" (a word I am using to describe meta-objects that 
describe various types of package, such as roles or classes). We do 
various tricks like sub-classing the Object PMC. And we build up these 
various objects at runtime, in :load :init blocks for our-scoped stuff. 
During the "ng" branch, where we did many many refactors, I managed to 
hide pretty much all of this beneath a meta-model API. This is both 
needed for Perl 6 anyway, but means that I can later change out what's 
beneath it.

It's now time to do that, because we've kinda hit breaking point. I need 
to handle...

* Gradual typing, rather than just dynamic
* Natively typed attributes, including in roles where they are type 
parametrized
* Compact structs...and arrays of them..and arrays of native types
* Custom meta-packages
* Representation polymorphism
* All the OO stuff we do correctly today
* All the OO stuff we do incorrectly today, but correctly

And do it:

* More quickly than we can now
* In a way that's clean
* In a way that works on Parrot
* In a way that works on other backends

To achieve this, I plan to build an object implementation - from the 
ground up - that works just as we need for Perl 6. On Parrot, the 
various objects will, of course, live inside PMCs. For the Parrot 
implementation, there'll be a way to map v-table methods, and I'll work 
out ways to inter-op with any other Parrot object, just as, say, a JVM 
implementation of the model would have to work out ways to interop with 
Java objects. (I'm aware there's interest in Parrot in us having some 
deeper meta-model integration too, but I think it'll be easier to do 
that when we have concrete things to talk about unifying, rather than 
trying to come up with some scheme first.)

I've got quite a few bits of how this will look coming together and I'll 
blog much more detail in the next week or two, but here are the salient 
points for the purpose of this discussion.

1) Whatever I implement will not use the Class and Object PMCs in Parrot.

2) Whatever I implement will be intended to replace Rakudo's, NQPs and 
PCTs usage of P6object (yes, we got a deprecation note in on that :-)).

3) Whatever I design will not - at the level we should be worrying about 
storing things in the bytecode - have a concept of "classes". There'll 
just be objects. One of those will happen to be installed in the package 
as ClassHOW and be the meta-package that implements Perl 6 class 
semantics. (Almost certainly, ClassHOW itself will be written in NQP. 
ClassHOW objects themselves will probably be a KnowHOW, which would 
implement a pure prototype, and we'd tie the bootstrap knot there, where 
it's relatively simple. My current thinking is that NQP will have some 
kinda NQPClassHOW that will be far simpler than Rakudo's ClassHOW.)

4) When compiling something like:

class Foo {
    has $!x;
    method bar() { }
}

Then the actions method would be implemented to instantiate a new 
instance of ClassHOW, and associate it with a new type object, which 
will then be installed in the package (for lexical ones, it'd go into 
whatever the compile time representation of the lexical symbol table 
would be). After parsing an attribute declaration, a call to 
add_attribute would be made immediately. Similar for method. Of course, 
we didn't compile the method yet, so we can only install a "placeholder" 
that we'll later fill in with the real compiled method. Once the class 
declaration is done, it can be composed. This may seem rather "early" to 
be making meta-objects, but is in fact critical if we're going to be 
able to properly support gradual typing and various other optimizations 
based upon type information.

So, coming back to the topic at hand, here's what I really, really would 
like to have Parrot provide in this area. I'd like an object that is a 
"container" for things to freeze. Let's call it SerializationContext. I 
would create it when I start compiling some chunk of Perl 6 code. I 
could then call an "add" method on it, and pass an object. It would 
return me a handle that I could later use in order to get hold of the 
object at runtime. When we then compile the PIR to PBC, this 
serialization context then gets all the stuff in it frozen.

While my immediate use case for this is meta-objects, this would be 
useful for many, many more things. We could also use it for storing any 
object we might want to make at compile time. That includes code-object 
wrappers, numeric and string objects from literals, and probably other 
stuff.

Here's one critical thing, however. This is _not_ about a constant 
segment. I could have a module...

class Lolspeak {
    method lol() { say "oh lol" }
}

That I pre-compile to a PBC. I then do in a script:

use Lolspeak;
augment class Lolspeak {
    method wtf() { say "omg you forgot a wtf method?" }
}

Which would call .add_method on the meta-class which would then have to 
change its internal state to know about the new method. Thus these 
things aren't constants once they're loaded into memory (they *may* be, 
but not by default). Same for subs, which we may end up calling .wrap 
on. Really, they're just objects we've serialized, and want to 
deserialize at startup.

A tricky issue that will need some thought is if I then wrote:

use Lolspeak;
class MoreLolspeak is Lolspeak {
    method omg() { say "OMG you accidentally the WHOLE MOP!" }
}

Then the meta-object for MoreLolspeak is going to reference the one from 
Lolspeak. Somehow, there will need to be some "bounding" on what we 
freeze and a cross-PBC way to do referencing. That's going to be tricky, 
but important. Alas, I've probably introduced enough things to ponder in 
this email so far, and it'd be good to get some reactions. :-)

I imagine that however other HLLs choose to represent their classes, 
they'll all benefit from being able to get them into the PBC. This is 
something we could integrate some support for into HLL::Compiler, 
perhaps. Not to mention it's a solution for other constant objects too.

At this point in time, we have a unique opportunity to get this in 
place. I'm about to embark on the re-building of the object model 
foundations used by NQP, PCT, Rakudo and whoever else wants to use this 
stuff, and that means that there's a chance to get this serialization 
stuff implemented and tested out with small cases, and gradually migrate 
the whole of NQP and Rakudo into using this. It'll be a big change, but 
I'd much, much rather do one sweeping refactor that introduces the new 
objects implementation _and_ the serialization stuff, rather than do one 
after the other, necessitating two long hard refactoring slogs instead 
of one.

Hope this all makes some kinda sense; I'm about on IRC for discussion, 
can take questions here on the mailing list and for anyone at YAPC::EU 
next week who wants to discuss this, I'll be there too. :-)

Thanks!

Jonathan



More information about the parrot-dev mailing list