constant_unfolding branch

Patrick R. Michaud pmichaud at pobox.com
Mon May 31 15:50:55 UTC 2010


On Sun, May 30, 2010 at 10:51:36PM -0700, Peter Lobsinger wrote:
> An idea that's been batted about a bit lately is that our op-bloat is
> at least partly caused by all the constant-argument variant forms all
> of our ops need to support.
> ...
> To address this, the constant_unfolding branch adds a step to IMCC
> instruction selection to check for non-const variants of an
> as-of-yet-unfound op. If found, this op is used and constant arguments
> are handled by assignments to temporary registers. ...

I may be missing an important component here, but reading the above
makes me want to scream "Slow down a bit, partner!"

In 2008 when I first proposed that we start reviewing and reducing our
opcode set, it was with the primary aim of regularizing the API a bit
and eliminating opcodes that are no longer used or operations that really
warrant being object methods instead of opcodes (e.g., I/O and some math
ops, where the opcodes really were just an opcode interface to underlying
method calls).  The point was to present more uniform opcode API, not
simply to reduce the opcode set to the smallest possible number.

The above proposal looks to me as though it is reducing the number
of opcodes at the expense of increasing the runtime opcode dispatches
and the size of the resulting bytecode.  That feels very much like
a false optimization to me.  (Again, I may be totally misreading what
is proposed or missing some key component in all of this.)

> As a proof of concept, I have removed 30 const form  find_cclass and
> find_not_cclass ops in r47192 with no ill effect. 

So, if I understand correctly, the new approach takes a current PIR
instruction like

    $I0 = find_cclass .CCLASS_WHITESPACE, $S2, $I3, $I4

and generates it as:

    $I99 = .CCLASS_WHITESPACE
    $I0 = find_cclass $I99, $S2, $I3, $I4

Is this correct?  (I chose this example because it is extremely common 
in both PGE and NQP-rx.)  If so, we've replaced a single opcode and dispatch
in the bytecode with two, increasing runtime and memory costs.

Now then, I grant that the other variants of find(_not)_cclass are quite rare 
-- it's unlikely that any of the other operands are likely to be constants.
So, we _could_ leave the two common constant forms in place and pessimize 
the other variants, which is what I suspect this patch is intended to do.
But (1) I don't know how many opcodes this pessimization will be useful for 
beyond the *cclass variants, and (2) doing this feels like it makes the 
opcode set less regular instead of more regular (which was my original
motivation for suggesting opcode reductions).

(Also, does the IMCC transformation above re-use the same set of temporary
registers, or does it create a new temporary register for each opcode instance?)

Anyway, if having a slightly smaller but irregular opcode set is seen as 
being more beneficial than a regular but slightly larger set, I can agree
to that.  Just be careful about removing constant-variant opcodes that
are in fact quite common at runtime, such as the find_cclass one above.

It may also be worth noting that PCT currently attempts to bias towards
the constant-form of opcodes wherever it can, under the theory that this
leads to faster/smaller code.  (However, I can come up with quite a few
common cases where such optimization turns out to be not optimal, so some
pieces there need to be rethought as well.)

Pm


More information about the parrot-dev mailing list