Euler Project Problem #4 Benchmark on Parrot 0.2.0 - trunk (r42903)

Mon Dec 7 00:08:31 UTC 2009

On Sun, 2009-12-06 at 13:58 -0800, chromatic wrote:
> On Sunday 06 December 2009 at 11:18, Patrick R wrote:
> 
> > Beyond that, I do not agree with any premise that HLLs will benefit 
> > from the optimizations this sort of analysis can achieve.

Ah, I fear I was dreadfully unclear with what I was trying to say.

> Benchmarks which point out bottlenecks in:
> 
> 	* intrinsic PMC performance (Hash, NameSpace, Sub)
> 	* garbage collector overhead, tunings, and algorithms
> 	* calling conventions overhead
> 
> ... are useful to every HLL I've seen implemented.

The above is close to what I was trying to say -- micro-benchmarks can
help point out individual places where performance is completely out of
line with reason.  Larger benchmarks tend to show some small wins and
some small losses over time, and it's often not clear *why* performance
is varying -- because from release to release inevitably some things
will be faster and some slower, and these effects partially cancel each
other.  But if you for example write a benchmark just for method calls,
and discover it's suddenly gotten 3x slower, then you can more easily
say "Oops, we inadvertently screwed up method call performance, let's
fix that."

Mind you, pmichaud has shown some high level benchmarks (such as the
Actions.pm compile) that show large problems, and I'm not in the
slightest discounting the considerable value of this.  But there were
several hypotheses as to why the slowness occurred, and a lot of
guesswork and circumstantial evidence.  I believe that writing
microbenchmarks to explicitly test each hypothesis on its own, without
interfering effects, would be valuable.

> I agree that benchmarking only op dispatch with primitive registers has little 
> bearing on HLL performance, but almost every one of the benchmarks I use 
> exercises at least one part of the HLL stack heavily.

And in fact the key point of what I was trying to say is that for the
near future, say pre-3.0 at least, we should *not* be worried about
being somewhat slower on this or that micro-benchmark.  We should only
be worried about a performance outlier, where HEAD is *several times*
slower than previous versions, or when doing the same operation using
slightly different methods (core versus HLL mapped PMCs, for instance)
one is massively slower than the other.

Even when we find one of these cases, I don't think we should get all
Chicken Little about it.  But I do think such cases warrant discussion
and analysis, and I think it's valuable to run the benchmarks regularly
for the same reason that we run smokes regularly -- because we want to
know when trunk is on fire (in this case, performance-wise rather than
correctness-wise).

A perhaps confusing part of my original email is that eventually I
predict we *will* care about trying to optimize every benchmark we can,
just like the JavaScript VMs have been doing for the last year or so --
but for us that time is not now, and not soon.

-'f