Buffered output to a file is painfully slow
Patrick R. Michaud
pmichaud at pobox.com
Mon Feb 9 15:58:14 UTC 2009
While working on improving the speed of the pbc_to_exe utility
yesterday, I discovered that buffered I/O in Parrot seems to be
painfully slow.
Here's a PIR program that opens a file and outputs 2.5 million
comma-separated integers (0 to 255) to the file. On my system
it takes Parrot r36492 almost a minute to generate the file:
$ cat x.pir
.sub 'main'
.local pmc ofh
ofh = new ['FileHandle']
ofh.'open'('test-pir.txt', 'w')
ofh.'buffer_type'('full-buffered')
$I0 = 2500000
loop:
unless $I0 > 0 goto done
dec $I0
$I1 = $I0 % 256
print ofh, $I1
print ofh, ','
$I1 = $I0 % 32
if $I1 != 0 goto loop
print ofh, "\n"
goto loop
done:
ofh.'close'()
.end
$ time ./parrot x.pir
real 0m57.596s
user 0m56.636s
sys 0m0.108s
By way of comparison, an equivalent Perl 5 program takes
less than four seconds:
$ cat x.pl
#!/usr/bin/perl
open OFH, ">test-pl.txt";
$i = 2500000;
while ($i > 0) {
$i--;
$j = $i % 256;
print OFH $j;
print OFH ",";
print OFH "\n" if $i % 32 == 0;
}
close OFH;
$ time perl x.pl
real 0m3.996s
user 0m3.968s
sys 0m0.028s
For now pbc_to_exe is working around this issue by
concatenating values into large strings to reduce the
overall number of invocations of 'print'. But IMO an
application should not have to be doing its own buffering
to get reasonable I/O performance -- indeed, that's what
"buffered IO" is supposed to provide in the first place. :-)
Comments and suggestions welcomed,
Pm
More information about the parrot-dev
mailing list