Buffered output to a file is painfully slow

Patrick R. Michaud pmichaud at pobox.com
Mon Feb 9 15:58:14 UTC 2009


While working on improving the speed of the pbc_to_exe utility
yesterday, I discovered that buffered I/O in Parrot seems to be
painfully slow.

Here's a PIR program that opens a file and outputs 2.5 million 
comma-separated integers (0 to 255) to the file.  On my system 
it takes Parrot r36492 almost a minute to generate the file:

    $ cat x.pir
    .sub 'main'
        .local pmc ofh
        ofh = new ['FileHandle']
        ofh.'open'('test-pir.txt', 'w')
        ofh.'buffer_type'('full-buffered')
    
        $I0 = 2500000
      loop:
        unless $I0 > 0 goto done
        dec $I0
        $I1 = $I0 % 256
        print ofh, $I1
        print ofh, ','
        $I1 = $I0 % 32
        if $I1 != 0 goto loop
        print ofh, "\n"
        goto loop
      done:
        ofh.'close'()
    .end

    $ time ./parrot x.pir
    
    real    0m57.596s
    user    0m56.636s
    sys     0m0.108s


By way of comparison, an equivalent Perl 5 program takes 
less than four seconds:

    $ cat x.pl
    #!/usr/bin/perl
    
    open OFH, ">test-pl.txt";
    
    $i = 2500000;
    
    while ($i > 0) {
        $i--;
        $j = $i % 256;
        print OFH $j;
        print OFH ",";
        print OFH "\n" if $i % 32 == 0;
    }
    
    close OFH;
    
    $ time perl x.pl
    
    real    0m3.996s
    user    0m3.968s
    sys     0m0.028s

For now pbc_to_exe is working around this issue by 
concatenating values into large strings to reduce the
overall number of invocations of 'print'.  But IMO an 
application should not have to be doing its own buffering 
to get reasonable I/O performance -- indeed, that's what 
"buffered IO" is supposed to provide in the first place.  :-)

Comments and suggestions welcomed,

Pm


More information about the parrot-dev mailing list