whiteknight/io_cleanup1 branch, ready to merge

Andrew Whitworth wknight8111 at gmail.com
Mon Aug 27 20:02:01 UTC 2012


I've heard back from ruban today, and done some more testing myself.
Everything is looking good and I am planning to merge the branch
tonight unless I hear otherwise.

On Sun, Aug 26, 2012 at 9:11 PM, James E Keenan <jkeen at verizon.net> wrote:
> How does this branch perform on Windows?

I've tested this today on win64, the only one I can reliably do any
testing on, and all tests pass. A win32 smoke report would be nice,
but I can't generate one of those reliably and I don't think it will
be much different from the win64 case.

> Note:  Since this branch has been under development for a long time, once it
> is merged it would be good to have a succinct -- no more than 3 paragraphs
> -- post on list as to
>
> * the problems the branch was meant to address
>
> * what problems it did address and how it did address them

This branch is a complete ground-up rewrite of the IO subsystem.
Originally I hadn't intended the scope to be so large, but as I
started cleaning and finding more messes, I just kept going. Nearly
100% of the code in src/io/*.c has been rewritten. Relatively small
changes were made to other parts of the system, such as the IO-related
PMCs (FileHandle, Socket, StringHandle) and the low-level platform
interface (src/platform/*/file.c, socket.c, io.c, etc). It's easier to
list changes made than to write them all up as prose:

1) Pipe logic has been separated from FileHandle logic internally, in
preparation for a new, separate Pipe PMC type

2) Buffering logic has been completely unified for all types.
Previously, FileHandle had a completely separate buffering mechanism
from Socket. Now all IO types share the same exact buffering logic and
the same utility routines. We're able to cut down on overall code
volume because we now have one implementation of buffers instead of
two, one implementation of readline instead of 3, etc.

2a) Buffers were previously set up as fields on the individual handle
type. Now Buffers are a new struct type with it's own API (and
eventually it's own PMC wrapper).

3) Likewise various bits of functionality have been standardized,
because each individual type had it's own interface and semantics for
some operations: getting/setting encoding information, getting/setting
buffer details, readline semantics, open/close/is_open, etc.

4) type-specific operations are now broken up into a new IO_VTABLE
structure, which encapsulates most of the details.
src/io/filehandle.c, socket.c, pipe.c and stringhandle.c implement
these new vtables. Previously, several API functions contained large
and clumsy switch/case blocks for different types, with capabilities
and semantics being different between types.

For one example of the kinds of semantic differences between different
PMC types, read
http://whiteknight.github.com/2012/06/13/io_readline.html. There are
at least a dozen other examples of varying degrees.

5) Several bugs have been fixed, including several that had not
previously been reported (but which were found in detailed testing and
comparison). This is especially true of Sockets and StringHandles,
both of which played very loose with string buffers and encodings in
some situations.

6) The C-level IO API has been completely rewritten, with old-style
functions kept around as thin compatibility wrappers (which can be
removed after some delay). Functions at the C level are all named and
implemented using a much cleaner, standard scheme. Also, the source
code is commented much more than it ever has been, so other people can
see what i've done and fix any mistakes I've made much more easily in
the future.

> * what problems it failed to address

This is only the first stage of what is, I hope, an ongoing effort to
make the IO system better. I cut off at the first reasonable stopping
point, where almost all features that had existed in the old system
had been recreated. This includes some semantics that I think are
lousy, but are needed to keep the tests happy. Some new features,
especially dealing with pipes and buffers, have not yet been
implemented. Creating and managing buffers, for instance, is still
extremely clumsy to do from PIR or above. Readline in particular has
had a major overhaul and there may be opportunities to improve
performance there, but performance of most of the rest of the system
should show no major regressions.

This is the first, and largest of what I hope are many waves of
improvements to this subsystem, and it brings a very welcome measure
of sanity to a system that hasn't gotten much love in a long time.

--Andrew Whitworth


More information about the parrot-dev mailing list