New branch string_checks
Nick Wellnhofer
wellnhofer at aevum.de
Sun Oct 31 15:14:49 UTC 2010
I just created a new branch string_checks that adds more thorough checks
to the contents of strings in various encodings.
First of all, there have been many places where strings are created in
the default ASCII encoding, but filled with binary data afterwards. This
is fixed in the new branch by always checking the contents of ASCII
strings in Parrot_str_new_init, and changing the encoding to binary
where appropriate.
The checks for Unicode strings are also improved and moved to
Parrot_str_new_init. Along the way, I rewrote the UTF-16 support to work
without ICU.
This branch breaks reading of UTF-8 data with Rakudo's IO::Socket. But
it's just a coincidence that this worked at all. Currently, Parrot
doesn't support different encodings for sockets like it does for file
handles. I'm not sure if this is a desired feature.
Nick
More information about the parrot-dev
mailing list