FileHandle.read and multi-byte encodings
Nick Wellnhofer
wellnhofer at aevum.de
Fri Jan 7 19:18:44 UTC 2011
On 07/01/11 19:47, Andrew Whitworth wrote:
> Is that how Rakudo reads from all files, by reading in binary mode
> directly into an encoding-less byte buffer?
No, Rakudo also uses readline with utf8 and other encodings.
> I'm strongly in favor of some kind of read() method that respects
> encodings. If we can do it in-place I would prefer that, but adding a
> new method as necessary would be an acceptable second option.
read() does respect encodings. It's only hard to do when it's given byte
counts and it's expected not to fail if it encounters partial multi-byte
chars.
If users like Rakudo want to read binary and string data from the same
handle, they could temporarily switch encodings. But it'd be much nicer
to have a 'read_bytes' method, especially if it can directly return a
ByteBuffer.
Nick
More information about the parrot-dev
mailing list