unicode question for tcl
Will Coleda
will at coleda.com
Tue Jun 9 03:54:11 UTC 2009
The tcl spec test "string.test" fails with:
Invalid character for UTF-8 encoding
This is due to the following tcl test:
test string-24.5 {string reverse command - shared unicode string} {
set x abcde\udead
string reverse $x
} \udeadedcba
I can't see a way to generate this string (\udead) in parrot, with
various failures on utf16, utf8, and ucs2.
According to the tcl spec:
\uhhhh
The hexadecimal digits hhhh (one, two, three, or four of them) give a
sixteen-bit hexadecimal value for the Unicode character that will be
inserted.
parrot mentions \u in docs/book/ch03_pir.pod and
docs/pdds/pdd19_pir.pod, which I assume should correspond directly to
the \u used in tclsh.
.sub main :main
$S1 = utf16:unicode:"abcd\udead" # Illegal escape sequence in uxxx
escape - too short
$S1 = utf8:unicode:"abcd\udead" # Malformed UTF-8 string
$S1 = ucs2:unicode:"abcd\udead" # ucs2 unimpl
.end
using \Ud800dead instead of \udead fails in the same way for each encoding.
Thanks in advance for any help.
--
Will "Coke" Coleda
More information about the parrot-dev
mailing list