Need a string_unescape function returning c-string

Simon Cozens simon at simon-cozens.org
Wed Jan 14 21:20:19 UTC 2009


kjstol wrote:
> Nothing special. Strings are stored as char * in the abstract syntax
> tree. Changing that would mean changing *a lot* of code.

Implementing charset-agnostic and encoding-agnostic string access in
Parrot does mean changing *a lot* of code, yes. And not just in PIRC, in
the whole thing. An awful lot of code. But it's going to happen.

We said from the very very start of Parrot (and I'm talking seven years
ago now) that string access should consider encoding and charset issues
and not handle blobs of data without carrying around their
interpretations. But people hacked on regardless using C strings with
the belief (which I normally share) that it's better to get something
working now than something working perfectly later. There was also the
argument (which I *do* agree with) that we couldn't get the string
access API nailed down until we knew what access was needed.

So a day of reckoning was always going to come, but I would like to see
it come sooner rather than later.

The good news for you, though, is that the majority of the problem is
going to be mine, as I'm committed to spending a lot of time making the
strings transition happen. I don't mind that, though - I really want to
see it happen, and I think it'll be very good for Parrot when it does.

In fact, I'm currently hammering on the new strings API at the moment,
going through the existing and proposed functions, making sure they work
well together, and pseudocoding and testing their use. This is all going
on in the "strings" branch. Until that's stable, I think it's actually
best if you keep on going the way you are. I will convert stuff over in
the branch as I get to it.

Simon


More information about the parrot-dev mailing list