Access the internal pointer to parrot strings

NotFound julian.notfound at gmail.com
Mon Jun 4 22:52:08 UTC 2012


El 04/06/2012 22:25, "Bart Wiegmans" <bartwiegmans at gmail.com> escribió:
>
> Hi everybody,
>
> I want to access the pointer to the internal buffer of a parrot
> string, so I can pass it to relevant IO functions outside of parrot.

You can use ByteBuffer. Assign a string to one and you get read access to
its raw content. If you write, BB does a copy and modifies the copy.

BB provides the get_pointer vtable function, so it can be used to pass
content to NCI. Note that in the string case you shouldn't pass it to
functions that write to it.

Also, the string subsystem can play games with the buffer content for its
own reason, so to be safe is better to make sure that  BB does a copy, by
writing on it (and not touching it again until the extern function returns.

> (The 'relevant IO function' in question is ap_rwrite, which takes a
> void pointer, a size_t of bytes, and a pointer to a request_rec
> structure. I wanted to paste the link to its definition but it is
> frightfully long. Its signature is size_t (void*, size_t, request_rec
> *)).
>
> My reason for wanting this is that I want to minimise the amount of
> copying and string scanning in general. Consider what happens from the
> inside of mod_parrot to apache:
>
> * A script calls say("Hello world"), because it likes newlines
> * This string is copied (or not, I'm not sure how stringhandle works)
> into the stringhandle buffer
> * Then, the script ends, and I read the stringhandle for its contents,
> which are copied (or not, again no idea how this works) into a
> Parrot_String),
> * And I export this Parrot_String to ascii, also adding a copy and zero.
> * And then finally I give it to ap_rwrite, which copies the string
> again into a buffer of its own.
> * If I were to use ap_rputs, which is easier for me to use, apache
> needs to scan the /entire/ string for the zero to know how many bytes
> to send.
>
> I count at least 3 and at most 5 copies, never minding the useless
> adding of a zero.
> Only one is ever needed, and that is from parrot to the apache buffer,
> which may or may not directly copy it to the kernel buffer. I do not
> know nor care much about that, but I believe the developers of Apache
> though Really Hard about that function and that It Works.
>
> Note that, as this is HTTP, and I can (and should) send the correct
> character encoding alongside the data, there is really not even much
> reason to worry about 'converting' to the correct format. Nor do I
> need a zero-terminated string, because I know the length. In short,
> the raw pointer to the buffer is perfect for my needs.
>
> Now, I see the following possible issues. First of all, I don't
> suppose that a string buffer, once allocated, will remain there
> forever when the pointer is exported. The pointer is not really 'safe'
> to use outside of a very limited scope, because it may be garbage
> collected (or concurrently accessed, although that might not be so
> serious. I don't know.). In my example it might be safe to use
> because, while calling, it will never go out of scope and the copying
> will be safe. Unless ap_rwrite is implemented asynchronously, or the
> garbage collector is of the concurrently copying type.
>
> The next issue is that it is not guaranteed - nor should it be - that
> the buffer containing the string is contiguous. It might be now, but
> there are reasons for it not to be. If it isn't, get_pointer might
> have to do compaction, and that might be problematic in itself because
> the buffer might be used by another string (e.g. a substring).
>
> Anyway, I was wondering what your thoughts where, before I go off to a
> wild goose chase and implement something that might be a really bad
> idea.  Especially people who are rewriting the IO subsystem might
> chime in with good advice :-).
>
> Regards,
> Bart Wiegmans
> _______________________________________________
> http://lists.parrot.org/mailman/listinfo/parrot-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.parrot.org/pipermail/parrot-dev/attachments/20120605/77681bcc/attachment.html>


More information about the parrot-dev mailing list