Bytes from a string

Carl Mäsak cmasak at gmail.com
Mon Mar 23 16:08:58 UTC 2009


François (>), Carl (>>):
>> Wanting to escape URIs in our Web.pm project, I would like to
>> implement unpack("H*", $str) in Rakudo:
>>
>>  say unpack("H*", "☺"); # should print 'e298ba'
>>
>> What facilities exist in Parrot (or, more precisely, PIR), to extract
>> the list of bytes making up a string? I'd need something like
>> to_bytes.
>>
>> Bonus points for answering with working PIR code, of course.
>>
>
> My first tentative was :
>
> .sub 'unpack'
>    $S0 = unicode:"\t20 \u20ac"
>    say $S0
>    $I0 = find_charset 'binary'
>    $S1 = trans_charset $S0, $I0
>    $S1 = $S0
>    $P0 = split '', $S1
>    $P1 = new 'FixedIntegerArray'
>    set $P1, 1
>    $S0 = ''
>  L0:
>    unless $P0 goto L1
>    $S2 = shift $P0
>    $I2 = ord $S2
>    $P1[0] = $I2
>    $S3 = sprintf '%02x', $P1
>    $S0 .= $S3
>    goto L0
>  L1:
>    say $S0
> .end
>
> which fails with :
>        20 €
> to_charset for binary not implemented
>
> but I success with :
>
> .sub 'unpack'
>    $S0 = unicode:"\t20 \u20ac"
>    say $S0
>    $P0 = split '', $S0
>    $P1 = new 'FixedIntegerArray'
>    set $P1, 1
>    $S0 = ''
>  L0:
>    unless $P0 goto L1
>    $S2 = shift $P0
>    $I2 = ord $S2
>    $P1[0] = $I2
>    $S3 = sprintf '%08x', $P1
>  L3:
>    $I0 = length $S3
>    if $I0 == 2 goto L4
>    $S4 = substr $S3, 0, 2
>    unless $S4 == '00' goto L4
>    $S3 = substr $S3, 2
>    goto L3
>  L4:
>    $S0 .= $S3
>    goto L0
>  L1:
>    say $S0
> .end
>
> which gives :
>
>        20 €
> 0932302020ac

Thank you! This is exactly what I needed to proceed.

// Carl


More information about the parrot-dev mailing list