Bytes from a string
Carl Mäsak
cmasak at gmail.com
Mon Mar 23 16:08:58 UTC 2009
François (>), Carl (>>):
>> Wanting to escape URIs in our Web.pm project, I would like to
>> implement unpack("H*", $str) in Rakudo:
>>
>> say unpack("H*", "☺"); # should print 'e298ba'
>>
>> What facilities exist in Parrot (or, more precisely, PIR), to extract
>> the list of bytes making up a string? I'd need something like
>> to_bytes.
>>
>> Bonus points for answering with working PIR code, of course.
>>
>
> My first tentative was :
>
> .sub 'unpack'
> $S0 = unicode:"\t20 \u20ac"
> say $S0
> $I0 = find_charset 'binary'
> $S1 = trans_charset $S0, $I0
> $S1 = $S0
> $P0 = split '', $S1
> $P1 = new 'FixedIntegerArray'
> set $P1, 1
> $S0 = ''
> L0:
> unless $P0 goto L1
> $S2 = shift $P0
> $I2 = ord $S2
> $P1[0] = $I2
> $S3 = sprintf '%02x', $P1
> $S0 .= $S3
> goto L0
> L1:
> say $S0
> .end
>
> which fails with :
> 20 €
> to_charset for binary not implemented
>
> but I success with :
>
> .sub 'unpack'
> $S0 = unicode:"\t20 \u20ac"
> say $S0
> $P0 = split '', $S0
> $P1 = new 'FixedIntegerArray'
> set $P1, 1
> $S0 = ''
> L0:
> unless $P0 goto L1
> $S2 = shift $P0
> $I2 = ord $S2
> $P1[0] = $I2
> $S3 = sprintf '%08x', $P1
> L3:
> $I0 = length $S3
> if $I0 == 2 goto L4
> $S4 = substr $S3, 0, 2
> unless $S4 == '00' goto L4
> $S3 = substr $S3, 2
> goto L3
> L4:
> $S0 .= $S3
> goto L0
> L1:
> say $S0
> .end
>
> which gives :
>
> 20 €
> 0932302020ac
Thank you! This is exactly what I needed to proceed.
// Carl
More information about the parrot-dev
mailing list