Bytes from a string

François Perrad francois.perrad at gadz.org
Mon Mar 23 15:35:32 UTC 2009


2009/3/23 Carl Mäsak <cmasak at gmail.com>:
> Hello,
>
> Wanting to escape URIs in our Web.pm project, I would like to
> implement unpack("H*", $str) in Rakudo:
>
>  say unpack("H*", "☺"); # should print 'e298ba'
>
> What facilities exist in Parrot (or, more precisely, PIR), to extract
> the list of bytes making up a string? I'd need something like
> to_bytes.
>
> Bonus points for answering with working PIR code, of course.
>

My first tentative was :

.sub 'unpack'
    $S0 = unicode:"\t20 \u20ac"
    say $S0
    $I0 = find_charset 'binary'
    $S1 = trans_charset $S0, $I0
    $S1 = $S0
    $P0 = split '', $S1
    $P1 = new 'FixedIntegerArray'
    set $P1, 1
    $S0 = ''
  L0:
    unless $P0 goto L1
    $S2 = shift $P0
    $I2 = ord $S2
    $P1[0] = $I2
    $S3 = sprintf '%02x', $P1
    $S0 .= $S3
    goto L0
  L1:
    say $S0
.end

which fails with :
        20 €
to_charset for binary not implemented

but I success with :

.sub 'unpack'
    $S0 = unicode:"\t20 \u20ac"
    say $S0
    $P0 = split '', $S0
    $P1 = new 'FixedIntegerArray'
    set $P1, 1
    $S0 = ''
  L0:
    unless $P0 goto L1
    $S2 = shift $P0
    $I2 = ord $S2
    $P1[0] = $I2
    $S3 = sprintf '%08x', $P1
  L3:
    $I0 = length $S3
    if $I0 == 2 goto L4
    $S4 = substr $S3, 0, 2
    unless $S4 == '00' goto L4
    $S3 = substr $S3, 2
    goto L3
  L4:
    $S0 .= $S3
    goto L0
  L1:
    say $S0
.end

which gives :

        20 €
0932302020ac


François

> (I asked this question on IRC last week but didn't get a conclusive
> answer: <http://irclog.perlgeek.de/parrot/2009-03-17#i_994363>)
>
> // Carl
> _______________________________________________
> http://lists.parrot.org/mailman/listinfo/parrot-dev
>


More information about the parrot-dev mailing list