Parrot Bytecode Debug Segment

Kevin Polulak kpolulak at gmail.com
Wed Aug 10 21:50:02 UTC 2011


Howdy,

If any of you have been following my GSoC blog, then you already know that
I've faced many frustrations this summer. As much as I hate to admit it,
HBDB is more than likely going to end up rotting in the bit bucket alongside
the old Parrot debugger. This is very upsetting to me. However, I'm mature
enough to realize that this is not necessarily a reflection of me but more
of a reflection of Parrot. However, this does not mean total failure.

When I first picked up this project, I searched far and wide across the
Internet in search of any literature regarding symbolic debugging. I ordered
books, I downloaded essays, I chatted with GDB developers, and I even
emailed a professor at University of Southern Maine (I read an essay of
his). None were very helpful. I pretty much just winged it, playing through
the pain this summer. As GSoC comes to an end, I feel that I've learned a
significant amount about how symbolic debugging works through my endless
battle with Parrot. It's maybe not the best approach to education but
effective none the less. The greatest thing that I've learned is that Parrot
is simply not a platform designed for debugging. This is mostly due to the
fact that Parrot bytecode does not have a real debug segment for storing
source-level information about HLL's.

It's only natural for areas like this to become neglected in large scale
projects like Parrot. It happens. I want to do something about this. I
really feel that Parrot is part of the future of dynamic languages and I'm
very happy to be a part of it. However, there are many things holding us
back that are preventing Parrot from being seen as a suitable and adequate
alternative to other mainstream virtual machines. As I discovered this
summer, one of those things is a lack of a debug segment.

A symbolic debugger really is one of the user-facing applications that we've
been talking about as of recently. Parrot and PCT itself are attractive to
only a very small and largely unknown group of people: language/compiler
developers. If we offer a great debugger to users, the scope of Parrot will
significantly increase. Without it, I really don't see people (especially
Perl 6 interested hackers) switching to Parrot-based languages when they
learn that we don't even offer a competent debugger they can use. That being
said, I think development of a debug data format is critical to the success
of Parrot. (As an aside, not only would a debug segment be useful for
debugging but also for other analysis tools like profilers, benchmarks, and
instruments; something that's also been a struggle for others to develop).

Some of the more seasoned Parrot developers are probably thinking, "What the
heck are you whining about? We already have a debug segment." Well, if you
haven't used it then you probably don't realize how utterly pathetic it is;
merely containing a line number to opcode mapping that is incredibly
unreliable. Additionally, our current system of file and line annotations
simply isn't enough. What we need is a standardized debug data format
embedded in the debug segment for describing symbolic HLL constructs such as
variables, data types, subroutines, etc.

That is why I've started designing a debug data format that we can use. The
specification is in a Gist located here <https://gist.github.com/1133182>.
For now, I'm calling it SOD: the Symbolic Opcode Description format. This is
certainly subject to change though. I've decided to model it after the DWARF
format since DWARF is considered to be one of the best formats out there.

Here's how it works in a nutshell. Since most modern languages are block
structured, the format itself is also block structured. That is, each entity
is contained or "owned" by another entity forming a tree-like structure.
This makes it much easier to describe the static structure of a source file
as most compilers already use a tree-like structure called an Abstract
Syntax tree as an intermediate representation. Only the information that is
needed to describe a program object is provided. This makes the format
extensible enough to describe nearly any procedural or object-oriented
language. A debugger like HBDB could recognize or ignore certain extensions
created by various HLL's.

The most basic entity in SOD is called a "Data Description Entity" or DDE. A
DDE consists of an "element" that indicates what it describes and a list of
"properties" that further describe the specific characteristics of the
entity.

More information on the specification can be found
here<https://gist.github.com/1133182>
.

I can offer my skills in designing the format itself but a large effort like
this would require the help of many other developers for the actual
implementation. That is why I'm writing this. I would really like to see
some people helping out on this (that includes you, whiteknight). With the
time and effort of others, I think this can really help get Parrot out into
user land.


-- 
- Kevin Polulak (soh_cah_toa)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.parrot.org/pipermail/parrot-dev/attachments/20110810/9fbe50e5/attachment.html>


More information about the parrot-dev mailing list