[svn:parrot] r41123 - trunk/tools/dev

cotto at svn.parrot.org cotto at svn.parrot.org
Mon Sep 7 18:19:00 UTC 2009


Author: cotto
Date: Mon Sep  7 18:18:59 2009
New Revision: 41123
URL: https://trac.parrot.org/parrot/changeset/41123

Log:
[pprof2cg] add documentation to pprof2cg

Modified:
   trunk/tools/dev/pprof2cg.pl

Modified: trunk/tools/dev/pprof2cg.pl
==============================================================================
--- trunk/tools/dev/pprof2cg.pl	Mon Sep  7 18:06:45 2009	(r41122)
+++ trunk/tools/dev/pprof2cg.pl	Mon Sep  7 18:18:59 2009	(r41123)
@@ -8,34 +8,116 @@
 
 use Data::Dumper;
 
-=head1 NAME
+=head1 Name
 
 tools/dev/pprof2cg.pl
 
-=head1 DESCRIPTION
+=head1 Description
 
 Convert the output of Parrot's profiling runcore to a Callgrind-compatible
 format.
 
-=head1 USAGE
+=head1 Synopsis
+
+perl tools/dev/pprof2cg.pl parrot.pprof.1234
+
+=head1 Usage
 
 Generate a profile by passing C<-Rprofiling> to parrot, for example C<./parrot
--Rprofiling perl6.pbc hello.p6>.  Once execution completes, parrot will print a
-message specifying the location of profile.  The profile will usually be named
-parrot.pprof.XXXX, where XXXX is the PID of the parrot process.
+-Rprofiling perl6.pbc hello.p6>.  Once execution completes, C<parrot> will
+print a message specifying the location of the parrot profile (pprof).  The
+profile will be named parrot.pprof.XXXX, where XXXX is the PID of the parrot
+process unless another name is specified by the B<PARROT_PROFILING_OUTPUT>
+environment variable.
 
 To generate a Callgrind-compatible profile, run this script with the pprof
-filename as the first argument.  The output file will be in parrot.out.XXXX,
-where XXXX again is the PID of the original parrot process.
+filename as the first argument.  The output file usable by kcachegrind will be
+in parrot.out.XXXX, where XXXX again is the PID of the original parrot process.
 
-XXX: document $stats format
+=head1 Environment Variables
 
-=cut
+=head2 PARROT_PROFILING_OUTPUT
 
+If the environment variable PARROT_PROFILING_OUTPUT is set, the profiling
+runcore will attempt to use its value as the profile filename.  Note that it
+does not check whether the file already exists and will happily overwrite
+existing files.
 
+=cut
 
 main(\@ARGV);
 
+=head1 Internal Data Structures
+
+=over 4
+
+=item notes
+
+Parrot's execution model is built on continuation-passing style and does not
+precisely fit the straightforward function-based format that
+Callgrind-compatible tools expect.  For this reason, the profiling runcore
+captures information about context switches (CS lines in the pprof file) and
+pprof2cg.pl maintains a context stack that functions similarly to a typical
+call stack.  pprof2cg.pl then maps these context switches as if they were
+function calls and returns.  See C<$ctx_stack> for more information.
+
+=item C<$ctx_stack>
+
+Variables which are named C<$ctx_stack> hold a reference to an array of hashes
+which contain information about the currently active contexts.  When collecting
+timing information about an op, it is necessary to add that inforation to all
+function calls on the stack because Callgrind-compatible tools expect the cost
+of a function call to include the cost of all calls made by that funcion, etc.
+
+When a context switch is detected, C<process_line> looks at the context stack
+to determine if the context switch looks like a function call (if the context
+hasn't been seen before) or a return (if the context is somewhere on the
+stack).  There are some other cases that the code handles, but these can be
+ignored for now in the interst of simplicity.  If the context has been seen,
+C<process_line> shifts contexts off the stack until it finds the context that
+has been switched to.  When C<process_line> detects a new context, it adds a
+fake op representing a function call to C<$stats> and unshifts a new context
+onto the stack.
+
+Each element of C<@$ctx_stack> contains the information needed to uniquely
+identify the site of the original context switch.
+
+=item C<$stats>
+
+Variables which are named C<$stats> contain a reference to a deeply nested
+HoHoH.. which contains all information gathered about a profiled PIR program.
+The nested hashes and arrays refer to the file, namespace, line of source code
+and op number, respectively.   The op number is used to allow multiple
+instructions per line because PIR instructions often represent multiple
+low-level instructions.  This also makes it easy to inject pseudo-ops to
+represent function calls.
+
+Each op always has a time value representing the total amount of time spent in
+that op.  Ops may also have an op_name value that gives the name of the op.
+When control flow similar to a function call is detected, a pseudo-op
+representing a function call is injected.  These pseudo-ops have zero cost when
+initialized and are used to determine the total time spent between when the
+context becomes active and when control flow returns to or past the context.
+Although they're not exactly like functions calls, they're close enough that it
+may help to think of them as such.
+
+Uncomment the print_stats line in main to see a representation of the data
+contained in C<$stats>.
+
+=back
+
+=head1 Functions
+
+=over 4
+
+=item C<main>
+
+This function is minimal driver for the other functions in this file, taking
+the name of a Parrot profile and writing a Callgrind-compatible profile to a
+similarly-named file.
+
+=cut
+
 sub main {
     my $argv      = shift;
     my $stats     = {};
@@ -51,7 +133,7 @@
 
     #print_stats($stats);
 
-    unless ($filename =~ s/\.pprof\./.out./) {
+    unless ($filename =~ s/pprof/out/) {
         $filename = "$filename.out";
     }
 
@@ -59,8 +141,17 @@
     my $cg_profile = get_cg_profile($stats);
     print $out_fh $cg_profile;
     close($out_fh) or die "couldn't close $filename: $!";
+    print "$filename can now be used with kcachegrind or other callgrind-compatible tools.\n";
 }
 
+=item C<process_line>
+
+This function takes string containing a single line from a Parrot profile, a
+reference to a hash of fine-grained statistics about the current PIR program
+and a reference to the current context stack.  It modifies the statistics and
+context stack according to the information from the Parrot profile.
+
+=cut
 
 sub process_line {
 
@@ -148,6 +239,15 @@
     }
 }
 
+=item C<print_stats>
+
+This function prints a complete, human-readable representation of the
+statistical data that have been collected into the C<$stats> argument to
+stdout.  It is primarily intended to ease debugging and is not necessary to
+create a Callgrind-compatible profile.
+
+=cut
+
 sub print_stats {
     my $stats = shift;
 
@@ -169,6 +269,14 @@
     }
 }
 
+=item C<split_vars>
+
+This function takes a string specifying 1 or more key/value mappings and
+returns a reference to a hash containing those keys and values.  The string
+must be in the format C<{x{key1:value1}x}{x{key2:value2}x}>.
+
+=cut
+
 sub split_vars {
     my $href;
     my $str = shift;
@@ -180,6 +288,17 @@
     return $href;
 }
 
+=item C<store_stats>
+
+This function adds statistical data to the C<$stats> hash reference.  The
+C<$locator> argument specifies information such as the namespace, file, line
+and op number where the data should go.  C<$time> is an integer representing
+the amount of time spent at the specified location.  C<$extra> contains any
+ancillary data that should be stored in the hash.  This includes data on
+(faked) subroutine calls and op names.
+
+=cut
+
 sub store_stats {
     my $stats   = shift;
     my $locator = shift;
@@ -212,6 +331,15 @@
     }
 }
 
+=item C<get_cg_profile>
+
+This function takes a reference to a hash of statistical information about a
+PIR program and returns a string containing a Callgrind-compatible profile.
+Although some informtion in the profile may not be accurate (namely PID and
+creator), tools such as kcachegrind are able to consume files generated by this
+function.
+
+=cut
 
 sub get_cg_profile {
 
@@ -281,6 +409,10 @@
     return join("\n", @output);
 }
 
+=back
+
+=cut
+
 # Local Variables:
 #   mode: cperl
 #   cperl-indent-level: 4


More information about the parrot-commits mailing list