Strange GC problem on the threads branch.

Stefan Seifert nine at detonation.org
Tue Dec 6 20:32:47 UTC 2011


Hi,

I've tried for several days to track down a GC problem on the threads branch. 
Situation is as follows:
In the attached test program, the main thread creates a new thread and 
immediately afterwards goes to sleep. I confirmed using gdb and debug 
statements, that it is indeed absolutely inactive. The new thread goes into a 
loop where it only calls the pass op. This results in the task pmc's code 
member to be recreated in a loop which in turn results in the GC starting from 
time to time. This seems to work for a couple of times until it results in a 
segfault.

At the weekend a segfault occured after the second GC run when trying to 
access task->code which suddenly was 0x0. I just merged current master and now 
the pattern changed and the GC runs nine times until a segfault occurs. I 
attached the debug output and stacktrace as thread_gc_output.txt and the patch 
for producing this debug output.

In the threads branch every thread runs it's own GC. Access to other thread's 
data goes through proxies which should prevent any GC concurrency issues.

>From what I could discover, the main problem is that Parrot_Task_mark only 
gets called once. After that the task's code and data are collected by the GC. 
The debug output and single stepping in gdb shows that 
gc_gms_process_work_list finds the task and adds it to self->objects[gen]. 
Immediately afterwards gc_gms_sweep_pools runs but the task is missing in the 
list. So the task never gets sweeped and it's live flag does not get reset. So 
the next time the GC runs the task is already marked and Parrot_Task_mark not 
called anymore.

I'm at a loss on what the problem might be. I learned much more about the GC 
than I ever wanted to know :) But it's still not enough to fully understand 
what's going on.

Any help would be greatly appreciated,
Stefan
-------------- next part --------------
#!./parrot
# Copyright (C) 2011, Parrot Foundation.

.sub main :main
    .local pmc task, sayer, starter, number, interp, tasks
    .local int i
    interp = getinterp
    sayer = get_global 'passing_sayer'
    starter = new ['Integer']
    task = new ['Task']
    setattribute task, 'code', sayer
    setattribute task, 'data', number
    schedule task
    task = null
    sleep 1000
    starter = 1
.end

.sub passing_sayer
    .param pmc name
    .local pmc interp, task, starter
    .local int i
    interp = getinterp
    task = interp.'current_task'()
start:
    pass
    goto start
.end
-------------- next part --------------
A non-text attachment was scrubbed...
Name: thread_gc_debug_output.diff
Type: text/x-patch
Size: 5714 bytes
Desc: not available
URL: <http://lists.parrot.org/pipermail/parrot-dev/attachments/20111206/a00d0744/attachment-0001.bin>
-------------- next part --------------
[New Thread 0x7ffff491f700 (LWP 10893)]
mark and sweep 0x6fd670
interp 0x6fd670 task alive 0x7b6700
Parrot_Task_mark 0x7b6700
gc_gms_process_work_list(0x70b1e0) cought item 0x7b66f8 cur_task 0x7b6700 in gen 0
gc_gms_sweep_pools(0x70b1e0) cur_task 0x7b6700, gen 0
interp 0x6fd670 task alive 0x7b6700
mark and sweep 0x6fd670
interp 0x6fd670 task alive 0x7b6700
gc_gms_sweep_pools(0x70b1e0) cur_task 0x7b6700, gen 0
interp 0x6fd670 task alive 0x7b6700
mark and sweep 0x6fd670
interp 0x6fd670 task alive 0x7b6700
gc_gms_sweep_pools(0x70b1e0) cur_task 0x7b6700, gen 0
interp 0x6fd670 task alive 0x7b6700
mark and sweep 0x6fd670
interp 0x6fd670 task alive 0x7b6700
gc_gms_sweep_pools(0x70b1e0) cur_task 0x7b6700, gen 0
interp 0x6fd670 task alive 0x7b6700
mark and sweep 0x6fd670
interp 0x6fd670 task alive 0x7b6700
gc_gms_sweep_pools(0x70b1e0) cur_task 0x7b6700, gen 0
interp 0x6fd670 task alive 0x7b6700
mark and sweep 0x6fd670
interp 0x6fd670 task alive 0x7b6700
gc_gms_sweep_pools(0x70b1e0) cur_task 0x7b6700, gen 0
interp 0x6fd670 task alive 0x7b6700
mark and sweep 0x6fd670
interp 0x6fd670 task alive 0x7b6700
gc_gms_sweep_pools(0x70b1e0) cur_task 0x7b6700, gen 0
interp 0x6fd670 task alive 0x7b6700
mark and sweep 0x6fd670
interp 0x6fd670 task alive 0x7b6700
gc_gms_sweep_pools(0x70b1e0) cur_task 0x7b6700, gen 0
interp 0x6fd670 task alive 0x7b6700
mark and sweep 0x6fd670
interp 0x6fd670 task alive 0x7b6700
gc_gms_sweep_pools(0x70b1e0) cur_task 0x7b6700, gen 0
interp 0x6fd670 task alive 0x7b6700
mark and sweep 0x6fd670
interp 0x6fd670 task alive 0x7b6700
gc_gms_sweep_pools(0x70b1e0) cur_task 0x7b6700, gen 1
gc_gms_sweep_pools(0x70b1e0) cur_task 0x7b6700, gen 0
interp 0x6fd670 task alive 0x7b6700

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff491f700 (LWP 10893)]
0x00007ffff78bd66c in mark_code_segment (interp=0x6fd670) at src/gc/mark_sweep.c:292
292             for (i = 0; i < ct->pmc.const_count; i++) {
(gdb) bt
#0  0x00007ffff78bd66c in mark_code_segment (interp=0x6fd670) at src/gc/mark_sweep.c:292
#1  0x00007ffff78bd581 in mark_interp (interp=0x6fd670) at src/gc/mark_sweep.c:269
#2  0x00007ffff78bd0ee in Parrot_gc_trace_root (interp=0x6fd670, mem_pools=0x0, trace=GC_TRACE_FULL) at src/gc/mark_sweep.c:190
#3  0x00007ffff78bcbd8 in gc_gms_validate_objects (interp=0x6fd670) at src/gc/gc_gms.c:2289
#4  0x00007ffff78b9265 in gc_gms_mark_and_sweep (interp=0x6fd670, flags=0) at src/gc/gc_gms.c:855
#5  0x00007ffff78bae18 in gc_gms_allocate_pmc_header (interp=0x6fd670, flags=0) at src/gc/gc_gms.c:1440
#6  0x00007ffff78b13a3 in Parrot_gc_new_pmc_header (interp=0x6fd670, flags=0) at src/gc/api.c:312
#7  0x00007ffff78f876f in get_new_pmc_header (interp=0x6fd670, base_type=16, flags=0) at src/pmc.c:510
#8  0x00007ffff78f7f2b in Parrot_pmc_new (interp=0x6fd670, base_type=16) at src/pmc.c:159
#9  0x00007ffff7901a44 in Parrot_cx_stop_task (interp=0x6fd670, next=0x70af90) at src/scheduler.c:339
#10 0x00007ffff7901bbd in Parrot_cx_preempt_task (interp=0x6fd670, scheduler=0x7b24c0, next=0x70af90) at src/scheduler.c:369
#11 0x00007ffff789689a in Parrot_pass (cur_opcode=0x70af88, interp=0x6fd670) at src/ops/core_ops.c:24080
#12 0x00007ffff78fa82a in runops_fast_core (interp=0x6fd670, runcore_unused=0x7afde0, pc=0x70af88) at src/runcore/cores.c:503
#13 0x00007ffff78f9c97 in runops_int (interp=0x6fd670, offset=46) at src/runcore/main.c:220
#14 0x00007ffff78cf380 in runops (interp=0x6fd670, offs=46) at src/call/ops.c:126
#15 0x00007ffff78c80a4 in Parrot_pcc_invoke_from_sig_object (interp=0x6fd670, sub_obj=0x625dd80, call_object=0x625de48) at src/call/pcc.c:338
#16 0x00007ffff78abd52 in Parrot_ext_call (interp=0x6fd670, sub_pmc=0x625dd80, signature=0x7ffff7adcba6 "P->") at src/extend.c:160
#17 0x00007ffff7a23dc4 in Parrot_Task_invoke (interp=0x6fd670, _self=0x7b6700, next=0x0) at src/pmc/task.c:166
#18 0x00007ffff78c803f in Parrot_pcc_invoke_from_sig_object (interp=0x6fd670, sub_obj=0x7b6700, call_object=0x625dda8) at src/call/pcc.c:330
#19 0x00007ffff78abd52 in Parrot_ext_call (interp=0x6fd670, sub_pmc=0x7b6700, signature=0x7ffff7a8afb5 "->") at src/extend.c:160
#20 0x00007ffff7901710 in Parrot_cx_next_task (interp=0x6fd670, scheduler=0x7b24c0) at src/scheduler.c:229
#21 0x00007ffff7902b91 in Parrot_thread_outer_runloop (arg=0x709ab0) at src/thread.c:204
#22 0x00007ffff74f7a3f in start_thread () from /lib64/libpthread.so.0
#23 0x00007ffff516766d in clone () from /lib64/libc.so.6
#24 0x0000000000000000 in ?? ()
(gdb) p ct
$1 = (PackFile_ConstTable *) 0x0


More information about the parrot-dev mailing list