cross-thread data sharing

Nat Tuck nat at ferrus.net
Wed Jun 2 18:07:40 UTC 2010


STM, inter-thread COW, and message-passing-only are all great solutions for
specific programs and even specific languages, but it's important to
remember that some programs cannot be written without relatively low level
access to shared memory. The Perl6 guys, for example, probably won't let
Parrot get away with preventing those programs from being written.

My favorite example of an algorithm that requires shared memory and probably
doesn't even want STM is this:
http://pandion.ferrus.net/gsoc/Parallel%20randomized%20best-first%20minimax%20search.pdf

-- Nat "Chandon" Tuck

On Wed, Jun 2, 2010 at 10:39 AM, François Perrad
<francois.perrad at gadz.org>wrote:

>
>
> 2010/6/2 Andrew Whitworth <wknight8111 at gmail.com>
>
> Chandon's GSoC project is already starting to highlight some
>> unresolved related issues we have in Parrot. Perhaps the most
>> important is how we control cross-thread data corruption. We used to
>> have an STM system though it was non-functional. We've recently also
>> removed a "_sync" member of the PMC structure which ostensibly would
>> have been used to perform fine-grained locking of shared PMCs. Both of
>> those things were unused and unfunctional at the time they were
>> removed, but we are going to need to replace them with something
>> eventually, especially if we ever want to have proper threads support.
>> Throughout this email I'm going to be using the term "threads" to mean
>> OS-level threads, not the new "Green Threads" that Chandon is working
>> on (in Green Threads, data corruption is a much much smaller problem).
>>
>> An obvious choice would be to create a new STM implementation. Done
>> right, we wouldn't need to add new fields to the PMC structure and we
>> could avoid almost all locking. Plus, there are several libraries out
>> there that we could tap into to get STM "for free". I think there are
>> some STM libraries affiliated with the LLVM project as well, so we
>> might be able to tap into those at the same time we're adding an
>> LLVM-based JIT backend. Implementing simple STM shouldn't be too big a
>> project. However, doing it correctly and robustly, following all the
>> current research on optimization and whatever is much harder. If we
>> want to go the route of using STM, we should seriously evaluate some
>> existing libraries.
>>
>> With our shiny new immutable strings implementation we already don't
>> have to worry about locking strings because they can't be written to
>> and therefore can't be corrupted. We may need to make some changes to
>> the implementation to make sure there are no exceptions and that a
>> reference to a STRING cannot escape into PIR land before it has been
>> completely constructed and write-projected. We also obviously don't
>> need to worry about locking INTVALs and FLOATVALs, since those aren't
>> passed internally by reference. So a better question than "how do we
>> safely share PMCs" might be "How do we stop sharing PMCs entirely?".
>> If PMCs were not shared, or if we create clones when we pass a PMC
>> from one thread to another, we don't need to worry about locks or safe
>> sharing. Thread-based COW on PMCs would do the same job.
>>
>> If PMCs can only be written from the thread that they originated from,
>> other threads could schedule method/vtable calls as "messages" on the
>> originating thread when updates need to be made. This can either raise
>> performance issues, where for every method or vtable call we send a
>> message a yield to allow the message to complete processing, or we
>> would require threads to be aware of the shared state of PMCs and
>> manually wait over some kind of flag until a batch of messages is
>> processed.
>>
>> We really need to consider whether we want PMCs to be transparently
>> modifiable by reference across multiple threads. If they are, we need
>> a system for managing either locks or atomic transactions, up to and
>> maybe including some kind of GIL. If they are not, we need to consider
>> a system for messaging.
>>
>> I don't think we're going to need to have any kind of system in place
>> for Chandon to continue his work and even reach a successful
>> conclusion. However, without a mechanism for data sharing any uses of
>> threads will need to either explicitly avoid data sharing entirely or
>> take the risk of crashing with fire.
>>
>> --Andrew Whitworth
>>
>
> For another example, see
> http://lists.parrot.org/pipermail/parrot-dev/2010-May/004238.html
>
> François
>
>
>> _______________________________________________
>> http://lists.parrot.org/mailman/listinfo/parrot-dev
>>
>
>
> _______________________________________________
> http://lists.parrot.org/mailman/listinfo/parrot-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.parrot.org/pipermail/parrot-dev/attachments/20100602/0742f371/attachment-0001.html>


More information about the parrot-dev mailing list