"Blocking Buffered Stream" threading primitive

Tue Dec 27 06:42:27 UTC 2011

Howdy,

In general, I am +1 to this and think you should fork parrot.git,
create a branch,
and start implementing something which we can give more specific feedback on.

I assume that you will need to create a new PMC, possibly something that
inherits from FileHandle or Socket. I would wager that whiteknight++ will have
something to say about this soon...

Duke

On Mon, Dec 26, 2011 at 8:42 AM, Daniel Ruoso <daniel at ruoso.com> wrote:
> [Nicholas, sorry for the duplicated, I assumed parrot-dev was in the reply-to]
> 2011/12/26 Nicholas Clark <nick at ccl4.org>:
>> On Mon, Dec 26, 2011 at 08:30:29AM -0500, Daniel Ruoso wrote:
>>> use the input in map and grep it will work completely parallel, but if
>>> you keep writing to outside values, it will be implicitly synchronized
>>> (but just the access to that value, but the order in which this access
>>> happen is still undefined).
>> That works OK for write-only values, and read-only values, I think.
>> But surely either performance or consistency is going to be impossible if
>> the closures both read and write from the same thing?
>> (eg something *like* a counter that's auto-incremented to provide a unique ID
>> for each item processed, but more complex in that the closure reads from it,
>> does something and then writes back. Or does Perl 6 specify that the auto-
>> threading of grep and map is such that if you "do" side effects, it's your
>> responsibility to lock shared resources to avoid race conditions)
>
> The understanding I have is that any expectation of global consistency
> in such concurrent-wise operations are your responsability. The
> example of an auto-incremented value is perfect in that respect. The
> auto-increment would have to serialize the access to the global
> counter itself, it's not the language that's going to do it.
>
>>> That is not the case, map and grep will not run in order and it is in
>>> general accepted that such constructs will run asynchronously. So,
>>> while it is not forbidden to have side effects, the user will know
>>> that the result is not guaranteed to have any ordering consistency.
>> Yes. But I'm wondering about what the rules are for side effects that the
>> programmer coded that are fine with things happening one at a time in an
>> arbitrary order, but break when more than one thing happens at once.
>> (The same bugs that become exposed when a "multi threaded" program is run
>> for the first time on a multicore machine)
>> Is it explicit that "the programmer has to assume that grep and map can
>> run the closure *concurrently*", and take responsibility for avoiding
>> the consequences? I'm thinking that this *is* the case, from what you
>> write below:
>
> Consuming items in map concurrently is a bit beyond what I addressed
> here. My point was about a chained map and grep, so the map closure
> would run concurrently with the grep closure. But, as far as I
> understand, and from all the times Larry spoke about it in #perl6, the
> user should expect even the case where the map closure is ran
> concurrently to consume items.
>
> From what I understand, every list operation, unless explicitly stated
> othewise, are possible candidates for concurrent evaluation.
>
>>> I think the parameters on how the tasks will be spread on different OS
>>> threads is something to be fine-tuned (ghc does that, for instance).
>> How straightforward is it to steal stuff from GHC? In that, Haskell is a
>> functional language, so (my understanding is that) unlike Perl 6, variables
>> don't vary. And knowing that things don't vary will let you make assumptions
>> about what is cheap, and where various trade offs lie. =A0Which may not be the
>> same trade offs that one should make if one is coping with variables and
>> side effects. Or is that not really relevant for this topic?
>
> It will probably be relevant after that is actually implemented to
> decide the better approach to this. At this point, I think the
> abstract idea is not exposing such controls as high level language
> constructs and leaving that to runtime decision (even if by command
> line arguments). Of course there will be ways for the user to enforce
> any setup, but the implicit threading should not presume that.
>
>>> That is why I'm considering the idea of a "blocking buffered stream"
>>> VM primitive (and why this thread is on this list), since basically
>>> this will provide a non-OS way to implement blocking reads and writes,
>>> which will effect how the scheduler chooses which task to run. As I
>>> said in the original post, I think the only way to make it efficient
>>> is doing it in the VM level.
>> Yes, my hunch is that it's going to be hard to do it well at a higher
>> level than the fabric of the VM. But as I'm not intimately familiar with
>> Parrot, reality may well prove me wrong on this.
>
> I could even try to implement a Proof of Concept if someone gave me
> some pointers on where to start and what to look for...
>
> daniel
> _______________________________________________
> http://lists.parrot.org/mailman/listinfo/parrot-dev

-- 
Jonathan "Duke" Leto <jonathan at leto.net>
Leto Labs LLC
209.691.DUKE // http://labs.leto.net
NOTE: Personal email is only checked twice a day at 10am/2pm PST,
please call/text for time-sensitive matters.