Adding a serial collection-based execution mode

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding a serial collection-based execution mode

Stephan Ewen
Hi everyone!

I have started with a patch that introduces *"local collection-based
execution"* of Flink programs.

Since Flink is a layered system, programs written against the APIs can be
executed in a variety of ways. In this case, we just run the functions
single-threaded directly on the Java collections, instead of firing up a
memory management, IPC, parallel workers, data movement, etc. That gives
programs a minimal execution footprint (like in the Java8 streams API) for
small data. The idea is to enable users to use the same program in all
sorts of different contexts.

The collection execution sits below the common API, so both Java and Scala
API can use it. I have implemented the basic framework in
https://github.com/StephanEwen/incubator-flink/commits/collections

I am looking for people who would be interested in implementing one
operator collection based. This is very simple, compared to Flink's
distributed runtime.

An example how to do this is in the MapOperator (
https://github.com/StephanEwen/incubator-flink/blob/a806609bc85c56b87fea16a985c4df3152c3b955/flink-core/src/main/java/org/apache/flink/api/common/operators/base/MapOperatorBase.java
)

Let me know what operators you would be interested in. I'll work on
supporting Broadcast variables and Iterations.

Just look where the code has compilation errors, that's where the operator
implementations are still missing.

Greetings,
Stephan
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

Ufuk Celebi-2

On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]> wrote:

> Hi everyone!
>
> I have started with a patch that introduces *"local collection-based
> execution"* of Flink programs.

Very nice!

To have a list of missing operators:

- flatMap <= I'll take this
- filter <= I'll take this
- reduce <= I'll take this
- groupreduce
- cogroup
- join
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

till.rohrmann
Then I'll take the join operation.

On Mon, Sep 22, 2014 at 3:44 PM, Ufuk Celebi <[hidden email]> wrote:

>
> On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]> wrote:
>
> > Hi everyone!
> >
> > I have started with a patch that introduces *"local collection-based
> > execution"* of Flink programs.
>
> Very nice!
>
> To have a list of missing operators:
>
> - flatMap <= I'll take this
> - filter <= I'll take this
> - reduce <= I'll take this
> - groupreduce
> - cogroup
> - join
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

Aljoscha Krettek-2
I'll do groupReduce

On Mon, Sep 22, 2014 at 4:13 PM, Till Rohrmann <[hidden email]> wrote:

> Then I'll take the join operation.
>
> On Mon, Sep 22, 2014 at 3:44 PM, Ufuk Celebi <[hidden email]> wrote:
>
>>
>> On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]> wrote:
>>
>> > Hi everyone!
>> >
>> > I have started with a patch that introduces *"local collection-based
>> > execution"* of Flink programs.
>>
>> Very nice!
>>
>> To have a list of missing operators:
>>
>> - flatMap <= I'll take this
>> - filter <= I'll take this
>> - reduce <= I'll take this
>> - groupreduce
>> - cogroup
>> - join
>>
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

Stephan Ewen
Hi!

I added a new branch (collections2) that includes the commit moving the
typeutils to "flink-core" and including the code for the runtime context.

Those who need any of that, please rebase your branch...

Stephan


On Mon, Sep 22, 2014 at 4:33 PM, Aljoscha Krettek <[hidden email]>
wrote:

> I'll do groupReduce
>
> On Mon, Sep 22, 2014 at 4:13 PM, Till Rohrmann <[hidden email]>
> wrote:
> > Then I'll take the join operation.
> >
> > On Mon, Sep 22, 2014 at 3:44 PM, Ufuk Celebi <[hidden email]> wrote:
> >
> >>
> >> On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]> wrote:
> >>
> >> > Hi everyone!
> >> >
> >> > I have started with a patch that introduces *"local collection-based
> >> > execution"* of Flink programs.
> >>
> >> Very nice!
> >>
> >> To have a list of missing operators:
> >>
> >> - flatMap <= I'll take this
> >> - filter <= I'll take this
> >> - reduce <= I'll take this
> >> - groupreduce
> >> - cogroup
> >> - join
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

Stephan Ewen
I'll take "cross". Need it for tests...

On Mon, Sep 22, 2014 at 5:48 PM, Stephan Ewen <[hidden email]> wrote:

> Hi!
>
> I added a new branch (collections2) that includes the commit moving the
> typeutils to "flink-core" and including the code for the runtime context.
>
> Those who need any of that, please rebase your branch...
>
> Stephan
>
>
> On Mon, Sep 22, 2014 at 4:33 PM, Aljoscha Krettek <[hidden email]>
> wrote:
>
>> I'll do groupReduce
>>
>> On Mon, Sep 22, 2014 at 4:13 PM, Till Rohrmann <[hidden email]>
>> wrote:
>> > Then I'll take the join operation.
>> >
>> > On Mon, Sep 22, 2014 at 3:44 PM, Ufuk Celebi <[hidden email]> wrote:
>> >
>> >>
>> >> On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]> wrote:
>> >>
>> >> > Hi everyone!
>> >> >
>> >> > I have started with a patch that introduces *"local collection-based
>> >> > execution"* of Flink programs.
>> >>
>> >> Very nice!
>> >>
>> >> To have a list of missing operators:
>> >>
>> >> - flatMap <= I'll take this
>> >> - filter <= I'll take this
>> >> - reduce <= I'll take this
>> >> - groupreduce
>> >> - cogroup
>> >> - join
>> >>
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

Ufuk Celebi-2

On 22 Sep 2014, at 19:08, Stephan Ewen <[hidden email]> wrote:

> I'll take "cross". Need it for tests...
>
> On Mon, Sep 22, 2014 at 5:48 PM, Stephan Ewen <[hidden email]> wrote:
>
>> Hi!
>>
>> I added a new branch (collections2) that includes the commit moving the
>> typeutils to "flink-core" and including the code for the runtime context.
>>
>> Those who need any of that, please rebase your branch...
>>
>> Stephan
>>
>>
>> On Mon, Sep 22, 2014 at 4:33 PM, Aljoscha Krettek <[hidden email]>
>> wrote:
>>
>>> I'll do groupReduce
>>>
>>> On Mon, Sep 22, 2014 at 4:13 PM, Till Rohrmann <[hidden email]>
>>> wrote:
>>>> Then I'll take the join operation.
>>>>
>>>> On Mon, Sep 22, 2014 at 3:44 PM, Ufuk Celebi <[hidden email]> wrote:
>>>>
>>>>>
>>>>> On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]> wrote:
>>>>>
>>>>>> Hi everyone!
>>>>>>
>>>>>> I have started with a patch that introduces *"local collection-based
>>>>>> execution"* of Flink programs.
>>>>>
>>>>> Very nice!
>>>>>
>>>>> To have a list of missing operators:
>>>>>
>>>>> - flatMap <= I'll take this
>>>>> - filter <= I'll take this
>>>>> - reduce <= I'll take this
>>>>> - groupreduce
>>>>> - cogroup
>>>>> - join
>>>>>

Aljoscha did the reduce. I will do CoGroup.
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

Stephan Ewen
Will do mapPartition...

On Mon, Sep 22, 2014 at 7:34 PM, Ufuk Celebi <[hidden email]> wrote:

>
> On 22 Sep 2014, at 19:08, Stephan Ewen <[hidden email]> wrote:
>
> > I'll take "cross". Need it for tests...
> >
> > On Mon, Sep 22, 2014 at 5:48 PM, Stephan Ewen <[hidden email]> wrote:
> >
> >> Hi!
> >>
> >> I added a new branch (collections2) that includes the commit moving the
> >> typeutils to "flink-core" and including the code for the runtime
> context.
> >>
> >> Those who need any of that, please rebase your branch...
> >>
> >> Stephan
> >>
> >>
> >> On Mon, Sep 22, 2014 at 4:33 PM, Aljoscha Krettek <[hidden email]>
> >> wrote:
> >>
> >>> I'll do groupReduce
> >>>
> >>> On Mon, Sep 22, 2014 at 4:13 PM, Till Rohrmann <
> [hidden email]>
> >>> wrote:
> >>>> Then I'll take the join operation.
> >>>>
> >>>> On Mon, Sep 22, 2014 at 3:44 PM, Ufuk Celebi <[hidden email]> wrote:
> >>>>
> >>>>>
> >>>>> On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]> wrote:
> >>>>>
> >>>>>> Hi everyone!
> >>>>>>
> >>>>>> I have started with a patch that introduces *"local collection-based
> >>>>>> execution"* of Flink programs.
> >>>>>
> >>>>> Very nice!
> >>>>>
> >>>>> To have a list of missing operators:
> >>>>>
> >>>>> - flatMap <= I'll take this
> >>>>> - filter <= I'll take this
> >>>>> - reduce <= I'll take this
> >>>>> - groupreduce
> >>>>> - cogroup
> >>>>> - join
> >>>>>
>
> Aljoscha did the reduce. I will do CoGroup.
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

Stephan Ewen
What do you think of changing the TestBase such that all new test programs
are not only executed with the LocalExecutor, but also with the Collection
execution environment? Would give us massive coverage for the collection
execution for free...

On Mon, Sep 22, 2014 at 11:09 PM, Stephan Ewen <[hidden email]> wrote:

> Will do mapPartition...
>
> On Mon, Sep 22, 2014 at 7:34 PM, Ufuk Celebi <[hidden email]> wrote:
>
>>
>> On 22 Sep 2014, at 19:08, Stephan Ewen <[hidden email]> wrote:
>>
>> > I'll take "cross". Need it for tests...
>> >
>> > On Mon, Sep 22, 2014 at 5:48 PM, Stephan Ewen <[hidden email]> wrote:
>> >
>> >> Hi!
>> >>
>> >> I added a new branch (collections2) that includes the commit moving the
>> >> typeutils to "flink-core" and including the code for the runtime
>> context.
>> >>
>> >> Those who need any of that, please rebase your branch...
>> >>
>> >> Stephan
>> >>
>> >>
>> >> On Mon, Sep 22, 2014 at 4:33 PM, Aljoscha Krettek <[hidden email]
>> >
>> >> wrote:
>> >>
>> >>> I'll do groupReduce
>> >>>
>> >>> On Mon, Sep 22, 2014 at 4:13 PM, Till Rohrmann <
>> [hidden email]>
>> >>> wrote:
>> >>>> Then I'll take the join operation.
>> >>>>
>> >>>> On Mon, Sep 22, 2014 at 3:44 PM, Ufuk Celebi <[hidden email]> wrote:
>> >>>>
>> >>>>>
>> >>>>> On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]> wrote:
>> >>>>>
>> >>>>>> Hi everyone!
>> >>>>>>
>> >>>>>> I have started with a patch that introduces *"local
>> collection-based
>> >>>>>> execution"* of Flink programs.
>> >>>>>
>> >>>>> Very nice!
>> >>>>>
>> >>>>> To have a list of missing operators:
>> >>>>>
>> >>>>> - flatMap <= I'll take this
>> >>>>> - filter <= I'll take this
>> >>>>> - reduce <= I'll take this
>> >>>>> - groupreduce
>> >>>>> - cogroup
>> >>>>> - join
>> >>>>>
>>
>> Aljoscha did the reduce. I will do CoGroup.
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

Aljoscha Krettek-2
Sounds like a good Idea. But why only "new test programs"? The
existing ones would also run on Collections, right?

On Tue, Sep 23, 2014 at 1:28 AM, Stephan Ewen <[hidden email]> wrote:

> What do you think of changing the TestBase such that all new test programs
> are not only executed with the LocalExecutor, but also with the Collection
> execution environment? Would give us massive coverage for the collection
> execution for free...
>
> On Mon, Sep 22, 2014 at 11:09 PM, Stephan Ewen <[hidden email]> wrote:
>
>> Will do mapPartition...
>>
>> On Mon, Sep 22, 2014 at 7:34 PM, Ufuk Celebi <[hidden email]> wrote:
>>
>>>
>>> On 22 Sep 2014, at 19:08, Stephan Ewen <[hidden email]> wrote:
>>>
>>> > I'll take "cross". Need it for tests...
>>> >
>>> > On Mon, Sep 22, 2014 at 5:48 PM, Stephan Ewen <[hidden email]> wrote:
>>> >
>>> >> Hi!
>>> >>
>>> >> I added a new branch (collections2) that includes the commit moving the
>>> >> typeutils to "flink-core" and including the code for the runtime
>>> context.
>>> >>
>>> >> Those who need any of that, please rebase your branch...
>>> >>
>>> >> Stephan
>>> >>
>>> >>
>>> >> On Mon, Sep 22, 2014 at 4:33 PM, Aljoscha Krettek <[hidden email]
>>> >
>>> >> wrote:
>>> >>
>>> >>> I'll do groupReduce
>>> >>>
>>> >>> On Mon, Sep 22, 2014 at 4:13 PM, Till Rohrmann <
>>> [hidden email]>
>>> >>> wrote:
>>> >>>> Then I'll take the join operation.
>>> >>>>
>>> >>>> On Mon, Sep 22, 2014 at 3:44 PM, Ufuk Celebi <[hidden email]> wrote:
>>> >>>>
>>> >>>>>
>>> >>>>> On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]> wrote:
>>> >>>>>
>>> >>>>>> Hi everyone!
>>> >>>>>>
>>> >>>>>> I have started with a patch that introduces *"local
>>> collection-based
>>> >>>>>> execution"* of Flink programs.
>>> >>>>>
>>> >>>>> Very nice!
>>> >>>>>
>>> >>>>> To have a list of missing operators:
>>> >>>>>
>>> >>>>> - flatMap <= I'll take this
>>> >>>>> - filter <= I'll take this
>>> >>>>> - reduce <= I'll take this
>>> >>>>> - groupreduce
>>> >>>>> - cogroup
>>> >>>>> - join
>>> >>>>>
>>>
>>> Aljoscha did the reduce. I will do CoGroup.
>>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

Fabian Hueske
+1

2014-09-23 7:03 GMT+02:00 Aljoscha Krettek <[hidden email]>:

> Sounds like a good Idea. But why only "new test programs"? The
> existing ones would also run on Collections, right?
>
> On Tue, Sep 23, 2014 at 1:28 AM, Stephan Ewen <[hidden email]> wrote:
> > What do you think of changing the TestBase such that all new test
> programs
> > are not only executed with the LocalExecutor, but also with the
> Collection
> > execution environment? Would give us massive coverage for the collection
> > execution for free...
> >
> > On Mon, Sep 22, 2014 at 11:09 PM, Stephan Ewen <[hidden email]> wrote:
> >
> >> Will do mapPartition...
> >>
> >> On Mon, Sep 22, 2014 at 7:34 PM, Ufuk Celebi <[hidden email]> wrote:
> >>
> >>>
> >>> On 22 Sep 2014, at 19:08, Stephan Ewen <[hidden email]> wrote:
> >>>
> >>> > I'll take "cross". Need it for tests...
> >>> >
> >>> > On Mon, Sep 22, 2014 at 5:48 PM, Stephan Ewen <[hidden email]>
> wrote:
> >>> >
> >>> >> Hi!
> >>> >>
> >>> >> I added a new branch (collections2) that includes the commit moving
> the
> >>> >> typeutils to "flink-core" and including the code for the runtime
> >>> context.
> >>> >>
> >>> >> Those who need any of that, please rebase your branch...
> >>> >>
> >>> >> Stephan
> >>> >>
> >>> >>
> >>> >> On Mon, Sep 22, 2014 at 4:33 PM, Aljoscha Krettek <
> [hidden email]
> >>> >
> >>> >> wrote:
> >>> >>
> >>> >>> I'll do groupReduce
> >>> >>>
> >>> >>> On Mon, Sep 22, 2014 at 4:13 PM, Till Rohrmann <
> >>> [hidden email]>
> >>> >>> wrote:
> >>> >>>> Then I'll take the join operation.
> >>> >>>>
> >>> >>>> On Mon, Sep 22, 2014 at 3:44 PM, Ufuk Celebi <[hidden email]>
> wrote:
> >>> >>>>
> >>> >>>>>
> >>> >>>>> On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]> wrote:
> >>> >>>>>
> >>> >>>>>> Hi everyone!
> >>> >>>>>>
> >>> >>>>>> I have started with a patch that introduces *"local
> >>> collection-based
> >>> >>>>>> execution"* of Flink programs.
> >>> >>>>>
> >>> >>>>> Very nice!
> >>> >>>>>
> >>> >>>>> To have a list of missing operators:
> >>> >>>>>
> >>> >>>>> - flatMap <= I'll take this
> >>> >>>>> - filter <= I'll take this
> >>> >>>>> - reduce <= I'll take this
> >>> >>>>> - groupreduce
> >>> >>>>> - cogroup
> >>> >>>>> - join
> >>> >>>>>
> >>>
> >>> Aljoscha did the reduce. I will do CoGroup.
> >>>
> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

till.rohrmann
Great idea. +1
On Sep 23, 2014 9:28 AM, "Fabian Hueske" <[hidden email]> wrote:

> +1
>
> 2014-09-23 7:03 GMT+02:00 Aljoscha Krettek <[hidden email]>:
>
> > Sounds like a good Idea. But why only "new test programs"? The
> > existing ones would also run on Collections, right?
> >
> > On Tue, Sep 23, 2014 at 1:28 AM, Stephan Ewen <[hidden email]> wrote:
> > > What do you think of changing the TestBase such that all new test
> > programs
> > > are not only executed with the LocalExecutor, but also with the
> > Collection
> > > execution environment? Would give us massive coverage for the
> collection
> > > execution for free...
> > >
> > > On Mon, Sep 22, 2014 at 11:09 PM, Stephan Ewen <[hidden email]>
> wrote:
> > >
> > >> Will do mapPartition...
> > >>
> > >> On Mon, Sep 22, 2014 at 7:34 PM, Ufuk Celebi <[hidden email]> wrote:
> > >>
> > >>>
> > >>> On 22 Sep 2014, at 19:08, Stephan Ewen <[hidden email]> wrote:
> > >>>
> > >>> > I'll take "cross". Need it for tests...
> > >>> >
> > >>> > On Mon, Sep 22, 2014 at 5:48 PM, Stephan Ewen <[hidden email]>
> > wrote:
> > >>> >
> > >>> >> Hi!
> > >>> >>
> > >>> >> I added a new branch (collections2) that includes the commit
> moving
> > the
> > >>> >> typeutils to "flink-core" and including the code for the runtime
> > >>> context.
> > >>> >>
> > >>> >> Those who need any of that, please rebase your branch...
> > >>> >>
> > >>> >> Stephan
> > >>> >>
> > >>> >>
> > >>> >> On Mon, Sep 22, 2014 at 4:33 PM, Aljoscha Krettek <
> > [hidden email]
> > >>> >
> > >>> >> wrote:
> > >>> >>
> > >>> >>> I'll do groupReduce
> > >>> >>>
> > >>> >>> On Mon, Sep 22, 2014 at 4:13 PM, Till Rohrmann <
> > >>> [hidden email]>
> > >>> >>> wrote:
> > >>> >>>> Then I'll take the join operation.
> > >>> >>>>
> > >>> >>>> On Mon, Sep 22, 2014 at 3:44 PM, Ufuk Celebi <[hidden email]>
> > wrote:
> > >>> >>>>
> > >>> >>>>>
> > >>> >>>>> On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]>
> wrote:
> > >>> >>>>>
> > >>> >>>>>> Hi everyone!
> > >>> >>>>>>
> > >>> >>>>>> I have started with a patch that introduces *"local
> > >>> collection-based
> > >>> >>>>>> execution"* of Flink programs.
> > >>> >>>>>
> > >>> >>>>> Very nice!
> > >>> >>>>>
> > >>> >>>>> To have a list of missing operators:
> > >>> >>>>>
> > >>> >>>>> - flatMap <= I'll take this
> > >>> >>>>> - filter <= I'll take this
> > >>> >>>>> - reduce <= I'll take this
> > >>> >>>>> - groupreduce
> > >>> >>>>> - cogroup
> > >>> >>>>> - join
> > >>> >>>>>
> > >>>
> > >>> Aljoscha did the reduce. I will do CoGroup.
> > >>>
> > >>
> > >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Adding a serial collection-based execution mode

Gyula Fóra
In reply to this post by Aljoscha Krettek-2
+1
2014.09.23. 7:04 ezt írta ("Aljoscha Krettek" <[hidden email]>):

> Sounds like a good Idea. But why only "new test programs"? The
> existing ones would also run on Collections, right?
>
> On Tue, Sep 23, 2014 at 1:28 AM, Stephan Ewen <[hidden email]> wrote:
> > What do you think of changing the TestBase such that all new test
> programs
> > are not only executed with the LocalExecutor, but also with the
> Collection
> > execution environment? Would give us massive coverage for the collection
> > execution for free...
> >
> > On Mon, Sep 22, 2014 at 11:09 PM, Stephan Ewen <[hidden email]> wrote:
> >
> >> Will do mapPartition...
> >>
> >> On Mon, Sep 22, 2014 at 7:34 PM, Ufuk Celebi <[hidden email]> wrote:
> >>
> >>>
> >>> On 22 Sep 2014, at 19:08, Stephan Ewen <[hidden email]> wrote:
> >>>
> >>> > I'll take "cross". Need it for tests...
> >>> >
> >>> > On Mon, Sep 22, 2014 at 5:48 PM, Stephan Ewen <[hidden email]>
> wrote:
> >>> >
> >>> >> Hi!
> >>> >>
> >>> >> I added a new branch (collections2) that includes the commit moving
> the
> >>> >> typeutils to "flink-core" and including the code for the runtime
> >>> context.
> >>> >>
> >>> >> Those who need any of that, please rebase your branch...
> >>> >>
> >>> >> Stephan
> >>> >>
> >>> >>
> >>> >> On Mon, Sep 22, 2014 at 4:33 PM, Aljoscha Krettek <
> [hidden email]
> >>> >
> >>> >> wrote:
> >>> >>
> >>> >>> I'll do groupReduce
> >>> >>>
> >>> >>> On Mon, Sep 22, 2014 at 4:13 PM, Till Rohrmann <
> >>> [hidden email]>
> >>> >>> wrote:
> >>> >>>> Then I'll take the join operation.
> >>> >>>>
> >>> >>>> On Mon, Sep 22, 2014 at 3:44 PM, Ufuk Celebi <[hidden email]>
> wrote:
> >>> >>>>
> >>> >>>>>
> >>> >>>>> On 22 Sep 2014, at 15:37, Stephan Ewen <[hidden email]> wrote:
> >>> >>>>>
> >>> >>>>>> Hi everyone!
> >>> >>>>>>
> >>> >>>>>> I have started with a patch that introduces *"local
> >>> collection-based
> >>> >>>>>> execution"* of Flink programs.
> >>> >>>>>
> >>> >>>>> Very nice!
> >>> >>>>>
> >>> >>>>> To have a list of missing operators:
> >>> >>>>>
> >>> >>>>> - flatMap <= I'll take this
> >>> >>>>> - filter <= I'll take this
> >>> >>>>> - reduce <= I'll take this
> >>> >>>>> - groupreduce
> >>> >>>>> - cogroup
> >>> >>>>> - join
> >>> >>>>>
> >>>
> >>> Aljoscha did the reduce. I will do CoGroup.
> >>>
> >>
> >>
>