[DISCUSS] FLIP-2 Extending Window Function Metadata

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] FLIP-2 Extending Window Function Metadata

Aljoscha Krettek-2
Hi,
this is a proposal to introduce a new interface for the window function to
make it more extensible for the future where we might want to provide
additional information about why a window fired to the user function:

https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending+Window+Function+Metadata

I'd appreciate your thoughts!

Cheers,
Aljoscha
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Stephan Ewen
Thanks for opening this.

I see the need for having an extensible context object for window function
invocations, but i think hiding every parameter in the context is a bit
unnatural.

How about having a function "apply(Key, Values, WindowContext, Collector)" ?
It should be possible to write the straightforward use cases without
accessing the context object.



On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek <[hidden email]>
wrote:

> Hi,
> this is a proposal to introduce a new interface for the window function to
> make it more extensible for the future where we might want to provide
> additional information about why a window fired to the user function:
>
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending+Window+Function+Metadata
>
> I'd appreciate your thoughts!
>
> Cheers,
> Aljoscha
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Aljoscha Krettek-2
Sure, I also thought about this but went for the "extreme" initially. If
no-one objects I'll update the doc in a bit.

On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]> wrote:

> Thanks for opening this.
>
> I see the need for having an extensible context object for window function
> invocations, but i think hiding every parameter in the context is a bit
> unnatural.
>
> How about having a function "apply(Key, Values, WindowContext, Collector)"
> ?
> It should be possible to write the straightforward use cases without
> accessing the context object.
>
>
>
> On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek <[hidden email]>
> wrote:
>
> > Hi,
> > this is a proposal to introduce a new interface for the window function
> to
> > make it more extensible for the future where we might want to provide
> > additional information about why a window fired to the user function:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending+Window+Function+Metadata
> >
> > I'd appreciate your thoughts!
> >
> > Cheers,
> > Aljoscha
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: [DISCUSS] FLIP-2 Extending Window Function Metadata

Radu Tudoran
Hi,

If it is to extend the Context to pass more information between the stages of processing a window (triggering -> process -> eviction), why not adding also a "EvictionInfo"? I think this might actually help with the issues discussed in the tread related to the eviction policy. I could imagine using this parameter to pass the conditions, from the processing stage to the evictor, about what events to be eliminated.




public abstract class Context {

   public abstract EvictionInfo evictionInfo();

...


   public abstract KEY key();

   public abstract W window();

   public abstract int id();

   public abstract FiringInfo firingInfo();

   public abstract Iterable<IN> elements();

   public abstract void output(OUT value);

}


Also on a slightly unrelated issue - how hard it would be to introduce different types of buffers for the windows. Currently the existing one is behaving (when under processing) similar with a FIFO queue (in the sense that you need to start from beginning, from the oldest element). How about enabling for example also LIFO behavior (start iterating through the list from the most recent element). As in the source queues or stacks are not actually used, perhaps we can just pass policies to the iterator - or have custom itrators






Dr. Radu Tudoran
Research Engineer - Big Data Expert
IT R&D Division


HUAWEI TECHNOLOGIES Duesseldorf GmbH
European Research Center
Riesstrasse 25, 80992 München

E-mail: [hidden email]
Mobile: +49 15209084330
Telephone: +49 891588344173

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN
This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!


-----Original Message-----
From: Aljoscha Krettek [mailto:[hidden email]]
Sent: Wednesday, July 13, 2016 2:24 PM
To: [hidden email]
Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Sure, I also thought about this but went for the "extreme" initially. If no-one objects I'll update the doc in a bit.

On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]> wrote:

> Thanks for opening this.
>
> I see the need for having an extensible context object for window
> function invocations, but i think hiding every parameter in the
> context is a bit unnatural.
>
> How about having a function "apply(Key, Values, WindowContext, Collector)"
> ?
> It should be possible to write the straightforward use cases without
> accessing the context object.
>
>
>
> On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> <[hidden email]>
> wrote:
>
> > Hi,
> > this is a proposal to introduce a new interface for the window
> > function
> to
> > make it more extensible for the future where we might want to
> > provide additional information about why a window fired to the user function:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending+Win
> dow+Function+Metadata
> >
> > I'd appreciate your thoughts!
> >
> > Cheers,
> > Aljoscha
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Aljoscha Krettek-2
I incorporated the changes. The proposed interface of ProcessWindowFunction
is now this:

public abstract class ProcessWindowFunction <IN, OUT, KEY, W extends
Window> implements Function {

    public abstract void process(KEY key, Iterable<IN> elements, Context
ctx) throws Exception;

    public abstract class Context {
        public abstract W window();
        public abstract void output(OUT value);
    }
}

I'm proposing to not expose Collector anymore because it has the close()
method that should not be called by users. Having the output() call
directly on the context should work just as well.

Also, I marked the "adding a firing reason" and "adding firing counter" as
future work that are only examples of stuff that can be implemented on top
of the new interface. Initially, this will provide exactly the same
features as the old API but be extensible. I did this to not make the scope
of this proposal to big because Radu also suggested more changes and each
of them should be covered in a separate design doc or FLIP.

@Radu: On the different buffer types. I think this would be very tricky.
Right now, people should also not rely on the fact that elements are
"FIFO". Some state backends might keep the elements in a different order
and when you have merging windows/session windows the order of the elements
will also not be preserved.

Cheers,
Aljoscha

On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <[hidden email]> wrote:

> Hi,
>
> If it is to extend the Context to pass more information between the stages
> of processing a window (triggering -> process -> eviction), why not adding
> also a "EvictionInfo"? I think this might actually help with the issues
> discussed in the tread related to the eviction policy. I could imagine
> using this parameter to pass the conditions, from the processing stage to
> the evictor, about what events to be eliminated.
>
>
>
>
> public abstract class Context {
>
>    public abstract EvictionInfo evictionInfo();
>
> ...
>
>
>    public abstract KEY key();
>
>    public abstract W window();
>
>    public abstract int id();
>
>    public abstract FiringInfo firingInfo();
>
>    public abstract Iterable<IN> elements();
>
>    public abstract void output(OUT value);
>
> }
>
>
> Also on a slightly unrelated issue - how hard it would be to introduce
> different types of buffers for the windows. Currently the existing one is
> behaving (when under processing) similar with a FIFO queue (in the sense
> that you need to start from beginning, from the oldest element). How about
> enabling for example also LIFO behavior (start iterating through the list
> from the most recent element). As in the source queues or stacks are not
> actually used, perhaps we can just pass policies to the iterator - or have
> custom itrators
>
>
>
>
>
>
> Dr. Radu Tudoran
> Research Engineer - Big Data Expert
> IT R&D Division
>
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
> European Research Center
> Riesstrasse 25, 80992 München
>
> E-mail: [hidden email]
> Mobile: +49 15209084330
> Telephone: +49 891588344173
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
> Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com
> Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
> Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
> Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN
> This e-mail and its attachments contain confidential information from
> HUAWEI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
>
>
> -----Original Message-----
> From: Aljoscha Krettek [mailto:[hidden email]]
> Sent: Wednesday, July 13, 2016 2:24 PM
> To: [hidden email]
> Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
>
> Sure, I also thought about this but went for the "extreme" initially. If
> no-one objects I'll update the doc in a bit.
>
> On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]> wrote:
>
> > Thanks for opening this.
> >
> > I see the need for having an extensible context object for window
> > function invocations, but i think hiding every parameter in the
> > context is a bit unnatural.
> >
> > How about having a function "apply(Key, Values, WindowContext,
> Collector)"
> > ?
> > It should be possible to write the straightforward use cases without
> > accessing the context object.
> >
> >
> >
> > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > <[hidden email]>
> > wrote:
> >
> > > Hi,
> > > this is a proposal to introduce a new interface for the window
> > > function
> > to
> > > make it more extensible for the future where we might want to
> > > provide additional information about why a window fired to the user
> function:
> > >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending+Win
> > dow+Function+Metadata
> > >
> > > I'd appreciate your thoughts!
> > >
> > > Cheers,
> > > Aljoscha
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: [DISCUSS] FLIP-2 Extending Window Function Metadata

Radu Tudoran
Hi,

I think it looks good and most importantly is that we can extend it in the directions discussed so far.

One question though regarding the Collector - are we going to be able to delete random elements from the list if this is not exposed as a collection, at least to the evictor? If not, how are we going to extend in the future to cover this case?

Regarding the ordering - I also observed that there are situations where elements do not have a logical order. One example is if you have high rates of the events. Nevertheless, even if now is not the time for this, I think in the future we can imagine having also some data structures that offer some ordering. It can save some computation efforts later in the functions for some use cases.


Best regards,


-----Original Message-----
From: Aljoscha Krettek [mailto:[hidden email]]
Sent: Monday, July 18, 2016 3:45 PM
To: [hidden email]
Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

I incorporated the changes. The proposed interface of ProcessWindowFunction is now this:

public abstract class ProcessWindowFunction <IN, OUT, KEY, W extends
Window> implements Function {

    public abstract void process(KEY key, Iterable<IN> elements, Context
ctx) throws Exception;

    public abstract class Context {
        public abstract W window();
        public abstract void output(OUT value);
    }
}

I'm proposing to not expose Collector anymore because it has the close() method that should not be called by users. Having the output() call directly on the context should work just as well.

Also, I marked the "adding a firing reason" and "adding firing counter" as future work that are only examples of stuff that can be implemented on top of the new interface. Initially, this will provide exactly the same features as the old API but be extensible. I did this to not make the scope of this proposal to big because Radu also suggested more changes and each of them should be covered in a separate design doc or FLIP.

@Radu: On the different buffer types. I think this would be very tricky.
Right now, people should also not rely on the fact that elements are "FIFO". Some state backends might keep the elements in a different order and when you have merging windows/session windows the order of the elements will also not be preserved.

Cheers,
Aljoscha

On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <[hidden email]> wrote:

> Hi,
>
> If it is to extend the Context to pass more information between the
> stages of processing a window (triggering -> process -> eviction), why
> not adding also a "EvictionInfo"? I think this might actually help
> with the issues discussed in the tread related to the eviction policy.
> I could imagine using this parameter to pass the conditions, from the
> processing stage to the evictor, about what events to be eliminated.
>
>
>
>
> public abstract class Context {
>
>    public abstract EvictionInfo evictionInfo();
>
> ...
>
>
>    public abstract KEY key();
>
>    public abstract W window();
>
>    public abstract int id();
>
>    public abstract FiringInfo firingInfo();
>
>    public abstract Iterable<IN> elements();
>
>    public abstract void output(OUT value);
>
> }
>
>
> Also on a slightly unrelated issue - how hard it would be to introduce
> different types of buffers for the windows. Currently the existing one
> is behaving (when under processing) similar with a FIFO queue (in the
> sense that you need to start from beginning, from the oldest element).
> How about enabling for example also LIFO behavior (start iterating
> through the list from the most recent element). As in the source
> queues or stacks are not actually used, perhaps we can just pass
> policies to the iterator - or have custom itrators
>
>
>
>
>
>
> Dr. Radu Tudoran
> Research Engineer - Big Data Expert
> IT R&D Division
>
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
> European Research Center
> Riesstrasse 25, 80992 München
>
> E-mail: [hidden email]
> Mobile: +49 15209084330
> Telephone: +49 891588344173
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
> Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com Registered
> Office: Düsseldorf, Register Court Düsseldorf, HRB 56063, Managing
> Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der Gesellschaft:
> Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This e-mail and
> its attachments contain confidential information from HUAWEI, which is
> intended only for the person or entity whose address is listed above.
> Any use of the information contained herein in any way (including, but
> not limited to, total or partial disclosure, reproduction, or
> dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the
> sender by phone or email immediately and delete it!
>
>
> -----Original Message-----
> From: Aljoscha Krettek [mailto:[hidden email]]
> Sent: Wednesday, July 13, 2016 2:24 PM
> To: [hidden email]
> Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
>
> Sure, I also thought about this but went for the "extreme" initially.
> If no-one objects I'll update the doc in a bit.
>
> On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]> wrote:
>
> > Thanks for opening this.
> >
> > I see the need for having an extensible context object for window
> > function invocations, but i think hiding every parameter in the
> > context is a bit unnatural.
> >
> > How about having a function "apply(Key, Values, WindowContext,
> Collector)"
> > ?
> > It should be possible to write the straightforward use cases without
> > accessing the context object.
> >
> >
> >
> > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > <[hidden email]>
> > wrote:
> >
> > > Hi,
> > > this is a proposal to introduce a new interface for the window
> > > function
> > to
> > > make it more extensible for the future where we might want to
> > > provide additional information about why a window fired to the
> > > user
> function:
> > >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending+W
> > in
> > dow+Function+Metadata
> > >
> > > I'd appreciate your thoughts!
> > >
> > > Cheers,
> > > Aljoscha
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Aljoscha Krettek-2
What about the collector? This is only used for emitting elements to the
downstream operation.

On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <[hidden email]> wrote:

> Hi,
>
> I think it looks good and most importantly is that we can extend it in the
> directions discussed so far.
>
> One question though regarding the Collector - are we going to be able to
> delete random elements from the list if this is not exposed as a
> collection, at least to the evictor? If not, how are we going to extend in
> the future to cover this case?
>
> Regarding the ordering - I also observed that there are situations where
> elements do not have a logical order. One example is if you have high rates
> of the events. Nevertheless, even if now is not the time for this, I think
> in the future we can imagine having also some data structures that offer
> some ordering. It can save some computation efforts later in the functions
> for some use cases.
>
>
> Best regards,
>
>
> -----Original Message-----
> From: Aljoscha Krettek [mailto:[hidden email]]
> Sent: Monday, July 18, 2016 3:45 PM
> To: [hidden email]
> Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
>
> I incorporated the changes. The proposed interface of
> ProcessWindowFunction is now this:
>
> public abstract class ProcessWindowFunction <IN, OUT, KEY, W extends
> Window> implements Function {
>
>     public abstract void process(KEY key, Iterable<IN> elements, Context
> ctx) throws Exception;
>
>     public abstract class Context {
>         public abstract W window();
>         public abstract void output(OUT value);
>     }
> }
>
> I'm proposing to not expose Collector anymore because it has the close()
> method that should not be called by users. Having the output() call
> directly on the context should work just as well.
>
> Also, I marked the "adding a firing reason" and "adding firing counter" as
> future work that are only examples of stuff that can be implemented on top
> of the new interface. Initially, this will provide exactly the same
> features as the old API but be extensible. I did this to not make the scope
> of this proposal to big because Radu also suggested more changes and each
> of them should be covered in a separate design doc or FLIP.
>
> @Radu: On the different buffer types. I think this would be very tricky.
> Right now, people should also not rely on the fact that elements are
> "FIFO". Some state backends might keep the elements in a different order
> and when you have merging windows/session windows the order of the elements
> will also not be preserved.
>
> Cheers,
> Aljoscha
>
> On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <[hidden email]> wrote:
>
> > Hi,
> >
> > If it is to extend the Context to pass more information between the
> > stages of processing a window (triggering -> process -> eviction), why
> > not adding also a "EvictionInfo"? I think this might actually help
> > with the issues discussed in the tread related to the eviction policy.
> > I could imagine using this parameter to pass the conditions, from the
> > processing stage to the evictor, about what events to be eliminated.
> >
> >
> >
> >
> > public abstract class Context {
> >
> >    public abstract EvictionInfo evictionInfo();
> >
> > ...
> >
> >
> >    public abstract KEY key();
> >
> >    public abstract W window();
> >
> >    public abstract int id();
> >
> >    public abstract FiringInfo firingInfo();
> >
> >    public abstract Iterable<IN> elements();
> >
> >    public abstract void output(OUT value);
> >
> > }
> >
> >
> > Also on a slightly unrelated issue - how hard it would be to introduce
> > different types of buffers for the windows. Currently the existing one
> > is behaving (when under processing) similar with a FIFO queue (in the
> > sense that you need to start from beginning, from the oldest element).
> > How about enabling for example also LIFO behavior (start iterating
> > through the list from the most recent element). As in the source
> > queues or stacks are not actually used, perhaps we can just pass
> > policies to the iterator - or have custom itrators
> >
> >
> >
> >
> >
> >
> > Dr. Radu Tudoran
> > Research Engineer - Big Data Expert
> > IT R&D Division
> >
> >
> > HUAWEI TECHNOLOGIES Duesseldorf GmbH
> > European Research Center
> > Riesstrasse 25, 80992 München
> >
> > E-mail: [hidden email]
> > Mobile: +49 15209084330
> > Telephone: +49 891588344173
> >
> > HUAWEI TECHNOLOGIES Duesseldorf GmbH
> > Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com Registered
> > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063, Managing
> > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der Gesellschaft:
> > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This e-mail and
> > its attachments contain confidential information from HUAWEI, which is
> > intended only for the person or entity whose address is listed above.
> > Any use of the information contained herein in any way (including, but
> > not limited to, total or partial disclosure, reproduction, or
> > dissemination) by persons other than the intended recipient(s) is
> > prohibited. If you receive this e-mail in error, please notify the
> > sender by phone or email immediately and delete it!
> >
> >
> > -----Original Message-----
> > From: Aljoscha Krettek [mailto:[hidden email]]
> > Sent: Wednesday, July 13, 2016 2:24 PM
> > To: [hidden email]
> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> >
> > Sure, I also thought about this but went for the "extreme" initially.
> > If no-one objects I'll update the doc in a bit.
> >
> > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]> wrote:
> >
> > > Thanks for opening this.
> > >
> > > I see the need for having an extensible context object for window
> > > function invocations, but i think hiding every parameter in the
> > > context is a bit unnatural.
> > >
> > > How about having a function "apply(Key, Values, WindowContext,
> > Collector)"
> > > ?
> > > It should be possible to write the straightforward use cases without
> > > accessing the context object.
> > >
> > >
> > >
> > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > > <[hidden email]>
> > > wrote:
> > >
> > > > Hi,
> > > > this is a proposal to introduce a new interface for the window
> > > > function
> > > to
> > > > make it more extensible for the future where we might want to
> > > > provide additional information about why a window fired to the
> > > > user
> > function:
> > > >
> > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending+W
> > > in
> > > dow+Function+Metadata
> > > >
> > > > I'd appreciate your thoughts!
> > > >
> > > > Cheers,
> > > > Aljoscha
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: [DISCUSS] FLIP-2 Extending Window Function Metadata

Radu Tudoran
Hi,

Sorry - I made a mistake - I was thinking of getting access to the collection (mist-read :) collector) of events in the window buffer  in order to be able to delete/evict some of them which are not necessary the last ones.


Radu

-----Original Message-----
From: Aljoscha Krettek [mailto:[hidden email]]
Sent: Monday, July 18, 2016 5:54 PM
To: [hidden email]
Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

What about the collector? This is only used for emitting elements to the downstream operation.

On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <[hidden email]> wrote:

> Hi,
>
> I think it looks good and most importantly is that we can extend it in
> the directions discussed so far.
>
> One question though regarding the Collector - are we going to be able
> to delete random elements from the list if this is not exposed as a
> collection, at least to the evictor? If not, how are we going to
> extend in the future to cover this case?
>
> Regarding the ordering - I also observed that there are situations
> where elements do not have a logical order. One example is if you have
> high rates of the events. Nevertheless, even if now is not the time
> for this, I think in the future we can imagine having also some data
> structures that offer some ordering. It can save some computation
> efforts later in the functions for some use cases.
>
>
> Best regards,
>
>
> -----Original Message-----
> From: Aljoscha Krettek [mailto:[hidden email]]
> Sent: Monday, July 18, 2016 3:45 PM
> To: [hidden email]
> Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
>
> I incorporated the changes. The proposed interface of
> ProcessWindowFunction is now this:
>
> public abstract class ProcessWindowFunction <IN, OUT, KEY, W extends
> Window> implements Function {
>
>     public abstract void process(KEY key, Iterable<IN> elements,
> Context
> ctx) throws Exception;
>
>     public abstract class Context {
>         public abstract W window();
>         public abstract void output(OUT value);
>     }
> }
>
> I'm proposing to not expose Collector anymore because it has the
> close() method that should not be called by users. Having the output()
> call directly on the context should work just as well.
>
> Also, I marked the "adding a firing reason" and "adding firing
> counter" as future work that are only examples of stuff that can be
> implemented on top of the new interface. Initially, this will provide
> exactly the same features as the old API but be extensible. I did this
> to not make the scope of this proposal to big because Radu also
> suggested more changes and each of them should be covered in a separate design doc or FLIP.
>
> @Radu: On the different buffer types. I think this would be very tricky.
> Right now, people should also not rely on the fact that elements are
> "FIFO". Some state backends might keep the elements in a different
> order and when you have merging windows/session windows the order of
> the elements will also not be preserved.
>
> Cheers,
> Aljoscha
>
> On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <[hidden email]> wrote:
>
> > Hi,
> >
> > If it is to extend the Context to pass more information between the
> > stages of processing a window (triggering -> process -> eviction),
> > why not adding also a "EvictionInfo"? I think this might actually
> > help with the issues discussed in the tread related to the eviction policy.
> > I could imagine using this parameter to pass the conditions, from
> > the processing stage to the evictor, about what events to be eliminated.
> >
> >
> >
> >
> > public abstract class Context {
> >
> >    public abstract EvictionInfo evictionInfo();
> >
> > ...
> >
> >
> >    public abstract KEY key();
> >
> >    public abstract W window();
> >
> >    public abstract int id();
> >
> >    public abstract FiringInfo firingInfo();
> >
> >    public abstract Iterable<IN> elements();
> >
> >    public abstract void output(OUT value);
> >
> > }
> >
> >
> > Also on a slightly unrelated issue - how hard it would be to
> > introduce different types of buffers for the windows. Currently the
> > existing one is behaving (when under processing) similar with a FIFO
> > queue (in the sense that you need to start from beginning, from the oldest element).
> > How about enabling for example also LIFO behavior (start iterating
> > through the list from the most recent element). As in the source
> > queues or stacks are not actually used, perhaps we can just pass
> > policies to the iterator - or have custom itrators
> >
> >
> >
> >
> >
> >
> > Dr. Radu Tudoran
> > Research Engineer - Big Data Expert
> > IT R&D Division
> >
> >
> > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Center
> > Riesstrasse 25, 80992 München
> >
> > E-mail: [hidden email]
> > Mobile: +49 15209084330
> > Telephone: +49 891588344173
> >
> > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205, 40549
> > Düsseldorf, Germany, www.huawei.com Registered
> > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063, Managing
> > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der Gesellschaft:
> > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This e-mail and
> > its attachments contain confidential information from HUAWEI, which
> > is intended only for the person or entity whose address is listed above.
> > Any use of the information contained herein in any way (including,
> > but not limited to, total or partial disclosure, reproduction, or
> > dissemination) by persons other than the intended recipient(s) is
> > prohibited. If you receive this e-mail in error, please notify the
> > sender by phone or email immediately and delete it!
> >
> >
> > -----Original Message-----
> > From: Aljoscha Krettek [mailto:[hidden email]]
> > Sent: Wednesday, July 13, 2016 2:24 PM
> > To: [hidden email]
> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> >
> > Sure, I also thought about this but went for the "extreme" initially.
> > If no-one objects I'll update the doc in a bit.
> >
> > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]> wrote:
> >
> > > Thanks for opening this.
> > >
> > > I see the need for having an extensible context object for window
> > > function invocations, but i think hiding every parameter in the
> > > context is a bit unnatural.
> > >
> > > How about having a function "apply(Key, Values, WindowContext,
> > Collector)"
> > > ?
> > > It should be possible to write the straightforward use cases
> > > without accessing the context object.
> > >
> > >
> > >
> > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > > <[hidden email]>
> > > wrote:
> > >
> > > > Hi,
> > > > this is a proposal to introduce a new interface for the window
> > > > function
> > > to
> > > > make it more extensible for the future where we might want to
> > > > provide additional information about why a window fired to the
> > > > user
> > function:
> > > >
> > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
> > > +W
> > > in
> > > dow+Function+Metadata
> > > >
> > > > I'd appreciate your thoughts!
> > > >
> > > > Cheers,
> > > > Aljoscha
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Stephan Ewen
I would actually make the output a separate parameter as well. Pretty much
like the old variant, only replacing the "Window" parameter by the context
(which contains everything about the window).
It could also be called "WindowInvocationContext" or so.

The current variant looks too Hadoop to me ;-) Everything done on the
context object, and messy mocking when creating tests.

On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <[hidden email]>
wrote:

> Hi,
>
> Sorry - I made a mistake - I was thinking of getting access to the
> collection (mist-read :) collector) of events in the window buffer  in
> order to be able to delete/evict some of them which are not necessary the
> last ones.
>
>
> Radu
>
> -----Original Message-----
> From: Aljoscha Krettek [mailto:[hidden email]]
> Sent: Monday, July 18, 2016 5:54 PM
> To: [hidden email]
> Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
>
> What about the collector? This is only used for emitting elements to the
> downstream operation.
>
> On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <[hidden email]> wrote:
>
> > Hi,
> >
> > I think it looks good and most importantly is that we can extend it in
> > the directions discussed so far.
> >
> > One question though regarding the Collector - are we going to be able
> > to delete random elements from the list if this is not exposed as a
> > collection, at least to the evictor? If not, how are we going to
> > extend in the future to cover this case?
> >
> > Regarding the ordering - I also observed that there are situations
> > where elements do not have a logical order. One example is if you have
> > high rates of the events. Nevertheless, even if now is not the time
> > for this, I think in the future we can imagine having also some data
> > structures that offer some ordering. It can save some computation
> > efforts later in the functions for some use cases.
> >
> >
> > Best regards,
> >
> >
> > -----Original Message-----
> > From: Aljoscha Krettek [mailto:[hidden email]]
> > Sent: Monday, July 18, 2016 3:45 PM
> > To: [hidden email]
> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> >
> > I incorporated the changes. The proposed interface of
> > ProcessWindowFunction is now this:
> >
> > public abstract class ProcessWindowFunction <IN, OUT, KEY, W extends
> > Window> implements Function {
> >
> >     public abstract void process(KEY key, Iterable<IN> elements,
> > Context
> > ctx) throws Exception;
> >
> >     public abstract class Context {
> >         public abstract W window();
> >         public abstract void output(OUT value);
> >     }
> > }
> >
> > I'm proposing to not expose Collector anymore because it has the
> > close() method that should not be called by users. Having the output()
> > call directly on the context should work just as well.
> >
> > Also, I marked the "adding a firing reason" and "adding firing
> > counter" as future work that are only examples of stuff that can be
> > implemented on top of the new interface. Initially, this will provide
> > exactly the same features as the old API but be extensible. I did this
> > to not make the scope of this proposal to big because Radu also
> > suggested more changes and each of them should be covered in a separate
> design doc or FLIP.
> >
> > @Radu: On the different buffer types. I think this would be very tricky.
> > Right now, people should also not rely on the fact that elements are
> > "FIFO". Some state backends might keep the elements in a different
> > order and when you have merging windows/session windows the order of
> > the elements will also not be preserved.
> >
> > Cheers,
> > Aljoscha
> >
> > On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <[hidden email]>
> wrote:
> >
> > > Hi,
> > >
> > > If it is to extend the Context to pass more information between the
> > > stages of processing a window (triggering -> process -> eviction),
> > > why not adding also a "EvictionInfo"? I think this might actually
> > > help with the issues discussed in the tread related to the eviction
> policy.
> > > I could imagine using this parameter to pass the conditions, from
> > > the processing stage to the evictor, about what events to be
> eliminated.
> > >
> > >
> > >
> > >
> > > public abstract class Context {
> > >
> > >    public abstract EvictionInfo evictionInfo();
> > >
> > > ...
> > >
> > >
> > >    public abstract KEY key();
> > >
> > >    public abstract W window();
> > >
> > >    public abstract int id();
> > >
> > >    public abstract FiringInfo firingInfo();
> > >
> > >    public abstract Iterable<IN> elements();
> > >
> > >    public abstract void output(OUT value);
> > >
> > > }
> > >
> > >
> > > Also on a slightly unrelated issue - how hard it would be to
> > > introduce different types of buffers for the windows. Currently the
> > > existing one is behaving (when under processing) similar with a FIFO
> > > queue (in the sense that you need to start from beginning, from the
> oldest element).
> > > How about enabling for example also LIFO behavior (start iterating
> > > through the list from the most recent element). As in the source
> > > queues or stacks are not actually used, perhaps we can just pass
> > > policies to the iterator - or have custom itrators
> > >
> > >
> > >
> > >
> > >
> > >
> > > Dr. Radu Tudoran
> > > Research Engineer - Big Data Expert
> > > IT R&D Division
> > >
> > >
> > > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Center
> > > Riesstrasse 25, 80992 München
> > >
> > > E-mail: [hidden email]
> > > Mobile: +49 15209084330
> > > Telephone: +49 891588344173
> > >
> > > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205, 40549
> > > Düsseldorf, Germany, www.huawei.com Registered
> > > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063, Managing
> > > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der Gesellschaft:
> > > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> > > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This e-mail and
> > > its attachments contain confidential information from HUAWEI, which
> > > is intended only for the person or entity whose address is listed
> above.
> > > Any use of the information contained herein in any way (including,
> > > but not limited to, total or partial disclosure, reproduction, or
> > > dissemination) by persons other than the intended recipient(s) is
> > > prohibited. If you receive this e-mail in error, please notify the
> > > sender by phone or email immediately and delete it!
> > >
> > >
> > > -----Original Message-----
> > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > Sent: Wednesday, July 13, 2016 2:24 PM
> > > To: [hidden email]
> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > >
> > > Sure, I also thought about this but went for the "extreme" initially.
> > > If no-one objects I'll update the doc in a bit.
> > >
> > > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]> wrote:
> > >
> > > > Thanks for opening this.
> > > >
> > > > I see the need for having an extensible context object for window
> > > > function invocations, but i think hiding every parameter in the
> > > > context is a bit unnatural.
> > > >
> > > > How about having a function "apply(Key, Values, WindowContext,
> > > Collector)"
> > > > ?
> > > > It should be possible to write the straightforward use cases
> > > > without accessing the context object.
> > > >
> > > >
> > > >
> > > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > > > <[hidden email]>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > > this is a proposal to introduce a new interface for the window
> > > > > function
> > > > to
> > > > > make it more extensible for the future where we might want to
> > > > > provide additional information about why a window fired to the
> > > > > user
> > > function:
> > > > >
> > > > >
> > > > >
> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
> > > > +W
> > > > in
> > > > dow+Function+Metadata
> > > > >
> > > > > I'd appreciate your thoughts!
> > > > >
> > > > > Cheers,
> > > > > Aljoscha
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Aljoscha Krettek-2
You mean keep the Collector? I don't like that one because it has the
close() method that should never be called by the user.

We can keep it, though, because all the other user function interfaces also
expose it to the user.

On Tue, 19 Jul 2016 at 15:22 Stephan Ewen <[hidden email]> wrote:

> I would actually make the output a separate parameter as well. Pretty much
> like the old variant, only replacing the "Window" parameter by the context
> (which contains everything about the window).
> It could also be called "WindowInvocationContext" or so.
>
> The current variant looks too Hadoop to me ;-) Everything done on the
> context object, and messy mocking when creating tests.
>
> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <[hidden email]>
> wrote:
>
> > Hi,
> >
> > Sorry - I made a mistake - I was thinking of getting access to the
> > collection (mist-read :) collector) of events in the window buffer  in
> > order to be able to delete/evict some of them which are not necessary the
> > last ones.
> >
> >
> > Radu
> >
> > -----Original Message-----
> > From: Aljoscha Krettek [mailto:[hidden email]]
> > Sent: Monday, July 18, 2016 5:54 PM
> > To: [hidden email]
> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> >
> > What about the collector? This is only used for emitting elements to the
> > downstream operation.
> >
> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <[hidden email]>
> wrote:
> >
> > > Hi,
> > >
> > > I think it looks good and most importantly is that we can extend it in
> > > the directions discussed so far.
> > >
> > > One question though regarding the Collector - are we going to be able
> > > to delete random elements from the list if this is not exposed as a
> > > collection, at least to the evictor? If not, how are we going to
> > > extend in the future to cover this case?
> > >
> > > Regarding the ordering - I also observed that there are situations
> > > where elements do not have a logical order. One example is if you have
> > > high rates of the events. Nevertheless, even if now is not the time
> > > for this, I think in the future we can imagine having also some data
> > > structures that offer some ordering. It can save some computation
> > > efforts later in the functions for some use cases.
> > >
> > >
> > > Best regards,
> > >
> > >
> > > -----Original Message-----
> > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > Sent: Monday, July 18, 2016 3:45 PM
> > > To: [hidden email]
> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > >
> > > I incorporated the changes. The proposed interface of
> > > ProcessWindowFunction is now this:
> > >
> > > public abstract class ProcessWindowFunction <IN, OUT, KEY, W extends
> > > Window> implements Function {
> > >
> > >     public abstract void process(KEY key, Iterable<IN> elements,
> > > Context
> > > ctx) throws Exception;
> > >
> > >     public abstract class Context {
> > >         public abstract W window();
> > >         public abstract void output(OUT value);
> > >     }
> > > }
> > >
> > > I'm proposing to not expose Collector anymore because it has the
> > > close() method that should not be called by users. Having the output()
> > > call directly on the context should work just as well.
> > >
> > > Also, I marked the "adding a firing reason" and "adding firing
> > > counter" as future work that are only examples of stuff that can be
> > > implemented on top of the new interface. Initially, this will provide
> > > exactly the same features as the old API but be extensible. I did this
> > > to not make the scope of this proposal to big because Radu also
> > > suggested more changes and each of them should be covered in a separate
> > design doc or FLIP.
> > >
> > > @Radu: On the different buffer types. I think this would be very
> tricky.
> > > Right now, people should also not rely on the fact that elements are
> > > "FIFO". Some state backends might keep the elements in a different
> > > order and when you have merging windows/session windows the order of
> > > the elements will also not be preserved.
> > >
> > > Cheers,
> > > Aljoscha
> > >
> > > On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <[hidden email]>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > If it is to extend the Context to pass more information between the
> > > > stages of processing a window (triggering -> process -> eviction),
> > > > why not adding also a "EvictionInfo"? I think this might actually
> > > > help with the issues discussed in the tread related to the eviction
> > policy.
> > > > I could imagine using this parameter to pass the conditions, from
> > > > the processing stage to the evictor, about what events to be
> > eliminated.
> > > >
> > > >
> > > >
> > > >
> > > > public abstract class Context {
> > > >
> > > >    public abstract EvictionInfo evictionInfo();
> > > >
> > > > ...
> > > >
> > > >
> > > >    public abstract KEY key();
> > > >
> > > >    public abstract W window();
> > > >
> > > >    public abstract int id();
> > > >
> > > >    public abstract FiringInfo firingInfo();
> > > >
> > > >    public abstract Iterable<IN> elements();
> > > >
> > > >    public abstract void output(OUT value);
> > > >
> > > > }
> > > >
> > > >
> > > > Also on a slightly unrelated issue - how hard it would be to
> > > > introduce different types of buffers for the windows. Currently the
> > > > existing one is behaving (when under processing) similar with a FIFO
> > > > queue (in the sense that you need to start from beginning, from the
> > oldest element).
> > > > How about enabling for example also LIFO behavior (start iterating
> > > > through the list from the most recent element). As in the source
> > > > queues or stacks are not actually used, perhaps we can just pass
> > > > policies to the iterator - or have custom itrators
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Dr. Radu Tudoran
> > > > Research Engineer - Big Data Expert
> > > > IT R&D Division
> > > >
> > > >
> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Center
> > > > Riesstrasse 25, 80992 München
> > > >
> > > > E-mail: [hidden email]
> > > > Mobile: +49 15209084330
> > > > Telephone: +49 891588344173
> > > >
> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205, 40549
> > > > Düsseldorf, Germany, www.huawei.com Registered
> > > > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063, Managing
> > > > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der Gesellschaft:
> > > > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> > > > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This e-mail and
> > > > its attachments contain confidential information from HUAWEI, which
> > > > is intended only for the person or entity whose address is listed
> > above.
> > > > Any use of the information contained herein in any way (including,
> > > > but not limited to, total or partial disclosure, reproduction, or
> > > > dissemination) by persons other than the intended recipient(s) is
> > > > prohibited. If you receive this e-mail in error, please notify the
> > > > sender by phone or email immediately and delete it!
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > > Sent: Wednesday, July 13, 2016 2:24 PM
> > > > To: [hidden email]
> > > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > > >
> > > > Sure, I also thought about this but went for the "extreme" initially.
> > > > If no-one objects I'll update the doc in a bit.
> > > >
> > > > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]> wrote:
> > > >
> > > > > Thanks for opening this.
> > > > >
> > > > > I see the need for having an extensible context object for window
> > > > > function invocations, but i think hiding every parameter in the
> > > > > context is a bit unnatural.
> > > > >
> > > > > How about having a function "apply(Key, Values, WindowContext,
> > > > Collector)"
> > > > > ?
> > > > > It should be possible to write the straightforward use cases
> > > > > without accessing the context object.
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > > > > <[hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > > this is a proposal to introduce a new interface for the window
> > > > > > function
> > > > > to
> > > > > > make it more extensible for the future where we might want to
> > > > > > provide additional information about why a window fired to the
> > > > > > user
> > > > function:
> > > > > >
> > > > > >
> > > > > >
> > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
> > > > > +W
> > > > > in
> > > > > dow+Function+Metadata
> > > > > >
> > > > > > I'd appreciate your thoughts!
> > > > > >
> > > > > > Cheers,
> > > > > > Aljoscha
> > > > > >
> > > > >
> > > >
> > >
> >
>
mxm
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

mxm
I think it looks like Beam rather than Hadoop :)

What Stephan meant was that he wanted a dedicated output method in the
ProcessWindowFunction. I agree with Aljoscha that we shouldn't expose
the collector.

On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek <[hidden email]> wrote:

> You mean keep the Collector? I don't like that one because it has the
> close() method that should never be called by the user.
>
> We can keep it, though, because all the other user function interfaces also
> expose it to the user.
>
> On Tue, 19 Jul 2016 at 15:22 Stephan Ewen <[hidden email]> wrote:
>
>> I would actually make the output a separate parameter as well. Pretty much
>> like the old variant, only replacing the "Window" parameter by the context
>> (which contains everything about the window).
>> It could also be called "WindowInvocationContext" or so.
>>
>> The current variant looks too Hadoop to me ;-) Everything done on the
>> context object, and messy mocking when creating tests.
>>
>> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <[hidden email]>
>> wrote:
>>
>> > Hi,
>> >
>> > Sorry - I made a mistake - I was thinking of getting access to the
>> > collection (mist-read :) collector) of events in the window buffer  in
>> > order to be able to delete/evict some of them which are not necessary the
>> > last ones.
>> >
>> >
>> > Radu
>> >
>> > -----Original Message-----
>> > From: Aljoscha Krettek [mailto:[hidden email]]
>> > Sent: Monday, July 18, 2016 5:54 PM
>> > To: [hidden email]
>> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
>> >
>> > What about the collector? This is only used for emitting elements to the
>> > downstream operation.
>> >
>> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <[hidden email]>
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > I think it looks good and most importantly is that we can extend it in
>> > > the directions discussed so far.
>> > >
>> > > One question though regarding the Collector - are we going to be able
>> > > to delete random elements from the list if this is not exposed as a
>> > > collection, at least to the evictor? If not, how are we going to
>> > > extend in the future to cover this case?
>> > >
>> > > Regarding the ordering - I also observed that there are situations
>> > > where elements do not have a logical order. One example is if you have
>> > > high rates of the events. Nevertheless, even if now is not the time
>> > > for this, I think in the future we can imagine having also some data
>> > > structures that offer some ordering. It can save some computation
>> > > efforts later in the functions for some use cases.
>> > >
>> > >
>> > > Best regards,
>> > >
>> > >
>> > > -----Original Message-----
>> > > From: Aljoscha Krettek [mailto:[hidden email]]
>> > > Sent: Monday, July 18, 2016 3:45 PM
>> > > To: [hidden email]
>> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
>> > >
>> > > I incorporated the changes. The proposed interface of
>> > > ProcessWindowFunction is now this:
>> > >
>> > > public abstract class ProcessWindowFunction <IN, OUT, KEY, W extends
>> > > Window> implements Function {
>> > >
>> > >     public abstract void process(KEY key, Iterable<IN> elements,
>> > > Context
>> > > ctx) throws Exception;
>> > >
>> > >     public abstract class Context {
>> > >         public abstract W window();
>> > >         public abstract void output(OUT value);
>> > >     }
>> > > }
>> > >
>> > > I'm proposing to not expose Collector anymore because it has the
>> > > close() method that should not be called by users. Having the output()
>> > > call directly on the context should work just as well.
>> > >
>> > > Also, I marked the "adding a firing reason" and "adding firing
>> > > counter" as future work that are only examples of stuff that can be
>> > > implemented on top of the new interface. Initially, this will provide
>> > > exactly the same features as the old API but be extensible. I did this
>> > > to not make the scope of this proposal to big because Radu also
>> > > suggested more changes and each of them should be covered in a separate
>> > design doc or FLIP.
>> > >
>> > > @Radu: On the different buffer types. I think this would be very
>> tricky.
>> > > Right now, people should also not rely on the fact that elements are
>> > > "FIFO". Some state backends might keep the elements in a different
>> > > order and when you have merging windows/session windows the order of
>> > > the elements will also not be preserved.
>> > >
>> > > Cheers,
>> > > Aljoscha
>> > >
>> > > On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <[hidden email]>
>> > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > If it is to extend the Context to pass more information between the
>> > > > stages of processing a window (triggering -> process -> eviction),
>> > > > why not adding also a "EvictionInfo"? I think this might actually
>> > > > help with the issues discussed in the tread related to the eviction
>> > policy.
>> > > > I could imagine using this parameter to pass the conditions, from
>> > > > the processing stage to the evictor, about what events to be
>> > eliminated.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > public abstract class Context {
>> > > >
>> > > >    public abstract EvictionInfo evictionInfo();
>> > > >
>> > > > ...
>> > > >
>> > > >
>> > > >    public abstract KEY key();
>> > > >
>> > > >    public abstract W window();
>> > > >
>> > > >    public abstract int id();
>> > > >
>> > > >    public abstract FiringInfo firingInfo();
>> > > >
>> > > >    public abstract Iterable<IN> elements();
>> > > >
>> > > >    public abstract void output(OUT value);
>> > > >
>> > > > }
>> > > >
>> > > >
>> > > > Also on a slightly unrelated issue - how hard it would be to
>> > > > introduce different types of buffers for the windows. Currently the
>> > > > existing one is behaving (when under processing) similar with a FIFO
>> > > > queue (in the sense that you need to start from beginning, from the
>> > oldest element).
>> > > > How about enabling for example also LIFO behavior (start iterating
>> > > > through the list from the most recent element). As in the source
>> > > > queues or stacks are not actually used, perhaps we can just pass
>> > > > policies to the iterator - or have custom itrators
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > Dr. Radu Tudoran
>> > > > Research Engineer - Big Data Expert
>> > > > IT R&D Division
>> > > >
>> > > >
>> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Center
>> > > > Riesstrasse 25, 80992 München
>> > > >
>> > > > E-mail: [hidden email]
>> > > > Mobile: +49 15209084330
>> > > > Telephone: +49 891588344173
>> > > >
>> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205, 40549
>> > > > Düsseldorf, Germany, www.huawei.com Registered
>> > > > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063, Managing
>> > > > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der Gesellschaft:
>> > > > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
>> > > > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This e-mail and
>> > > > its attachments contain confidential information from HUAWEI, which
>> > > > is intended only for the person or entity whose address is listed
>> > above.
>> > > > Any use of the information contained herein in any way (including,
>> > > > but not limited to, total or partial disclosure, reproduction, or
>> > > > dissemination) by persons other than the intended recipient(s) is
>> > > > prohibited. If you receive this e-mail in error, please notify the
>> > > > sender by phone or email immediately and delete it!
>> > > >
>> > > >
>> > > > -----Original Message-----
>> > > > From: Aljoscha Krettek [mailto:[hidden email]]
>> > > > Sent: Wednesday, July 13, 2016 2:24 PM
>> > > > To: [hidden email]
>> > > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
>> > > >
>> > > > Sure, I also thought about this but went for the "extreme" initially.
>> > > > If no-one objects I'll update the doc in a bit.
>> > > >
>> > > > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]> wrote:
>> > > >
>> > > > > Thanks for opening this.
>> > > > >
>> > > > > I see the need for having an extensible context object for window
>> > > > > function invocations, but i think hiding every parameter in the
>> > > > > context is a bit unnatural.
>> > > > >
>> > > > > How about having a function "apply(Key, Values, WindowContext,
>> > > > Collector)"
>> > > > > ?
>> > > > > It should be possible to write the straightforward use cases
>> > > > > without accessing the context object.
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
>> > > > > <[hidden email]>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi,
>> > > > > > this is a proposal to introduce a new interface for the window
>> > > > > > function
>> > > > > to
>> > > > > > make it more extensible for the future where we might want to
>> > > > > > provide additional information about why a window fired to the
>> > > > > > user
>> > > > function:
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
>> > > > > +W
>> > > > > in
>> > > > > dow+Function+Metadata
>> > > > > >
>> > > > > > I'd appreciate your thoughts!
>> > > > > >
>> > > > > > Cheers,
>> > > > > > Aljoscha
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Aljoscha Krettek-2
@Stephan: For the Output, should we keep using a Collector (which exposes)
the close() method which should never be called by users or create a new
Output type that only has an "output" method. Collector can also be used
but with a close() method that doesn't do anything. In the long run, I
thought it might be better to switch the type away from Collector.

Cheers,
Aljoscha

On Wed, 20 Jul 2016 at 01:25 Maximilian Michels <[hidden email]> wrote:

> I think it looks like Beam rather than Hadoop :)
>
> What Stephan meant was that he wanted a dedicated output method in the
> ProcessWindowFunction. I agree with Aljoscha that we shouldn't expose
> the collector.
>
> On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek <[hidden email]>
> wrote:
> > You mean keep the Collector? I don't like that one because it has the
> > close() method that should never be called by the user.
> >
> > We can keep it, though, because all the other user function interfaces
> also
> > expose it to the user.
> >
> > On Tue, 19 Jul 2016 at 15:22 Stephan Ewen <[hidden email]> wrote:
> >
> >> I would actually make the output a separate parameter as well. Pretty
> much
> >> like the old variant, only replacing the "Window" parameter by the
> context
> >> (which contains everything about the window).
> >> It could also be called "WindowInvocationContext" or so.
> >>
> >> The current variant looks too Hadoop to me ;-) Everything done on the
> >> context object, and messy mocking when creating tests.
> >>
> >> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <[hidden email]>
> >> wrote:
> >>
> >> > Hi,
> >> >
> >> > Sorry - I made a mistake - I was thinking of getting access to the
> >> > collection (mist-read :) collector) of events in the window buffer  in
> >> > order to be able to delete/evict some of them which are not necessary
> the
> >> > last ones.
> >> >
> >> >
> >> > Radu
> >> >
> >> > -----Original Message-----
> >> > From: Aljoscha Krettek [mailto:[hidden email]]
> >> > Sent: Monday, July 18, 2016 5:54 PM
> >> > To: [hidden email]
> >> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> >> >
> >> > What about the collector? This is only used for emitting elements to
> the
> >> > downstream operation.
> >> >
> >> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <[hidden email]>
> >> wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > I think it looks good and most importantly is that we can extend it
> in
> >> > > the directions discussed so far.
> >> > >
> >> > > One question though regarding the Collector - are we going to be
> able
> >> > > to delete random elements from the list if this is not exposed as a
> >> > > collection, at least to the evictor? If not, how are we going to
> >> > > extend in the future to cover this case?
> >> > >
> >> > > Regarding the ordering - I also observed that there are situations
> >> > > where elements do not have a logical order. One example is if you
> have
> >> > > high rates of the events. Nevertheless, even if now is not the time
> >> > > for this, I think in the future we can imagine having also some data
> >> > > structures that offer some ordering. It can save some computation
> >> > > efforts later in the functions for some use cases.
> >> > >
> >> > >
> >> > > Best regards,
> >> > >
> >> > >
> >> > > -----Original Message-----
> >> > > From: Aljoscha Krettek [mailto:[hidden email]]
> >> > > Sent: Monday, July 18, 2016 3:45 PM
> >> > > To: [hidden email]
> >> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> >> > >
> >> > > I incorporated the changes. The proposed interface of
> >> > > ProcessWindowFunction is now this:
> >> > >
> >> > > public abstract class ProcessWindowFunction <IN, OUT, KEY, W extends
> >> > > Window> implements Function {
> >> > >
> >> > >     public abstract void process(KEY key, Iterable<IN> elements,
> >> > > Context
> >> > > ctx) throws Exception;
> >> > >
> >> > >     public abstract class Context {
> >> > >         public abstract W window();
> >> > >         public abstract void output(OUT value);
> >> > >     }
> >> > > }
> >> > >
> >> > > I'm proposing to not expose Collector anymore because it has the
> >> > > close() method that should not be called by users. Having the
> output()
> >> > > call directly on the context should work just as well.
> >> > >
> >> > > Also, I marked the "adding a firing reason" and "adding firing
> >> > > counter" as future work that are only examples of stuff that can be
> >> > > implemented on top of the new interface. Initially, this will
> provide
> >> > > exactly the same features as the old API but be extensible. I did
> this
> >> > > to not make the scope of this proposal to big because Radu also
> >> > > suggested more changes and each of them should be covered in a
> separate
> >> > design doc or FLIP.
> >> > >
> >> > > @Radu: On the different buffer types. I think this would be very
> >> tricky.
> >> > > Right now, people should also not rely on the fact that elements are
> >> > > "FIFO". Some state backends might keep the elements in a different
> >> > > order and when you have merging windows/session windows the order of
> >> > > the elements will also not be preserved.
> >> > >
> >> > > Cheers,
> >> > > Aljoscha
> >> > >
> >> > > On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <[hidden email]>
> >> > wrote:
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > If it is to extend the Context to pass more information between
> the
> >> > > > stages of processing a window (triggering -> process -> eviction),
> >> > > > why not adding also a "EvictionInfo"? I think this might actually
> >> > > > help with the issues discussed in the tread related to the
> eviction
> >> > policy.
> >> > > > I could imagine using this parameter to pass the conditions, from
> >> > > > the processing stage to the evictor, about what events to be
> >> > eliminated.
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > public abstract class Context {
> >> > > >
> >> > > >    public abstract EvictionInfo evictionInfo();
> >> > > >
> >> > > > ...
> >> > > >
> >> > > >
> >> > > >    public abstract KEY key();
> >> > > >
> >> > > >    public abstract W window();
> >> > > >
> >> > > >    public abstract int id();
> >> > > >
> >> > > >    public abstract FiringInfo firingInfo();
> >> > > >
> >> > > >    public abstract Iterable<IN> elements();
> >> > > >
> >> > > >    public abstract void output(OUT value);
> >> > > >
> >> > > > }
> >> > > >
> >> > > >
> >> > > > Also on a slightly unrelated issue - how hard it would be to
> >> > > > introduce different types of buffers for the windows. Currently
> the
> >> > > > existing one is behaving (when under processing) similar with a
> FIFO
> >> > > > queue (in the sense that you need to start from beginning, from
> the
> >> > oldest element).
> >> > > > How about enabling for example also LIFO behavior (start iterating
> >> > > > through the list from the most recent element). As in the source
> >> > > > queues or stacks are not actually used, perhaps we can just pass
> >> > > > policies to the iterator - or have custom itrators
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > Dr. Radu Tudoran
> >> > > > Research Engineer - Big Data Expert
> >> > > > IT R&D Division
> >> > > >
> >> > > >
> >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Center
> >> > > > Riesstrasse 25, 80992 München
> >> > > >
> >> > > > E-mail: [hidden email]
> >> > > > Mobile: +49 15209084330
> >> > > > Telephone: +49 891588344173
> >> > > >
> >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205, 40549
> >> > > > Düsseldorf, Germany, www.huawei.com Registered
> >> > > > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063, Managing
> >> > > > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der
> Gesellschaft:
> >> > > > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> >> > > > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This e-mail
> and
> >> > > > its attachments contain confidential information from HUAWEI,
> which
> >> > > > is intended only for the person or entity whose address is listed
> >> > above.
> >> > > > Any use of the information contained herein in any way (including,
> >> > > > but not limited to, total or partial disclosure, reproduction, or
> >> > > > dissemination) by persons other than the intended recipient(s) is
> >> > > > prohibited. If you receive this e-mail in error, please notify the
> >> > > > sender by phone or email immediately and delete it!
> >> > > >
> >> > > >
> >> > > > -----Original Message-----
> >> > > > From: Aljoscha Krettek [mailto:[hidden email]]
> >> > > > Sent: Wednesday, July 13, 2016 2:24 PM
> >> > > > To: [hidden email]
> >> > > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> >> > > >
> >> > > > Sure, I also thought about this but went for the "extreme"
> initially.
> >> > > > If no-one objects I'll update the doc in a bit.
> >> > > >
> >> > > > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]>
> wrote:
> >> > > >
> >> > > > > Thanks for opening this.
> >> > > > >
> >> > > > > I see the need for having an extensible context object for
> window
> >> > > > > function invocations, but i think hiding every parameter in the
> >> > > > > context is a bit unnatural.
> >> > > > >
> >> > > > > How about having a function "apply(Key, Values, WindowContext,
> >> > > > Collector)"
> >> > > > > ?
> >> > > > > It should be possible to write the straightforward use cases
> >> > > > > without accessing the context object.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> >> > > > > <[hidden email]>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hi,
> >> > > > > > this is a proposal to introduce a new interface for the window
> >> > > > > > function
> >> > > > > to
> >> > > > > > make it more extensible for the future where we might want to
> >> > > > > > provide additional information about why a window fired to the
> >> > > > > > user
> >> > > > function:
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
> >> > > > > +W
> >> > > > > in
> >> > > > > dow+Function+Metadata
> >> > > > > >
> >> > > > > > I'd appreciate your thoughts!
> >> > > > > >
> >> > > > > > Cheers,
> >> > > > > > Aljoscha
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Stephan Ewen
The Collector is pretty integral in all the other functions that return
multiple elements. I honestly don't see us switching away from it, given
that it is such a core part of the API.

The close() method has, to my best knowledge, not caused issues, yet. I
cannot recall anyone mentioning that the close() method confused them, they
accidentally called it, etc.
I am wondering whether this is more of a theoretical than practical issue.

If we move one function away from Collector to be on the safe side for a
"maybe change in the future", while keeping Collector in all other
functions - I think that fragments the API concept wise more than it
improves anything.

On Sun, Jul 31, 2016 at 7:10 PM, Aljoscha Krettek <[hidden email]>
wrote:

> @Stephan: For the Output, should we keep using a Collector (which exposes)
> the close() method which should never be called by users or create a new
> Output type that only has an "output" method. Collector can also be used
> but with a close() method that doesn't do anything. In the long run, I
> thought it might be better to switch the type away from Collector.
>
> Cheers,
> Aljoscha
>
> On Wed, 20 Jul 2016 at 01:25 Maximilian Michels <[hidden email]> wrote:
>
> > I think it looks like Beam rather than Hadoop :)
> >
> > What Stephan meant was that he wanted a dedicated output method in the
> > ProcessWindowFunction. I agree with Aljoscha that we shouldn't expose
> > the collector.
> >
> > On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek <[hidden email]>
> > wrote:
> > > You mean keep the Collector? I don't like that one because it has the
> > > close() method that should never be called by the user.
> > >
> > > We can keep it, though, because all the other user function interfaces
> > also
> > > expose it to the user.
> > >
> > > On Tue, 19 Jul 2016 at 15:22 Stephan Ewen <[hidden email]> wrote:
> > >
> > >> I would actually make the output a separate parameter as well. Pretty
> > much
> > >> like the old variant, only replacing the "Window" parameter by the
> > context
> > >> (which contains everything about the window).
> > >> It could also be called "WindowInvocationContext" or so.
> > >>
> > >> The current variant looks too Hadoop to me ;-) Everything done on the
> > >> context object, and messy mocking when creating tests.
> > >>
> > >> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <
> [hidden email]>
> > >> wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > Sorry - I made a mistake - I was thinking of getting access to the
> > >> > collection (mist-read :) collector) of events in the window buffer
> in
> > >> > order to be able to delete/evict some of them which are not
> necessary
> > the
> > >> > last ones.
> > >> >
> > >> >
> > >> > Radu
> > >> >
> > >> > -----Original Message-----
> > >> > From: Aljoscha Krettek [mailto:[hidden email]]
> > >> > Sent: Monday, July 18, 2016 5:54 PM
> > >> > To: [hidden email]
> > >> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > >> >
> > >> > What about the collector? This is only used for emitting elements to
> > the
> > >> > downstream operation.
> > >> >
> > >> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <[hidden email]>
> > >> wrote:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > > I think it looks good and most importantly is that we can extend
> it
> > in
> > >> > > the directions discussed so far.
> > >> > >
> > >> > > One question though regarding the Collector - are we going to be
> > able
> > >> > > to delete random elements from the list if this is not exposed as
> a
> > >> > > collection, at least to the evictor? If not, how are we going to
> > >> > > extend in the future to cover this case?
> > >> > >
> > >> > > Regarding the ordering - I also observed that there are situations
> > >> > > where elements do not have a logical order. One example is if you
> > have
> > >> > > high rates of the events. Nevertheless, even if now is not the
> time
> > >> > > for this, I think in the future we can imagine having also some
> data
> > >> > > structures that offer some ordering. It can save some computation
> > >> > > efforts later in the functions for some use cases.
> > >> > >
> > >> > >
> > >> > > Best regards,
> > >> > >
> > >> > >
> > >> > > -----Original Message-----
> > >> > > From: Aljoscha Krettek [mailto:[hidden email]]
> > >> > > Sent: Monday, July 18, 2016 3:45 PM
> > >> > > To: [hidden email]
> > >> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > >> > >
> > >> > > I incorporated the changes. The proposed interface of
> > >> > > ProcessWindowFunction is now this:
> > >> > >
> > >> > > public abstract class ProcessWindowFunction <IN, OUT, KEY, W
> extends
> > >> > > Window> implements Function {
> > >> > >
> > >> > >     public abstract void process(KEY key, Iterable<IN> elements,
> > >> > > Context
> > >> > > ctx) throws Exception;
> > >> > >
> > >> > >     public abstract class Context {
> > >> > >         public abstract W window();
> > >> > >         public abstract void output(OUT value);
> > >> > >     }
> > >> > > }
> > >> > >
> > >> > > I'm proposing to not expose Collector anymore because it has the
> > >> > > close() method that should not be called by users. Having the
> > output()
> > >> > > call directly on the context should work just as well.
> > >> > >
> > >> > > Also, I marked the "adding a firing reason" and "adding firing
> > >> > > counter" as future work that are only examples of stuff that can
> be
> > >> > > implemented on top of the new interface. Initially, this will
> > provide
> > >> > > exactly the same features as the old API but be extensible. I did
> > this
> > >> > > to not make the scope of this proposal to big because Radu also
> > >> > > suggested more changes and each of them should be covered in a
> > separate
> > >> > design doc or FLIP.
> > >> > >
> > >> > > @Radu: On the different buffer types. I think this would be very
> > >> tricky.
> > >> > > Right now, people should also not rely on the fact that elements
> are
> > >> > > "FIFO". Some state backends might keep the elements in a different
> > >> > > order and when you have merging windows/session windows the order
> of
> > >> > > the elements will also not be preserved.
> > >> > >
> > >> > > Cheers,
> > >> > > Aljoscha
> > >> > >
> > >> > > On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <
> [hidden email]>
> > >> > wrote:
> > >> > >
> > >> > > > Hi,
> > >> > > >
> > >> > > > If it is to extend the Context to pass more information between
> > the
> > >> > > > stages of processing a window (triggering -> process ->
> eviction),
> > >> > > > why not adding also a "EvictionInfo"? I think this might
> actually
> > >> > > > help with the issues discussed in the tread related to the
> > eviction
> > >> > policy.
> > >> > > > I could imagine using this parameter to pass the conditions,
> from
> > >> > > > the processing stage to the evictor, about what events to be
> > >> > eliminated.
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > public abstract class Context {
> > >> > > >
> > >> > > >    public abstract EvictionInfo evictionInfo();
> > >> > > >
> > >> > > > ...
> > >> > > >
> > >> > > >
> > >> > > >    public abstract KEY key();
> > >> > > >
> > >> > > >    public abstract W window();
> > >> > > >
> > >> > > >    public abstract int id();
> > >> > > >
> > >> > > >    public abstract FiringInfo firingInfo();
> > >> > > >
> > >> > > >    public abstract Iterable<IN> elements();
> > >> > > >
> > >> > > >    public abstract void output(OUT value);
> > >> > > >
> > >> > > > }
> > >> > > >
> > >> > > >
> > >> > > > Also on a slightly unrelated issue - how hard it would be to
> > >> > > > introduce different types of buffers for the windows. Currently
> > the
> > >> > > > existing one is behaving (when under processing) similar with a
> > FIFO
> > >> > > > queue (in the sense that you need to start from beginning, from
> > the
> > >> > oldest element).
> > >> > > > How about enabling for example also LIFO behavior (start
> iterating
> > >> > > > through the list from the most recent element). As in the source
> > >> > > > queues or stacks are not actually used, perhaps we can just pass
> > >> > > > policies to the iterator - or have custom itrators
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > Dr. Radu Tudoran
> > >> > > > Research Engineer - Big Data Expert
> > >> > > > IT R&D Division
> > >> > > >
> > >> > > >
> > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Center
> > >> > > > Riesstrasse 25, 80992 München
> > >> > > >
> > >> > > > E-mail: [hidden email]
> > >> > > > Mobile: +49 15209084330
> > >> > > > Telephone: +49 891588344173
> > >> > > >
> > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205, 40549
> > >> > > > Düsseldorf, Germany, www.huawei.com Registered
> > >> > > > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
> Managing
> > >> > > > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der
> > Gesellschaft:
> > >> > > > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> > >> > > > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This e-mail
> > and
> > >> > > > its attachments contain confidential information from HUAWEI,
> > which
> > >> > > > is intended only for the person or entity whose address is
> listed
> > >> > above.
> > >> > > > Any use of the information contained herein in any way
> (including,
> > >> > > > but not limited to, total or partial disclosure, reproduction,
> or
> > >> > > > dissemination) by persons other than the intended recipient(s)
> is
> > >> > > > prohibited. If you receive this e-mail in error, please notify
> the
> > >> > > > sender by phone or email immediately and delete it!
> > >> > > >
> > >> > > >
> > >> > > > -----Original Message-----
> > >> > > > From: Aljoscha Krettek [mailto:[hidden email]]
> > >> > > > Sent: Wednesday, July 13, 2016 2:24 PM
> > >> > > > To: [hidden email]
> > >> > > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > >> > > >
> > >> > > > Sure, I also thought about this but went for the "extreme"
> > initially.
> > >> > > > If no-one objects I'll update the doc in a bit.
> > >> > > >
> > >> > > > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]>
> > wrote:
> > >> > > >
> > >> > > > > Thanks for opening this.
> > >> > > > >
> > >> > > > > I see the need for having an extensible context object for
> > window
> > >> > > > > function invocations, but i think hiding every parameter in
> the
> > >> > > > > context is a bit unnatural.
> > >> > > > >
> > >> > > > > How about having a function "apply(Key, Values, WindowContext,
> > >> > > > Collector)"
> > >> > > > > ?
> > >> > > > > It should be possible to write the straightforward use cases
> > >> > > > > without accessing the context object.
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > >> > > > > <[hidden email]>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Hi,
> > >> > > > > > this is a proposal to introduce a new interface for the
> window
> > >> > > > > > function
> > >> > > > > to
> > >> > > > > > make it more extensible for the future where we might want
> to
> > >> > > > > > provide additional information about why a window fired to
> the
> > >> > > > > > user
> > >> > > > function:
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
> > >> > > > > +W
> > >> > > > > in
> > >> > > > > dow+Function+Metadata
> > >> > > > > >
> > >> > > > > > I'd appreciate your thoughts!
> > >> > > > > >
> > >> > > > > > Cheers,
> > >> > > > > > Aljoscha
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Aljoscha Krettek-2
Alright, that seems reasonable. I updated the doc to add the Collector to
the method signature again.

On Mon, 1 Aug 2016 at 00:59 Stephan Ewen <[hidden email]> wrote:

> The Collector is pretty integral in all the other functions that return
> multiple elements. I honestly don't see us switching away from it, given
> that it is such a core part of the API.
>
> The close() method has, to my best knowledge, not caused issues, yet. I
> cannot recall anyone mentioning that the close() method confused them, they
> accidentally called it, etc.
> I am wondering whether this is more of a theoretical than practical issue.
>
> If we move one function away from Collector to be on the safe side for a
> "maybe change in the future", while keeping Collector in all other
> functions - I think that fragments the API concept wise more than it
> improves anything.
>
> On Sun, Jul 31, 2016 at 7:10 PM, Aljoscha Krettek <[hidden email]>
> wrote:
>
> > @Stephan: For the Output, should we keep using a Collector (which
> exposes)
> > the close() method which should never be called by users or create a new
> > Output type that only has an "output" method. Collector can also be used
> > but with a close() method that doesn't do anything. In the long run, I
> > thought it might be better to switch the type away from Collector.
> >
> > Cheers,
> > Aljoscha
> >
> > On Wed, 20 Jul 2016 at 01:25 Maximilian Michels <[hidden email]> wrote:
> >
> > > I think it looks like Beam rather than Hadoop :)
> > >
> > > What Stephan meant was that he wanted a dedicated output method in the
> > > ProcessWindowFunction. I agree with Aljoscha that we shouldn't expose
> > > the collector.
> > >
> > > On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek <
> [hidden email]>
> > > wrote:
> > > > You mean keep the Collector? I don't like that one because it has the
> > > > close() method that should never be called by the user.
> > > >
> > > > We can keep it, though, because all the other user function
> interfaces
> > > also
> > > > expose it to the user.
> > > >
> > > > On Tue, 19 Jul 2016 at 15:22 Stephan Ewen <[hidden email]> wrote:
> > > >
> > > >> I would actually make the output a separate parameter as well.
> Pretty
> > > much
> > > >> like the old variant, only replacing the "Window" parameter by the
> > > context
> > > >> (which contains everything about the window).
> > > >> It could also be called "WindowInvocationContext" or so.
> > > >>
> > > >> The current variant looks too Hadoop to me ;-) Everything done on
> the
> > > >> context object, and messy mocking when creating tests.
> > > >>
> > > >> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <
> > [hidden email]>
> > > >> wrote:
> > > >>
> > > >> > Hi,
> > > >> >
> > > >> > Sorry - I made a mistake - I was thinking of getting access to the
> > > >> > collection (mist-read :) collector) of events in the window buffer
> > in
> > > >> > order to be able to delete/evict some of them which are not
> > necessary
> > > the
> > > >> > last ones.
> > > >> >
> > > >> >
> > > >> > Radu
> > > >> >
> > > >> > -----Original Message-----
> > > >> > From: Aljoscha Krettek [mailto:[hidden email]]
> > > >> > Sent: Monday, July 18, 2016 5:54 PM
> > > >> > To: [hidden email]
> > > >> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > > >> >
> > > >> > What about the collector? This is only used for emitting elements
> to
> > > the
> > > >> > downstream operation.
> > > >> >
> > > >> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <
> [hidden email]>
> > > >> wrote:
> > > >> >
> > > >> > > Hi,
> > > >> > >
> > > >> > > I think it looks good and most importantly is that we can extend
> > it
> > > in
> > > >> > > the directions discussed so far.
> > > >> > >
> > > >> > > One question though regarding the Collector - are we going to be
> > > able
> > > >> > > to delete random elements from the list if this is not exposed
> as
> > a
> > > >> > > collection, at least to the evictor? If not, how are we going to
> > > >> > > extend in the future to cover this case?
> > > >> > >
> > > >> > > Regarding the ordering - I also observed that there are
> situations
> > > >> > > where elements do not have a logical order. One example is if
> you
> > > have
> > > >> > > high rates of the events. Nevertheless, even if now is not the
> > time
> > > >> > > for this, I think in the future we can imagine having also some
> > data
> > > >> > > structures that offer some ordering. It can save some
> computation
> > > >> > > efforts later in the functions for some use cases.
> > > >> > >
> > > >> > >
> > > >> > > Best regards,
> > > >> > >
> > > >> > >
> > > >> > > -----Original Message-----
> > > >> > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > >> > > Sent: Monday, July 18, 2016 3:45 PM
> > > >> > > To: [hidden email]
> > > >> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > > >> > >
> > > >> > > I incorporated the changes. The proposed interface of
> > > >> > > ProcessWindowFunction is now this:
> > > >> > >
> > > >> > > public abstract class ProcessWindowFunction <IN, OUT, KEY, W
> > extends
> > > >> > > Window> implements Function {
> > > >> > >
> > > >> > >     public abstract void process(KEY key, Iterable<IN> elements,
> > > >> > > Context
> > > >> > > ctx) throws Exception;
> > > >> > >
> > > >> > >     public abstract class Context {
> > > >> > >         public abstract W window();
> > > >> > >         public abstract void output(OUT value);
> > > >> > >     }
> > > >> > > }
> > > >> > >
> > > >> > > I'm proposing to not expose Collector anymore because it has the
> > > >> > > close() method that should not be called by users. Having the
> > > output()
> > > >> > > call directly on the context should work just as well.
> > > >> > >
> > > >> > > Also, I marked the "adding a firing reason" and "adding firing
> > > >> > > counter" as future work that are only examples of stuff that can
> > be
> > > >> > > implemented on top of the new interface. Initially, this will
> > > provide
> > > >> > > exactly the same features as the old API but be extensible. I
> did
> > > this
> > > >> > > to not make the scope of this proposal to big because Radu also
> > > >> > > suggested more changes and each of them should be covered in a
> > > separate
> > > >> > design doc or FLIP.
> > > >> > >
> > > >> > > @Radu: On the different buffer types. I think this would be very
> > > >> tricky.
> > > >> > > Right now, people should also not rely on the fact that elements
> > are
> > > >> > > "FIFO". Some state backends might keep the elements in a
> different
> > > >> > > order and when you have merging windows/session windows the
> order
> > of
> > > >> > > the elements will also not be preserved.
> > > >> > >
> > > >> > > Cheers,
> > > >> > > Aljoscha
> > > >> > >
> > > >> > > On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <
> > [hidden email]>
> > > >> > wrote:
> > > >> > >
> > > >> > > > Hi,
> > > >> > > >
> > > >> > > > If it is to extend the Context to pass more information
> between
> > > the
> > > >> > > > stages of processing a window (triggering -> process ->
> > eviction),
> > > >> > > > why not adding also a "EvictionInfo"? I think this might
> > actually
> > > >> > > > help with the issues discussed in the tread related to the
> > > eviction
> > > >> > policy.
> > > >> > > > I could imagine using this parameter to pass the conditions,
> > from
> > > >> > > > the processing stage to the evictor, about what events to be
> > > >> > eliminated.
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > public abstract class Context {
> > > >> > > >
> > > >> > > >    public abstract EvictionInfo evictionInfo();
> > > >> > > >
> > > >> > > > ...
> > > >> > > >
> > > >> > > >
> > > >> > > >    public abstract KEY key();
> > > >> > > >
> > > >> > > >    public abstract W window();
> > > >> > > >
> > > >> > > >    public abstract int id();
> > > >> > > >
> > > >> > > >    public abstract FiringInfo firingInfo();
> > > >> > > >
> > > >> > > >    public abstract Iterable<IN> elements();
> > > >> > > >
> > > >> > > >    public abstract void output(OUT value);
> > > >> > > >
> > > >> > > > }
> > > >> > > >
> > > >> > > >
> > > >> > > > Also on a slightly unrelated issue - how hard it would be to
> > > >> > > > introduce different types of buffers for the windows.
> Currently
> > > the
> > > >> > > > existing one is behaving (when under processing) similar with
> a
> > > FIFO
> > > >> > > > queue (in the sense that you need to start from beginning,
> from
> > > the
> > > >> > oldest element).
> > > >> > > > How about enabling for example also LIFO behavior (start
> > iterating
> > > >> > > > through the list from the most recent element). As in the
> source
> > > >> > > > queues or stacks are not actually used, perhaps we can just
> pass
> > > >> > > > policies to the iterator - or have custom itrators
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > Dr. Radu Tudoran
> > > >> > > > Research Engineer - Big Data Expert
> > > >> > > > IT R&D Division
> > > >> > > >
> > > >> > > >
> > > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Center
> > > >> > > > Riesstrasse 25, 80992 München
> > > >> > > >
> > > >> > > > E-mail: [hidden email]
> > > >> > > > Mobile: +49 15209084330
> > > >> > > > Telephone: +49 891588344173
> > > >> > > >
> > > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205, 40549
> > > >> > > > Düsseldorf, Germany, www.huawei.com Registered
> > > >> > > > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
> > Managing
> > > >> > > > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der
> > > Gesellschaft:
> > > >> > > > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> > > >> > > > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This
> e-mail
> > > and
> > > >> > > > its attachments contain confidential information from HUAWEI,
> > > which
> > > >> > > > is intended only for the person or entity whose address is
> > listed
> > > >> > above.
> > > >> > > > Any use of the information contained herein in any way
> > (including,
> > > >> > > > but not limited to, total or partial disclosure, reproduction,
> > or
> > > >> > > > dissemination) by persons other than the intended recipient(s)
> > is
> > > >> > > > prohibited. If you receive this e-mail in error, please notify
> > the
> > > >> > > > sender by phone or email immediately and delete it!
> > > >> > > >
> > > >> > > >
> > > >> > > > -----Original Message-----
> > > >> > > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > >> > > > Sent: Wednesday, July 13, 2016 2:24 PM
> > > >> > > > To: [hidden email]
> > > >> > > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function
> Metadata
> > > >> > > >
> > > >> > > > Sure, I also thought about this but went for the "extreme"
> > > initially.
> > > >> > > > If no-one objects I'll update the doc in a bit.
> > > >> > > >
> > > >> > > > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]>
> > > wrote:
> > > >> > > >
> > > >> > > > > Thanks for opening this.
> > > >> > > > >
> > > >> > > > > I see the need for having an extensible context object for
> > > window
> > > >> > > > > function invocations, but i think hiding every parameter in
> > the
> > > >> > > > > context is a bit unnatural.
> > > >> > > > >
> > > >> > > > > How about having a function "apply(Key, Values,
> WindowContext,
> > > >> > > > Collector)"
> > > >> > > > > ?
> > > >> > > > > It should be possible to write the straightforward use cases
> > > >> > > > > without accessing the context object.
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > > >> > > > > <[hidden email]>
> > > >> > > > > wrote:
> > > >> > > > >
> > > >> > > > > > Hi,
> > > >> > > > > > this is a proposal to introduce a new interface for the
> > window
> > > >> > > > > > function
> > > >> > > > > to
> > > >> > > > > > make it more extensible for the future where we might want
> > to
> > > >> > > > > > provide additional information about why a window fired to
> > the
> > > >> > > > > > user
> > > >> > > > function:
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
> > > >> > > > > +W
> > > >> > > > > in
> > > >> > > > > dow+Function+Metadata
> > > >> > > > > >
> > > >> > > > > > I'd appreciate your thoughts!
> > > >> > > > > >
> > > >> > > > > > Cheers,
> > > >> > > > > > Aljoscha
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Aljoscha Krettek-2
I think we reached consensus here so I would like to mark this FLIP as
accepted. We will now process with implementing the first step, i.e. adding
the new ProcessWindowFunction.

On Mon, 1 Aug 2016 at 18:08 Aljoscha Krettek <[hidden email]> wrote:

> Alright, that seems reasonable. I updated the doc to add the Collector to
> the method signature again.
>
> On Mon, 1 Aug 2016 at 00:59 Stephan Ewen <[hidden email]> wrote:
>
> The Collector is pretty integral in all the other functions that return
> multiple elements. I honestly don't see us switching away from it, given
> that it is such a core part of the API.
>
> The close() method has, to my best knowledge, not caused issues, yet. I
> cannot recall anyone mentioning that the close() method confused them, they
> accidentally called it, etc.
> I am wondering whether this is more of a theoretical than practical issue.
>
> If we move one function away from Collector to be on the safe side for a
> "maybe change in the future", while keeping Collector in all other
> functions - I think that fragments the API concept wise more than it
> improves anything.
>
> On Sun, Jul 31, 2016 at 7:10 PM, Aljoscha Krettek <[hidden email]>
> wrote:
>
> > @Stephan: For the Output, should we keep using a Collector (which
> exposes)
> > the close() method which should never be called by users or create a new
> > Output type that only has an "output" method. Collector can also be used
> > but with a close() method that doesn't do anything. In the long run, I
> > thought it might be better to switch the type away from Collector.
> >
> > Cheers,
> > Aljoscha
> >
> > On Wed, 20 Jul 2016 at 01:25 Maximilian Michels <[hidden email]> wrote:
> >
> > > I think it looks like Beam rather than Hadoop :)
> > >
> > > What Stephan meant was that he wanted a dedicated output method in the
> > > ProcessWindowFunction. I agree with Aljoscha that we shouldn't expose
> > > the collector.
> > >
> > > On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek <
> [hidden email]>
> > > wrote:
> > > > You mean keep the Collector? I don't like that one because it has the
> > > > close() method that should never be called by the user.
> > > >
> > > > We can keep it, though, because all the other user function
> interfaces
> > > also
> > > > expose it to the user.
> > > >
> > > > On Tue, 19 Jul 2016 at 15:22 Stephan Ewen <[hidden email]> wrote:
> > > >
> > > >> I would actually make the output a separate parameter as well.
> Pretty
> > > much
> > > >> like the old variant, only replacing the "Window" parameter by the
> > > context
> > > >> (which contains everything about the window).
> > > >> It could also be called "WindowInvocationContext" or so.
> > > >>
> > > >> The current variant looks too Hadoop to me ;-) Everything done on
> the
> > > >> context object, and messy mocking when creating tests.
> > > >>
> > > >> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <
> > [hidden email]>
> > > >> wrote:
> > > >>
> > > >> > Hi,
> > > >> >
> > > >> > Sorry - I made a mistake - I was thinking of getting access to the
> > > >> > collection (mist-read :) collector) of events in the window buffer
> > in
> > > >> > order to be able to delete/evict some of them which are not
> > necessary
> > > the
> > > >> > last ones.
> > > >> >
> > > >> >
> > > >> > Radu
> > > >> >
> > > >> > -----Original Message-----
> > > >> > From: Aljoscha Krettek [mailto:[hidden email]]
> > > >> > Sent: Monday, July 18, 2016 5:54 PM
> > > >> > To: [hidden email]
> > > >> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > > >> >
> > > >> > What about the collector? This is only used for emitting elements
> to
> > > the
> > > >> > downstream operation.
> > > >> >
> > > >> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <
> [hidden email]>
> > > >> wrote:
> > > >> >
> > > >> > > Hi,
> > > >> > >
> > > >> > > I think it looks good and most importantly is that we can extend
> > it
> > > in
> > > >> > > the directions discussed so far.
> > > >> > >
> > > >> > > One question though regarding the Collector - are we going to be
> > > able
> > > >> > > to delete random elements from the list if this is not exposed
> as
> > a
> > > >> > > collection, at least to the evictor? If not, how are we going to
> > > >> > > extend in the future to cover this case?
> > > >> > >
> > > >> > > Regarding the ordering - I also observed that there are
> situations
> > > >> > > where elements do not have a logical order. One example is if
> you
> > > have
> > > >> > > high rates of the events. Nevertheless, even if now is not the
> > time
> > > >> > > for this, I think in the future we can imagine having also some
> > data
> > > >> > > structures that offer some ordering. It can save some
> computation
> > > >> > > efforts later in the functions for some use cases.
> > > >> > >
> > > >> > >
> > > >> > > Best regards,
> > > >> > >
> > > >> > >
> > > >> > > -----Original Message-----
> > > >> > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > >> > > Sent: Monday, July 18, 2016 3:45 PM
> > > >> > > To: [hidden email]
> > > >> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > > >> > >
> > > >> > > I incorporated the changes. The proposed interface of
> > > >> > > ProcessWindowFunction is now this:
> > > >> > >
> > > >> > > public abstract class ProcessWindowFunction <IN, OUT, KEY, W
> > extends
> > > >> > > Window> implements Function {
> > > >> > >
> > > >> > >     public abstract void process(KEY key, Iterable<IN> elements,
> > > >> > > Context
> > > >> > > ctx) throws Exception;
> > > >> > >
> > > >> > >     public abstract class Context {
> > > >> > >         public abstract W window();
> > > >> > >         public abstract void output(OUT value);
> > > >> > >     }
> > > >> > > }
> > > >> > >
> > > >> > > I'm proposing to not expose Collector anymore because it has the
> > > >> > > close() method that should not be called by users. Having the
> > > output()
> > > >> > > call directly on the context should work just as well.
> > > >> > >
> > > >> > > Also, I marked the "adding a firing reason" and "adding firing
> > > >> > > counter" as future work that are only examples of stuff that can
> > be
> > > >> > > implemented on top of the new interface. Initially, this will
> > > provide
> > > >> > > exactly the same features as the old API but be extensible. I
> did
> > > this
> > > >> > > to not make the scope of this proposal to big because Radu also
> > > >> > > suggested more changes and each of them should be covered in a
> > > separate
> > > >> > design doc or FLIP.
> > > >> > >
> > > >> > > @Radu: On the different buffer types. I think this would be very
> > > >> tricky.
> > > >> > > Right now, people should also not rely on the fact that elements
> > are
> > > >> > > "FIFO". Some state backends might keep the elements in a
> different
> > > >> > > order and when you have merging windows/session windows the
> order
> > of
> > > >> > > the elements will also not be preserved.
> > > >> > >
> > > >> > > Cheers,
> > > >> > > Aljoscha
> > > >> > >
> > > >> > > On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <
> > [hidden email]>
> > > >> > wrote:
> > > >> > >
> > > >> > > > Hi,
> > > >> > > >
> > > >> > > > If it is to extend the Context to pass more information
> between
> > > the
> > > >> > > > stages of processing a window (triggering -> process ->
> > eviction),
> > > >> > > > why not adding also a "EvictionInfo"? I think this might
> > actually
> > > >> > > > help with the issues discussed in the tread related to the
> > > eviction
> > > >> > policy.
> > > >> > > > I could imagine using this parameter to pass the conditions,
> > from
> > > >> > > > the processing stage to the evictor, about what events to be
> > > >> > eliminated.
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > public abstract class Context {
> > > >> > > >
> > > >> > > >    public abstract EvictionInfo evictionInfo();
> > > >> > > >
> > > >> > > > ...
> > > >> > > >
> > > >> > > >
> > > >> > > >    public abstract KEY key();
> > > >> > > >
> > > >> > > >    public abstract W window();
> > > >> > > >
> > > >> > > >    public abstract int id();
> > > >> > > >
> > > >> > > >    public abstract FiringInfo firingInfo();
> > > >> > > >
> > > >> > > >    public abstract Iterable<IN> elements();
> > > >> > > >
> > > >> > > >    public abstract void output(OUT value);
> > > >> > > >
> > > >> > > > }
> > > >> > > >
> > > >> > > >
> > > >> > > > Also on a slightly unrelated issue - how hard it would be to
> > > >> > > > introduce different types of buffers for the windows.
> Currently
> > > the
> > > >> > > > existing one is behaving (when under processing) similar with
> a
> > > FIFO
> > > >> > > > queue (in the sense that you need to start from beginning,
> from
> > > the
> > > >> > oldest element).
> > > >> > > > How about enabling for example also LIFO behavior (start
> > iterating
> > > >> > > > through the list from the most recent element). As in the
> source
> > > >> > > > queues or stacks are not actually used, perhaps we can just
> pass
> > > >> > > > policies to the iterator - or have custom itrators
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > Dr. Radu Tudoran
> > > >> > > > Research Engineer - Big Data Expert
> > > >> > > > IT R&D Division
> > > >> > > >
> > > >> > > >
> > > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Center
> > > >> > > > Riesstrasse 25, 80992 München
> > > >> > > >
> > > >> > > > E-mail: [hidden email]
> > > >> > > > Mobile: +49 15209084330 <01520%209084330>
> > > >> > > > Telephone: +49 891588344173 <089%201588344173>
> > > >> > > >
> > > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205, 40549
> > > >> > > > Düsseldorf, Germany, www.huawei.com Registered
> > > >> > > > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
> > Managing
> > > >> > > > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der
> > > Gesellschaft:
> > > >> > > > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> > > >> > > > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This
> e-mail
> > > and
> > > >> > > > its attachments contain confidential information from HUAWEI,
> > > which
> > > >> > > > is intended only for the person or entity whose address is
> > listed
> > > >> > above.
> > > >> > > > Any use of the information contained herein in any way
> > (including,
> > > >> > > > but not limited to, total or partial disclosure, reproduction,
> > or
> > > >> > > > dissemination) by persons other than the intended recipient(s)
> > is
> > > >> > > > prohibited. If you receive this e-mail in error, please notify
> > the
> > > >> > > > sender by phone or email immediately and delete it!
> > > >> > > >
> > > >> > > >
> > > >> > > > -----Original Message-----
> > > >> > > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > >> > > > Sent: Wednesday, July 13, 2016 2:24 PM
> > > >> > > > To: [hidden email]
> > > >> > > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function
> Metadata
> > > >> > > >
> > > >> > > > Sure, I also thought about this but went for the "extreme"
> > > initially.
> > > >> > > > If no-one objects I'll update the doc in a bit.
> > > >> > > >
> > > >> > > > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]>
> > > wrote:
> > > >> > > >
> > > >> > > > > Thanks for opening this.
> > > >> > > > >
> > > >> > > > > I see the need for having an extensible context object for
> > > window
> > > >> > > > > function invocations, but i think hiding every parameter in
> > the
> > > >> > > > > context is a bit unnatural.
> > > >> > > > >
> > > >> > > > > How about having a function "apply(Key, Values,
> WindowContext,
> > > >> > > > Collector)"
> > > >> > > > > ?
> > > >> > > > > It should be possible to write the straightforward use cases
> > > >> > > > > without accessing the context object.
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > > >> > > > > <[hidden email]>
> > > >> > > > > wrote:
> > > >> > > > >
> > > >> > > > > > Hi,
> > > >> > > > > > this is a proposal to introduce a new interface for the
> > window
> > > >> > > > > > function
> > > >> > > > > to
> > > >> > > > > > make it more extensible for the future where we might want
> > to
> > > >> > > > > > provide additional information about why a window fired to
> > the
> > > >> > > > > > user
> > > >> > > > function:
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
> > > >> > > > > +W
> > > >> > > > > in
> > > >> > > > > dow+Function+Metadata
> > > >> > > > > >
> > > >> > > > > > I'd appreciate your thoughts!
> > > >> > > > > >
> > > >> > > > > > Cheers,
> > > >> > > > > > Aljoscha
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Manu Zhang
Hi Aljoscha,

Have you started working on ProcessWindowFunction ? If not, may I take this
task ?

Thanks,
Manu


On Wed, Nov 2, 2016 at 5:16 PM Aljoscha Krettek <[hidden email]> wrote:

> I think we reached consensus here so I would like to mark this FLIP as
> accepted. We will now process with implementing the first step, i.e. adding
> the new ProcessWindowFunction.
>
> On Mon, 1 Aug 2016 at 18:08 Aljoscha Krettek <[hidden email]> wrote:
>
> > Alright, that seems reasonable. I updated the doc to add the Collector to
> > the method signature again.
> >
> > On Mon, 1 Aug 2016 at 00:59 Stephan Ewen <[hidden email]> wrote:
> >
> > The Collector is pretty integral in all the other functions that return
> > multiple elements. I honestly don't see us switching away from it, given
> > that it is such a core part of the API.
> >
> > The close() method has, to my best knowledge, not caused issues, yet. I
> > cannot recall anyone mentioning that the close() method confused them,
> they
> > accidentally called it, etc.
> > I am wondering whether this is more of a theoretical than practical
> issue.
> >
> > If we move one function away from Collector to be on the safe side for a
> > "maybe change in the future", while keeping Collector in all other
> > functions - I think that fragments the API concept wise more than it
> > improves anything.
> >
> > On Sun, Jul 31, 2016 at 7:10 PM, Aljoscha Krettek <[hidden email]>
> > wrote:
> >
> > > @Stephan: For the Output, should we keep using a Collector (which
> > exposes)
> > > the close() method which should never be called by users or create a
> new
> > > Output type that only has an "output" method. Collector can also be
> used
> > > but with a close() method that doesn't do anything. In the long run, I
> > > thought it might be better to switch the type away from Collector.
> > >
> > > Cheers,
> > > Aljoscha
> > >
> > > On Wed, 20 Jul 2016 at 01:25 Maximilian Michels <[hidden email]>
> wrote:
> > >
> > > > I think it looks like Beam rather than Hadoop :)
> > > >
> > > > What Stephan meant was that he wanted a dedicated output method in
> the
> > > > ProcessWindowFunction. I agree with Aljoscha that we shouldn't expose
> > > > the collector.
> > > >
> > > > On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek <
> > [hidden email]>
> > > > wrote:
> > > > > You mean keep the Collector? I don't like that one because it has
> the
> > > > > close() method that should never be called by the user.
> > > > >
> > > > > We can keep it, though, because all the other user function
> > interfaces
> > > > also
> > > > > expose it to the user.
> > > > >
> > > > > On Tue, 19 Jul 2016 at 15:22 Stephan Ewen <[hidden email]>
> wrote:
> > > > >
> > > > >> I would actually make the output a separate parameter as well.
> > Pretty
> > > > much
> > > > >> like the old variant, only replacing the "Window" parameter by the
> > > > context
> > > > >> (which contains everything about the window).
> > > > >> It could also be called "WindowInvocationContext" or so.
> > > > >>
> > > > >> The current variant looks too Hadoop to me ;-) Everything done on
> > the
> > > > >> context object, and messy mocking when creating tests.
> > > > >>
> > > > >> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <
> > > [hidden email]>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi,
> > > > >> >
> > > > >> > Sorry - I made a mistake - I was thinking of getting access to
> the
> > > > >> > collection (mist-read :) collector) of events in the window
> buffer
> > > in
> > > > >> > order to be able to delete/evict some of them which are not
> > > necessary
> > > > the
> > > > >> > last ones.
> > > > >> >
> > > > >> >
> > > > >> > Radu
> > > > >> >
> > > > >> > -----Original Message-----
> > > > >> > From: Aljoscha Krettek [mailto:[hidden email]]
> > > > >> > Sent: Monday, July 18, 2016 5:54 PM
> > > > >> > To: [hidden email]
> > > > >> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > > > >> >
> > > > >> > What about the collector? This is only used for emitting
> elements
> > to
> > > > the
> > > > >> > downstream operation.
> > > > >> >
> > > > >> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <
> > [hidden email]>
> > > > >> wrote:
> > > > >> >
> > > > >> > > Hi,
> > > > >> > >
> > > > >> > > I think it looks good and most importantly is that we can
> extend
> > > it
> > > > in
> > > > >> > > the directions discussed so far.
> > > > >> > >
> > > > >> > > One question though regarding the Collector - are we going to
> be
> > > > able
> > > > >> > > to delete random elements from the list if this is not exposed
> > as
> > > a
> > > > >> > > collection, at least to the evictor? If not, how are we going
> to
> > > > >> > > extend in the future to cover this case?
> > > > >> > >
> > > > >> > > Regarding the ordering - I also observed that there are
> > situations
> > > > >> > > where elements do not have a logical order. One example is if
> > you
> > > > have
> > > > >> > > high rates of the events. Nevertheless, even if now is not the
> > > time
> > > > >> > > for this, I think in the future we can imagine having also
> some
> > > data
> > > > >> > > structures that offer some ordering. It can save some
> > computation
> > > > >> > > efforts later in the functions for some use cases.
> > > > >> > >
> > > > >> > >
> > > > >> > > Best regards,
> > > > >> > >
> > > > >> > >
> > > > >> > > -----Original Message-----
> > > > >> > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > > >> > > Sent: Monday, July 18, 2016 3:45 PM
> > > > >> > > To: [hidden email]
> > > > >> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function
> Metadata
> > > > >> > >
> > > > >> > > I incorporated the changes. The proposed interface of
> > > > >> > > ProcessWindowFunction is now this:
> > > > >> > >
> > > > >> > > public abstract class ProcessWindowFunction <IN, OUT, KEY, W
> > > extends
> > > > >> > > Window> implements Function {
> > > > >> > >
> > > > >> > >     public abstract void process(KEY key, Iterable<IN>
> elements,
> > > > >> > > Context
> > > > >> > > ctx) throws Exception;
> > > > >> > >
> > > > >> > >     public abstract class Context {
> > > > >> > >         public abstract W window();
> > > > >> > >         public abstract void output(OUT value);
> > > > >> > >     }
> > > > >> > > }
> > > > >> > >
> > > > >> > > I'm proposing to not expose Collector anymore because it has
> the
> > > > >> > > close() method that should not be called by users. Having the
> > > > output()
> > > > >> > > call directly on the context should work just as well.
> > > > >> > >
> > > > >> > > Also, I marked the "adding a firing reason" and "adding firing
> > > > >> > > counter" as future work that are only examples of stuff that
> can
> > > be
> > > > >> > > implemented on top of the new interface. Initially, this will
> > > > provide
> > > > >> > > exactly the same features as the old API but be extensible. I
> > did
> > > > this
> > > > >> > > to not make the scope of this proposal to big because Radu
> also
> > > > >> > > suggested more changes and each of them should be covered in a
> > > > separate
> > > > >> > design doc or FLIP.
> > > > >> > >
> > > > >> > > @Radu: On the different buffer types. I think this would be
> very
> > > > >> tricky.
> > > > >> > > Right now, people should also not rely on the fact that
> elements
> > > are
> > > > >> > > "FIFO". Some state backends might keep the elements in a
> > different
> > > > >> > > order and when you have merging windows/session windows the
> > order
> > > of
> > > > >> > > the elements will also not be preserved.
> > > > >> > >
> > > > >> > > Cheers,
> > > > >> > > Aljoscha
> > > > >> > >
> > > > >> > > On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <
> > > [hidden email]>
> > > > >> > wrote:
> > > > >> > >
> > > > >> > > > Hi,
> > > > >> > > >
> > > > >> > > > If it is to extend the Context to pass more information
> > between
> > > > the
> > > > >> > > > stages of processing a window (triggering -> process ->
> > > eviction),
> > > > >> > > > why not adding also a "EvictionInfo"? I think this might
> > > actually
> > > > >> > > > help with the issues discussed in the tread related to the
> > > > eviction
> > > > >> > policy.
> > > > >> > > > I could imagine using this parameter to pass the conditions,
> > > from
> > > > >> > > > the processing stage to the evictor, about what events to be
> > > > >> > eliminated.
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > public abstract class Context {
> > > > >> > > >
> > > > >> > > >    public abstract EvictionInfo evictionInfo();
> > > > >> > > >
> > > > >> > > > ...
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >    public abstract KEY key();
> > > > >> > > >
> > > > >> > > >    public abstract W window();
> > > > >> > > >
> > > > >> > > >    public abstract int id();
> > > > >> > > >
> > > > >> > > >    public abstract FiringInfo firingInfo();
> > > > >> > > >
> > > > >> > > >    public abstract Iterable<IN> elements();
> > > > >> > > >
> > > > >> > > >    public abstract void output(OUT value);
> > > > >> > > >
> > > > >> > > > }
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > Also on a slightly unrelated issue - how hard it would be to
> > > > >> > > > introduce different types of buffers for the windows.
> > Currently
> > > > the
> > > > >> > > > existing one is behaving (when under processing) similar
> with
> > a
> > > > FIFO
> > > > >> > > > queue (in the sense that you need to start from beginning,
> > from
> > > > the
> > > > >> > oldest element).
> > > > >> > > > How about enabling for example also LIFO behavior (start
> > > iterating
> > > > >> > > > through the list from the most recent element). As in the
> > source
> > > > >> > > > queues or stacks are not actually used, perhaps we can just
> > pass
> > > > >> > > > policies to the iterator - or have custom itrators
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > Dr. Radu Tudoran
> > > > >> > > > Research Engineer - Big Data Expert
> > > > >> > > > IT R&D Division
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research
> Center
> > > > >> > > > Riesstrasse 25, 80992 München
> > > > >> > > >
> > > > >> > > > E-mail: [hidden email]
> > > > >> > > > Mobile: +49 15209084330 <+49%201520%209084330>
> <01520%209084330>
> > > > >> > > > Telephone: +49 891588344173 <+49%2089%201588344173>
> <089%201588344173>
> > > > >> > > >
> > > > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205, 40549
> > > > >> > > > Düsseldorf, Germany, www.huawei.com Registered
> > > > >> > > > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
> > > Managing
> > > > >> > > > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der
> > > > Gesellschaft:
> > > > >> > > > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> > > > >> > > > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This
> > e-mail
> > > > and
> > > > >> > > > its attachments contain confidential information from
> HUAWEI,
> > > > which
> > > > >> > > > is intended only for the person or entity whose address is
> > > listed
> > > > >> > above.
> > > > >> > > > Any use of the information contained herein in any way
> > > (including,
> > > > >> > > > but not limited to, total or partial disclosure,
> reproduction,
> > > or
> > > > >> > > > dissemination) by persons other than the intended
> recipient(s)
> > > is
> > > > >> > > > prohibited. If you receive this e-mail in error, please
> notify
> > > the
> > > > >> > > > sender by phone or email immediately and delete it!
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > -----Original Message-----
> > > > >> > > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > > >> > > > Sent: Wednesday, July 13, 2016 2:24 PM
> > > > >> > > > To: [hidden email]
> > > > >> > > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function
> > Metadata
> > > > >> > > >
> > > > >> > > > Sure, I also thought about this but went for the "extreme"
> > > > initially.
> > > > >> > > > If no-one objects I'll update the doc in a bit.
> > > > >> > > >
> > > > >> > > > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <[hidden email]
> >
> > > > wrote:
> > > > >> > > >
> > > > >> > > > > Thanks for opening this.
> > > > >> > > > >
> > > > >> > > > > I see the need for having an extensible context object for
> > > > window
> > > > >> > > > > function invocations, but i think hiding every parameter
> in
> > > the
> > > > >> > > > > context is a bit unnatural.
> > > > >> > > > >
> > > > >> > > > > How about having a function "apply(Key, Values,
> > WindowContext,
> > > > >> > > > Collector)"
> > > > >> > > > > ?
> > > > >> > > > > It should be possible to write the straightforward use
> cases
> > > > >> > > > > without accessing the context object.
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > > > >> > > > > <[hidden email]>
> > > > >> > > > > wrote:
> > > > >> > > > >
> > > > >> > > > > > Hi,
> > > > >> > > > > > this is a proposal to introduce a new interface for the
> > > window
> > > > >> > > > > > function
> > > > >> > > > > to
> > > > >> > > > > > make it more extensible for the future where we might
> want
> > > to
> > > > >> > > > > > provide additional information about why a window fired
> to
> > > the
> > > > >> > > > > > user
> > > > >> > > > function:
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > >
> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
> > > > >> > > > > +W
> > > > >> > > > > in
> > > > >> > > > > dow+Function+Metadata
> > > > >> > > > > >
> > > > >> > > > > > I'd appreciate your thoughts!
> > > > >> > > > > >
> > > > >> > > > > > Cheers,
> > > > >> > > > > > Aljoscha
> > > > >> > > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Aljoscha Krettek-2
Hi Manu,
it's great that you want to work on this but another contributor has also
started looking into this and has some code already. It's unfortunate that
there is no Jira issue yet for this but I think he'll open one this week.

Cheers,
Aljoscha

On Wed, 2 Nov 2016 at 13:00 Manu Zhang <[hidden email]> wrote:

> Hi Aljoscha,
>
> Have you started working on ProcessWindowFunction ? If not, may I take this
> task ?
>
> Thanks,
> Manu
>
>
> On Wed, Nov 2, 2016 at 5:16 PM Aljoscha Krettek <[hidden email]>
> wrote:
>
> > I think we reached consensus here so I would like to mark this FLIP as
> > accepted. We will now process with implementing the first step, i.e.
> adding
> > the new ProcessWindowFunction.
> >
> > On Mon, 1 Aug 2016 at 18:08 Aljoscha Krettek <[hidden email]>
> wrote:
> >
> > > Alright, that seems reasonable. I updated the doc to add the Collector
> to
> > > the method signature again.
> > >
> > > On Mon, 1 Aug 2016 at 00:59 Stephan Ewen <[hidden email]> wrote:
> > >
> > > The Collector is pretty integral in all the other functions that return
> > > multiple elements. I honestly don't see us switching away from it,
> given
> > > that it is such a core part of the API.
> > >
> > > The close() method has, to my best knowledge, not caused issues, yet. I
> > > cannot recall anyone mentioning that the close() method confused them,
> > they
> > > accidentally called it, etc.
> > > I am wondering whether this is more of a theoretical than practical
> > issue.
> > >
> > > If we move one function away from Collector to be on the safe side for
> a
> > > "maybe change in the future", while keeping Collector in all other
> > > functions - I think that fragments the API concept wise more than it
> > > improves anything.
> > >
> > > On Sun, Jul 31, 2016 at 7:10 PM, Aljoscha Krettek <[hidden email]
> >
> > > wrote:
> > >
> > > > @Stephan: For the Output, should we keep using a Collector (which
> > > exposes)
> > > > the close() method which should never be called by users or create a
> > new
> > > > Output type that only has an "output" method. Collector can also be
> > used
> > > > but with a close() method that doesn't do anything. In the long run,
> I
> > > > thought it might be better to switch the type away from Collector.
> > > >
> > > > Cheers,
> > > > Aljoscha
> > > >
> > > > On Wed, 20 Jul 2016 at 01:25 Maximilian Michels <[hidden email]>
> > wrote:
> > > >
> > > > > I think it looks like Beam rather than Hadoop :)
> > > > >
> > > > > What Stephan meant was that he wanted a dedicated output method in
> > the
> > > > > ProcessWindowFunction. I agree with Aljoscha that we shouldn't
> expose
> > > > > the collector.
> > > > >
> > > > > On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek <
> > > [hidden email]>
> > > > > wrote:
> > > > > > You mean keep the Collector? I don't like that one because it has
> > the
> > > > > > close() method that should never be called by the user.
> > > > > >
> > > > > > We can keep it, though, because all the other user function
> > > interfaces
> > > > > also
> > > > > > expose it to the user.
> > > > > >
> > > > > > On Tue, 19 Jul 2016 at 15:22 Stephan Ewen <[hidden email]>
> > wrote:
> > > > > >
> > > > > >> I would actually make the output a separate parameter as well.
> > > Pretty
> > > > > much
> > > > > >> like the old variant, only replacing the "Window" parameter by
> the
> > > > > context
> > > > > >> (which contains everything about the window).
> > > > > >> It could also be called "WindowInvocationContext" or so.
> > > > > >>
> > > > > >> The current variant looks too Hadoop to me ;-) Everything done
> on
> > > the
> > > > > >> context object, and messy mocking when creating tests.
> > > > > >>
> > > > > >> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <
> > > > [hidden email]>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Hi,
> > > > > >> >
> > > > > >> > Sorry - I made a mistake - I was thinking of getting access to
> > the
> > > > > >> > collection (mist-read :) collector) of events in the window
> > buffer
> > > > in
> > > > > >> > order to be able to delete/evict some of them which are not
> > > > necessary
> > > > > the
> > > > > >> > last ones.
> > > > > >> >
> > > > > >> >
> > > > > >> > Radu
> > > > > >> >
> > > > > >> > -----Original Message-----
> > > > > >> > From: Aljoscha Krettek [mailto:[hidden email]]
> > > > > >> > Sent: Monday, July 18, 2016 5:54 PM
> > > > > >> > To: [hidden email]
> > > > > >> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function
> Metadata
> > > > > >> >
> > > > > >> > What about the collector? This is only used for emitting
> > elements
> > > to
> > > > > the
> > > > > >> > downstream operation.
> > > > > >> >
> > > > > >> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <
> > > [hidden email]>
> > > > > >> wrote:
> > > > > >> >
> > > > > >> > > Hi,
> > > > > >> > >
> > > > > >> > > I think it looks good and most importantly is that we can
> > extend
> > > > it
> > > > > in
> > > > > >> > > the directions discussed so far.
> > > > > >> > >
> > > > > >> > > One question though regarding the Collector - are we going
> to
> > be
> > > > > able
> > > > > >> > > to delete random elements from the list if this is not
> exposed
> > > as
> > > > a
> > > > > >> > > collection, at least to the evictor? If not, how are we
> going
> > to
> > > > > >> > > extend in the future to cover this case?
> > > > > >> > >
> > > > > >> > > Regarding the ordering - I also observed that there are
> > > situations
> > > > > >> > > where elements do not have a logical order. One example is
> if
> > > you
> > > > > have
> > > > > >> > > high rates of the events. Nevertheless, even if now is not
> the
> > > > time
> > > > > >> > > for this, I think in the future we can imagine having also
> > some
> > > > data
> > > > > >> > > structures that offer some ordering. It can save some
> > > computation
> > > > > >> > > efforts later in the functions for some use cases.
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > Best regards,
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > -----Original Message-----
> > > > > >> > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > > > >> > > Sent: Monday, July 18, 2016 3:45 PM
> > > > > >> > > To: [hidden email]
> > > > > >> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function
> > Metadata
> > > > > >> > >
> > > > > >> > > I incorporated the changes. The proposed interface of
> > > > > >> > > ProcessWindowFunction is now this:
> > > > > >> > >
> > > > > >> > > public abstract class ProcessWindowFunction <IN, OUT, KEY, W
> > > > extends
> > > > > >> > > Window> implements Function {
> > > > > >> > >
> > > > > >> > >     public abstract void process(KEY key, Iterable<IN>
> > elements,
> > > > > >> > > Context
> > > > > >> > > ctx) throws Exception;
> > > > > >> > >
> > > > > >> > >     public abstract class Context {
> > > > > >> > >         public abstract W window();
> > > > > >> > >         public abstract void output(OUT value);
> > > > > >> > >     }
> > > > > >> > > }
> > > > > >> > >
> > > > > >> > > I'm proposing to not expose Collector anymore because it has
> > the
> > > > > >> > > close() method that should not be called by users. Having
> the
> > > > > output()
> > > > > >> > > call directly on the context should work just as well.
> > > > > >> > >
> > > > > >> > > Also, I marked the "adding a firing reason" and "adding
> firing
> > > > > >> > > counter" as future work that are only examples of stuff that
> > can
> > > > be
> > > > > >> > > implemented on top of the new interface. Initially, this
> will
> > > > > provide
> > > > > >> > > exactly the same features as the old API but be extensible.
> I
> > > did
> > > > > this
> > > > > >> > > to not make the scope of this proposal to big because Radu
> > also
> > > > > >> > > suggested more changes and each of them should be covered
> in a
> > > > > separate
> > > > > >> > design doc or FLIP.
> > > > > >> > >
> > > > > >> > > @Radu: On the different buffer types. I think this would be
> > very
> > > > > >> tricky.
> > > > > >> > > Right now, people should also not rely on the fact that
> > elements
> > > > are
> > > > > >> > > "FIFO". Some state backends might keep the elements in a
> > > different
> > > > > >> > > order and when you have merging windows/session windows the
> > > order
> > > > of
> > > > > >> > > the elements will also not be preserved.
> > > > > >> > >
> > > > > >> > > Cheers,
> > > > > >> > > Aljoscha
> > > > > >> > >
> > > > > >> > > On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <
> > > > [hidden email]>
> > > > > >> > wrote:
> > > > > >> > >
> > > > > >> > > > Hi,
> > > > > >> > > >
> > > > > >> > > > If it is to extend the Context to pass more information
> > > between
> > > > > the
> > > > > >> > > > stages of processing a window (triggering -> process ->
> > > > eviction),
> > > > > >> > > > why not adding also a "EvictionInfo"? I think this might
> > > > actually
> > > > > >> > > > help with the issues discussed in the tread related to the
> > > > > eviction
> > > > > >> > policy.
> > > > > >> > > > I could imagine using this parameter to pass the
> conditions,
> > > > from
> > > > > >> > > > the processing stage to the evictor, about what events to
> be
> > > > > >> > eliminated.
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > public abstract class Context {
> > > > > >> > > >
> > > > > >> > > >    public abstract EvictionInfo evictionInfo();
> > > > > >> > > >
> > > > > >> > > > ...
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > >    public abstract KEY key();
> > > > > >> > > >
> > > > > >> > > >    public abstract W window();
> > > > > >> > > >
> > > > > >> > > >    public abstract int id();
> > > > > >> > > >
> > > > > >> > > >    public abstract FiringInfo firingInfo();
> > > > > >> > > >
> > > > > >> > > >    public abstract Iterable<IN> elements();
> > > > > >> > > >
> > > > > >> > > >    public abstract void output(OUT value);
> > > > > >> > > >
> > > > > >> > > > }
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > Also on a slightly unrelated issue - how hard it would be
> to
> > > > > >> > > > introduce different types of buffers for the windows.
> > > Currently
> > > > > the
> > > > > >> > > > existing one is behaving (when under processing) similar
> > with
> > > a
> > > > > FIFO
> > > > > >> > > > queue (in the sense that you need to start from beginning,
> > > from
> > > > > the
> > > > > >> > oldest element).
> > > > > >> > > > How about enabling for example also LIFO behavior (start
> > > > iterating
> > > > > >> > > > through the list from the most recent element). As in the
> > > source
> > > > > >> > > > queues or stacks are not actually used, perhaps we can
> just
> > > pass
> > > > > >> > > > policies to the iterator - or have custom itrators
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > Dr. Radu Tudoran
> > > > > >> > > > Research Engineer - Big Data Expert
> > > > > >> > > > IT R&D Division
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research
> > Center
> > > > > >> > > > Riesstrasse 25, 80992 München
> > > > > >> > > >
> > > > > >> > > > E-mail: [hidden email]
> > > > > >> > > > Mobile: +49 15209084330 <01520%209084330>
> <+49%201520%209084330>
> > <01520%209084330>
> > > > > >> > > > Telephone: +49 891588344173 <089%201588344173>
> <+49%2089%201588344173>
> > <089%201588344173>
> > > > > >> > > >
> > > > > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205, 40549
> > > > > >> > > > Düsseldorf, Germany, www.huawei.com Registered
> > > > > >> > > > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
> > > > Managing
> > > > > >> > > > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der
> > > > > Gesellschaft:
> > > > > >> > > > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> > > > > >> > > > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This
> > > e-mail
> > > > > and
> > > > > >> > > > its attachments contain confidential information from
> > HUAWEI,
> > > > > which
> > > > > >> > > > is intended only for the person or entity whose address is
> > > > listed
> > > > > >> > above.
> > > > > >> > > > Any use of the information contained herein in any way
> > > > (including,
> > > > > >> > > > but not limited to, total or partial disclosure,
> > reproduction,
> > > > or
> > > > > >> > > > dissemination) by persons other than the intended
> > recipient(s)
> > > > is
> > > > > >> > > > prohibited. If you receive this e-mail in error, please
> > notify
> > > > the
> > > > > >> > > > sender by phone or email immediately and delete it!
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > -----Original Message-----
> > > > > >> > > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > > > >> > > > Sent: Wednesday, July 13, 2016 2:24 PM
> > > > > >> > > > To: [hidden email]
> > > > > >> > > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function
> > > Metadata
> > > > > >> > > >
> > > > > >> > > > Sure, I also thought about this but went for the "extreme"
> > > > > initially.
> > > > > >> > > > If no-one objects I'll update the doc in a bit.
> > > > > >> > > >
> > > > > >> > > > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <
> [hidden email]
> > >
> > > > > wrote:
> > > > > >> > > >
> > > > > >> > > > > Thanks for opening this.
> > > > > >> > > > >
> > > > > >> > > > > I see the need for having an extensible context object
> for
> > > > > window
> > > > > >> > > > > function invocations, but i think hiding every parameter
> > in
> > > > the
> > > > > >> > > > > context is a bit unnatural.
> > > > > >> > > > >
> > > > > >> > > > > How about having a function "apply(Key, Values,
> > > WindowContext,
> > > > > >> > > > Collector)"
> > > > > >> > > > > ?
> > > > > >> > > > > It should be possible to write the straightforward use
> > cases
> > > > > >> > > > > without accessing the context object.
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > > > > >> > > > > <[hidden email]>
> > > > > >> > > > > wrote:
> > > > > >> > > > >
> > > > > >> > > > > > Hi,
> > > > > >> > > > > > this is a proposal to introduce a new interface for
> the
> > > > window
> > > > > >> > > > > > function
> > > > > >> > > > > to
> > > > > >> > > > > > make it more extensible for the future where we might
> > want
> > > > to
> > > > > >> > > > > > provide additional information about why a window
> fired
> > to
> > > > the
> > > > > >> > > > > > user
> > > > > >> > > > function:
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > >
> > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
> > > > > >> > > > > +W
> > > > > >> > > > > in
> > > > > >> > > > > dow+Function+Metadata
> > > > > >> > > > > >
> > > > > >> > > > > > I'd appreciate your thoughts!
> > > > > >> > > > > >
> > > > > >> > > > > > Cheers,
> > > > > >> > > > > > Aljoscha
> > > > > >> > > > > >
> > > > > >> > > > >
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

Manu Zhang
No problem, Aljoscha. I'll follow up with my use cases.

Thanks,
Manu

On Wed, Nov 2, 2016 at 9:15 PM Aljoscha Krettek <[hidden email]> wrote:

> Hi Manu,
> it's great that you want to work on this but another contributor has also
> started looking into this and has some code already. It's unfortunate that
> there is no Jira issue yet for this but I think he'll open one this week.
>
> Cheers,
> Aljoscha
>
> On Wed, 2 Nov 2016 at 13:00 Manu Zhang <[hidden email]> wrote:
>
> > Hi Aljoscha,
> >
> > Have you started working on ProcessWindowFunction ? If not, may I take
> this
> > task ?
> >
> > Thanks,
> > Manu
> >
> >
> > On Wed, Nov 2, 2016 at 5:16 PM Aljoscha Krettek <[hidden email]>
> > wrote:
> >
> > > I think we reached consensus here so I would like to mark this FLIP as
> > > accepted. We will now process with implementing the first step, i.e.
> > adding
> > > the new ProcessWindowFunction.
> > >
> > > On Mon, 1 Aug 2016 at 18:08 Aljoscha Krettek <[hidden email]>
> > wrote:
> > >
> > > > Alright, that seems reasonable. I updated the doc to add the
> Collector
> > to
> > > > the method signature again.
> > > >
> > > > On Mon, 1 Aug 2016 at 00:59 Stephan Ewen <[hidden email]> wrote:
> > > >
> > > > The Collector is pretty integral in all the other functions that
> return
> > > > multiple elements. I honestly don't see us switching away from it,
> > given
> > > > that it is such a core part of the API.
> > > >
> > > > The close() method has, to my best knowledge, not caused issues,
> yet. I
> > > > cannot recall anyone mentioning that the close() method confused
> them,
> > > they
> > > > accidentally called it, etc.
> > > > I am wondering whether this is more of a theoretical than practical
> > > issue.
> > > >
> > > > If we move one function away from Collector to be on the safe side
> for
> > a
> > > > "maybe change in the future", while keeping Collector in all other
> > > > functions - I think that fragments the API concept wise more than it
> > > > improves anything.
> > > >
> > > > On Sun, Jul 31, 2016 at 7:10 PM, Aljoscha Krettek <
> [hidden email]
> > >
> > > > wrote:
> > > >
> > > > > @Stephan: For the Output, should we keep using a Collector (which
> > > > exposes)
> > > > > the close() method which should never be called by users or create
> a
> > > new
> > > > > Output type that only has an "output" method. Collector can also be
> > > used
> > > > > but with a close() method that doesn't do anything. In the long
> run,
> > I
> > > > > thought it might be better to switch the type away from Collector.
> > > > >
> > > > > Cheers,
> > > > > Aljoscha
> > > > >
> > > > > On Wed, 20 Jul 2016 at 01:25 Maximilian Michels <[hidden email]>
> > > wrote:
> > > > >
> > > > > > I think it looks like Beam rather than Hadoop :)
> > > > > >
> > > > > > What Stephan meant was that he wanted a dedicated output method
> in
> > > the
> > > > > > ProcessWindowFunction. I agree with Aljoscha that we shouldn't
> > expose
> > > > > > the collector.
> > > > > >
> > > > > > On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek <
> > > > [hidden email]>
> > > > > > wrote:
> > > > > > > You mean keep the Collector? I don't like that one because it
> has
> > > the
> > > > > > > close() method that should never be called by the user.
> > > > > > >
> > > > > > > We can keep it, though, because all the other user function
> > > > interfaces
> > > > > > also
> > > > > > > expose it to the user.
> > > > > > >
> > > > > > > On Tue, 19 Jul 2016 at 15:22 Stephan Ewen <[hidden email]>
> > > wrote:
> > > > > > >
> > > > > > >> I would actually make the output a separate parameter as well.
> > > > Pretty
> > > > > > much
> > > > > > >> like the old variant, only replacing the "Window" parameter by
> > the
> > > > > > context
> > > > > > >> (which contains everything about the window).
> > > > > > >> It could also be called "WindowInvocationContext" or so.
> > > > > > >>
> > > > > > >> The current variant looks too Hadoop to me ;-) Everything done
> > on
> > > > the
> > > > > > >> context object, and messy mocking when creating tests.
> > > > > > >>
> > > > > > >> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <
> > > > > [hidden email]>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Hi,
> > > > > > >> >
> > > > > > >> > Sorry - I made a mistake - I was thinking of getting access
> to
> > > the
> > > > > > >> > collection (mist-read :) collector) of events in the window
> > > buffer
> > > > > in
> > > > > > >> > order to be able to delete/evict some of them which are not
> > > > > necessary
> > > > > > the
> > > > > > >> > last ones.
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > Radu
> > > > > > >> >
> > > > > > >> > -----Original Message-----
> > > > > > >> > From: Aljoscha Krettek [mailto:[hidden email]]
> > > > > > >> > Sent: Monday, July 18, 2016 5:54 PM
> > > > > > >> > To: [hidden email]
> > > > > > >> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function
> > Metadata
> > > > > > >> >
> > > > > > >> > What about the collector? This is only used for emitting
> > > elements
> > > > to
> > > > > > the
> > > > > > >> > downstream operation.
> > > > > > >> >
> > > > > > >> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <
> > > > [hidden email]>
> > > > > > >> wrote:
> > > > > > >> >
> > > > > > >> > > Hi,
> > > > > > >> > >
> > > > > > >> > > I think it looks good and most importantly is that we can
> > > extend
> > > > > it
> > > > > > in
> > > > > > >> > > the directions discussed so far.
> > > > > > >> > >
> > > > > > >> > > One question though regarding the Collector - are we going
> > to
> > > be
> > > > > > able
> > > > > > >> > > to delete random elements from the list if this is not
> > exposed
> > > > as
> > > > > a
> > > > > > >> > > collection, at least to the evictor? If not, how are we
> > going
> > > to
> > > > > > >> > > extend in the future to cover this case?
> > > > > > >> > >
> > > > > > >> > > Regarding the ordering - I also observed that there are
> > > > situations
> > > > > > >> > > where elements do not have a logical order. One example is
> > if
> > > > you
> > > > > > have
> > > > > > >> > > high rates of the events. Nevertheless, even if now is not
> > the
> > > > > time
> > > > > > >> > > for this, I think in the future we can imagine having also
> > > some
> > > > > data
> > > > > > >> > > structures that offer some ordering. It can save some
> > > > computation
> > > > > > >> > > efforts later in the functions for some use cases.
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > Best regards,
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > -----Original Message-----
> > > > > > >> > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > > > > >> > > Sent: Monday, July 18, 2016 3:45 PM
> > > > > > >> > > To: [hidden email]
> > > > > > >> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function
> > > Metadata
> > > > > > >> > >
> > > > > > >> > > I incorporated the changes. The proposed interface of
> > > > > > >> > > ProcessWindowFunction is now this:
> > > > > > >> > >
> > > > > > >> > > public abstract class ProcessWindowFunction <IN, OUT,
> KEY, W
> > > > > extends
> > > > > > >> > > Window> implements Function {
> > > > > > >> > >
> > > > > > >> > >     public abstract void process(KEY key, Iterable<IN>
> > > elements,
> > > > > > >> > > Context
> > > > > > >> > > ctx) throws Exception;
> > > > > > >> > >
> > > > > > >> > >     public abstract class Context {
> > > > > > >> > >         public abstract W window();
> > > > > > >> > >         public abstract void output(OUT value);
> > > > > > >> > >     }
> > > > > > >> > > }
> > > > > > >> > >
> > > > > > >> > > I'm proposing to not expose Collector anymore because it
> has
> > > the
> > > > > > >> > > close() method that should not be called by users. Having
> > the
> > > > > > output()
> > > > > > >> > > call directly on the context should work just as well.
> > > > > > >> > >
> > > > > > >> > > Also, I marked the "adding a firing reason" and "adding
> > firing
> > > > > > >> > > counter" as future work that are only examples of stuff
> that
> > > can
> > > > > be
> > > > > > >> > > implemented on top of the new interface. Initially, this
> > will
> > > > > > provide
> > > > > > >> > > exactly the same features as the old API but be
> extensible.
> > I
> > > > did
> > > > > > this
> > > > > > >> > > to not make the scope of this proposal to big because Radu
> > > also
> > > > > > >> > > suggested more changes and each of them should be covered
> > in a
> > > > > > separate
> > > > > > >> > design doc or FLIP.
> > > > > > >> > >
> > > > > > >> > > @Radu: On the different buffer types. I think this would
> be
> > > very
> > > > > > >> tricky.
> > > > > > >> > > Right now, people should also not rely on the fact that
> > > elements
> > > > > are
> > > > > > >> > > "FIFO". Some state backends might keep the elements in a
> > > > different
> > > > > > >> > > order and when you have merging windows/session windows
> the
> > > > order
> > > > > of
> > > > > > >> > > the elements will also not be preserved.
> > > > > > >> > >
> > > > > > >> > > Cheers,
> > > > > > >> > > Aljoscha
> > > > > > >> > >
> > > > > > >> > > On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <
> > > > > [hidden email]>
> > > > > > >> > wrote:
> > > > > > >> > >
> > > > > > >> > > > Hi,
> > > > > > >> > > >
> > > > > > >> > > > If it is to extend the Context to pass more information
> > > > between
> > > > > > the
> > > > > > >> > > > stages of processing a window (triggering -> process ->
> > > > > eviction),
> > > > > > >> > > > why not adding also a "EvictionInfo"? I think this might
> > > > > actually
> > > > > > >> > > > help with the issues discussed in the tread related to
> the
> > > > > > eviction
> > > > > > >> > policy.
> > > > > > >> > > > I could imagine using this parameter to pass the
> > conditions,
> > > > > from
> > > > > > >> > > > the processing stage to the evictor, about what events
> to
> > be
> > > > > > >> > eliminated.
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > public abstract class Context {
> > > > > > >> > > >
> > > > > > >> > > >    public abstract EvictionInfo evictionInfo();
> > > > > > >> > > >
> > > > > > >> > > > ...
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > >    public abstract KEY key();
> > > > > > >> > > >
> > > > > > >> > > >    public abstract W window();
> > > > > > >> > > >
> > > > > > >> > > >    public abstract int id();
> > > > > > >> > > >
> > > > > > >> > > >    public abstract FiringInfo firingInfo();
> > > > > > >> > > >
> > > > > > >> > > >    public abstract Iterable<IN> elements();
> > > > > > >> > > >
> > > > > > >> > > >    public abstract void output(OUT value);
> > > > > > >> > > >
> > > > > > >> > > > }
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > Also on a slightly unrelated issue - how hard it would
> be
> > to
> > > > > > >> > > > introduce different types of buffers for the windows.
> > > > Currently
> > > > > > the
> > > > > > >> > > > existing one is behaving (when under processing) similar
> > > with
> > > > a
> > > > > > FIFO
> > > > > > >> > > > queue (in the sense that you need to start from
> beginning,
> > > > from
> > > > > > the
> > > > > > >> > oldest element).
> > > > > > >> > > > How about enabling for example also LIFO behavior (start
> > > > > iterating
> > > > > > >> > > > through the list from the most recent element). As in
> the
> > > > source
> > > > > > >> > > > queues or stacks are not actually used, perhaps we can
> > just
> > > > pass
> > > > > > >> > > > policies to the iterator - or have custom itrators
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > Dr. Radu Tudoran
> > > > > > >> > > > Research Engineer - Big Data Expert
> > > > > > >> > > > IT R&D Division
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research
> > > Center
> > > > > > >> > > > Riesstrasse 25, 80992 München
> > > > > > >> > > >
> > > > > > >> > > > E-mail: [hidden email]
> > > > > > >> > > > Mobile: +49 15209084330 <+49%201520%209084330>
> <01520%209084330>
> > <+49%201520%209084330>
> > > <01520%209084330>
> > > > > > >> > > > Telephone: +49 891588344173 <+49%2089%201588344173>
> <089%201588344173>
> > <+49%2089%201588344173>
> > > <089%201588344173>
> > > > > > >> > > >
> > > > > > >> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205,
> 40549
> > > > > > >> > > > Düsseldorf, Germany, www.huawei.com Registered
> > > > > > >> > > > Office: Düsseldorf, Register Court Düsseldorf, HRB
> 56063,
> > > > > Managing
> > > > > > >> > > > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der
> > > > > > Gesellschaft:
> > > > > > >> > > > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> > > > > > >> > > > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This
> > > > e-mail
> > > > > > and
> > > > > > >> > > > its attachments contain confidential information from
> > > HUAWEI,
> > > > > > which
> > > > > > >> > > > is intended only for the person or entity whose address
> is
> > > > > listed
> > > > > > >> > above.
> > > > > > >> > > > Any use of the information contained herein in any way
> > > > > (including,
> > > > > > >> > > > but not limited to, total or partial disclosure,
> > > reproduction,
> > > > > or
> > > > > > >> > > > dissemination) by persons other than the intended
> > > recipient(s)
> > > > > is
> > > > > > >> > > > prohibited. If you receive this e-mail in error, please
> > > notify
> > > > > the
> > > > > > >> > > > sender by phone or email immediately and delete it!
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > -----Original Message-----
> > > > > > >> > > > From: Aljoscha Krettek [mailto:[hidden email]]
> > > > > > >> > > > Sent: Wednesday, July 13, 2016 2:24 PM
> > > > > > >> > > > To: [hidden email]
> > > > > > >> > > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function
> > > > Metadata
> > > > > > >> > > >
> > > > > > >> > > > Sure, I also thought about this but went for the
> "extreme"
> > > > > > initially.
> > > > > > >> > > > If no-one objects I'll update the doc in a bit.
> > > > > > >> > > >
> > > > > > >> > > > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <
> > [hidden email]
> > > >
> > > > > > wrote:
> > > > > > >> > > >
> > > > > > >> > > > > Thanks for opening this.
> > > > > > >> > > > >
> > > > > > >> > > > > I see the need for having an extensible context object
> > for
> > > > > > window
> > > > > > >> > > > > function invocations, but i think hiding every
> parameter
> > > in
> > > > > the
> > > > > > >> > > > > context is a bit unnatural.
> > > > > > >> > > > >
> > > > > > >> > > > > How about having a function "apply(Key, Values,
> > > > WindowContext,
> > > > > > >> > > > Collector)"
> > > > > > >> > > > > ?
> > > > > > >> > > > > It should be possible to write the straightforward use
> > > cases
> > > > > > >> > > > > without accessing the context object.
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
> > > > > > >> > > > > <[hidden email]>
> > > > > > >> > > > > wrote:
> > > > > > >> > > > >
> > > > > > >> > > > > > Hi,
> > > > > > >> > > > > > this is a proposal to introduce a new interface for
> > the
> > > > > window
> > > > > > >> > > > > > function
> > > > > > >> > > > > to
> > > > > > >> > > > > > make it more extensible for the future where we
> might
> > > want
> > > > > to
> > > > > > >> > > > > > provide additional information about why a window
> > fired
> > > to
> > > > > the
> > > > > > >> > > > > > user
> > > > > > >> > > > function:
> > > > > > >> > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > >
> > > > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
> > > > > > >> > > > > +W
> > > > > > >> > > > > in
> > > > > > >> > > > > dow+Function+Metadata
> > > > > > >> > > > > >
> > > > > > >> > > > > > I'd appreciate your thoughts!
> > > > > > >> > > > > >
> > > > > > >> > > > > > Cheers,
> > > > > > >> > > > > > Aljoscha
> > > > > > >> > > > > >
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > > >
> > >
> >
>