currentLowWatermark metric not reported for all tasks?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

currentLowWatermark metric not reported for all tasks?

dan bress
I am trying to measure the currentLowWatermark throughout my dataflow, but
I am not seeing it for tasks with sources.  For those tasks I see these
metrics:

lastCheckpointSize
numBytesInLocal
numBytesInRemote
numBytesOut

why am I not seeing currentLowWatermark on these tasks?

Thanks!

Dan
Reply | Threaded
Open this post in threaded view
|

Re: currentLowWatermark metric not reported for all tasks?

Chesnay Schepler-3
Hello Dan,

the technical reason is that this metric is only collected in the
*InputProcessor classes, which aren't used for source tasks.

I do recall that there were discussions about source watermarks, but
frankly i don't remember why we didn't add them.

In order to add them one would only have to modify the SourceContext
classes to a) store the last emitted watermark and b) expose it through
a metric.

Regards,
Chesnay

On 29.09.2016 23:03, dan bress wrote:

> I am trying to measure the currentLowWatermark throughout my dataflow, but
> I am not seeing it for tasks with sources.  For those tasks I see these
> metrics:
>
> lastCheckpointSize
> numBytesInLocal
> numBytesInRemote
> numBytesOut
>
> why am I not seeing currentLowWatermark on these tasks?
>
> Thanks!
>
> Dan
>

Reply | Threaded
Open this post in threaded view
|

Re: currentLowWatermark metric not reported for all tasks?

Stephan Ewen
I think what you describe, Chesnay, is exactly what we should do...

On Fri, Sep 30, 2016 at 1:15 PM, Chesnay Schepler <[hidden email]>
wrote:

> Hello Dan,
>
> the technical reason is that this metric is only collected in the
> *InputProcessor classes, which aren't used for source tasks.
>
> I do recall that there were discussions about source watermarks, but
> frankly i don't remember why we didn't add them.
>
> In order to add them one would only have to modify the SourceContext
> classes to a) store the last emitted watermark and b) expose it through a
> metric.
>
> Regards,
> Chesnay
>
>
> On 29.09.2016 23:03, dan bress wrote:
>
>> I am trying to measure the currentLowWatermark throughout my dataflow, but
>> I am not seeing it for tasks with sources.  For those tasks I see these
>> metrics:
>>
>> lastCheckpointSize
>> numBytesInLocal
>> numBytesInRemote
>> numBytesOut
>>
>> why am I not seeing currentLowWatermark on these tasks?
>>
>> Thanks!
>>
>> Dan
>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: currentLowWatermark metric not reported for all tasks?

dan bress
Awesome!  It would definitely help me troubleshoot lagging watermarks if i
can see what watermark all my sources have seen.  Thanks for looking into
this!

Dan

On Fri, Sep 30, 2016 at 5:48 AM Stephan Ewen <[hidden email]> wrote:

> I think what you describe, Chesnay, is exactly what we should do...
>
> On Fri, Sep 30, 2016 at 1:15 PM, Chesnay Schepler <[hidden email]>
> wrote:
>
> > Hello Dan,
> >
> > the technical reason is that this metric is only collected in the
> > *InputProcessor classes, which aren't used for source tasks.
> >
> > I do recall that there were discussions about source watermarks, but
> > frankly i don't remember why we didn't add them.
> >
> > In order to add them one would only have to modify the SourceContext
> > classes to a) store the last emitted watermark and b) expose it through a
> > metric.
> >
> > Regards,
> > Chesnay
> >
> >
> > On 29.09.2016 23:03, dan bress wrote:
> >
> >> I am trying to measure the currentLowWatermark throughout my dataflow,
> but
> >> I am not seeing it for tasks with sources.  For those tasks I see these
> >> metrics:
> >>
> >> lastCheckpointSize
> >> numBytesInLocal
> >> numBytesInRemote
> >> numBytesOut
> >>
> >> why am I not seeing currentLowWatermark on these tasks?
> >>
> >> Thanks!
> >>
> >> Dan
> >>
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: currentLowWatermark metric not reported for all tasks?

Robert Metzger
I added a JIRA for this feature request:
https://issues.apache.org/jira/browse/FLINK-4812

On Fri, Sep 30, 2016 at 6:13 PM, dan bress <[hidden email]> wrote:

> Awesome!  It would definitely help me troubleshoot lagging watermarks if i
> can see what watermark all my sources have seen.  Thanks for looking into
> this!
>
> Dan
>
> On Fri, Sep 30, 2016 at 5:48 AM Stephan Ewen <[hidden email]> wrote:
>
> > I think what you describe, Chesnay, is exactly what we should do...
> >
> > On Fri, Sep 30, 2016 at 1:15 PM, Chesnay Schepler <[hidden email]>
> > wrote:
> >
> > > Hello Dan,
> > >
> > > the technical reason is that this metric is only collected in the
> > > *InputProcessor classes, which aren't used for source tasks.
> > >
> > > I do recall that there were discussions about source watermarks, but
> > > frankly i don't remember why we didn't add them.
> > >
> > > In order to add them one would only have to modify the SourceContext
> > > classes to a) store the last emitted watermark and b) expose it
> through a
> > > metric.
> > >
> > > Regards,
> > > Chesnay
> > >
> > >
> > > On 29.09.2016 23:03, dan bress wrote:
> > >
> > >> I am trying to measure the currentLowWatermark throughout my dataflow,
> > but
> > >> I am not seeing it for tasks with sources.  For those tasks I see
> these
> > >> metrics:
> > >>
> > >> lastCheckpointSize
> > >> numBytesInLocal
> > >> numBytesInRemote
> > >> numBytesOut
> > >>
> > >> why am I not seeing currentLowWatermark on these tasks?
> > >>
> > >> Thanks!
> > >>
> > >> Dan
> > >>
> > >>
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: currentLowWatermark metric not reported for all tasks?

dan bress
Thank you Robert!

On Wed, Oct 12, 2016 at 2:55 AM Robert Metzger <[hidden email]> wrote:

> I added a JIRA for this feature request:
> https://issues.apache.org/jira/browse/FLINK-4812
>
> On Fri, Sep 30, 2016 at 6:13 PM, dan bress <[hidden email]> wrote:
>
> > Awesome!  It would definitely help me troubleshoot lagging watermarks if
> i
> > can see what watermark all my sources have seen.  Thanks for looking into
> > this!
> >
> > Dan
> >
> > On Fri, Sep 30, 2016 at 5:48 AM Stephan Ewen <[hidden email]> wrote:
> >
> > > I think what you describe, Chesnay, is exactly what we should do...
> > >
> > > On Fri, Sep 30, 2016 at 1:15 PM, Chesnay Schepler <[hidden email]>
> > > wrote:
> > >
> > > > Hello Dan,
> > > >
> > > > the technical reason is that this metric is only collected in the
> > > > *InputProcessor classes, which aren't used for source tasks.
> > > >
> > > > I do recall that there were discussions about source watermarks, but
> > > > frankly i don't remember why we didn't add them.
> > > >
> > > > In order to add them one would only have to modify the SourceContext
> > > > classes to a) store the last emitted watermark and b) expose it
> > through a
> > > > metric.
> > > >
> > > > Regards,
> > > > Chesnay
> > > >
> > > >
> > > > On 29.09.2016 23:03, dan bress wrote:
> > > >
> > > >> I am trying to measure the currentLowWatermark throughout my
> dataflow,
> > > but
> > > >> I am not seeing it for tasks with sources.  For those tasks I see
> > these
> > > >> metrics:
> > > >>
> > > >> lastCheckpointSize
> > > >> numBytesInLocal
> > > >> numBytesInRemote
> > > >> numBytesOut
> > > >>
> > > >> why am I not seeing currentLowWatermark on these tasks?
> > > >>
> > > >> Thanks!
> > > >>
> > > >> Dan
> > > >>
> > > >>
> > > >
> > >
> >
>