[DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Fabian Wollert
Hi everyone, 

disclaimer: i read the contribution guide about improvement requests (i.e. i should actually just start a jira ticket) but i thought it would make sense to run this first through the mailing list here. after collecting some input i would then create the jira ticket.

When accessing the Flink Web Dashboard (which is basically what i do almost every day to check some status of a job or so), I recently felt that the actual information given in the top portion of the start page is highly improvable. I created a first mock by moving html elements around and wanted to share this one now:



With the exception of the metrics (see below) none of this information should be new, but rather re-organized to speed up investigation and monitoring:
  • complete overview on the cluster status and health, without clicking through a lot of pages.
    • Active and stand-by Job Managers. Also their health is depicted as a color (as a first suggestion: last heartbeat is inside heartbeat.timeout)
    • Current registered Task Managers
      • the little bar on the side indicates task slot usage. i did not color it since a fully utilised task manager is not necessarily something bad.
      • the color indicates the health of the task manager (as a first suggestion: last heartbeat is inside heartbeat.timeout)
  • overview on some cluster metrics
Some points to notice:
  • All data you see on the screenshot is mock, no number relates to another number at all. but colors should relate to the numbers already which they indicate.
  • All of this could also be done with other monitoring solutions someone might have in his company, by reading out JMX metrics and then plotting those in his monitoring solution (e.g. grafana). But this out of the box solution would save everyone from doing it on their own and they could trust the metrics shown here.
  • Some of the metrics can only be done with FLINK-7286 being done. So i would split the implementation of this into two parts (cluster overview and metrics) and do them separately.
  • This first mock up is targeted to what we here at Zalando would like to see first glance, so it fits our use case very well. We mostly use long-running session clusters.
  • I'm more a Backend Guy with some Frontend expertise (but mostly in React, no angular1 (Flink Web Dashboard is built with this currently) experience) and not at all a designer.
What do you think? I would be glad to have some feedback on this, especially if this makes sense in the broad community. I would no matter what implement this somehow, if not in the Flink Master branch, then as a OS project which anyone can deploy next to their flink clusters. But i first wanted to run it through here to see if this sparks any interest. 

Please also let me know if you see difficulties implementing this already, maybe i have overseen something.

Can't wait for your input.

Cheers

--

Fabian Wollert
Zalando SE
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Fabian Wollert
argh, i think the screenshot is missing (at least nabble is not showing anything). here is a link to the mockup:


Cheers

--

Fabian Wollert
Zalando SE


Am Di., 9. Okt. 2018 um 12:46 Uhr schrieb Fabian Wollert <[hidden email]>:
Hi everyone, 

disclaimer: i read the contribution guide about improvement requests (i.e. i should actually just start a jira ticket) but i thought it would make sense to run this first through the mailing list here. after collecting some input i would then create the jira ticket.

When accessing the Flink Web Dashboard (which is basically what i do almost every day to check some status of a job or so), I recently felt that the actual information given in the top portion of the start page is highly improvable. I created a first mock by moving html elements around and wanted to share this one now:



With the exception of the metrics (see below) none of this information should be new, but rather re-organized to speed up investigation and monitoring:
  • complete overview on the cluster status and health, without clicking through a lot of pages.
    • Active and stand-by Job Managers. Also their health is depicted as a color (as a first suggestion: last heartbeat is inside heartbeat.timeout)
    • Current registered Task Managers
      • the little bar on the side indicates task slot usage. i did not color it since a fully utilised task manager is not necessarily something bad.
      • the color indicates the health of the task manager (as a first suggestion: last heartbeat is inside heartbeat.timeout)
  • overview on some cluster metrics
Some points to notice:
  • All data you see on the screenshot is mock, no number relates to another number at all. but colors should relate to the numbers already which they indicate.
  • All of this could also be done with other monitoring solutions someone might have in his company, by reading out JMX metrics and then plotting those in his monitoring solution (e.g. grafana). But this out of the box solution would save everyone from doing it on their own and they could trust the metrics shown here.
  • Some of the metrics can only be done with FLINK-7286 being done. So i would split the implementation of this into two parts (cluster overview and metrics) and do them separately.
  • This first mock up is targeted to what we here at Zalando would like to see first glance, so it fits our use case very well. We mostly use long-running session clusters.
  • I'm more a Backend Guy with some Frontend expertise (but mostly in React, no angular1 (Flink Web Dashboard is built with this currently) experience) and not at all a designer.
What do you think? I would be glad to have some feedback on this, especially if this makes sense in the broad community. I would no matter what implement this somehow, if not in the Flink Master branch, then as a OS project which anyone can deploy next to their flink clusters. But i first wanted to run it through here to see if this sparks any interest. 

Please also let me know if you see difficulties implementing this already, maybe i have overseen something.

Can't wait for your input.

Cheers

--

Fabian Wollert
Zalando SE
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Till Rohrmann
Hi Fabian,

thanks for starting this discussion. I agree with you that Flink's web
dashboard lacks a bit of general cluster overview information on the front
page. Your mock looks really promising to me since it shows some basic
metrics and cluster information at a glance. Apart from the the source
input and sink output metrics, all other required information should be
available to display it in the dashboard. Thus, your proposal should only
affect flink-runtime-web which should make it easier to realize.

I'm in favour of adding this feature to Flink's dashboard to make it
available to the whole community.

Cheers,
Till

On Tue, Oct 9, 2018 at 12:54 PM Fabian Wollert <[hidden email]> wrote:

> argh, i think the screenshot is missing (at least nabble is not showing
> anything). here is a link to the mockup:
>
>
> https://drive.google.com/file/d/1p3wVP028_AFFLZ6fjPb41yAI8zUhgDTO/view?usp=sharing
>
> Cheers
>
> --
>
>
> *Fabian WollertZalando SE*
>
> E-Mail: [hidden email]
>
>
> Am Di., 9. Okt. 2018 um 12:46 Uhr schrieb Fabian Wollert <
> [hidden email]>:
>
>> Hi everyone,
>>
>> disclaimer: i read the contribution guide about improvement requests
>> (i.e. i should actually just start a jira ticket) but i thought it would
>> make sense to run this first through the mailing list here. after
>> collecting some input i would then create the jira ticket.
>>
>> When accessing the Flink Web Dashboard (which is basically what i do
>> almost every day to check some status of a job or so), I recently felt that
>> the actual information given in the top portion of the start page is highly
>> improvable. I created a first mock by moving html elements around and
>> wanted to share this one now:
>>
>> [image: image.png]
>>
>> With the exception of the metrics (see below) none of this information
>> should be new, but rather re-organized to speed up investigation and
>> monitoring:
>>
>>    - complete overview on the cluster status and health, without
>>    clicking through a lot of pages.
>>    - Active and stand-by Job Managers. Also their health is depicted as
>>       a color (as a first suggestion: last heartbeat is inside heartbeat.timeout)
>>       - Current registered Task Managers
>>          - the little bar on the side indicates task slot usage. i did
>>          not color it since a fully utilised task manager is not necessarily
>>          something bad.
>>          - the color indicates the health of the task manager (as a
>>          first suggestion: last heartbeat is inside heartbeat.timeout)
>>       - overview on some cluster metrics
>>
>> Some points to notice:
>>
>>    - All data you see on the screenshot is mock, no number relates to
>>    another number at all. but colors should relate to the numbers already
>>    which they indicate.
>>    - All of this could also be done with other monitoring solutions
>>    someone might have in his company, by reading out JMX metrics and then
>>    plotting those in his monitoring solution (e.g. grafana). But this out of
>>    the box solution would save everyone from doing it on their own and they
>>    could trust the metrics shown here.
>>    - Some of the metrics can only be done with FLINK-7286
>>    <https://issues.apache.org/jira/browse/FLINK-7286> being done. So i
>>    would split the implementation of this into two parts (cluster overview and
>>    metrics) and do them separately.
>>    - This first mock up is targeted to what we here at Zalando would
>>    like to see first glance, so it fits our use case very well. We mostly use
>>    long-running session clusters.
>>    - I'm more a Backend Guy with some Frontend expertise (but mostly in
>>    React, no angular1 (Flink Web Dashboard is built with this currently)
>>    experience) and not at all a designer.
>>
>> What do you think? I would be glad to have some feedback on this,
>> especially if this makes sense in the broad community. I would no matter
>> what implement this somehow, if not in the Flink Master branch, then as a
>> OS project which anyone can deploy next to their flink clusters. But i
>> first wanted to run it through here to see if this sparks any interest.
>>
>> Please also let me know if you see difficulties implementing this
>> already, maybe i have overseen something.
>>
>> Can't wait for your input.
>>
>> Cheers
>>
>> --
>>
>>
>> *Fabian WollertZalando SE*
>>
>> E-Mail: [hidden email]
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

isunjin
Great job! That would very helpful for debug.

  • I would suggest to use small icons for this Job Manager/Managers when there are too many instances (like a thousand)
  • May be we can also introduce locality,  that task managers belongs to same rack shows together?




Small icons can be like this:




On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]> wrote:

mation on the front
page. Your mock looks really promising to me since it shows some basic
metrics and cluster information at a glance. Apart from the the source
input and sink output metrics, all other required information should be
available to display it in the dashboard. Thus, your proposal should only
affect flink-runtime-web which should make it easier to realize.

I'm in favour of adding this feature to Flink's dashboard to make it
available to the whole community.

Reply | Threaded
Open this post in threaded view
|

回复:[DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Zhijiang(wangzhijiang999)
Thanks Fabian for proposing this topic.

It is very worth improving the web dashborad for showing more useful informations which can benefit flink users a lot.

Just two small personal concerns:
1. The start time and end time are already given, so it is easy to estimate the rough duration time. Is it necessary to show the duration information to occupy the space?
2. The job name given by users can be used for identification, and the job id is automatically generated in random. I am not sure whether this id is useful for further debugging. If not maybe we can ignore the job id from the dashboard?

Best,
Zhijiang
------------------------------------------------------------------
发件人:Jin Sun <[hidden email]>
发送时间:2018年10月10日(星期三) 01:10
收件人:dev <[hidden email]>
主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Great job! That would very helpful for debug.

  • I would suggest to use small icons for this Job Manager/Managers when there are too many instances (like a thousand)
  • May be we can also introduce locality,  that task managers belongs to same rack shows together?




Small icons can be like this:




On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]> wrote:

mation on the front
page. Your mock looks really promising to me since it shows some basic
metrics and cluster information at a glance. Apart from the the source
input and sink output metrics, all other required information should be
available to display it in the dashboard. Thus, your proposal should only
affect flink-runtime-web which should make it easier to realize.

I'm in favour of adding this feature to Flink's dashboard to make it
available to the whole community.


Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Robert Metzger
Hey Fabian,
thanks a lot for reaching out to the Flink community with this proposal!
(Posting to the ML instead of creating a JIRA is a good idea for such
questions -- you can create a ticket/tickets once the discussion here has
come to a conclusion)

I have two comments:
- You are listing Records/Kb in and Records/Kb out as cluster-wide metrics.
I wonder whether we should rather show these metrics for each job, instead
of the entire cluster? (or maybe both). My concern is that the cluster-wide
metric is not really relevant as soon as you have jobs with different
characteristics running on one cluster
- You mention that the Flink UI is based on Angular 1. I've been thinking
for quite a while now whether we should actually rewrite / migrate the
Flink UI to React.
Do you think we can re-use most of the work you'd be doing for this change
when we migrate to React?

Best,
Robert



On Wed, Oct 10, 2018 at 8:24 AM Zhijiang(wangzhijiang999)
<[hidden email]> wrote:

> Thanks Fabian for proposing this topic.
>
> It is very worth improving the web dashborad for showing more useful
> informations which can benefit flink users a lot.
>
> Just two small personal concerns:
> 1. The start time and end time are already given, so it is easy to
> estimate the rough duration time. Is it necessary to show the duration
> information to occupy the space?
> 2. The job name given by users can be used for identification, and the
> job id is automatically generated in random. I am not sure whether this id
> is useful for further debugging. If not maybe we can ignore the job id from
> the dashboard?
>
> Best,
> Zhijiang
>
> ------------------------------------------------------------------
> 发件人:Jin Sun <[hidden email]>
> 发送时间:2018年10月10日(星期三) 01:10
> 收件人:dev <[hidden email]>
> 主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal
>
> Great job! That would very helpful for debug.
>
>
>    - I would suggest to use small icons for this Job Manager/Managers
>    when there are too many instances (like a thousand)
>    - May be we can also introduce locality,  that task managers belongs
>    to same rack shows together?
>
>
>
>
>
> Small icons can be like this:
>
>
>
>
> On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]> wrote:
>
> mation on the front
> page. Your mock looks really promising to me since it shows some basic
> metrics and cluster information at a glance. Apart from the the source
> input and sink output metrics, all other required information should be
> available to display it in the dashboard. Thus, your proposal should only
> affect flink-runtime-web which should make it easier to realize.
>
> I'm in favour of adding this feature to Flink's dashboard to make it
> available to the whole community.
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Fabian Wollert
Hi everyone, thx for all the comments and feedback. Let me address
everything individually:

@Till: yes, for the start my plan would be to just touch the
flink-runtime-web/web-dashboard repo/folder.

@Jin Sun:

   - smaller icons on increasing server counts: yes, thats also something i
   already thought about. will keep it in mind when realizing the first
   version!
   - about locality: i searched quickly through the docs, but i could not
   find anything regarding flink featuring rack awareness. Is this something
   already implemented? If not, i think this will bloat the size of this
   initial proposal. If its somewhere already included, we could implement it
   for sure.

@Zhijiang:the focus of this redesign was not yet including the job list in
the lower half of the overview. as part of the redesign we can also think
about optimising this list though, and removing unnecessary columns is
usually the most easy thing to do. we can maybe create a separate ticket
for this as well and discuss this issue there, to not bloat the initial
discussion with too much topics.

@Robert:

   - Agreed that it might make sense to also show this on job level. Since
   these metrics are probably gonna be introduced later only anyways, we can
   discuss this maybe then separately after FLINK-9050
   <https://issues.apache.org/jira/browse/FLINK-9050> (linked the wrong
   ticket in my initial mail) is done.
   - Rewriting the whole thing while doing this also came to my mind. What
   i would like to do anyways (even if we stick for now to A1) is to remove
   bower as a package manager (since its deprecated) and update bootstrap to
   V4. I will check what the additional effort is to move to React/Redux.
   We're working with this here at work as well, so implementing at least a
   first MVP might be feasible as well, before getting to deep into A1
   specifics. But that basically means that you guys are open to change the
   underlying web/JS technology, yeah?

Cheers

--


*Fabian WollertZalando SE*

E-Mail: [hidden email]


Am Mi., 10. Okt. 2018 um 08:41 Uhr schrieb Robert Metzger <
[hidden email]>:

> Hey Fabian,
> thanks a lot for reaching out to the Flink community with this proposal!
> (Posting to the ML instead of creating a JIRA is a good idea for such
> questions -- you can create a ticket/tickets once the discussion here has
> come to a conclusion)
>
> I have two comments:
> - You are listing Records/Kb in and Records/Kb out as cluster-wide metrics.
> I wonder whether we should rather show these metrics for each job, instead
> of the entire cluster? (or maybe both). My concern is that the cluster-wide
> metric is not really relevant as soon as you have jobs with different
> characteristics running on one cluster
> - You mention that the Flink UI is based on Angular 1. I've been thinking
> for quite a while now whether we should actually rewrite / migrate the
> Flink UI to React.
> Do you think we can re-use most of the work you'd be doing for this change
> when we migrate to React?
>
> Best,
> Robert
>
>
>
> On Wed, Oct 10, 2018 at 8:24 AM Zhijiang(wangzhijiang999)
> <[hidden email]> wrote:
>
> > Thanks Fabian for proposing this topic.
> >
> > It is very worth improving the web dashborad for showing more useful
> > informations which can benefit flink users a lot.
> >
> > Just two small personal concerns:
> > 1. The start time and end time are already given, so it is easy to
> > estimate the rough duration time. Is it necessary to show the duration
> > information to occupy the space?
> > 2. The job name given by users can be used for identification, and the
> > job id is automatically generated in random. I am not sure whether this
> id
> > is useful for further debugging. If not maybe we can ignore the job id
> from
> > the dashboard?
> >
> > Best,
> > Zhijiang
> >
> > ------------------------------------------------------------------
> > 发件人:Jin Sun <[hidden email]>
> > 发送时间:2018年10月10日(星期三) 01:10
> > 收件人:dev <[hidden email]>
> > 主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal
> >
> > Great job! That would very helpful for debug.
> >
> >
> >    - I would suggest to use small icons for this Job Manager/Managers
> >    when there are too many instances (like a thousand)
> >    - May be we can also introduce locality,  that task managers belongs
> >    to same rack shows together?
> >
> >
> >
> >
> >
> > Small icons can be like this:
> >
> >
> >
> >
> > On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]> wrote:
> >
> > mation on the front
> > page. Your mock looks really promising to me since it shows some basic
> > metrics and cluster information at a glance. Apart from the the source
> > input and sink output metrics, all other required information should be
> > available to display it in the dashboard. Thus, your proposal should only
> > affect flink-runtime-web which should make it easier to realize.
> >
> > I'm in favour of adding this feature to Flink's dashboard to make it
> > available to the whole community.
> >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Till Rohrmann
Hi Fabian,

yes the community is very much open and thankful for contributions to the
web UI including the technology used. What it could use is a person who
would really like to drive this since so far it was if it all someone's
side project.

Cheers,
Till

On Wed, Oct 10, 2018 at 11:28 AM Fabian Wollert <[hidden email]> wrote:

> Hi everyone, thx for all the comments and feedback. Let me address
> everything individually:
>
> @Till: yes, for the start my plan would be to just touch the
> flink-runtime-web/web-dashboard repo/folder.
>
> @Jin Sun:
>
>    - smaller icons on increasing server counts: yes, thats also something i
>    already thought about. will keep it in mind when realizing the first
>    version!
>    - about locality: i searched quickly through the docs, but i could not
>    find anything regarding flink featuring rack awareness. Is this
> something
>    already implemented? If not, i think this will bloat the size of this
>    initial proposal. If its somewhere already included, we could implement
> it
>    for sure.
>
> @Zhijiang:the focus of this redesign was not yet including the job list in
> the lower half of the overview. as part of the redesign we can also think
> about optimising this list though, and removing unnecessary columns is
> usually the most easy thing to do. we can maybe create a separate ticket
> for this as well and discuss this issue there, to not bloat the initial
> discussion with too much topics.
>
> @Robert:
>
>    - Agreed that it might make sense to also show this on job level. Since
>    these metrics are probably gonna be introduced later only anyways, we
> can
>    discuss this maybe then separately after FLINK-9050
>    <https://issues.apache.org/jira/browse/FLINK-9050> (linked the wrong
>    ticket in my initial mail) is done.
>    - Rewriting the whole thing while doing this also came to my mind. What
>    i would like to do anyways (even if we stick for now to A1) is to remove
>    bower as a package manager (since its deprecated) and update bootstrap
> to
>    V4. I will check what the additional effort is to move to React/Redux.
>    We're working with this here at work as well, so implementing at least a
>    first MVP might be feasible as well, before getting to deep into A1
>    specifics. But that basically means that you guys are open to change the
>    underlying web/JS technology, yeah?
>
> Cheers
>
> --
>
>
> *Fabian WollertZalando SE*
>
> E-Mail: [hidden email]
>
>
> Am Mi., 10. Okt. 2018 um 08:41 Uhr schrieb Robert Metzger <
> [hidden email]>:
>
> > Hey Fabian,
> > thanks a lot for reaching out to the Flink community with this proposal!
> > (Posting to the ML instead of creating a JIRA is a good idea for such
> > questions -- you can create a ticket/tickets once the discussion here has
> > come to a conclusion)
> >
> > I have two comments:
> > - You are listing Records/Kb in and Records/Kb out as cluster-wide
> metrics.
> > I wonder whether we should rather show these metrics for each job,
> instead
> > of the entire cluster? (or maybe both). My concern is that the
> cluster-wide
> > metric is not really relevant as soon as you have jobs with different
> > characteristics running on one cluster
> > - You mention that the Flink UI is based on Angular 1. I've been thinking
> > for quite a while now whether we should actually rewrite / migrate the
> > Flink UI to React.
> > Do you think we can re-use most of the work you'd be doing for this
> change
> > when we migrate to React?
> >
> > Best,
> > Robert
> >
> >
> >
> > On Wed, Oct 10, 2018 at 8:24 AM Zhijiang(wangzhijiang999)
> > <[hidden email]> wrote:
> >
> > > Thanks Fabian for proposing this topic.
> > >
> > > It is very worth improving the web dashborad for showing more useful
> > > informations which can benefit flink users a lot.
> > >
> > > Just two small personal concerns:
> > > 1. The start time and end time are already given, so it is easy to
> > > estimate the rough duration time. Is it necessary to show the duration
> > > information to occupy the space?
> > > 2. The job name given by users can be used for identification, and the
> > > job id is automatically generated in random. I am not sure whether this
> > id
> > > is useful for further debugging. If not maybe we can ignore the job id
> > from
> > > the dashboard?
> > >
> > > Best,
> > > Zhijiang
> > >
> > > ------------------------------------------------------------------
> > > 发件人:Jin Sun <[hidden email]>
> > > 发送时间:2018年10月10日(星期三) 01:10
> > > 收件人:dev <[hidden email]>
> > > 主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal
> > >
> > > Great job! That would very helpful for debug.
> > >
> > >
> > >    - I would suggest to use small icons for this Job Manager/Managers
> > >    when there are too many instances (like a thousand)
> > >    - May be we can also introduce locality,  that task managers belongs
> > >    to same rack shows together?
> > >
> > >
> > >
> > >
> > >
> > > Small icons can be like this:
> > >
> > >
> > >
> > >
> > > On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]>
> wrote:
> > >
> > > mation on the front
> > > page. Your mock looks really promising to me since it shows some basic
> > > metrics and cluster information at a glance. Apart from the the source
> > > input and sink output metrics, all other required information should be
> > > available to display it in the dashboard. Thus, your proposal should
> only
> > > affect flink-runtime-web which should make it easier to realize.
> > >
> > > I'm in favour of adding this feature to Flink's dashboard to make it
> > > available to the whole community.
> > >
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Robert Metzger
In reply to this post by Fabian Wollert
Hey,
Sorry for the delay.

Yes -- I would be open to revisit the underlying technologies.

Best,
Robert

On Wed, Oct 10, 2018 at 11:28 AM Fabian Wollert <[hidden email]> wrote:

> Hi everyone, thx for all the comments and feedback. Let me address
> everything individually:
>
> @Till: yes, for the start my plan would be to just touch the
> flink-runtime-web/web-dashboard repo/folder.
>
> @Jin Sun:
>
>    - smaller icons on increasing server counts: yes, thats also something i
>    already thought about. will keep it in mind when realizing the first
>    version!
>    - about locality: i searched quickly through the docs, but i could not
>    find anything regarding flink featuring rack awareness. Is this
> something
>    already implemented? If not, i think this will bloat the size of this
>    initial proposal. If its somewhere already included, we could implement
> it
>    for sure.
>
> @Zhijiang:the focus of this redesign was not yet including the job list in
> the lower half of the overview. as part of the redesign we can also think
> about optimising this list though, and removing unnecessary columns is
> usually the most easy thing to do. we can maybe create a separate ticket
> for this as well and discuss this issue there, to not bloat the initial
> discussion with too much topics.
>
> @Robert:
>
>    - Agreed that it might make sense to also show this on job level. Since
>    these metrics are probably gonna be introduced later only anyways, we
> can
>    discuss this maybe then separately after FLINK-9050
>    <https://issues.apache.org/jira/browse/FLINK-9050> (linked the wrong
>    ticket in my initial mail) is done.
>    - Rewriting the whole thing while doing this also came to my mind. What
>    i would like to do anyways (even if we stick for now to A1) is to remove
>    bower as a package manager (since its deprecated) and update bootstrap
> to
>    V4. I will check what the additional effort is to move to React/Redux.
>    We're working with this here at work as well, so implementing at least a
>    first MVP might be feasible as well, before getting to deep into A1
>    specifics. But that basically means that you guys are open to change the
>    underlying web/JS technology, yeah?
>
> Cheers
>
> --
>
>
> *Fabian WollertZalando SE*
>
> E-Mail: [hidden email]
>
>
> Am Mi., 10. Okt. 2018 um 08:41 Uhr schrieb Robert Metzger <
> [hidden email]>:
>
> > Hey Fabian,
> > thanks a lot for reaching out to the Flink community with this proposal!
> > (Posting to the ML instead of creating a JIRA is a good idea for such
> > questions -- you can create a ticket/tickets once the discussion here has
> > come to a conclusion)
> >
> > I have two comments:
> > - You are listing Records/Kb in and Records/Kb out as cluster-wide
> metrics.
> > I wonder whether we should rather show these metrics for each job,
> instead
> > of the entire cluster? (or maybe both). My concern is that the
> cluster-wide
> > metric is not really relevant as soon as you have jobs with different
> > characteristics running on one cluster
> > - You mention that the Flink UI is based on Angular 1. I've been thinking
> > for quite a while now whether we should actually rewrite / migrate the
> > Flink UI to React.
> > Do you think we can re-use most of the work you'd be doing for this
> change
> > when we migrate to React?
> >
> > Best,
> > Robert
> >
> >
> >
> > On Wed, Oct 10, 2018 at 8:24 AM Zhijiang(wangzhijiang999)
> > <[hidden email]> wrote:
> >
> > > Thanks Fabian for proposing this topic.
> > >
> > > It is very worth improving the web dashborad for showing more useful
> > > informations which can benefit flink users a lot.
> > >
> > > Just two small personal concerns:
> > > 1. The start time and end time are already given, so it is easy to
> > > estimate the rough duration time. Is it necessary to show the duration
> > > information to occupy the space?
> > > 2. The job name given by users can be used for identification, and the
> > > job id is automatically generated in random. I am not sure whether this
> > id
> > > is useful for further debugging. If not maybe we can ignore the job id
> > from
> > > the dashboard?
> > >
> > > Best,
> > > Zhijiang
> > >
> > > ------------------------------------------------------------------
> > > 发件人:Jin Sun <[hidden email]>
> > > 发送时间:2018年10月10日(星期三) 01:10
> > > 收件人:dev <[hidden email]>
> > > 主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal
> > >
> > > Great job! That would very helpful for debug.
> > >
> > >
> > >    - I would suggest to use small icons for this Job Manager/Managers
> > >    when there are too many instances (like a thousand)
> > >    - May be we can also introduce locality,  that task managers belongs
> > >    to same rack shows together?
> > >
> > >
> > >
> > >
> > >
> > > Small icons can be like this:
> > >
> > >
> > >
> > >
> > > On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]>
> wrote:
> > >
> > > mation on the front
> > > page. Your mock looks really promising to me since it shows some basic
> > > metrics and cluster information at a glance. Apart from the the source
> > > input and sink output metrics, all other required information should be
> > > available to display it in the dashboard. Thus, your proposal should
> only
> > > affect flink-runtime-web which should make it easier to realize.
> > >
> > > I'm in favour of adding this feature to Flink's dashboard to make it
> > > available to the whole community.
> > >
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Fabian Wollert
Hi everyone,

thx for all the feedback. I created now
https://issues.apache.org/jira/browse/FLINK-10705 with sub tickets to
tackle this. i also found some time this weekend and implemented the first
draft, which i will post in the ticket (not sure if i get the pictures to
work here in the mailing list :-D).

Lets continue discussion in the tickets then.

Since this is my first bigger contribution to Flink, please advise on how
to handle tickets, and structure the work. But for now i will just continue
to work on this, whenever i find free time.

Cheers

--


*Fabian WollertZalando SE*

E-Mail: [hidden email]


Am Sa., 27. Okt. 2018 um 17:15 Uhr schrieb Robert Metzger <
[hidden email]>:

> Hey,
> Sorry for the delay.
>
> Yes -- I would be open to revisit the underlying technologies.
>
> Best,
> Robert
>
> On Wed, Oct 10, 2018 at 11:28 AM Fabian Wollert <[hidden email]> wrote:
>
> > Hi everyone, thx for all the comments and feedback. Let me address
> > everything individually:
> >
> > @Till: yes, for the start my plan would be to just touch the
> > flink-runtime-web/web-dashboard repo/folder.
> >
> > @Jin Sun:
> >
> >    - smaller icons on increasing server counts: yes, thats also
> something i
> >    already thought about. will keep it in mind when realizing the first
> >    version!
> >    - about locality: i searched quickly through the docs, but i could not
> >    find anything regarding flink featuring rack awareness. Is this
> > something
> >    already implemented? If not, i think this will bloat the size of this
> >    initial proposal. If its somewhere already included, we could
> implement
> > it
> >    for sure.
> >
> > @Zhijiang:the focus of this redesign was not yet including the job list
> in
> > the lower half of the overview. as part of the redesign we can also think
> > about optimising this list though, and removing unnecessary columns is
> > usually the most easy thing to do. we can maybe create a separate ticket
> > for this as well and discuss this issue there, to not bloat the initial
> > discussion with too much topics.
> >
> > @Robert:
> >
> >    - Agreed that it might make sense to also show this on job level.
> Since
> >    these metrics are probably gonna be introduced later only anyways, we
> > can
> >    discuss this maybe then separately after FLINK-9050
> >    <https://issues.apache.org/jira/browse/FLINK-9050> (linked the wrong
> >    ticket in my initial mail) is done.
> >    - Rewriting the whole thing while doing this also came to my mind.
> What
> >    i would like to do anyways (even if we stick for now to A1) is to
> remove
> >    bower as a package manager (since its deprecated) and update bootstrap
> > to
> >    V4. I will check what the additional effort is to move to React/Redux.
> >    We're working with this here at work as well, so implementing at
> least a
> >    first MVP might be feasible as well, before getting to deep into A1
> >    specifics. But that basically means that you guys are open to change
> the
> >    underlying web/JS technology, yeah?
> >
> > Cheers
> >
> > --
> >
> >
> > *Fabian WollertZalando SE*
> >
> > E-Mail: [hidden email]
> >
> >
> > Am Mi., 10. Okt. 2018 um 08:41 Uhr schrieb Robert Metzger <
> > [hidden email]>:
> >
> > > Hey Fabian,
> > > thanks a lot for reaching out to the Flink community with this
> proposal!
> > > (Posting to the ML instead of creating a JIRA is a good idea for such
> > > questions -- you can create a ticket/tickets once the discussion here
> has
> > > come to a conclusion)
> > >
> > > I have two comments:
> > > - You are listing Records/Kb in and Records/Kb out as cluster-wide
> > metrics.
> > > I wonder whether we should rather show these metrics for each job,
> > instead
> > > of the entire cluster? (or maybe both). My concern is that the
> > cluster-wide
> > > metric is not really relevant as soon as you have jobs with different
> > > characteristics running on one cluster
> > > - You mention that the Flink UI is based on Angular 1. I've been
> thinking
> > > for quite a while now whether we should actually rewrite / migrate the
> > > Flink UI to React.
> > > Do you think we can re-use most of the work you'd be doing for this
> > change
> > > when we migrate to React?
> > >
> > > Best,
> > > Robert
> > >
> > >
> > >
> > > On Wed, Oct 10, 2018 at 8:24 AM Zhijiang(wangzhijiang999)
> > > <[hidden email]> wrote:
> > >
> > > > Thanks Fabian for proposing this topic.
> > > >
> > > > It is very worth improving the web dashborad for showing more useful
> > > > informations which can benefit flink users a lot.
> > > >
> > > > Just two small personal concerns:
> > > > 1. The start time and end time are already given, so it is easy to
> > > > estimate the rough duration time. Is it necessary to show the
> duration
> > > > information to occupy the space?
> > > > 2. The job name given by users can be used for identification, and
> the
> > > > job id is automatically generated in random. I am not sure whether
> this
> > > id
> > > > is useful for further debugging. If not maybe we can ignore the job
> id
> > > from
> > > > the dashboard?
> > > >
> > > > Best,
> > > > Zhijiang
> > > >
> > > > ------------------------------------------------------------------
> > > > 发件人:Jin Sun <[hidden email]>
> > > > 发送时间:2018年10月10日(星期三) 01:10
> > > > 收件人:dev <[hidden email]>
> > > > 主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement
> Proposal
> > > >
> > > > Great job! That would very helpful for debug.
> > > >
> > > >
> > > >    - I would suggest to use small icons for this Job Manager/Managers
> > > >    when there are too many instances (like a thousand)
> > > >    - May be we can also introduce locality,  that task managers
> belongs
> > > >    to same rack shows together?
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Small icons can be like this:
> > > >
> > > >
> > > >
> > > >
> > > > On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]>
> > wrote:
> > > >
> > > > mation on the front
> > > > page. Your mock looks really promising to me since it shows some
> basic
> > > > metrics and cluster information at a glance. Apart from the the
> source
> > > > input and sink output metrics, all other required information should
> be
> > > > available to display it in the dashboard. Thus, your proposal should
> > only
> > > > affect flink-runtime-web which should make it easier to realize.
> > > >
> > > > I'm in favour of adding this feature to Flink's dashboard to make it
> > > > available to the whole community.
> > > >
> > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Fabian Wollert
Hi again,

Chesnay correctly commented in the tickets that we first should discuss
here, if changing the underlying technology for the Flink Web Dashboard is
a valid option at all. What are your thoughts about this?

personally I agree with Till's comments in the ticket, Angular 1 being
basically outdated and is not having a large following anymore. From my
experience the choice between Angular 2-7 or React is subjective, you can
get things done with both. I personally only have experience with React, so
i personally would be faster to develop with this one. I currently have not
planned to learn Angular as well (being a more backend focused developer in
general) so if the decision would be to go with Angular, i would be
unfortunately out of this rework of the Flink Dashboard most certainly.

Cheers
Fabian

--


*Fabian WollertZalando SE*

E-Mail: [hidden email]


Am Mo., 29. Okt. 2018 um 09:21 Uhr schrieb Fabian Wollert <[hidden email]
>:

> Hi everyone,
>
> thx for all the feedback. I created now
> https://issues.apache.org/jira/browse/FLINK-10705 with sub tickets to
> tackle this. i also found some time this weekend and implemented the first
> draft, which i will post in the ticket (not sure if i get the pictures to
> work here in the mailing list :-D).
>
> Lets continue discussion in the tickets then.
>
> Since this is my first bigger contribution to Flink, please advise on how
> to handle tickets, and structure the work. But for now i will just continue
> to work on this, whenever i find free time.
>
> Cheers
>
> --
>
>
> *Fabian WollertZalando SE*
>
> E-Mail: [hidden email]
>
>
> Am Sa., 27. Okt. 2018 um 17:15 Uhr schrieb Robert Metzger <
> [hidden email]>:
>
>> Hey,
>> Sorry for the delay.
>>
>> Yes -- I would be open to revisit the underlying technologies.
>>
>> Best,
>> Robert
>>
>> On Wed, Oct 10, 2018 at 11:28 AM Fabian Wollert <[hidden email]>
>> wrote:
>>
>> > Hi everyone, thx for all the comments and feedback. Let me address
>> > everything individually:
>> >
>> > @Till: yes, for the start my plan would be to just touch the
>> > flink-runtime-web/web-dashboard repo/folder.
>> >
>> > @Jin Sun:
>> >
>> >    - smaller icons on increasing server counts: yes, thats also
>> something i
>> >    already thought about. will keep it in mind when realizing the first
>> >    version!
>> >    - about locality: i searched quickly through the docs, but i could
>> not
>> >    find anything regarding flink featuring rack awareness. Is this
>> > something
>> >    already implemented? If not, i think this will bloat the size of this
>> >    initial proposal. If its somewhere already included, we could
>> implement
>> > it
>> >    for sure.
>> >
>> > @Zhijiang:the focus of this redesign was not yet including the job list
>> in
>> > the lower half of the overview. as part of the redesign we can also
>> think
>> > about optimising this list though, and removing unnecessary columns is
>> > usually the most easy thing to do. we can maybe create a separate ticket
>> > for this as well and discuss this issue there, to not bloat the initial
>> > discussion with too much topics.
>> >
>> > @Robert:
>> >
>> >    - Agreed that it might make sense to also show this on job level.
>> Since
>> >    these metrics are probably gonna be introduced later only anyways, we
>> > can
>> >    discuss this maybe then separately after FLINK-9050
>> >    <https://issues.apache.org/jira/browse/FLINK-9050> (linked the wrong
>> >    ticket in my initial mail) is done.
>> >    - Rewriting the whole thing while doing this also came to my mind.
>> What
>> >    i would like to do anyways (even if we stick for now to A1) is to
>> remove
>> >    bower as a package manager (since its deprecated) and update
>> bootstrap
>> > to
>> >    V4. I will check what the additional effort is to move to
>> React/Redux.
>> >    We're working with this here at work as well, so implementing at
>> least a
>> >    first MVP might be feasible as well, before getting to deep into A1
>> >    specifics. But that basically means that you guys are open to change
>> the
>> >    underlying web/JS technology, yeah?
>> >
>> > Cheers
>> >
>> > --
>> >
>> >
>> > *Fabian WollertZalando SE*
>> >
>> > E-Mail: [hidden email]
>> >
>> >
>> > Am Mi., 10. Okt. 2018 um 08:41 Uhr schrieb Robert Metzger <
>> > [hidden email]>:
>> >
>> > > Hey Fabian,
>> > > thanks a lot for reaching out to the Flink community with this
>> proposal!
>> > > (Posting to the ML instead of creating a JIRA is a good idea for such
>> > > questions -- you can create a ticket/tickets once the discussion here
>> has
>> > > come to a conclusion)
>> > >
>> > > I have two comments:
>> > > - You are listing Records/Kb in and Records/Kb out as cluster-wide
>> > metrics.
>> > > I wonder whether we should rather show these metrics for each job,
>> > instead
>> > > of the entire cluster? (or maybe both). My concern is that the
>> > cluster-wide
>> > > metric is not really relevant as soon as you have jobs with different
>> > > characteristics running on one cluster
>> > > - You mention that the Flink UI is based on Angular 1. I've been
>> thinking
>> > > for quite a while now whether we should actually rewrite / migrate the
>> > > Flink UI to React.
>> > > Do you think we can re-use most of the work you'd be doing for this
>> > change
>> > > when we migrate to React?
>> > >
>> > > Best,
>> > > Robert
>> > >
>> > >
>> > >
>> > > On Wed, Oct 10, 2018 at 8:24 AM Zhijiang(wangzhijiang999)
>> > > <[hidden email]> wrote:
>> > >
>> > > > Thanks Fabian for proposing this topic.
>> > > >
>> > > > It is very worth improving the web dashborad for showing more useful
>> > > > informations which can benefit flink users a lot.
>> > > >
>> > > > Just two small personal concerns:
>> > > > 1. The start time and end time are already given, so it is easy to
>> > > > estimate the rough duration time. Is it necessary to show the
>> duration
>> > > > information to occupy the space?
>> > > > 2. The job name given by users can be used for identification, and
>> the
>> > > > job id is automatically generated in random. I am not sure whether
>> this
>> > > id
>> > > > is useful for further debugging. If not maybe we can ignore the job
>> id
>> > > from
>> > > > the dashboard?
>> > > >
>> > > > Best,
>> > > > Zhijiang
>> > > >
>> > > > ------------------------------------------------------------------
>> > > > 发件人:Jin Sun <[hidden email]>
>> > > > 发送时间:2018年10月10日(星期三) 01:10
>> > > > 收件人:dev <[hidden email]>
>> > > > 主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement
>> Proposal
>> > > >
>> > > > Great job! That would very helpful for debug.
>> > > >
>> > > >
>> > > >    - I would suggest to use small icons for this Job
>> Manager/Managers
>> > > >    when there are too many instances (like a thousand)
>> > > >    - May be we can also introduce locality,  that task managers
>> belongs
>> > > >    to same rack shows together?
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > Small icons can be like this:
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]>
>> > wrote:
>> > > >
>> > > > mation on the front
>> > > > page. Your mock looks really promising to me since it shows some
>> basic
>> > > > metrics and cluster information at a glance. Apart from the the
>> source
>> > > > input and sink output metrics, all other required information
>> should be
>> > > > available to display it in the dashboard. Thus, your proposal should
>> > only
>> > > > affect flink-runtime-web which should make it easier to realize.
>> > > >
>> > > > I'm in favour of adding this feature to Flink's dashboard to make it
>> > > > available to the whole community.
>> > > >
>> > > >
>> > > >
>> > >
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Chesnay Schepler-3
Please start the discussion in an entirely new thread; people may
discard this thread immediately since the first page is purely about the
layout of the WebUI.

On 29.10.2018 12:39, Fabian Wollert wrote:

> Hi again,
>
> Chesnay correctly commented in the tickets that we first should discuss
> here, if changing the underlying technology for the Flink Web Dashboard is
> a valid option at all. What are your thoughts about this?
>
> personally I agree with Till's comments in the ticket, Angular 1 being
> basically outdated and is not having a large following anymore. From my
> experience the choice between Angular 2-7 or React is subjective, you can
> get things done with both. I personally only have experience with React, so
> i personally would be faster to develop with this one. I currently have not
> planned to learn Angular as well (being a more backend focused developer in
> general) so if the decision would be to go with Angular, i would be
> unfortunately out of this rework of the Flink Dashboard most certainly.
>
> Cheers
> Fabian
>
> --
>
>
> *Fabian WollertZalando SE*
>
> E-Mail: [hidden email]
>
>
> Am Mo., 29. Okt. 2018 um 09:21 Uhr schrieb Fabian Wollert <[hidden email]
>> :
>> Hi everyone,
>>
>> thx for all the feedback. I created now
>> https://issues.apache.org/jira/browse/FLINK-10705 with sub tickets to
>> tackle this. i also found some time this weekend and implemented the first
>> draft, which i will post in the ticket (not sure if i get the pictures to
>> work here in the mailing list :-D).
>>
>> Lets continue discussion in the tickets then.
>>
>> Since this is my first bigger contribution to Flink, please advise on how
>> to handle tickets, and structure the work. But for now i will just continue
>> to work on this, whenever i find free time.
>>
>> Cheers
>>
>> --
>>
>>
>> *Fabian WollertZalando SE*
>>
>> E-Mail: [hidden email]
>>
>>
>> Am Sa., 27. Okt. 2018 um 17:15 Uhr schrieb Robert Metzger <
>> [hidden email]>:
>>
>>> Hey,
>>> Sorry for the delay.
>>>
>>> Yes -- I would be open to revisit the underlying technologies.
>>>
>>> Best,
>>> Robert
>>>
>>> On Wed, Oct 10, 2018 at 11:28 AM Fabian Wollert <[hidden email]>
>>> wrote:
>>>
>>>> Hi everyone, thx for all the comments and feedback. Let me address
>>>> everything individually:
>>>>
>>>> @Till: yes, for the start my plan would be to just touch the
>>>> flink-runtime-web/web-dashboard repo/folder.
>>>>
>>>> @Jin Sun:
>>>>
>>>>     - smaller icons on increasing server counts: yes, thats also
>>> something i
>>>>     already thought about. will keep it in mind when realizing the first
>>>>     version!
>>>>     - about locality: i searched quickly through the docs, but i could
>>> not
>>>>     find anything regarding flink featuring rack awareness. Is this
>>>> something
>>>>     already implemented? If not, i think this will bloat the size of this
>>>>     initial proposal. If its somewhere already included, we could
>>> implement
>>>> it
>>>>     for sure.
>>>>
>>>> @Zhijiang:the focus of this redesign was not yet including the job list
>>> in
>>>> the lower half of the overview. as part of the redesign we can also
>>> think
>>>> about optimising this list though, and removing unnecessary columns is
>>>> usually the most easy thing to do. we can maybe create a separate ticket
>>>> for this as well and discuss this issue there, to not bloat the initial
>>>> discussion with too much topics.
>>>>
>>>> @Robert:
>>>>
>>>>     - Agreed that it might make sense to also show this on job level.
>>> Since
>>>>     these metrics are probably gonna be introduced later only anyways, we
>>>> can
>>>>     discuss this maybe then separately after FLINK-9050
>>>>     <https://issues.apache.org/jira/browse/FLINK-9050> (linked the wrong
>>>>     ticket in my initial mail) is done.
>>>>     - Rewriting the whole thing while doing this also came to my mind.
>>> What
>>>>     i would like to do anyways (even if we stick for now to A1) is to
>>> remove
>>>>     bower as a package manager (since its deprecated) and update
>>> bootstrap
>>>> to
>>>>     V4. I will check what the additional effort is to move to
>>> React/Redux.
>>>>     We're working with this here at work as well, so implementing at
>>> least a
>>>>     first MVP might be feasible as well, before getting to deep into A1
>>>>     specifics. But that basically means that you guys are open to change
>>> the
>>>>     underlying web/JS technology, yeah?
>>>>
>>>> Cheers
>>>>
>>>> --
>>>>
>>>>
>>>> *Fabian WollertZalando SE*
>>>>
>>>> E-Mail: [hidden email]
>>>>
>>>>
>>>> Am Mi., 10. Okt. 2018 um 08:41 Uhr schrieb Robert Metzger <
>>>> [hidden email]>:
>>>>
>>>>> Hey Fabian,
>>>>> thanks a lot for reaching out to the Flink community with this
>>> proposal!
>>>>> (Posting to the ML instead of creating a JIRA is a good idea for such
>>>>> questions -- you can create a ticket/tickets once the discussion here
>>> has
>>>>> come to a conclusion)
>>>>>
>>>>> I have two comments:
>>>>> - You are listing Records/Kb in and Records/Kb out as cluster-wide
>>>> metrics.
>>>>> I wonder whether we should rather show these metrics for each job,
>>>> instead
>>>>> of the entire cluster? (or maybe both). My concern is that the
>>>> cluster-wide
>>>>> metric is not really relevant as soon as you have jobs with different
>>>>> characteristics running on one cluster
>>>>> - You mention that the Flink UI is based on Angular 1. I've been
>>> thinking
>>>>> for quite a while now whether we should actually rewrite / migrate the
>>>>> Flink UI to React.
>>>>> Do you think we can re-use most of the work you'd be doing for this
>>>> change
>>>>> when we migrate to React?
>>>>>
>>>>> Best,
>>>>> Robert
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Oct 10, 2018 at 8:24 AM Zhijiang(wangzhijiang999)
>>>>> <[hidden email]> wrote:
>>>>>
>>>>>> Thanks Fabian for proposing this topic.
>>>>>>
>>>>>> It is very worth improving the web dashborad for showing more useful
>>>>>> informations which can benefit flink users a lot.
>>>>>>
>>>>>> Just two small personal concerns:
>>>>>> 1. The start time and end time are already given, so it is easy to
>>>>>> estimate the rough duration time. Is it necessary to show the
>>> duration
>>>>>> information to occupy the space?
>>>>>> 2. The job name given by users can be used for identification, and
>>> the
>>>>>> job id is automatically generated in random. I am not sure whether
>>> this
>>>>> id
>>>>>> is useful for further debugging. If not maybe we can ignore the job
>>> id
>>>>> from
>>>>>> the dashboard?
>>>>>>
>>>>>> Best,
>>>>>> Zhijiang
>>>>>>
>>>>>> ------------------------------------------------------------------
>>>>>> 发件人:Jin Sun <[hidden email]>
>>>>>> 发送时间:2018年10月10日(星期三) 01:10
>>>>>> 收件人:dev <[hidden email]>
>>>>>> 主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement
>>> Proposal
>>>>>> Great job! That would very helpful for debug.
>>>>>>
>>>>>>
>>>>>>     - I would suggest to use small icons for this Job
>>> Manager/Managers
>>>>>>     when there are too many instances (like a thousand)
>>>>>>     - May be we can also introduce locality,  that task managers
>>> belongs
>>>>>>     to same rack shows together?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Small icons can be like this:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]>
>>>> wrote:
>>>>>> mation on the front
>>>>>> page. Your mock looks really promising to me since it shows some
>>> basic
>>>>>> metrics and cluster information at a glance. Apart from the the
>>> source
>>>>>> input and sink output metrics, all other required information
>>> should be
>>>>>> available to display it in the dashboard. Thus, your proposal should
>>>> only
>>>>>> affect flink-runtime-web which should make it easier to realize.
>>>>>>
>>>>>> I'm in favour of adding this feature to Flink's dashboard to make it
>>>>>> available to the whole community.
>>>>>>
>>>>>>
>>>>>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Fabian Wollert
sure, will do.

--


*Fabian WollertZalando SE*

E-Mail: [hidden email]


Am Mo., 29. Okt. 2018 um 12:57 Uhr schrieb Chesnay Schepler <
[hidden email]>:

> Please start the discussion in an entirely new thread; people may
> discard this thread immediately since the first page is purely about the
> layout of the WebUI.
>
> On 29.10.2018 12:39, Fabian Wollert wrote:
> > Hi again,
> >
> > Chesnay correctly commented in the tickets that we first should discuss
> > here, if changing the underlying technology for the Flink Web Dashboard
> is
> > a valid option at all. What are your thoughts about this?
> >
> > personally I agree with Till's comments in the ticket, Angular 1 being
> > basically outdated and is not having a large following anymore. From my
> > experience the choice between Angular 2-7 or React is subjective, you can
> > get things done with both. I personally only have experience with React,
> so
> > i personally would be faster to develop with this one. I currently have
> not
> > planned to learn Angular as well (being a more backend focused developer
> in
> > general) so if the decision would be to go with Angular, i would be
> > unfortunately out of this rework of the Flink Dashboard most certainly.
> >
> > Cheers
> > Fabian
> >
> > --
> >
> >
> > *Fabian WollertZalando SE*
> >
> > E-Mail: [hidden email]
> >
> >
> > Am Mo., 29. Okt. 2018 um 09:21 Uhr schrieb Fabian Wollert <
> [hidden email]
> >> :
> >> Hi everyone,
> >>
> >> thx for all the feedback. I created now
> >> https://issues.apache.org/jira/browse/FLINK-10705 with sub tickets to
> >> tackle this. i also found some time this weekend and implemented the
> first
> >> draft, which i will post in the ticket (not sure if i get the pictures
> to
> >> work here in the mailing list :-D).
> >>
> >> Lets continue discussion in the tickets then.
> >>
> >> Since this is my first bigger contribution to Flink, please advise on
> how
> >> to handle tickets, and structure the work. But for now i will just
> continue
> >> to work on this, whenever i find free time.
> >>
> >> Cheers
> >>
> >> --
> >>
> >>
> >> *Fabian WollertZalando SE*
> >>
> >> E-Mail: [hidden email]
> >>
> >>
> >> Am Sa., 27. Okt. 2018 um 17:15 Uhr schrieb Robert Metzger <
> >> [hidden email]>:
> >>
> >>> Hey,
> >>> Sorry for the delay.
> >>>
> >>> Yes -- I would be open to revisit the underlying technologies.
> >>>
> >>> Best,
> >>> Robert
> >>>
> >>> On Wed, Oct 10, 2018 at 11:28 AM Fabian Wollert <[hidden email]>
> >>> wrote:
> >>>
> >>>> Hi everyone, thx for all the comments and feedback. Let me address
> >>>> everything individually:
> >>>>
> >>>> @Till: yes, for the start my plan would be to just touch the
> >>>> flink-runtime-web/web-dashboard repo/folder.
> >>>>
> >>>> @Jin Sun:
> >>>>
> >>>>     - smaller icons on increasing server counts: yes, thats also
> >>> something i
> >>>>     already thought about. will keep it in mind when realizing the
> first
> >>>>     version!
> >>>>     - about locality: i searched quickly through the docs, but i could
> >>> not
> >>>>     find anything regarding flink featuring rack awareness. Is this
> >>>> something
> >>>>     already implemented? If not, i think this will bloat the size of
> this
> >>>>     initial proposal. If its somewhere already included, we could
> >>> implement
> >>>> it
> >>>>     for sure.
> >>>>
> >>>> @Zhijiang:the focus of this redesign was not yet including the job
> list
> >>> in
> >>>> the lower half of the overview. as part of the redesign we can also
> >>> think
> >>>> about optimising this list though, and removing unnecessary columns is
> >>>> usually the most easy thing to do. we can maybe create a separate
> ticket
> >>>> for this as well and discuss this issue there, to not bloat the
> initial
> >>>> discussion with too much topics.
> >>>>
> >>>> @Robert:
> >>>>
> >>>>     - Agreed that it might make sense to also show this on job level.
> >>> Since
> >>>>     these metrics are probably gonna be introduced later only
> anyways, we
> >>>> can
> >>>>     discuss this maybe then separately after FLINK-9050
> >>>>     <https://issues.apache.org/jira/browse/FLINK-9050> (linked the
> wrong
> >>>>     ticket in my initial mail) is done.
> >>>>     - Rewriting the whole thing while doing this also came to my mind.
> >>> What
> >>>>     i would like to do anyways (even if we stick for now to A1) is to
> >>> remove
> >>>>     bower as a package manager (since its deprecated) and update
> >>> bootstrap
> >>>> to
> >>>>     V4. I will check what the additional effort is to move to
> >>> React/Redux.
> >>>>     We're working with this here at work as well, so implementing at
> >>> least a
> >>>>     first MVP might be feasible as well, before getting to deep into
> A1
> >>>>     specifics. But that basically means that you guys are open to
> change
> >>> the
> >>>>     underlying web/JS technology, yeah?
> >>>>
> >>>> Cheers
> >>>>
> >>>> --
> >>>>
> >>>>
> >>>> *Fabian WollertZalando SE*
> >>>>
> >>>> E-Mail: [hidden email]
> >>>>
> >>>>
> >>>> Am Mi., 10. Okt. 2018 um 08:41 Uhr schrieb Robert Metzger <
> >>>> [hidden email]>:
> >>>>
> >>>>> Hey Fabian,
> >>>>> thanks a lot for reaching out to the Flink community with this
> >>> proposal!
> >>>>> (Posting to the ML instead of creating a JIRA is a good idea for such
> >>>>> questions -- you can create a ticket/tickets once the discussion here
> >>> has
> >>>>> come to a conclusion)
> >>>>>
> >>>>> I have two comments:
> >>>>> - You are listing Records/Kb in and Records/Kb out as cluster-wide
> >>>> metrics.
> >>>>> I wonder whether we should rather show these metrics for each job,
> >>>> instead
> >>>>> of the entire cluster? (or maybe both). My concern is that the
> >>>> cluster-wide
> >>>>> metric is not really relevant as soon as you have jobs with different
> >>>>> characteristics running on one cluster
> >>>>> - You mention that the Flink UI is based on Angular 1. I've been
> >>> thinking
> >>>>> for quite a while now whether we should actually rewrite / migrate
> the
> >>>>> Flink UI to React.
> >>>>> Do you think we can re-use most of the work you'd be doing for this
> >>>> change
> >>>>> when we migrate to React?
> >>>>>
> >>>>> Best,
> >>>>> Robert
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, Oct 10, 2018 at 8:24 AM Zhijiang(wangzhijiang999)
> >>>>> <[hidden email]> wrote:
> >>>>>
> >>>>>> Thanks Fabian for proposing this topic.
> >>>>>>
> >>>>>> It is very worth improving the web dashborad for showing more useful
> >>>>>> informations which can benefit flink users a lot.
> >>>>>>
> >>>>>> Just two small personal concerns:
> >>>>>> 1. The start time and end time are already given, so it is easy to
> >>>>>> estimate the rough duration time. Is it necessary to show the
> >>> duration
> >>>>>> information to occupy the space?
> >>>>>> 2. The job name given by users can be used for identification, and
> >>> the
> >>>>>> job id is automatically generated in random. I am not sure whether
> >>> this
> >>>>> id
> >>>>>> is useful for further debugging. If not maybe we can ignore the job
> >>> id
> >>>>> from
> >>>>>> the dashboard?
> >>>>>>
> >>>>>> Best,
> >>>>>> Zhijiang
> >>>>>>
> >>>>>> ------------------------------------------------------------------
> >>>>>> 发件人:Jin Sun <[hidden email]>
> >>>>>> 发送时间:2018年10月10日(星期三) 01:10
> >>>>>> 收件人:dev <[hidden email]>
> >>>>>> 主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement
> >>> Proposal
> >>>>>> Great job! That would very helpful for debug.
> >>>>>>
> >>>>>>
> >>>>>>     - I would suggest to use small icons for this Job
> >>> Manager/Managers
> >>>>>>     when there are too many instances (like a thousand)
> >>>>>>     - May be we can also introduce locality,  that task managers
> >>> belongs
> >>>>>>     to same rack shows together?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Small icons can be like this:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]>
> >>>> wrote:
> >>>>>> mation on the front
> >>>>>> page. Your mock looks really promising to me since it shows some
> >>> basic
> >>>>>> metrics and cluster information at a glance. Apart from the the
> >>> source
> >>>>>> input and sink output metrics, all other required information
> >>> should be
> >>>>>> available to display it in the dashboard. Thus, your proposal should
> >>>> only
> >>>>>> affect flink-runtime-web which should make it easier to realize.
> >>>>>>
> >>>>>> I'm in favour of adding this feature to Flink's dashboard to make it
> >>>>>> available to the whole community.
> >>>>>>
> >>>>>>
> >>>>>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

jing
We also need refactor single job show. Now, just can see vertex metrics. And if you want see other informations, have to go to other page .First, we need update vertex
 and operator show.


- operator show like 

- add job dashbord  show faiover,  tm, operator overview.


Fabian Wollert <[hidden email]> 于2018年10月29日周一 下午7:59写道:
sure, will do.

--


*Fabian WollertZalando SE*

E-Mail: [hidden email]


Am Mo., 29. Okt. 2018 um 12:57 Uhr schrieb Chesnay Schepler <
[hidden email]>:

> Please start the discussion in an entirely new thread; people may
> discard this thread immediately since the first page is purely about the
> layout of the WebUI.
>
> On 29.10.2018 12:39, Fabian Wollert wrote:
> > Hi again,
> >
> > Chesnay correctly commented in the tickets that we first should discuss
> > here, if changing the underlying technology for the Flink Web Dashboard
> is
> > a valid option at all. What are your thoughts about this?
> >
> > personally I agree with Till's comments in the ticket, Angular 1 being
> > basically outdated and is not having a large following anymore. From my
> > experience the choice between Angular 2-7 or React is subjective, you can
> > get things done with both. I personally only have experience with React,
> so
> > i personally would be faster to develop with this one. I currently have
> not
> > planned to learn Angular as well (being a more backend focused developer
> in
> > general) so if the decision would be to go with Angular, i would be
> > unfortunately out of this rework of the Flink Dashboard most certainly.
> >
> > Cheers
> > Fabian
> >
> > --
> >
> >
> > *Fabian WollertZalando SE*
> >
> > E-Mail: [hidden email]
> >
> >
> > Am Mo., 29. Okt. 2018 um 09:21 Uhr schrieb Fabian Wollert <
> [hidden email]
> >> :
> >> Hi everyone,
> >>
> >> thx for all the feedback. I created now
> >> https://issues.apache.org/jira/browse/FLINK-10705 with sub tickets to
> >> tackle this. i also found some time this weekend and implemented the
> first
> >> draft, which i will post in the ticket (not sure if i get the pictures
> to
> >> work here in the mailing list :-D).
> >>
> >> Lets continue discussion in the tickets then.
> >>
> >> Since this is my first bigger contribution to Flink, please advise on
> how
> >> to handle tickets, and structure the work. But for now i will just
> continue
> >> to work on this, whenever i find free time.
> >>
> >> Cheers
> >>
> >> --
> >>
> >>
> >> *Fabian WollertZalando SE*
> >>
> >> E-Mail: [hidden email]
> >>
> >>
> >> Am Sa., 27. Okt. 2018 um 17:15 Uhr schrieb Robert Metzger <
> >> [hidden email]>:
> >>
> >>> Hey,
> >>> Sorry for the delay.
> >>>
> >>> Yes -- I would be open to revisit the underlying technologies.
> >>>
> >>> Best,
> >>> Robert
> >>>
> >>> On Wed, Oct 10, 2018 at 11:28 AM Fabian Wollert <[hidden email]>
> >>> wrote:
> >>>
> >>>> Hi everyone, thx for all the comments and feedback. Let me address
> >>>> everything individually:
> >>>>
> >>>> @Till: yes, for the start my plan would be to just touch the
> >>>> flink-runtime-web/web-dashboard repo/folder.
> >>>>
> >>>> @Jin Sun:
> >>>>
> >>>>     - smaller icons on increasing server counts: yes, thats also
> >>> something i
> >>>>     already thought about. will keep it in mind when realizing the
> first
> >>>>     version!
> >>>>     - about locality: i searched quickly through the docs, but i could
> >>> not
> >>>>     find anything regarding flink featuring rack awareness. Is this
> >>>> something
> >>>>     already implemented? If not, i think this will bloat the size of
> this
> >>>>     initial proposal. If its somewhere already included, we could
> >>> implement
> >>>> it
> >>>>     for sure.
> >>>>
> >>>> @Zhijiang:the focus of this redesign was not yet including the job
> list
> >>> in
> >>>> the lower half of the overview. as part of the redesign we can also
> >>> think
> >>>> about optimising this list though, and removing unnecessary columns is
> >>>> usually the most easy thing to do. we can maybe create a separate
> ticket
> >>>> for this as well and discuss this issue there, to not bloat the
> initial
> >>>> discussion with too much topics.
> >>>>
> >>>> @Robert:
> >>>>
> >>>>     - Agreed that it might make sense to also show this on job level.
> >>> Since
> >>>>     these metrics are probably gonna be introduced later only
> anyways, we
> >>>> can
> >>>>     discuss this maybe then separately after FLINK-9050
> >>>>     <https://issues.apache.org/jira/browse/FLINK-9050> (linked the
> wrong
> >>>>     ticket in my initial mail) is done.
> >>>>     - Rewriting the whole thing while doing this also came to my mind.
> >>> What
> >>>>     i would like to do anyways (even if we stick for now to A1) is to
> >>> remove
> >>>>     bower as a package manager (since its deprecated) and update
> >>> bootstrap
> >>>> to
> >>>>     V4. I will check what the additional effort is to move to
> >>> React/Redux.
> >>>>     We're working with this here at work as well, so implementing at
> >>> least a
> >>>>     first MVP might be feasible as well, before getting to deep into
> A1
> >>>>     specifics. But that basically means that you guys are open to
> change
> >>> the
> >>>>     underlying web/JS technology, yeah?
> >>>>
> >>>> Cheers
> >>>>
> >>>> --
> >>>>
> >>>>
> >>>> *Fabian WollertZalando SE*
> >>>>
> >>>> E-Mail: [hidden email]
> >>>>
> >>>>
> >>>> Am Mi., 10. Okt. 2018 um 08:41 Uhr schrieb Robert Metzger <
> >>>> [hidden email]>:
> >>>>
> >>>>> Hey Fabian,
> >>>>> thanks a lot for reaching out to the Flink community with this
> >>> proposal!
> >>>>> (Posting to the ML instead of creating a JIRA is a good idea for such
> >>>>> questions -- you can create a ticket/tickets once the discussion here
> >>> has
> >>>>> come to a conclusion)
> >>>>>
> >>>>> I have two comments:
> >>>>> - You are listing Records/Kb in and Records/Kb out as cluster-wide
> >>>> metrics.
> >>>>> I wonder whether we should rather show these metrics for each job,
> >>>> instead
> >>>>> of the entire cluster? (or maybe both). My concern is that the
> >>>> cluster-wide
> >>>>> metric is not really relevant as soon as you have jobs with different
> >>>>> characteristics running on one cluster
> >>>>> - You mention that the Flink UI is based on Angular 1. I've been
> >>> thinking
> >>>>> for quite a while now whether we should actually rewrite / migrate
> the
> >>>>> Flink UI to React.
> >>>>> Do you think we can re-use most of the work you'd be doing for this
> >>>> change
> >>>>> when we migrate to React?
> >>>>>
> >>>>> Best,
> >>>>> Robert
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, Oct 10, 2018 at 8:24 AM Zhijiang(wangzhijiang999)
> >>>>> <[hidden email].invalid> wrote:
> >>>>>
> >>>>>> Thanks Fabian for proposing this topic.
> >>>>>>
> >>>>>> It is very worth improving the web dashborad for showing more useful
> >>>>>> informations which can benefit flink users a lot.
> >>>>>>
> >>>>>> Just two small personal concerns:
> >>>>>> 1. The start time and end time are already given, so it is easy to
> >>>>>> estimate the rough duration time. Is it necessary to show the
> >>> duration
> >>>>>> information to occupy the space?
> >>>>>> 2. The job name given by users can be used for identification, and
> >>> the
> >>>>>> job id is automatically generated in random. I am not sure whether
> >>> this
> >>>>> id
> >>>>>> is useful for further debugging. If not maybe we can ignore the job
> >>> id
> >>>>> from
> >>>>>> the dashboard?
> >>>>>>
> >>>>>> Best,
> >>>>>> Zhijiang
> >>>>>>
> >>>>>> ------------------------------------------------------------------
> >>>>>> 发件人:Jin Sun <[hidden email]>
> >>>>>> 发送时间:2018年10月10日(星期三) 01:10
> >>>>>> 收件人:dev <[hidden email]>
> >>>>>> 主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement
> >>> Proposal
> >>>>>> Great job! That would very helpful for debug.
> >>>>>>
> >>>>>>
> >>>>>>     - I would suggest to use small icons for this Job
> >>> Manager/Managers
> >>>>>>     when there are too many instances (like a thousand)
> >>>>>>     - May be we can also introduce locality,  that task managers
> >>> belongs
> >>>>>>     to same rack shows together?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Small icons can be like this:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]>
> >>>> wrote:
> >>>>>> mation on the front
> >>>>>> page. Your mock looks really promising to me since it shows some
> >>> basic
> >>>>>> metrics and cluster information at a glance. Apart from the the
> >>> source
> >>>>>> input and sink output metrics, all other required information
> >>> should be
> >>>>>> available to display it in the dashboard. Thus, your proposal should
> >>>> only
> >>>>>> affect flink-runtime-web which should make it easier to realize.
> >>>>>>
> >>>>>> I'm in favour of adding this feature to Flink's dashboard to make it
> >>>>>> available to the whole community.
> >>>>>>
> >>>>>>
> >>>>>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Shaoxuan Wang
Lining,
Thanks for the proposal.
There is another ongoing ML (
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Change-underlying-Frontend-Architecture-for-Flink-Web-Dashboard-td24902.html),
where YaDong has shared some sample code (
https://github.com/vthinkxie/flink-runtime-web) which upgraded the Flink
web UI to Angular 7.0. I would suggest we add all the new web UI features
on top of that underlying framework. What do you think?

Regards,
Shaoxuan



On Tue, Nov 6, 2018 at 6:37 PM lining jing <[hidden email]> wrote:

> We also need refactor single job show. Now, just can see vertex metrics.
> And if you want see other informations, have to go to other page .First, we
> need update vertex
>  and operator show.
>
> - vertex show in https://issues.apache.org/jira/browse/FLINK-10802.
> [image: image.png]
> - operator show like
> [image: image.png]
> - add job dashbord  show faiover,  tm, operator overview.
>
>
> Fabian Wollert <[hidden email]> 于2018年10月29日周一 下午7:59写道:
>
>> sure, will do.
>>
>> --
>>
>>
>> *Fabian WollertZalando SE*
>>
>> E-Mail: [hidden email]
>>
>>
>> Am Mo., 29. Okt. 2018 um 12:57 Uhr schrieb Chesnay Schepler <
>> [hidden email]>:
>>
>> > Please start the discussion in an entirely new thread; people may
>> > discard this thread immediately since the first page is purely about the
>> > layout of the WebUI.
>> >
>> > On 29.10.2018 12:39, Fabian Wollert wrote:
>> > > Hi again,
>> > >
>> > > Chesnay correctly commented in the tickets that we first should
>> discuss
>> > > here, if changing the underlying technology for the Flink Web
>> Dashboard
>> > is
>> > > a valid option at all. What are your thoughts about this?
>> > >
>> > > personally I agree with Till's comments in the ticket, Angular 1 being
>> > > basically outdated and is not having a large following anymore. From
>> my
>> > > experience the choice between Angular 2-7 or React is subjective, you
>> can
>> > > get things done with both. I personally only have experience with
>> React,
>> > so
>> > > i personally would be faster to develop with this one. I currently
>> have
>> > not
>> > > planned to learn Angular as well (being a more backend focused
>> developer
>> > in
>> > > general) so if the decision would be to go with Angular, i would be
>> > > unfortunately out of this rework of the Flink Dashboard most
>> certainly.
>> > >
>> > > Cheers
>> > > Fabian
>> > >
>> > > --
>> > >
>> > >
>> > > *Fabian WollertZalando SE*
>> > >
>> > > E-Mail: [hidden email]
>> > >
>> > >
>> > > Am Mo., 29. Okt. 2018 um 09:21 Uhr schrieb Fabian Wollert <
>> > [hidden email]
>> > >> :
>> > >> Hi everyone,
>> > >>
>> > >> thx for all the feedback. I created now
>> > >> https://issues.apache.org/jira/browse/FLINK-10705 with sub tickets
>> to
>> > >> tackle this. i also found some time this weekend and implemented the
>> > first
>> > >> draft, which i will post in the ticket (not sure if i get the
>> pictures
>> > to
>> > >> work here in the mailing list :-D).
>> > >>
>> > >> Lets continue discussion in the tickets then.
>> > >>
>> > >> Since this is my first bigger contribution to Flink, please advise on
>> > how
>> > >> to handle tickets, and structure the work. But for now i will just
>> > continue
>> > >> to work on this, whenever i find free time.
>> > >>
>> > >> Cheers
>> > >>
>> > >> --
>> > >>
>> > >>
>> > >> *Fabian WollertZalando SE*
>> > >>
>> > >> E-Mail: [hidden email]
>> > >>
>> > >>
>> > >> Am Sa., 27. Okt. 2018 um 17:15 Uhr schrieb Robert Metzger <
>> > >> [hidden email]>:
>> > >>
>> > >>> Hey,
>> > >>> Sorry for the delay.
>> > >>>
>> > >>> Yes -- I would be open to revisit the underlying technologies.
>> > >>>
>> > >>> Best,
>> > >>> Robert
>> > >>>
>> > >>> On Wed, Oct 10, 2018 at 11:28 AM Fabian Wollert <[hidden email]>
>> > >>> wrote:
>> > >>>
>> > >>>> Hi everyone, thx for all the comments and feedback. Let me address
>> > >>>> everything individually:
>> > >>>>
>> > >>>> @Till: yes, for the start my plan would be to just touch the
>> > >>>> flink-runtime-web/web-dashboard repo/folder.
>> > >>>>
>> > >>>> @Jin Sun:
>> > >>>>
>> > >>>>     - smaller icons on increasing server counts: yes, thats also
>> > >>> something i
>> > >>>>     already thought about. will keep it in mind when realizing the
>> > first
>> > >>>>     version!
>> > >>>>     - about locality: i searched quickly through the docs, but i
>> could
>> > >>> not
>> > >>>>     find anything regarding flink featuring rack awareness. Is this
>> > >>>> something
>> > >>>>     already implemented? If not, i think this will bloat the size
>> of
>> > this
>> > >>>>     initial proposal. If its somewhere already included, we could
>> > >>> implement
>> > >>>> it
>> > >>>>     for sure.
>> > >>>>
>> > >>>> @Zhijiang:the focus of this redesign was not yet including the job
>> > list
>> > >>> in
>> > >>>> the lower half of the overview. as part of the redesign we can also
>> > >>> think
>> > >>>> about optimising this list though, and removing unnecessary
>> columns is
>> > >>>> usually the most easy thing to do. we can maybe create a separate
>> > ticket
>> > >>>> for this as well and discuss this issue there, to not bloat the
>> > initial
>> > >>>> discussion with too much topics.
>> > >>>>
>> > >>>> @Robert:
>> > >>>>
>> > >>>>     - Agreed that it might make sense to also show this on job
>> level.
>> > >>> Since
>> > >>>>     these metrics are probably gonna be introduced later only
>> > anyways, we
>> > >>>> can
>> > >>>>     discuss this maybe then separately after FLINK-9050
>> > >>>>     <https://issues.apache.org/jira/browse/FLINK-9050> (linked the
>> > wrong
>> > >>>>     ticket in my initial mail) is done.
>> > >>>>     - Rewriting the whole thing while doing this also came to my
>> mind.
>> > >>> What
>> > >>>>     i would like to do anyways (even if we stick for now to A1) is
>> to
>> > >>> remove
>> > >>>>     bower as a package manager (since its deprecated) and update
>> > >>> bootstrap
>> > >>>> to
>> > >>>>     V4. I will check what the additional effort is to move to
>> > >>> React/Redux.
>> > >>>>     We're working with this here at work as well, so implementing
>> at
>> > >>> least a
>> > >>>>     first MVP might be feasible as well, before getting to deep
>> into
>> > A1
>> > >>>>     specifics. But that basically means that you guys are open to
>> > change
>> > >>> the
>> > >>>>     underlying web/JS technology, yeah?
>> > >>>>
>> > >>>> Cheers
>> > >>>>
>> > >>>> --
>> > >>>>
>> > >>>>
>> > >>>> *Fabian WollertZalando SE*
>> > >>>>
>> > >>>> E-Mail: [hidden email]
>> > >>>>
>> > >>>>
>> > >>>> Am Mi., 10. Okt. 2018 um 08:41 Uhr schrieb Robert Metzger <
>> > >>>> [hidden email]>:
>> > >>>>
>> > >>>>> Hey Fabian,
>> > >>>>> thanks a lot for reaching out to the Flink community with this
>> > >>> proposal!
>> > >>>>> (Posting to the ML instead of creating a JIRA is a good idea for
>> such
>> > >>>>> questions -- you can create a ticket/tickets once the discussion
>> here
>> > >>> has
>> > >>>>> come to a conclusion)
>> > >>>>>
>> > >>>>> I have two comments:
>> > >>>>> - You are listing Records/Kb in and Records/Kb out as cluster-wide
>> > >>>> metrics.
>> > >>>>> I wonder whether we should rather show these metrics for each job,
>> > >>>> instead
>> > >>>>> of the entire cluster? (or maybe both). My concern is that the
>> > >>>> cluster-wide
>> > >>>>> metric is not really relevant as soon as you have jobs with
>> different
>> > >>>>> characteristics running on one cluster
>> > >>>>> - You mention that the Flink UI is based on Angular 1. I've been
>> > >>> thinking
>> > >>>>> for quite a while now whether we should actually rewrite / migrate
>> > the
>> > >>>>> Flink UI to React.
>> > >>>>> Do you think we can re-use most of the work you'd be doing for
>> this
>> > >>>> change
>> > >>>>> when we migrate to React?
>> > >>>>>
>> > >>>>> Best,
>> > >>>>> Robert
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>> On Wed, Oct 10, 2018 at 8:24 AM Zhijiang(wangzhijiang999)
>> > >>>>> <[hidden email]> wrote:
>> > >>>>>
>> > >>>>>> Thanks Fabian for proposing this topic.
>> > >>>>>>
>> > >>>>>> It is very worth improving the web dashborad for showing more
>> useful
>> > >>>>>> informations which can benefit flink users a lot.
>> > >>>>>>
>> > >>>>>> Just two small personal concerns:
>> > >>>>>> 1. The start time and end time are already given, so it is easy
>> to
>> > >>>>>> estimate the rough duration time. Is it necessary to show the
>> > >>> duration
>> > >>>>>> information to occupy the space?
>> > >>>>>> 2. The job name given by users can be used for identification,
>> and
>> > >>> the
>> > >>>>>> job id is automatically generated in random. I am not sure
>> whether
>> > >>> this
>> > >>>>> id
>> > >>>>>> is useful for further debugging. If not maybe we can ignore the
>> job
>> > >>> id
>> > >>>>> from
>> > >>>>>> the dashboard?
>> > >>>>>>
>> > >>>>>> Best,
>> > >>>>>> Zhijiang
>> > >>>>>>
>> > >>>>>>
>> ------------------------------------------------------------------
>> > >>>>>> 发件人:Jin Sun <[hidden email]>
>> > >>>>>> 发送时间:2018年10月10日(星期三) 01:10
>> > >>>>>> 收件人:dev <[hidden email]>
>> > >>>>>> 主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement
>> > >>> Proposal
>> > >>>>>> Great job! That would very helpful for debug.
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>     - I would suggest to use small icons for this Job
>> > >>> Manager/Managers
>> > >>>>>>     when there are too many instances (like a thousand)
>> > >>>>>>     - May be we can also introduce locality,  that task managers
>> > >>> belongs
>> > >>>>>>     to same rack shows together?
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> Small icons can be like this:
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> On Oct 9, 2018, at 8:49 PM, Till Rohrmann <[hidden email]>
>> > >>>> wrote:
>> > >>>>>> mation on the front
>> > >>>>>> page. Your mock looks really promising to me since it shows some
>> > >>> basic
>> > >>>>>> metrics and cluster information at a glance. Apart from the the
>> > >>> source
>> > >>>>>> input and sink output metrics, all other required information
>> > >>> should be
>> > >>>>>> available to display it in the dashboard. Thus, your proposal
>> should
>> > >>>> only
>> > >>>>>> affect flink-runtime-web which should make it easier to realize.
>> > >>>>>>
>> > >>>>>> I'm in favour of adding this feature to Flink's dashboard to
>> make it
>> > >>>>>> available to the whole community.
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> >
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

jing
Ok

Shaoxuan Wang <[hidden email]>于2018年11月6日 周二19:26写道:

> Lining,
> Thanks for the proposal.
> There is another ongoing ML (
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Change-underlying-Frontend-Architecture-for-Flink-Web-Dashboard-td24902.html
> ),
> where YaDong has shared some sample code (
> https://github.com/vthinkxie/flink-runtime-web) which upgraded the Flink
> web UI to Angular 7.0. I would suggest we add all the new web UI features
> on top of that underlying framework. What do you think?
>
> Regards,
> Shaoxuan
>
>
>
> On Tue, Nov 6, 2018 at 6:37 PM lining jing <[hidden email]> wrote:
>
> > We also need refactor single job show. Now, just can see vertex metrics.
> > And if you want see other informations, have to go to other page .First,
> we
> > need update vertex
> >  and operator show.
> >
> > - vertex show in https://issues.apache.org/jira/browse/FLINK-10802.
> > [image: image.png]
> > - operator show like
> > [image: image.png]
> > - add job dashbord  show faiover,  tm, operator overview.
> >
> >
> > Fabian Wollert <[hidden email]> 于2018年10月29日周一 下午7:59写道:
> >
> >> sure, will do.
> >>
> >> --
> >>
> >>
> >> *Fabian WollertZalando SE*
> >>
> >> E-Mail: [hidden email]
> >>
> >>
> >> Am Mo., 29. Okt. 2018 um 12:57 Uhr schrieb Chesnay Schepler <
> >> [hidden email]>:
> >>
> >> > Please start the discussion in an entirely new thread; people may
> >> > discard this thread immediately since the first page is purely about
> the
> >> > layout of the WebUI.
> >> >
> >> > On 29.10.2018 12:39, Fabian Wollert wrote:
> >> > > Hi again,
> >> > >
> >> > > Chesnay correctly commented in the tickets that we first should
> >> discuss
> >> > > here, if changing the underlying technology for the Flink Web
> >> Dashboard
> >> > is
> >> > > a valid option at all. What are your thoughts about this?
> >> > >
> >> > > personally I agree with Till's comments in the ticket, Angular 1
> being
> >> > > basically outdated and is not having a large following anymore. From
> >> my
> >> > > experience the choice between Angular 2-7 or React is subjective,
> you
> >> can
> >> > > get things done with both. I personally only have experience with
> >> React,
> >> > so
> >> > > i personally would be faster to develop with this one. I currently
> >> have
> >> > not
> >> > > planned to learn Angular as well (being a more backend focused
> >> developer
> >> > in
> >> > > general) so if the decision would be to go with Angular, i would be
> >> > > unfortunately out of this rework of the Flink Dashboard most
> >> certainly.
> >> > >
> >> > > Cheers
> >> > > Fabian
> >> > >
> >> > > --
> >> > >
> >> > >
> >> > > *Fabian WollertZalando SE*
> >> > >
> >> > > E-Mail: [hidden email]
> >> > >
> >> > >
> >> > > Am Mo., 29. Okt. 2018 um 09:21 Uhr schrieb Fabian Wollert <
> >> > [hidden email]
> >> > >> :
> >> > >> Hi everyone,
> >> > >>
> >> > >> thx for all the feedback. I created now
> >> > >> https://issues.apache.org/jira/browse/FLINK-10705 with sub tickets
> >> to
> >> > >> tackle this. i also found some time this weekend and implemented
> the
> >> > first
> >> > >> draft, which i will post in the ticket (not sure if i get the
> >> pictures
> >> > to
> >> > >> work here in the mailing list :-D).
> >> > >>
> >> > >> Lets continue discussion in the tickets then.
> >> > >>
> >> > >> Since this is my first bigger contribution to Flink, please advise
> on
> >> > how
> >> > >> to handle tickets, and structure the work. But for now i will just
> >> > continue
> >> > >> to work on this, whenever i find free time.
> >> > >>
> >> > >> Cheers
> >> > >>
> >> > >> --
> >> > >>
> >> > >>
> >> > >> *Fabian WollertZalando SE*
> >> > >>
> >> > >> E-Mail: [hidden email]
> >> > >>
> >> > >>
> >> > >> Am Sa., 27. Okt. 2018 um 17:15 Uhr schrieb Robert Metzger <
> >> > >> [hidden email]>:
> >> > >>
> >> > >>> Hey,
> >> > >>> Sorry for the delay.
> >> > >>>
> >> > >>> Yes -- I would be open to revisit the underlying technologies.
> >> > >>>
> >> > >>> Best,
> >> > >>> Robert
> >> > >>>
> >> > >>> On Wed, Oct 10, 2018 at 11:28 AM Fabian Wollert <
> [hidden email]>
> >> > >>> wrote:
> >> > >>>
> >> > >>>> Hi everyone, thx for all the comments and feedback. Let me
> address
> >> > >>>> everything individually:
> >> > >>>>
> >> > >>>> @Till: yes, for the start my plan would be to just touch the
> >> > >>>> flink-runtime-web/web-dashboard repo/folder.
> >> > >>>>
> >> > >>>> @Jin Sun:
> >> > >>>>
> >> > >>>>     - smaller icons on increasing server counts: yes, thats also
> >> > >>> something i
> >> > >>>>     already thought about. will keep it in mind when realizing
> the
> >> > first
> >> > >>>>     version!
> >> > >>>>     - about locality: i searched quickly through the docs, but i
> >> could
> >> > >>> not
> >> > >>>>     find anything regarding flink featuring rack awareness. Is
> this
> >> > >>>> something
> >> > >>>>     already implemented? If not, i think this will bloat the size
> >> of
> >> > this
> >> > >>>>     initial proposal. If its somewhere already included, we could
> >> > >>> implement
> >> > >>>> it
> >> > >>>>     for sure.
> >> > >>>>
> >> > >>>> @Zhijiang:the focus of this redesign was not yet including the
> job
> >> > list
> >> > >>> in
> >> > >>>> the lower half of the overview. as part of the redesign we can
> also
> >> > >>> think
> >> > >>>> about optimising this list though, and removing unnecessary
> >> columns is
> >> > >>>> usually the most easy thing to do. we can maybe create a separate
> >> > ticket
> >> > >>>> for this as well and discuss this issue there, to not bloat the
> >> > initial
> >> > >>>> discussion with too much topics.
> >> > >>>>
> >> > >>>> @Robert:
> >> > >>>>
> >> > >>>>     - Agreed that it might make sense to also show this on job
> >> level.
> >> > >>> Since
> >> > >>>>     these metrics are probably gonna be introduced later only
> >> > anyways, we
> >> > >>>> can
> >> > >>>>     discuss this maybe then separately after FLINK-9050
> >> > >>>>     <https://issues.apache.org/jira/browse/FLINK-9050> (linked
> the
> >> > wrong
> >> > >>>>     ticket in my initial mail) is done.
> >> > >>>>     - Rewriting the whole thing while doing this also came to my
> >> mind.
> >> > >>> What
> >> > >>>>     i would like to do anyways (even if we stick for now to A1)
> is
> >> to
> >> > >>> remove
> >> > >>>>     bower as a package manager (since its deprecated) and update
> >> > >>> bootstrap
> >> > >>>> to
> >> > >>>>     V4. I will check what the additional effort is to move to
> >> > >>> React/Redux.
> >> > >>>>     We're working with this here at work as well, so implementing
> >> at
> >> > >>> least a
> >> > >>>>     first MVP might be feasible as well, before getting to deep
> >> into
> >> > A1
> >> > >>>>     specifics. But that basically means that you guys are open to
> >> > change
> >> > >>> the
> >> > >>>>     underlying web/JS technology, yeah?
> >> > >>>>
> >> > >>>> Cheers
> >> > >>>>
> >> > >>>> --
> >> > >>>>
> >> > >>>>
> >> > >>>> *Fabian WollertZalando SE*
> >> > >>>>
> >> > >>>> E-Mail: [hidden email]
> >> > >>>>
> >> > >>>>
> >> > >>>> Am Mi., 10. Okt. 2018 um 08:41 Uhr schrieb Robert Metzger <
> >> > >>>> [hidden email]>:
> >> > >>>>
> >> > >>>>> Hey Fabian,
> >> > >>>>> thanks a lot for reaching out to the Flink community with this
> >> > >>> proposal!
> >> > >>>>> (Posting to the ML instead of creating a JIRA is a good idea for
> >> such
> >> > >>>>> questions -- you can create a ticket/tickets once the discussion
> >> here
> >> > >>> has
> >> > >>>>> come to a conclusion)
> >> > >>>>>
> >> > >>>>> I have two comments:
> >> > >>>>> - You are listing Records/Kb in and Records/Kb out as
> cluster-wide
> >> > >>>> metrics.
> >> > >>>>> I wonder whether we should rather show these metrics for each
> job,
> >> > >>>> instead
> >> > >>>>> of the entire cluster? (or maybe both). My concern is that the
> >> > >>>> cluster-wide
> >> > >>>>> metric is not really relevant as soon as you have jobs with
> >> different
> >> > >>>>> characteristics running on one cluster
> >> > >>>>> - You mention that the Flink UI is based on Angular 1. I've been
> >> > >>> thinking
> >> > >>>>> for quite a while now whether we should actually rewrite /
> migrate
> >> > the
> >> > >>>>> Flink UI to React.
> >> > >>>>> Do you think we can re-use most of the work you'd be doing for
> >> this
> >> > >>>> change
> >> > >>>>> when we migrate to React?
> >> > >>>>>
> >> > >>>>> Best,
> >> > >>>>> Robert
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>> On Wed, Oct 10, 2018 at 8:24 AM Zhijiang(wangzhijiang999)
> >> > >>>>> <[hidden email]> wrote:
> >> > >>>>>
> >> > >>>>>> Thanks Fabian for proposing this topic.
> >> > >>>>>>
> >> > >>>>>> It is very worth improving the web dashborad for showing more
> >> useful
> >> > >>>>>> informations which can benefit flink users a lot.
> >> > >>>>>>
> >> > >>>>>> Just two small personal concerns:
> >> > >>>>>> 1. The start time and end time are already given, so it is easy
> >> to
> >> > >>>>>> estimate the rough duration time. Is it necessary to show the
> >> > >>> duration
> >> > >>>>>> information to occupy the space?
> >> > >>>>>> 2. The job name given by users can be used for identification,
> >> and
> >> > >>> the
> >> > >>>>>> job id is automatically generated in random. I am not sure
> >> whether
> >> > >>> this
> >> > >>>>> id
> >> > >>>>>> is useful for further debugging. If not maybe we can ignore the
> >> job
> >> > >>> id
> >> > >>>>> from
> >> > >>>>>> the dashboard?
> >> > >>>>>>
> >> > >>>>>> Best,
> >> > >>>>>> Zhijiang
> >> > >>>>>>
> >> > >>>>>>
> >> ------------------------------------------------------------------
> >> > >>>>>> 发件人:Jin Sun <[hidden email]>
> >> > >>>>>> 发送时间:2018年10月10日(星期三) 01:10
> >> > >>>>>> 收件人:dev <[hidden email]>
> >> > >>>>>> 主 题:Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement
> >> > >>> Proposal
> >> > >>>>>> Great job! That would very helpful for debug.
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>     - I would suggest to use small icons for this Job
> >> > >>> Manager/Managers
> >> > >>>>>>     when there are too many instances (like a thousand)
> >> > >>>>>>     - May be we can also introduce locality,  that task
> managers
> >> > >>> belongs
> >> > >>>>>>     to same rack shows together?
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>> Small icons can be like this:
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>> On Oct 9, 2018, at 8:49 PM, Till Rohrmann <
> [hidden email]>
> >> > >>>> wrote:
> >> > >>>>>> mation on the front
> >> > >>>>>> page. Your mock looks really promising to me since it shows
> some
> >> > >>> basic
> >> > >>>>>> metrics and cluster information at a glance. Apart from the the
> >> > >>> source
> >> > >>>>>> input and sink output metrics, all other required information
> >> > >>> should be
> >> > >>>>>> available to display it in the dashboard. Thus, your proposal
> >> should
> >> > >>>> only
> >> > >>>>>> affect flink-runtime-web which should make it easier to
> realize.
> >> > >>>>>>
> >> > >>>>>> I'm in favour of adding this feature to Flink's dashboard to
> >> make it
> >> > >>>>>> available to the whole community.
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> >
> >> >
> >>
> >
>