Got exception: unable to create new native thread when use BucketingSink writing to HDFS

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Got exception: unable to create new native thread when use BucketingSink writing to HDFS

Mu Kong
Hi all,

I am new to flink. I tried to start a cluster over 3 servers, and the
process speed was GREAT.
However, after several hours' streaming, I got this error:

*java.lang.OutOfMemoryError: unable to create new native thread*


From:

at org.apache.flink.streaming.connectors.fs.StreamWriterBase.open(StreamWriterBase.java:120)
at org.apache.flink.streaming.connectors.fs.StringWriter.open(StringWriter.java:62)

It seems that when flink was writing to HDFS, flink opened too many files
which run out all the native threads.

I wonder is there a way for BucketingSink to solve this.
Or how can I optimize my setting in general?

Thanks in advance.

FYI,
I also posted this question to stack overflow:
https://stackoverflow.com/questions/44425709/flink-unable-to-create-new-native-thread-when-use-bucketingsink
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Got exception: unable to create new native thread when use BucketingSink writing to HDFS

Till Rohrmann
Hi Mu Kong,

I think this is a problem with how the BucketingSink works. It keeps for
each file a writer which has an open file stream. I think we should limit
the number of open writers such that we don't run into the problem of too
many open file handles/open HDFS connections.

Could you open a JIRA issue for that?

Cheers,
Till

On Thu, Jun 8, 2017 at 9:28 AM, Mu Kong <[hidden email]> wrote:

> Hi all,
>
> I am new to flink. I tried to start a cluster over 3 servers, and the
> process speed was GREAT.
> However, after several hours' streaming, I got this error:
>
> *java.lang.OutOfMemoryError: unable to create new native thread*
>
>
> From:
>
> at org.apache.flink.streaming.connectors.fs.StreamWriterBase.open(
> StreamWriterBase.java:120)
> at org.apache.flink.streaming.connectors.fs.StringWriter.
> open(StringWriter.java:62)
>
> It seems that when flink was writing to HDFS, flink opened too many files
> which run out all the native threads.
>
> I wonder is there a way for BucketingSink to solve this.
> Or how can I optimize my setting in general?
>
> Thanks in advance.
>
> FYI,
> I also posted this question to stack overflow:
> https://stackoverflow.com/questions/44425709/flink-
> unable-to-create-new-native-thread-when-use-bucketingsink
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Got exception: unable to create new native thread when use BucketingSink writing to HDFS

Mu Kong
Hi Till Rohrmann,

Thanks for the prompt response!
Limiting the number of open connections to the HDFS would be great.

I created an issue here in JIRA:
https://issues.apache.org/jira/browse/FLINK-6873

Best regards,
Mu

On Thu, Jun 8, 2017 at 7:06 PM, Till Rohrmann <[hidden email]> wrote:

> Hi Mu Kong,
>
> I think this is a problem with how the BucketingSink works. It keeps for
> each file a writer which has an open file stream. I think we should limit
> the number of open writers such that we don't run into the problem of too
> many open file handles/open HDFS connections.
>
> Could you open a JIRA issue for that?
>
> Cheers,
> Till
>
> On Thu, Jun 8, 2017 at 9:28 AM, Mu Kong <[hidden email]> wrote:
>
> > Hi all,
> >
> > I am new to flink. I tried to start a cluster over 3 servers, and the
> > process speed was GREAT.
> > However, after several hours' streaming, I got this error:
> >
> > *java.lang.OutOfMemoryError: unable to create new native thread*
> >
> >
> > From:
> >
> > at org.apache.flink.streaming.connectors.fs.StreamWriterBase.open(
> > StreamWriterBase.java:120)
> > at org.apache.flink.streaming.connectors.fs.StringWriter.
> > open(StringWriter.java:62)
> >
> > It seems that when flink was writing to HDFS, flink opened too many files
> > which run out all the native threads.
> >
> > I wonder is there a way for BucketingSink to solve this.
> > Or how can I optimize my setting in general?
> >
> > Thanks in advance.
> >
> > FYI,
> > I also posted this question to stack overflow:
> > https://stackoverflow.com/questions/44425709/flink-
> > unable-to-create-new-native-thread-when-use-bucketingsink
> >
>
Loading...