nifi is running out of memory

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

nifi is running out of memory

Gop Krr

Hi All,

I have very simple data flow, where I need to move s3 data from one bucket in one account to another bucket under another account. I have attached my processor configuration.


2016-10-27 20:09:57,626 
ERROR [Flow Service Tasks Thread-2] org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space

I am very new to NiFi and trying ot get few of the use cases going. I need help from the community.

Thanks again

Rai




Screen Shot 2016-10-27 at 2.37.42 PM.png (155K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Bryan Bende
Hello,

Are you running with all of the default settings?

If so you would probably want to try increasing the memory settings in conf/bootstrap.conf.

They default to 512mb, you may want to try bumping it up to 1024mb.

-Bryan

On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]> wrote:

Hi All,

I have very simple data flow, where I need to move s3 data from one bucket in one account to another bucket under another account. I have attached my processor configuration.


2016-10-27 20:09:57,626 
ERROR [Flow Service Tasks Thread-2] org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space

I am very new to NiFi and trying ot get few of the use cases going. I need help from the community.

Thanks again

Rai




Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Joe Witt
moving dev to bcc

Yes I believe the issue here is that FetchS3 doesn't do chunked
transfers and so is loading all into memory.  I've not verified this
in the code yet but it seems quite likely.  Krish if you can verify
that going with a larger heap gets you in the game can you please file
a JIRA.

Thanks
Joe

On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]> wrote:

> Hello,
>
> Are you running with all of the default settings?
>
> If so you would probably want to try increasing the memory settings in
> conf/bootstrap.conf.
>
> They default to 512mb, you may want to try bumping it up to 1024mb.
>
> -Bryan
>
> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]> wrote:
>>
>> Hi All,
>>
>> I have very simple data flow, where I need to move s3 data from one bucket
>> in one account to another bucket under another account. I have attached my
>> processor configuration.
>>
>>
>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>
>> I am very new to NiFi and trying ot get few of the use cases going. I need
>> help from the community.
>>
>> Thanks again
>>
>> Rai
>>
>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Joe Witt
Looking at this line [1] makes me think the FetchS3 processor is
properly streaming the bytes directly to the content repository.

Looking at the screenshot showing nothing out of the ListS3 processor
makes me think the bucket has so many things in it that the processor
or associated library isn't handling that well and is just listing
everything with no mechanism of max buffer size.  Krish please try
with the largest heap you can and let us know what you see.

[1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107

On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <[hidden email]> wrote:

> moving dev to bcc
>
> Yes I believe the issue here is that FetchS3 doesn't do chunked
> transfers and so is loading all into memory.  I've not verified this
> in the code yet but it seems quite likely.  Krish if you can verify
> that going with a larger heap gets you in the game can you please file
> a JIRA.
>
> Thanks
> Joe
>
> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]> wrote:
>> Hello,
>>
>> Are you running with all of the default settings?
>>
>> If so you would probably want to try increasing the memory settings in
>> conf/bootstrap.conf.
>>
>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>
>> -Bryan
>>
>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]> wrote:
>>>
>>> Hi All,
>>>
>>> I have very simple data flow, where I need to move s3 data from one bucket
>>> in one account to another bucket under another account. I have attached my
>>> processor configuration.
>>>
>>>
>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>
>>> I am very new to NiFi and trying ot get few of the use cases going. I need
>>> help from the community.
>>>
>>> Thanks again
>>>
>>> Rai
>>>
>>>
>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Adam Lamar
Hey All,

I believe OP is running into a bug fixed here:
https://issues.apache.org/jira/browse/NIFI-2631

Basically, ListS3 attempts to commit all the files it finds
(potentially 100k+) at once, rather than in batches. NIFI-2631
addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
a 1.x release.

Cheers,
Adam


On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <[hidden email]> wrote:

> Looking at this line [1] makes me think the FetchS3 processor is
> properly streaming the bytes directly to the content repository.
>
> Looking at the screenshot showing nothing out of the ListS3 processor
> makes me think the bucket has so many things in it that the processor
> or associated library isn't handling that well and is just listing
> everything with no mechanism of max buffer size.  Krish please try
> with the largest heap you can and let us know what you see.
>
> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>
> On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <[hidden email]> wrote:
>> moving dev to bcc
>>
>> Yes I believe the issue here is that FetchS3 doesn't do chunked
>> transfers and so is loading all into memory.  I've not verified this
>> in the code yet but it seems quite likely.  Krish if you can verify
>> that going with a larger heap gets you in the game can you please file
>> a JIRA.
>>
>> Thanks
>> Joe
>>
>> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]> wrote:
>>> Hello,
>>>
>>> Are you running with all of the default settings?
>>>
>>> If so you would probably want to try increasing the memory settings in
>>> conf/bootstrap.conf.
>>>
>>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>>
>>> -Bryan
>>>
>>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I have very simple data flow, where I need to move s3 data from one bucket
>>>> in one account to another bucket under another account. I have attached my
>>>> processor configuration.
>>>>
>>>>
>>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>>
>>>> I am very new to NiFi and trying ot get few of the use cases going. I need
>>>> help from the community.
>>>>
>>>> Thanks again
>>>>
>>>> Rai
>>>>
>>>>
>>>>
>>>
Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Gop Krr
Thanks Adam. I will try 0.7.1 and update the community on the outcome. If it works then I can create a patch for 1.x
Thanks
Rai

On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <[hidden email]> wrote:
Hey All,

I believe OP is running into a bug fixed here:
https://issues.apache.org/jira/browse/NIFI-2631

Basically, ListS3 attempts to commit all the files it finds
(potentially 100k+) at once, rather than in batches. NIFI-2631
addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
a 1.x release.

Cheers,
Adam


On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <[hidden email]> wrote:
> Looking at this line [1] makes me think the FetchS3 processor is
> properly streaming the bytes directly to the content repository.
>
> Looking at the screenshot showing nothing out of the ListS3 processor
> makes me think the bucket has so many things in it that the processor
> or associated library isn't handling that well and is just listing
> everything with no mechanism of max buffer size.  Krish please try
> with the largest heap you can and let us know what you see.
>
> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>
> On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <[hidden email]> wrote:
>> moving dev to bcc
>>
>> Yes I believe the issue here is that FetchS3 doesn't do chunked
>> transfers and so is loading all into memory.  I've not verified this
>> in the code yet but it seems quite likely.  Krish if you can verify
>> that going with a larger heap gets you in the game can you please file
>> a JIRA.
>>
>> Thanks
>> Joe
>>
>> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]> wrote:
>>> Hello,
>>>
>>> Are you running with all of the default settings?
>>>
>>> If so you would probably want to try increasing the memory settings in
>>> conf/bootstrap.conf.
>>>
>>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>>
>>> -Bryan
>>>
>>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I have very simple data flow, where I need to move s3 data from one bucket
>>>> in one account to another bucket under another account. I have attached my
>>>> processor configuration.
>>>>
>>>>
>>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>>
>>>> I am very new to NiFi and trying ot get few of the use cases going. I need
>>>> help from the community.
>>>>
>>>> Thanks again
>>>>
>>>> Rai
>>>>
>>>>
>>>>
>>>

Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Pierre Villard
Quick remark: the fix has also been merged in master and will be in release 1.1.0.

Pierre

2016-10-28 15:22 GMT+02:00 Gop Krr <[hidden email]>:
Thanks Adam. I will try 0.7.1 and update the community on the outcome. If it works then I can create a patch for 1.x
Thanks
Rai

On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <[hidden email]> wrote:
Hey All,

I believe OP is running into a bug fixed here:
https://issues.apache.org/jira/browse/NIFI-2631

Basically, ListS3 attempts to commit all the files it finds
(potentially 100k+) at once, rather than in batches. NIFI-2631
addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
a 1.x release.

Cheers,
Adam


On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <[hidden email]> wrote:
> Looking at this line [1] makes me think the FetchS3 processor is
> properly streaming the bytes directly to the content repository.
>
> Looking at the screenshot showing nothing out of the ListS3 processor
> makes me think the bucket has so many things in it that the processor
> or associated library isn't handling that well and is just listing
> everything with no mechanism of max buffer size.  Krish please try
> with the largest heap you can and let us know what you see.
>
> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>
> On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <[hidden email]> wrote:
>> moving dev to bcc
>>
>> Yes I believe the issue here is that FetchS3 doesn't do chunked
>> transfers and so is loading all into memory.  I've not verified this
>> in the code yet but it seems quite likely.  Krish if you can verify
>> that going with a larger heap gets you in the game can you please file
>> a JIRA.
>>
>> Thanks
>> Joe
>>
>> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]> wrote:
>>> Hello,
>>>
>>> Are you running with all of the default settings?
>>>
>>> If so you would probably want to try increasing the memory settings in
>>> conf/bootstrap.conf.
>>>
>>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>>
>>> -Bryan
>>>
>>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I have very simple data flow, where I need to move s3 data from one bucket
>>>> in one account to another bucket under another account. I have attached my
>>>> processor configuration.
>>>>
>>>>
>>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>>
>>>> I am very new to NiFi and trying ot get few of the use cases going. I need
>>>> help from the community.
>>>>
>>>> Thanks again
>>>>
>>>> Rai
>>>>
>>>>
>>>>
>>>


Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Gop Krr
Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching to 0.71.  Now it is able to list the files from buckets and create those files in the another bucket. But write is not happening and I am getting the permission issue ( I have attached below for the reference) Could this be the setting of the buckets or it has more to do with the access key. All the files which are creaetd in the new bucket are of 0 byte.
Thanks
Rai

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed to retrieve S3 Object for StandardFlowFileRecord[uuid=yyyyy,claim=,offset=0,name=xxxxx.gz,size=0]; routing to failure: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxxxxxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object

com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 0F34E71C0697B1D8)

        at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595) ~[aws-java-sdk-s3-1.10.32.jar:na]

        at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116) ~[aws-java-sdk-s3-1.10.32.jar:na]

        at org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106) ~[nifi-aws-processors-0.7.1.jar:0.7.1]

        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) [nifi-framework-core-0.7.1.jar:0.7.1]

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]

        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101]

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101]

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101]

        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]


On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard <[hidden email]> wrote:
Quick remark: the fix has also been merged in master and will be in release 1.1.0.

Pierre

2016-10-28 15:22 GMT+02:00 Gop Krr <[hidden email]>:
Thanks Adam. I will try 0.7.1 and update the community on the outcome. If it works then I can create a patch for 1.x
Thanks
Rai

On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <[hidden email]> wrote:
Hey All,

I believe OP is running into a bug fixed here:
https://issues.apache.org/jira/browse/NIFI-2631

Basically, ListS3 attempts to commit all the files it finds
(potentially 100k+) at once, rather than in batches. NIFI-2631
addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
a 1.x release.

Cheers,
Adam


On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <[hidden email]> wrote:
> Looking at this line [1] makes me think the FetchS3 processor is
> properly streaming the bytes directly to the content repository.
>
> Looking at the screenshot showing nothing out of the ListS3 processor
> makes me think the bucket has so many things in it that the processor
> or associated library isn't handling that well and is just listing
> everything with no mechanism of max buffer size.  Krish please try
> with the largest heap you can and let us know what you see.
>
> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>
> On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <[hidden email]> wrote:
>> moving dev to bcc
>>
>> Yes I believe the issue here is that FetchS3 doesn't do chunked
>> transfers and so is loading all into memory.  I've not verified this
>> in the code yet but it seems quite likely.  Krish if you can verify
>> that going with a larger heap gets you in the game can you please file
>> a JIRA.
>>
>> Thanks
>> Joe
>>
>> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]> wrote:
>>> Hello,
>>>
>>> Are you running with all of the default settings?
>>>
>>> If so you would probably want to try increasing the memory settings in
>>> conf/bootstrap.conf.
>>>
>>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>>
>>> -Bryan
>>>
>>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I have very simple data flow, where I need to move s3 data from one bucket
>>>> in one account to another bucket under another account. I have attached my
>>>> processor configuration.
>>>>
>>>>
>>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>>
>>>> I am very new to NiFi and trying ot get few of the use cases going. I need
>>>> help from the community.
>>>>
>>>> Thanks again
>>>>
>>>> Rai
>>>>
>>>>
>>>>
>>>



Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Gop Krr
This is how my nifi flow looks like.

On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <[hidden email]> wrote:
Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching to 0.71.  Now it is able to list the files from buckets and create those files in the another bucket. But write is not happening and I am getting the permission issue ( I have attached below for the reference) Could this be the setting of the buckets or it has more to do with the access key. All the files which are creaetd in the new bucket are of 0 byte.
Thanks
Rai

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed to retrieve S3 Object for StandardFlowFileRecord[uuid=yyyyy,claim=,offset=0,name=xxxxx.gz,size=0]; routing to failure: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxxxxxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object

com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 0F34E71C0697B1D8)

        at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595) ~[aws-java-sdk-s3-1.10.32.jar:na]

        at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116) ~[aws-java-sdk-s3-1.10.32.jar:na]

        at org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106) ~[nifi-aws-processors-0.7.1.jar:0.7.1]

        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) [nifi-framework-core-0.7.1.jar:0.7.1]

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]

        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101]

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101]

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101]

        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]


On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard <[hidden email]> wrote:
Quick remark: the fix has also been merged in master and will be in release 1.1.0.

Pierre

2016-10-28 15:22 GMT+02:00 Gop Krr <[hidden email]>:
Thanks Adam. I will try 0.7.1 and update the community on the outcome. If it works then I can create a patch for 1.x
Thanks
Rai

On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <[hidden email]> wrote:
Hey All,

I believe OP is running into a bug fixed here:
https://issues.apache.org/jira/browse/NIFI-2631

Basically, ListS3 attempts to commit all the files it finds
(potentially 100k+) at once, rather than in batches. NIFI-2631
addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
a 1.x release.

Cheers,
Adam


On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <[hidden email]> wrote:
> Looking at this line [1] makes me think the FetchS3 processor is
> properly streaming the bytes directly to the content repository.
>
> Looking at the screenshot showing nothing out of the ListS3 processor
> makes me think the bucket has so many things in it that the processor
> or associated library isn't handling that well and is just listing
> everything with no mechanism of max buffer size.  Krish please try
> with the largest heap you can and let us know what you see.
>
> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>
> On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <[hidden email]> wrote:
>> moving dev to bcc
>>
>> Yes I believe the issue here is that FetchS3 doesn't do chunked
>> transfers and so is loading all into memory.  I've not verified this
>> in the code yet but it seems quite likely.  Krish if you can verify
>> that going with a larger heap gets you in the game can you please file
>> a JIRA.
>>
>> Thanks
>> Joe
>>
>> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]> wrote:
>>> Hello,
>>>
>>> Are you running with all of the default settings?
>>>
>>> If so you would probably want to try increasing the memory settings in
>>> conf/bootstrap.conf.
>>>
>>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>>
>>> -Bryan
>>>
>>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I have very simple data flow, where I need to move s3 data from one bucket
>>>> in one account to another bucket under another account. I have attached my
>>>> processor configuration.
>>>>
>>>>
>>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>>
>>>> I am very new to NiFi and trying ot get few of the use cases going. I need
>>>> help from the community.
>>>>
>>>> Thanks again
>>>>
>>>> Rai
>>>>
>>>>
>>>>
>>>





Screen Shot 2016-10-28 at 10.00.44 AM.png (209K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

James Wing
From the screenshot and the error message, I interpret the sequence of events to be something like this:

1.) ListS3 succeeds and generates flowfiles with attributes referencing S3 objects, but no content (0 bytes)
2.) FetchS3Object fails to pull the S3 object content with an Access Denied error, but the failed flowfiles are routed on to PutS3Object (35,179 files / 0 bytes in the "putconnector" queue)
3.) PutS3Object is succeeding, writing the 0 byte content from ListS3

I recommend a couple thing for FetchS3Object:

* Only allow the "success" relationship to continue to PutS3Object.  Separate the "failure" relationship to either loop back to FetchS3Object or go to a LogAttibute processor, or other handling path.
* It looks like the permissions aren't working, you might want to double-check the access keys or try a sample file with the AWS CLI.

Thanks,

James


On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr <[hidden email]> wrote:
This is how my nifi flow looks like.

On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <[hidden email]> wrote:
Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching to 0.71.  Now it is able to list the files from buckets and create those files in the another bucket. But write is not happening and I am getting the permission issue ( I have attached below for the reference) Could this be the setting of the buckets or it has more to do with the access key. All the files which are creaetd in the new bucket are of 0 byte.
Thanks
Rai

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed to retrieve S3 Object for StandardFlowFileRecord[uuid=yyyyy,claim=,offset=0,name=xxxxx.gz,size=0]; routing to failure: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxxxxxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object

com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 0F34E71C0697B1D8)

        at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595) ~[aws-java-sdk-s3-1.10.32.jar:na]

        at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116) ~[aws-java-sdk-s3-1.10.32.jar:na]

        at org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106) ~[nifi-aws-processors-0.7.1.jar:0.7.1]

        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) [nifi-framework-core-0.7.1.jar:0.7.1]

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]

        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101]

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101]

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101]

        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]


On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard <[hidden email]> wrote:
Quick remark: the fix has also been merged in master and will be in release 1.1.0.

Pierre

2016-10-28 15:22 GMT+02:00 Gop Krr <[hidden email]>:
Thanks Adam. I will try 0.7.1 and update the community on the outcome. If it works then I can create a patch for 1.x
Thanks
Rai

On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <[hidden email]> wrote:
Hey All,

I believe OP is running into a bug fixed here:
https://issues.apache.org/jira/browse/NIFI-2631

Basically, ListS3 attempts to commit all the files it finds
(potentially 100k+) at once, rather than in batches. NIFI-2631
addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
a 1.x release.

Cheers,
Adam


On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <[hidden email]> wrote:
> Looking at this line [1] makes me think the FetchS3 processor is
> properly streaming the bytes directly to the content repository.
>
> Looking at the screenshot showing nothing out of the ListS3 processor
> makes me think the bucket has so many things in it that the processor
> or associated library isn't handling that well and is just listing
> everything with no mechanism of max buffer size.  Krish please try
> with the largest heap you can and let us know what you see.
>
> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>
> On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <[hidden email]> wrote:
>> moving dev to bcc
>>
>> Yes I believe the issue here is that FetchS3 doesn't do chunked
>> transfers and so is loading all into memory.  I've not verified this
>> in the code yet but it seems quite likely.  Krish if you can verify
>> that going with a larger heap gets you in the game can you please file
>> a JIRA.
>>
>> Thanks
>> Joe
>>
>> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]> wrote:
>>> Hello,
>>>
>>> Are you running with all of the default settings?
>>>
>>> If so you would probably want to try increasing the memory settings in
>>> conf/bootstrap.conf.
>>>
>>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>>
>>> -Bryan
>>>
>>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I have very simple data flow, where I need to move s3 data from one bucket
>>>> in one account to another bucket under another account. I have attached my
>>>> processor configuration.
>>>>
>>>>
>>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>>
>>>> I am very new to NiFi and trying ot get few of the use cases going. I need
>>>> help from the community.
>>>>
>>>> Thanks again
>>>>
>>>> Rai
>>>>
>>>>
>>>>
>>>





Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Gop Krr
Thanks James.. I am looking into permission issue and update the thread. I will also make the changes as you per your recommendation.

On Fri, Oct 28, 2016 at 10:23 AM, James Wing <[hidden email]> wrote:
From the screenshot and the error message, I interpret the sequence of events to be something like this:

1.) ListS3 succeeds and generates flowfiles with attributes referencing S3 objects, but no content (0 bytes)
2.) FetchS3Object fails to pull the S3 object content with an Access Denied error, but the failed flowfiles are routed on to PutS3Object (35,179 files / 0 bytes in the "putconnector" queue)
3.) PutS3Object is succeeding, writing the 0 byte content from ListS3

I recommend a couple thing for FetchS3Object:

* Only allow the "success" relationship to continue to PutS3Object.  Separate the "failure" relationship to either loop back to FetchS3Object or go to a LogAttibute processor, or other handling path.
* It looks like the permissions aren't working, you might want to double-check the access keys or try a sample file with the AWS CLI.

Thanks,

James


On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr <[hidden email]> wrote:
This is how my nifi flow looks like.

On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <[hidden email]> wrote:
Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching to 0.71.  Now it is able to list the files from buckets and create those files in the another bucket. But write is not happening and I am getting the permission issue ( I have attached below for the reference) Could this be the setting of the buckets or it has more to do with the access key. All the files which are creaetd in the new bucket are of 0 byte.
Thanks
Rai

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed to retrieve S3 Object for StandardFlowFileRecord[uuid=yyyyy,claim=,offset=0,name=xxxxx.gz,size=0]; routing to failure: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxxxxxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object

com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 0F34E71C0697B1D8)

        at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595) ~[aws-java-sdk-s3-1.10.32.jar:na]

        at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116) ~[aws-java-sdk-s3-1.10.32.jar:na]

        at org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106) ~[nifi-aws-processors-0.7.1.jar:0.7.1]

        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) [nifi-framework-core-0.7.1.jar:0.7.1]

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]

        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101]

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101]

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101]

        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]


On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard <[hidden email]> wrote:
Quick remark: the fix has also been merged in master and will be in release 1.1.0.

Pierre

2016-10-28 15:22 GMT+02:00 Gop Krr <[hidden email]>:
Thanks Adam. I will try 0.7.1 and update the community on the outcome. If it works then I can create a patch for 1.x
Thanks
Rai

On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <[hidden email]> wrote:
Hey All,

I believe OP is running into a bug fixed here:
https://issues.apache.org/jira/browse/NIFI-2631

Basically, ListS3 attempts to commit all the files it finds
(potentially 100k+) at once, rather than in batches. NIFI-2631
addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
a 1.x release.

Cheers,
Adam


On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <[hidden email]> wrote:
> Looking at this line [1] makes me think the FetchS3 processor is
> properly streaming the bytes directly to the content repository.
>
> Looking at the screenshot showing nothing out of the ListS3 processor
> makes me think the bucket has so many things in it that the processor
> or associated library isn't handling that well and is just listing
> everything with no mechanism of max buffer size.  Krish please try
> with the largest heap you can and let us know what you see.
>
> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>
> On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <[hidden email]> wrote:
>> moving dev to bcc
>>
>> Yes I believe the issue here is that FetchS3 doesn't do chunked
>> transfers and so is loading all into memory.  I've not verified this
>> in the code yet but it seems quite likely.  Krish if you can verify
>> that going with a larger heap gets you in the game can you please file
>> a JIRA.
>>
>> Thanks
>> Joe
>>
>> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]> wrote:
>>> Hello,
>>>
>>> Are you running with all of the default settings?
>>>
>>> If so you would probably want to try increasing the memory settings in
>>> conf/bootstrap.conf.
>>>
>>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>>
>>> -Bryan
>>>
>>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I have very simple data flow, where I need to move s3 data from one bucket
>>>> in one account to another bucket under another account. I have attached my
>>>> processor configuration.
>>>>
>>>>
>>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>>
>>>> I am very new to NiFi and trying ot get few of the use cases going. I need
>>>> help from the community.
>>>>
>>>> Thanks again
>>>>
>>>> Rai
>>>>
>>>>
>>>>
>>>






Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Gop Krr
James, permission issue got resolved. I still don't see any write.

On Fri, Oct 28, 2016 at 10:34 AM, Gop Krr <[hidden email]> wrote:
Thanks James.. I am looking into permission issue and update the thread. I will also make the changes as you per your recommendation.

On Fri, Oct 28, 2016 at 10:23 AM, James Wing <[hidden email]> wrote:
From the screenshot and the error message, I interpret the sequence of events to be something like this:

1.) ListS3 succeeds and generates flowfiles with attributes referencing S3 objects, but no content (0 bytes)
2.) FetchS3Object fails to pull the S3 object content with an Access Denied error, but the failed flowfiles are routed on to PutS3Object (35,179 files / 0 bytes in the "putconnector" queue)
3.) PutS3Object is succeeding, writing the 0 byte content from ListS3

I recommend a couple thing for FetchS3Object:

* Only allow the "success" relationship to continue to PutS3Object.  Separate the "failure" relationship to either loop back to FetchS3Object or go to a LogAttibute processor, or other handling path.
* It looks like the permissions aren't working, you might want to double-check the access keys or try a sample file with the AWS CLI.

Thanks,

James


On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr <[hidden email]> wrote:
This is how my nifi flow looks like.

On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <[hidden email]> wrote:
Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching to 0.71.  Now it is able to list the files from buckets and create those files in the another bucket. But write is not happening and I am getting the permission issue ( I have attached below for the reference) Could this be the setting of the buckets or it has more to do with the access key. All the files which are creaetd in the new bucket are of 0 byte.
Thanks
Rai

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed to retrieve S3 Object for StandardFlowFileRecord[uuid=yyyyy,claim=,offset=0,name=xxxxx.gz,size=0]; routing to failure: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxxxxxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object

com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 0F34E71C0697B1D8)

        at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317) ~[aws-java-sdk-core-1.10.32.jar:na]

        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595) ~[aws-java-sdk-s3-1.10.32.jar:na]

        at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116) ~[aws-java-sdk-s3-1.10.32.jar:na]

        at org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106) ~[nifi-aws-processors-0.7.1.jar:0.7.1]

        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.1.jar:0.7.1]

        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) [nifi-framework-core-0.7.1.jar:0.7.1]

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]

        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101]

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101]

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101]

        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]


On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard <[hidden email]> wrote:
Quick remark: the fix has also been merged in master and will be in release 1.1.0.

Pierre

2016-10-28 15:22 GMT+02:00 Gop Krr <[hidden email]>:
Thanks Adam. I will try 0.7.1 and update the community on the outcome. If it works then I can create a patch for 1.x
Thanks
Rai

On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <[hidden email]> wrote:
Hey All,

I believe OP is running into a bug fixed here:
https://issues.apache.org/jira/browse/NIFI-2631

Basically, ListS3 attempts to commit all the files it finds
(potentially 100k+) at once, rather than in batches. NIFI-2631
addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
a 1.x release.

Cheers,
Adam


On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <[hidden email]> wrote:
> Looking at this line [1] makes me think the FetchS3 processor is
> properly streaming the bytes directly to the content repository.
>
> Looking at the screenshot showing nothing out of the ListS3 processor
> makes me think the bucket has so many things in it that the processor
> or associated library isn't handling that well and is just listing
> everything with no mechanism of max buffer size.  Krish please try
> with the largest heap you can and let us know what you see.
>
> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>
> On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <[hidden email]> wrote:
>> moving dev to bcc
>>
>> Yes I believe the issue here is that FetchS3 doesn't do chunked
>> transfers and so is loading all into memory.  I've not verified this
>> in the code yet but it seems quite likely.  Krish if you can verify
>> that going with a larger heap gets you in the game can you please file
>> a JIRA.
>>
>> Thanks
>> Joe
>>
>> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]> wrote:
>>> Hello,
>>>
>>> Are you running with all of the default settings?
>>>
>>> If so you would probably want to try increasing the memory settings in
>>> conf/bootstrap.conf.
>>>
>>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>>
>>> -Bryan
>>>
>>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I have very simple data flow, where I need to move s3 data from one bucket
>>>> in one account to another bucket under another account. I have attached my
>>>> processor configuration.
>>>>
>>>>
>>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>>
>>>> I am very new to NiFi and trying ot get few of the use cases going. I need
>>>> help from the community.
>>>>
>>>> Thanks again
>>>>
>>>> Rai
>>>>
>>>>
>>>>
>>>








Screen Shot 2016-10-28 at 11.34.22 AM.png (149K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Joe Witt
Krish,

Did you ever get past this?

Thanks
Joe

On Fri, Oct 28, 2016 at 2:36 PM, Gop Krr <[hidden email]> wrote:

> James, permission issue got resolved. I still don't see any write.
>
> On Fri, Oct 28, 2016 at 10:34 AM, Gop Krr <[hidden email]> wrote:
>>
>> Thanks James.. I am looking into permission issue and update the thread. I
>> will also make the changes as you per your recommendation.
>>
>> On Fri, Oct 28, 2016 at 10:23 AM, James Wing <[hidden email]> wrote:
>>>
>>> From the screenshot and the error message, I interpret the sequence of
>>> events to be something like this:
>>>
>>> 1.) ListS3 succeeds and generates flowfiles with attributes referencing
>>> S3 objects, but no content (0 bytes)
>>> 2.) FetchS3Object fails to pull the S3 object content with an Access
>>> Denied error, but the failed flowfiles are routed on to PutS3Object (35,179
>>> files / 0 bytes in the "putconnector" queue)
>>> 3.) PutS3Object is succeeding, writing the 0 byte content from ListS3
>>>
>>> I recommend a couple thing for FetchS3Object:
>>>
>>> * Only allow the "success" relationship to continue to PutS3Object.
>>> Separate the "failure" relationship to either loop back to FetchS3Object or
>>> go to a LogAttibute processor, or other handling path.
>>> * It looks like the permissions aren't working, you might want to
>>> double-check the access keys or try a sample file with the AWS CLI.
>>>
>>> Thanks,
>>>
>>> James
>>>
>>>
>>> On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr <[hidden email]> wrote:
>>>>
>>>> This is how my nifi flow looks like.
>>>>
>>>> On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <[hidden email]> wrote:
>>>>>
>>>>> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching
>>>>> to 0.71.  Now it is able to list the files from buckets and create those
>>>>> files in the another bucket. But write is not happening and I am getting the
>>>>> permission issue ( I have attached below for the reference) Could this be
>>>>> the setting of the buckets or it has more to do with the access key. All the
>>>>> files which are creaetd in the new bucket are of 0 byte.
>>>>> Thanks
>>>>> Rai
>>>>>
>>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>>>> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed to
>>>>> retrieve S3 Object for
>>>>> StandardFlowFileRecord[uuid=yyyyy,claim=,offset=0,name=xxxxx.gz,size=0];
>>>>> routing to failure: com.amazonaws.services.s3.model.AmazonS3Exception:
>>>>> Access Denied (Service: Amazon S3; Status Code: 403; Error Code:
>>>>> AccessDenied; Request ID: xxxxxxx), S3 Extended Request ID:
>>>>> lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=
>>>>>
>>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>>>> o.a.nifi.processors.aws.s3.FetchS3Object
>>>>>
>>>>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>>>>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID:
>>>>> 0F34E71C0697B1D8)
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
>>>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116)
>>>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106)
>>>>> ~[nifi-aws-processors-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>>>>> [nifi-api-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
>>>>>
>>>>>
>>>>> On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard
>>>>> <[hidden email]> wrote:
>>>>>>
>>>>>> Quick remark: the fix has also been merged in master and will be in
>>>>>> release 1.1.0.
>>>>>>
>>>>>> Pierre
>>>>>>
>>>>>> 2016-10-28 15:22 GMT+02:00 Gop Krr <[hidden email]>:
>>>>>>>
>>>>>>> Thanks Adam. I will try 0.7.1 and update the community on the
>>>>>>> outcome. If it works then I can create a patch for 1.x
>>>>>>> Thanks
>>>>>>> Rai
>>>>>>>
>>>>>>> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <[hidden email]>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hey All,
>>>>>>>>
>>>>>>>> I believe OP is running into a bug fixed here:
>>>>>>>> https://issues.apache.org/jira/browse/NIFI-2631
>>>>>>>>
>>>>>>>> Basically, ListS3 attempts to commit all the files it finds
>>>>>>>> (potentially 100k+) at once, rather than in batches. NIFI-2631
>>>>>>>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet
>>>>>>>> in
>>>>>>>> a 1.x release.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Adam
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <[hidden email]>
>>>>>>>> wrote:
>>>>>>>> > Looking at this line [1] makes me think the FetchS3 processor is
>>>>>>>> > properly streaming the bytes directly to the content repository.
>>>>>>>> >
>>>>>>>> > Looking at the screenshot showing nothing out of the ListS3
>>>>>>>> > processor
>>>>>>>> > makes me think the bucket has so many things in it that the
>>>>>>>> > processor
>>>>>>>> > or associated library isn't handling that well and is just listing
>>>>>>>> > everything with no mechanism of max buffer size.  Krish please try
>>>>>>>> > with the largest heap you can and let us know what you see.
>>>>>>>> >
>>>>>>>> > [1]
>>>>>>>> > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>>>>>>>> >
>>>>>>>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <[hidden email]>
>>>>>>>> > wrote:
>>>>>>>> >> moving dev to bcc
>>>>>>>> >>
>>>>>>>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
>>>>>>>> >> transfers and so is loading all into memory.  I've not verified
>>>>>>>> >> this
>>>>>>>> >> in the code yet but it seems quite likely.  Krish if you can
>>>>>>>> >> verify
>>>>>>>> >> that going with a larger heap gets you in the game can you please
>>>>>>>> >> file
>>>>>>>> >> a JIRA.
>>>>>>>> >>
>>>>>>>> >> Thanks
>>>>>>>> >> Joe
>>>>>>>> >>
>>>>>>>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]>
>>>>>>>> >> wrote:
>>>>>>>> >>> Hello,
>>>>>>>> >>>
>>>>>>>> >>> Are you running with all of the default settings?
>>>>>>>> >>>
>>>>>>>> >>> If so you would probably want to try increasing the memory
>>>>>>>> >>> settings in
>>>>>>>> >>> conf/bootstrap.conf.
>>>>>>>> >>>
>>>>>>>> >>> They default to 512mb, you may want to try bumping it up to
>>>>>>>> >>> 1024mb.
>>>>>>>> >>>
>>>>>>>> >>> -Bryan
>>>>>>>> >>>
>>>>>>>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]>
>>>>>>>> >>> wrote:
>>>>>>>> >>>>
>>>>>>>> >>>> Hi All,
>>>>>>>> >>>>
>>>>>>>> >>>> I have very simple data flow, where I need to move s3 data from
>>>>>>>> >>>> one bucket
>>>>>>>> >>>> in one account to another bucket under another account. I have
>>>>>>>> >>>> attached my
>>>>>>>> >>>> processor configuration.
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>>>>>> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread
>>>>>>>> >>>> Thread[Flow Service
>>>>>>>> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap
>>>>>>>> >>>> space
>>>>>>>> >>>>
>>>>>>>> >>>> I am very new to NiFi and trying ot get few of the use cases
>>>>>>>> >>>> going. I need
>>>>>>>> >>>> help from the community.
>>>>>>>> >>>>
>>>>>>>> >>>> Thanks again
>>>>>>>> >>>>
>>>>>>>> >>>> Rai
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: nifi is running out of memory

Gop Krr
Thanks Joe for checking. Yes, I got past it and I was successfully able to demo it to the team :) Now, the next challenge is to drive the performance out of nifi for the high throughput. 

On Mon, Oct 31, 2016 at 7:08 PM, Joe Witt <[hidden email]> wrote:
Krish,

Did you ever get past this?

Thanks
Joe

On Fri, Oct 28, 2016 at 2:36 PM, Gop Krr <[hidden email]> wrote:
> James, permission issue got resolved. I still don't see any write.
>
> On Fri, Oct 28, 2016 at 10:34 AM, Gop Krr <[hidden email]> wrote:
>>
>> Thanks James.. I am looking into permission issue and update the thread. I
>> will also make the changes as you per your recommendation.
>>
>> On Fri, Oct 28, 2016 at 10:23 AM, James Wing <[hidden email]> wrote:
>>>
>>> From the screenshot and the error message, I interpret the sequence of
>>> events to be something like this:
>>>
>>> 1.) ListS3 succeeds and generates flowfiles with attributes referencing
>>> S3 objects, but no content (0 bytes)
>>> 2.) FetchS3Object fails to pull the S3 object content with an Access
>>> Denied error, but the failed flowfiles are routed on to PutS3Object (35,179
>>> files / 0 bytes in the "putconnector" queue)
>>> 3.) PutS3Object is succeeding, writing the 0 byte content from ListS3
>>>
>>> I recommend a couple thing for FetchS3Object:
>>>
>>> * Only allow the "success" relationship to continue to PutS3Object.
>>> Separate the "failure" relationship to either loop back to FetchS3Object or
>>> go to a LogAttibute processor, or other handling path.
>>> * It looks like the permissions aren't working, you might want to
>>> double-check the access keys or try a sample file with the AWS CLI.
>>>
>>> Thanks,
>>>
>>> James
>>>
>>>
>>> On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr <[hidden email]> wrote:
>>>>
>>>> This is how my nifi flow looks like.
>>>>
>>>> On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <[hidden email]> wrote:
>>>>>
>>>>> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching
>>>>> to 0.71.  Now it is able to list the files from buckets and create those
>>>>> files in the another bucket. But write is not happening and I am getting the
>>>>> permission issue ( I have attached below for the reference) Could this be
>>>>> the setting of the buckets or it has more to do with the access key. All the
>>>>> files which are creaetd in the new bucket are of 0 byte.
>>>>> Thanks
>>>>> Rai
>>>>>
>>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>>>> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed to
>>>>> retrieve S3 Object for
>>>>> StandardFlowFileRecord[uuid=yyyyy,claim=,offset=0,name=xxxxx.gz,size=0];
>>>>> routing to failure: com.amazonaws.services.s3.model.AmazonS3Exception:
>>>>> Access Denied (Service: Amazon S3; Status Code: 403; Error Code:
>>>>> AccessDenied; Request ID: xxxxxxx), S3 Extended Request ID:
>>>>> lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=
>>>>>
>>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>>>> o.a.nifi.processors.aws.s3.FetchS3Object
>>>>>
>>>>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>>>>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID:
>>>>> 0F34E71C0697B1D8)
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
>>>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116)
>>>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106)
>>>>> ~[nifi-aws-processors-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>>>>> [nifi-api-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
>>>>>
>>>>>
>>>>> On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard
>>>>> <[hidden email]> wrote:
>>>>>>
>>>>>> Quick remark: the fix has also been merged in master and will be in
>>>>>> release 1.1.0.
>>>>>>
>>>>>> Pierre
>>>>>>
>>>>>> 2016-10-28 15:22 GMT+02:00 Gop Krr <[hidden email]>:
>>>>>>>
>>>>>>> Thanks Adam. I will try 0.7.1 and update the community on the
>>>>>>> outcome. If it works then I can create a patch for 1.x
>>>>>>> Thanks
>>>>>>> Rai
>>>>>>>
>>>>>>> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <[hidden email]>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hey All,
>>>>>>>>
>>>>>>>> I believe OP is running into a bug fixed here:
>>>>>>>> https://issues.apache.org/jira/browse/NIFI-2631
>>>>>>>>
>>>>>>>> Basically, ListS3 attempts to commit all the files it finds
>>>>>>>> (potentially 100k+) at once, rather than in batches. NIFI-2631
>>>>>>>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet
>>>>>>>> in
>>>>>>>> a 1.x release.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Adam
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <[hidden email]>
>>>>>>>> wrote:
>>>>>>>> > Looking at this line [1] makes me think the FetchS3 processor is
>>>>>>>> > properly streaming the bytes directly to the content repository.
>>>>>>>> >
>>>>>>>> > Looking at the screenshot showing nothing out of the ListS3
>>>>>>>> > processor
>>>>>>>> > makes me think the bucket has so many things in it that the
>>>>>>>> > processor
>>>>>>>> > or associated library isn't handling that well and is just listing
>>>>>>>> > everything with no mechanism of max buffer size.  Krish please try
>>>>>>>> > with the largest heap you can and let us know what you see.
>>>>>>>> >
>>>>>>>> > [1]
>>>>>>>> > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>>>>>>>> >
>>>>>>>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <[hidden email]>
>>>>>>>> > wrote:
>>>>>>>> >> moving dev to bcc
>>>>>>>> >>
>>>>>>>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
>>>>>>>> >> transfers and so is loading all into memory.  I've not verified
>>>>>>>> >> this
>>>>>>>> >> in the code yet but it seems quite likely.  Krish if you can
>>>>>>>> >> verify
>>>>>>>> >> that going with a larger heap gets you in the game can you please
>>>>>>>> >> file
>>>>>>>> >> a JIRA.
>>>>>>>> >>
>>>>>>>> >> Thanks
>>>>>>>> >> Joe
>>>>>>>> >>
>>>>>>>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <[hidden email]>
>>>>>>>> >> wrote:
>>>>>>>> >>> Hello,
>>>>>>>> >>>
>>>>>>>> >>> Are you running with all of the default settings?
>>>>>>>> >>>
>>>>>>>> >>> If so you would probably want to try increasing the memory
>>>>>>>> >>> settings in
>>>>>>>> >>> conf/bootstrap.conf.
>>>>>>>> >>>
>>>>>>>> >>> They default to 512mb, you may want to try bumping it up to
>>>>>>>> >>> 1024mb.
>>>>>>>> >>>
>>>>>>>> >>> -Bryan
>>>>>>>> >>>
>>>>>>>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <[hidden email]>
>>>>>>>> >>> wrote:
>>>>>>>> >>>>
>>>>>>>> >>>> Hi All,
>>>>>>>> >>>>
>>>>>>>> >>>> I have very simple data flow, where I need to move s3 data from
>>>>>>>> >>>> one bucket
>>>>>>>> >>>> in one account to another bucket under another account. I have
>>>>>>>> >>>> attached my
>>>>>>>> >>>> processor configuration.
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>>>>>> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread
>>>>>>>> >>>> Thread[Flow Service
>>>>>>>> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap
>>>>>>>> >>>> space
>>>>>>>> >>>>
>>>>>>>> >>>> I am very new to NiFi and trying ot get few of the use cases
>>>>>>>> >>>> going. I need
>>>>>>>> >>>> help from the community.
>>>>>>>> >>>>
>>>>>>>> >>>> Thanks again
>>>>>>>> >>>>
>>>>>>>> >>>> Rai
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>