Host disconnected due to different workflow configuration

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Host disconnected due to different workflow configuration

Alessio Palma-2

Hello all,
I experienced "host out of the cluster which was no longer able to
join", log reports  configuration workflow has been changed and it's
different from the one running into the cluster.
Due to this issue there is not way to join again the cluster.
To resolve this I stopped the whole cluster and copied the same
configuration to every host. After the restart anything worked well.
Is here a good way to prevent flow changes when all the host into the
cluster are not connected ?

Reply | Threaded
Open this post in threaded view
|

Re: Host disconnected due to different workflow configuration

Bryan Bende
Hello,

In general NiFi does its best to prevent changes being made to the flow when one of the cluster nodes is down. For example, if you have a 3 node cluster and only 2 nodes are up, you can't change the flow.

When a request comes in to change the flow, lets say you drag a new processor on the graph, this is sent to one of the nodes which then does a two phase-commit with the rest of the nodes in the cluster.

The error message you got means that all the nodes responded successfully to the first phase, and on the second phase of the commit, one of the nodes encountered an error. 
At this point the change was applied to the other nodes, and the node with the error was purposely disconnected from the cluster because it is in an inconsistent state.

If possible, can you see what other errors happened in the log of that node before seeing "host out of cluster..."? because the real problem is there was some other issue on that node that caused it to fail.

-Bryan

On Wed, Nov 9, 2016 at 5:45 AM, Alessio Palma <[hidden email]> wrote:

Hello all,
I experienced "host out of the cluster which was no longer able to
join", log reports  configuration workflow has been changed and it's
different from the one running into the cluster.
Due to this issue there is not way to join again the cluster.
To resolve this I stopped the whole cluster and copied the same
configuration to every host. After the restart anything worked well.
Is here a good way to prevent flow changes when all the host into the
cluster are not connected ?


Reply | Threaded
Open this post in threaded view
|

Re: Host disconnected due to different workflow configuration

Alessio Palma-2

OK, understood it.
It's still a bit fragile.
Is there a way to force the adoption of a configuration from a node using the mouse  without to stop and start the server ?



From: Bryan Bende <[hidden email]>
Sent: Thursday, November 10, 2016 3:19:16 PM
To: [hidden email]
Subject: Re: Host disconnected due to different workflow configuration
 
Hello,

In general NiFi does its best to prevent changes being made to the flow when one of the cluster nodes is down. For example, if you have a 3 node cluster and only 2 nodes are up, you can't change the flow.

When a request comes in to change the flow, lets say you drag a new processor on the graph, this is sent to one of the nodes which then does a two phase-commit with the rest of the nodes in the cluster.

The error message you got means that all the nodes responded successfully to the first phase, and on the second phase of the commit, one of the nodes encountered an error. 
At this point the change was applied to the other nodes, and the node with the error was purposely disconnected from the cluster because it is in an inconsistent state.

If possible, can you see what other errors happened in the log of that node before seeing "host out of cluster..."? because the real problem is there was some other issue on that node that caused it to fail.

-Bryan

On Wed, Nov 9, 2016 at 5:45 AM, Alessio Palma <[hidden email]> wrote:

Hello all,
I experienced "host out of the cluster which was no longer able to
join", log reports  configuration workflow has been changed and it's
different from the one running into the cluster.
Due to this issue there is not way to join again the cluster.
To resolve this I stopped the whole cluster and copied the same
configuration to every host. After the restart anything worked well.
Is here a good way to prevent flow changes when all the host into the
cluster are not connected ?