Provenance Query Improvements

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Provenance Query Improvements

Eric Secules
Hello everyone,

I am working on a tool to tell whether all processing has completed for a given input filename. Since it's possible for the flow to change the "filename" attribute, that is not a reliable way to get all the events related to an input file. My current solution involves recursively calling the API for each child flowfile within a flowfile. This is acceptable for small lineages, but not so much when a flowfile can be split into hundreds of children and has several thousand descendent flowfiles.

There are a couple of things that can be done to the provenanceAPI to make it friendlier.
  • The ability to query for a list of provenance events
  • Pagination for dealing with large responses
  • The ability to query for all descendants of a flowfile
  • The ability to filter by active flowfiles only
Thanks,
Eric