Split a field in a CSV record to an array of strings

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Split a field in a CSV record to an array of strings

momo adc
Hi, I have a csv which looks like this:
field,another field,"aaa,bbb,ccc",yet another field

I want to ingest "aaa,bbb,ccc" as an array of strings ["aaa","bbb","ccc"],
I've tried to use convertRecord with a csv reader and an avro writer but it
seems that csv reader doesn't have the concept of arrays (didn't see a case
for handling an array type in the code).

I wanted to try queryRecord but it seems it can only handle regexes and
explicitly defining a field, since I don't know the size of the array I will
get (it can also be field,aaa,another field or field,,another field) I don't
think I can do it with SQL either.

Thanks,
momo.



--
Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Split a field in a CSV record to an array of strings

Mark Payne
I would recommend taking a look at JoltTransformRecord. Jolt has some capabilities for splitting strings into arrays.
You can see [1] as an example.

Thanks
-Mark

[1] https://stackoverflow.com/questions/58726817/conversion-of-string-to-array-in-jolt-transformation

On Jul 14, 2020, at 11:38 AM, momo adc <[hidden email]> wrote:

Hi, I have a csv which looks like this:
field,another field,"aaa,bbb,ccc",yet another field

I want to ingest "aaa,bbb,ccc" as an array of strings ["aaa","bbb","ccc"],
I've tried to use convertRecord with a csv reader and an avro writer but it
seems that csv reader doesn't have the concept of arrays (didn't see a case
for handling an array type in the code).

I wanted to try queryRecord but it seems it can only handle regexes and
explicitly defining a field, since I don't know the size of the array I will
get (it can also be field,aaa,another field or field,,another field) I don't
think I can do it with SQL either.

Thanks,
momo.



--
Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/