logstash: Import comma separated string from MySQL into elastic search as an array


logstash: Import comma separated string from MySQL into elastic search as an array



I'm trying to insert a comma separated string (GROUP_CONCAT) into elasticsearch as an array datatype. As input I use JDBC and the output of the SQL query is like:


GROUP_CONCAT


+---------+-----------+------------+--------------------------+-------------+---------------------+---------+------------+----------+---------------------+-------------+---------+----------------------------------------+
| network | post_dbid | host_dbid | host_netid | post_netid | published | n_likes | n_comments | language | indexed | n_harvested | country | vrt |
+---------+-----------+------------+--------------------------+-------------+---------------------+---------+------------+----------+---------------------+-------------+---------+----------------------------------------+
| xxx | 2_xxx | 60480_xxx | xxxxxxxxxxxxxxxxxxxxxxxx | xxxxxxxxxxx | 2017-12-28 08:11:58 | 5 | 0 | en | 2018-05-30 00:00:00 | 0 | ID | Fitness,Well-being |
| xxx | 5_xxx | 98458_xxx | xxxxxxxxxxxxxxxxxxxxxxxx | xxxxxxxxxxx | 2016-09-01 11:59:14 | 2275 | 242 | ar | 2018-05-30 00:00:00 | 0 | SA | SmartPhones_Gadgets |
| xxx | 15_xxx | 50884_xxx | xxxxxxxxxxxxxxxxxxxxxxxx | xxxxxxxxxxx | 2018-04-23 16:36:10 | 0 | 0 | en | 2018-05-30 00:00:00 | 0 | EG | Fashion_Beauty |
| xxx | 21_xxx | 64118_xxx | xxxxxxxxxxxxxxxxxxxxxxxx | xxxxxxxxxxx | 2015-07-01 22:50:54 | 295 | 8 | pt | 2018-05-30 00:00:00 | 0 | BR | Nutrition |
| xxx | 24_xxx | 9767_xxx | xxxxxxxxxxxxxxxxxxxxxxxx | xxxxxxxxxxx | 2017-05-30 02:35:29 | 10 | 1 | en | 2018-06-18 15:32:57 | 0 | US | Health |
| xxx | 87_xxx | 44473_xxx | xxxxxxxxxxxxxxxxxxxxxxxx | xxxxxxxxxxx | 2017-01-08 23:02:52 | 7 | 0 | en | 2018-05-30 00:00:00 | 0 | US | Beverages |
| xxx | 99_xxx | 120198_xxx | xxxxxxxxxxxxxxxxxxxxxxxx | xxxxxxxxxxx | 2018-02-17 02:57:58 | 8 | 0 | en | 2018-05-30 00:00:00 | 0 | US | Food |
| xxx | 126_xxx | 50258_xxx | xxxxxxxxxxxxxxxxxxxxxxxx | xxxxxxxxxxx | 2018-03-22 09:16:25 | 1 | 0 | en | 2018-05-30 00:00:00 | 0 | IN | Health |
+---------+-----------+------------+--------------------------+-------------+---------------------+---------+------------+----------+---------------------+-------------+---------+----------------------------------------+



I used split from mutate plugin:


split


filter {
mutate {
split => { "vrt" => "," }
}
}



although, field was inserted as a comma separated string:


GET xxx/_search
{
"query": {
"terms": {
"_id": ["2_xxx"]
}
}
}



responce:


{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "xxx",
"_type": "doc",
"_id": "2_xxx",
"_score": 1,
"_source": {
"post_dbid": "2_xxx",
"host_dbid": "60480_xxx",
"host_netid": "xxxxxxxxxxxxxxxxxxxxxxxx",
"n_likes": 5,
"n_comments": 0,
"country": "ID",
"network": "xxx",
"indexed": "2018-05-30T00:00:00.000Z",
"n_harvested": 0,
"vrt": "Fitness,Well-being",
"@version": "1",
"post_netid": "xxxxxxxxxxx",
"@timestamp": "2018-06-27T15:47:24.370Z",
"language": "en",
"published": "2017-12-28T08:11:58.000Z"
}
}
]
}
}



My final goal is to insert vrt as array field and using kibana, to create visualizations. For example I want to create a counter on kibana and count how many document has the "Fitness" on vrt field.


vrt


vrt



ELK Version: 6.2.4




1 Answer
1



You could use ruby filter. Here's how I do it. I created a ruby method that splits comma separated string, trims, rejects empty elements and removes duplicates. You can then use the method for all comma separated string as below:


filter {
ruby{
code =>"

# method to split the supplied string by comma, trim whitespace and return an array
def mapStringToArray(strFieldValue)

#if string is not null, return array
if (strFieldValue != nil)
fieldArr = strFieldValue.split(',').map(&:strip).reject(&:empty?).uniq
return fieldArr
end

return #return empty array if string is nil
end

vrtArr = mapStringToArray(event.get('vrt'))
if vrtArr.length > 0
event.set('vrt', vrtArr)
end
"
}
}






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

PHP contact form sending but not receiving emails

PHP parse/syntax errors; and how to solve them?

iOS Top Alignment constraint based on screen (superview) height