How to extract substring in Hive

Multi tool use
How to extract substring in Hive
I am having trouble trying to extract substring in Hive. The table I am working on has a column called referee_dict, showing the rank and corresponding players' IDs. For example, a record could look like this:
[('Bronze1', [2738653, 2738652, 2738655]), ('Bronze2', [2738653, 2738652]), ('Bronze3', ), ('Silver1', ), ('Silver2', ), ('Silver3', )
I am trying to find the players who have achieved bronze 2, so I want to extract [2738653, 2738652] from the list. I know it is pretty easy in Python, however, I looked up Hive's documentation but still do not know how to do it in sql/Hive. Any help would be appreciated!
Is this array column?
– saravanatn
Jul 3 at 6:49
1 Answer
1
Well I think I figured out a way.. however I don't know if it is the easiest one. Since it is a string, I am going to use regex to capture the substring after "Bronze1' [" and before next "]". The function I am going to use is
regexp_extract(string subject, string pattern, int index). Hope this helps if anyone has similar questions.
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
You can use REGEX_EXTRACT to extract anything from your data. You can search for Bronze2 in that case to extract your tuple.
– Piyush P
Jul 3 at 3:44