Convert string into json string and parsing in R


Convert string into json string and parsing in R



I have a data with a column as json string:


reservation reasons
1592 [{"name"=>"jorge", "value"=>"MX"}, {"name"=>"Billing phone number", "value"=>"1123"}, {"name"=>"BillingCountry", "value"=>"USA"}]
1597 [{"name"=>"BillingAddress_Country", "value"=>"IN"}, {"name"=>"Billing phone number country code", "value"=>"IN"}, {"name"=>"Latest amount", "value"=>"583000000"}]



I want to parse the column as follows:


reservation name value
1592 jorge mx
1592 Billing phone number 1123
1592 BillingCountry USA
1597 BillingAddress_Country IN
1597 Billing phone number country code IN
1597 Latest amount 583000000



I am us
ing jsonlite in R. I am getting following error in my code:


data<-read.csv("data.csv")
json<-data$reasons
mydf <- fromJSON(json)
Error: Argument 'txt' must be a JSON string, URL or file.



Can anyone tell me where am I making mistakes? What modifications do I need to do? Many thanks in advance!





can you dput(head(dat))?
– Onyambu
Jul 2 at 20:57


dput(head(dat))





Are you sure this is JSON? It doesn't look that much like JSON. What's this =>? I think you need to reformat it to JSON first. I can show you how in an answer.
– Hack-R
Jul 2 at 21:05



=>





Sorry but I am not getting it. How dput(head(dat)) can solve the problem?
– user3642360
Jul 2 at 21:05





That is not a solution to the problem, that's how to reproducibly share your data. Please see the r tag description for more info on that.
– Hack-R
Jul 2 at 21:06


r





This doesn't appear to be valid JSON. Where did it come from? It looks like maybe a Ruby hash
– camille
Jul 2 at 21:50




2 Answers
2


dat%>%
mutate(reasons=str_split(gsub("[^=A-Za-z,0-9{} ]+","",reasons),"(?<=}),s*"))%>%
unnest()%>%
mutate(names=str_extract(reasons,"(?<=name=)[^,}]+"),
values=str_extract(reasons,"(?<=value=)[^,}]+"),
reasons=NULL)

reservation names values
1 1592 jorge MX
2 1592 Billing phone number 1123
3 1592 BillingCountry USA
4 1597 BillingAddressCountry IN
5 1597 Billing phone number country code IN
6 1597 Latest amount 583000000



with this code, if you need the email, just add email=str_extract.. etc etc


email=str_extract..





Thanks. Your code works for this subset of data. But I am getting error while trying to run the code on the whole data: Error: Duplicate identifiers for rows (333, 334) In addition: Warning messages: 1: Expected 2 pieces. Additional pieces discarded in 49 rows [121, 124, 129, 136, 141, 146, 155, 158, 163, 166, 169, 172, 177, 207, 215, 224, 236, 239, 251, 265, ...]. 2: Expected 2 pieces. Missing pieces filled with NA in 57 rows [206, 208, 213, 214, 216, 217, 225, 226, 227, 240, 252, 253, 254, 268, 270, 286, 287, 290, 292, 305, ...]. Can you please help me to identify the error?
– user3642360
Jul 2 at 22:30


NA





what do you have befor you spread?
– Onyambu
Jul 2 at 22:33


spread





data<-read.csv("sample_reason1.csv") data%>% mutate(reasons=strsplit(gsub("[^=A-Za-z,0-9 ]+","",reasons),"s*,s*"))%>% unnest()%>% separate(reasons,c("name","value"),sep="s*=s*")%>% mutate(id=rep(1:(n()/2),each=2))%>% spread(name,value) This is the code.
– user3642360
Jul 2 at 22:36





I have done the edit:
– Onyambu
Jul 2 at 22:41





Thanks. But I am not getting following error: Error in mutate_impl(.data, dots) : Column id must be length 516 (the number of rows) or one, not 56 In addition: Warning messages: 1: Expected 2 pieces. Additional pieces discarded in 49 rows [121, 124, 129, 136, 141, 146, 155, 158, 163, 166, 169, 172, 177, 207, 215, 224, 236, 239, 251, 265, ...]. 2: Expected 2 pieces. Missing pieces filled with NA in 57 rows [206, 208, 213, 214, 216, 217, 225, 226, 227, 240, 252, 253, 254, 268, 270, 286, 287, 290, 292, 305, ...].
– user3642360
Jul 2 at 22:45


id


NA



This doesn't look like normal JSON to me (or to fromJSON, which makes me feel a little better). Maybe it's some special case of it or something (?). Update: @camille identified it as Ruby Hash.


fromJSON



In any event, we can fix it:


reasons <- '{"name"=>"jorge", "value"=>"MX"}, {"name"=>"Billing phone number", "value"=>"1123"}, {"name"=>"BillingCountry", "value"=>"USA"}'

reasons <- gsub("=>", ":", reasons)
reasons <- gsub("[{}]", "", reasons)
reasons <- paste0("{",reasons,"}")

fromJSON(reasons)


$`name`
[1] "jorge"

$value
[1] "MX"

$name
[1] "Billing phone number"

$value
[1] "1123"

$name
[1] "BillingCountry"

$value
[1] "USA"





Thanks. I wrote the following code as you have mentioned: library(jsonlite) data<-read.csv("/Users/susmitaghosh/Downloads/sample_sift_reason.csv") dim(data) reasons<-data$sift_reasons reasons <- gsub("=>", ":", reasons) reasons <- gsub("[{}]", "", reasons) reasons <- paste0("{",reasons,"}") fromJSON(reasons). But I am still getting following error: Error: parse error: invalid object key (must be a string) {["name":"BillingAddress_Country (right here) ------^. Sorry for bad formatting. I don't know how to fix it in comment.
– user3642360
Jul 2 at 21:23






@user3642360 if it doesn't fit neatly in a comment, you can edit your original post and put additional code formatted in the question
– camille
Jul 2 at 21:45





@user3642360 I agree with camille's suggestion but also it sounds like my solution works for the data you provided, but not some other part of the data that we don't have. So we need the that data too. Further, since what you thought was JSON was JSON-like but not real JSON, we should look into where you got this data from. A better solution is to find out what format it's in and convert it directly from that instead of making into JSON.
– Hack-R
Jul 2 at 21:46






@Hack-R. Thank you so much. Your code is working. I did the modification like this: reasons<-as.character(data$reasons)
– user3642360
Jul 2 at 21:51





@user3642360 Perfect! If this works for you could you please click the green checkmark to mark it as the answer?
– Hack-R
Jul 2 at 21:52






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

PHP contact form sending but not receiving emails

Do graphics cards have individual ID by which single devices can be distinguished?

iOS Top Alignment constraint based on screen (superview) height