How is as.numeric used here?


How is as.numeric used here?



I've been trying to figure out how to mimic a piecewise linear regression model developed in the pricing software Emblem, using R. I did that using @Roland's answer in the below post.



https://stats.stackexchange.com/questions/61805/standard-error-of-slopes-in-piecewise-linear-regression-with-known-breakpoints



So to get the slopes, thanks to @Roland, I used the as.numeric((variable < X)) to get the slope of the second segment in the predictor variables.



What is going on here? Why does the "as.numeric" give me the correct answer? I can't find documentation on it and I would like to understand why this works.




1 Answer
1



It converts a boolean (TRUE / FALSE) value to numeric (1 / 0).


TRUE


FALSE


1


0



(The R-y name for boolean is "logical": is.logical(TRUE) returns TRUE.)


is.logical(TRUE)


TRUE



x < 10 # TRUE if x is less than 10, FALSE if x is 10 or more


x < 10 # TRUE if x is less than 10, FALSE if x is 10 or more



as.numeric(x<10) # 1 if x is less than 10, 0 if x is 10 or more


as.numeric(x<10) # 1 if x is less than 10, 0 if x is 10 or more



This being said, you don't really need an as.numeric there. What you could do instead is:


as.numeric


# will also work:
mod2 <- lm(y~I((x<9.6)*x)+(x<9.6)+I((x>=9.6)*x)+(x>=9.6)-1)



This version will use the boolean values directly -- these are converted implicitly to factors, and how a factor functions within lm is that it is converted into k-1 dichotomous variables where k is the number of levels. So that's why, if you use the code above, you'll see variable names like x < 9.6TRUE in the lm output.


lm


k-1


k


x < 9.6TRUE


lm



Then again, technically, as.numeric is a hack, and a more transparent way to do it may be something like ifelse(x<9.6,1,0). But hacks are not necessarily bad, so you might also prefer a hackier hack such as (x<9.6)*1 but that won't work within a formula because * has a special meaning in formulas, so you'd have to use I around it: I((x<9.6)*1) - I'd say as.numeric looks cleaner.


as.numeric


ifelse(x<9.6,1,0)


(x<9.6)*1


*


I


I((x<9.6)*1)


as.numeric





Makes sense now. Thanks.
– Jordan
Jul 2 at 19:41






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

api-platform.com Unable to generate an IRI for the item of type

PHP contact form sending but not receiving emails

Do graphics cards have individual ID by which single devices can be distinguished?