generate column with capped values in data.table

Multi tool use
Multi tool use


generate column with capped values in data.table



I have a data.table similar to the following one:


| ID | C1 | C2 | C3 |
| 1 | 1 | 20 | 400 |
| 2 | 0 | 15 | 500 |
| 3 | 0 | 2 | 350 |
| 4 | 1 | 4 | 402 |
| 5 | 1 | 8 | 333 |



And I want to generate a new column "CALC" applying a function like this one:


func <- function(a, b, c){
pol <- a*b-0.01*c
value <- min(max(pol, 0), 5)
}



What I thought it was going to give was me the pol solution in each row, capped by 0 and 5. So, instead of


| ID | C1 | C2 | C3 | CALC |
| 1 | 1 | 20 | 400 | 16 |
| 2 | 0 | 15 | 500 | -5 |
| 3 | 0 | 2 | 350 | -3.5 |
| 4 | 1 | 4 | 402 | -0.02 |
| 5 | 1 | 8 | 333 | 4.67 |



it should return


| ID | C1 | C2 | C3 | CALC |
| 1 | 1 | 20 | 400 | 5 |
| 2 | 0 | 15 | 500 | 0 |
| 3 | 0 | 2 | 350 | 0 |
| 4 | 1 | 4 | 402 | 0 |
| 5 | 1 | 8 | 333 | 4.67 |



But instead, what it gives is the following dt:


| ID | C1 | C2 | C3 | CALC |
| 1 | 1 | 20 | 400 | 5 |
| 2 | 0 | 15 | 500 | 5 |
| 3 | 0 | 2 | 350 | 5 |
| 4 | 1 | 4 | 402 | 5 |
| 5 | 1 | 8 | 333 | 5 |



So, if I'm not wrong, it's getting the max of the values in the column (capping it by 5) and putting in along the column. Not what I intented, which would be getting the max of each row alone.



Is there any "easy" way to fix it? The original dt is more complex, so it wouldn't be possible to simply do dt[, CALC := func(C1, C2, C3), by="ID"], as it should be something like by=.(GROUP1, GROUP2, ID), and I'm pretty sure it has to be simpler. Maybe there's an easy way through the function itself?


dt[, CALC := func(C1, C2, C3), by="ID"]


by=.(GROUP1, GROUP2, ID)



Thanks a lot for your time.





maybe by=seq_len(dt[,.N])?
– chinsoon12
Jul 3 at 8:59



by=seq_len(dt[,.N])





It works for the reduced model I'm working with, but I can see how it takes it a lot of time for such a few rows (100, vs the real one of something like 10 million). Is there any other possibility?
– sneaky_lobster
Jul 3 at 9:10





check out Onyambu's vectorized func
– chinsoon12
Jul 3 at 9:12



func




1 Answer
1



You need to use pmin and pmax


pmin


pmax


func <- function(a, b, c){
pol <- a*b-0.01*c
pmin(pmax(pol, 0), 5)
}


transform(dat,s=func(C1,C2,C3))
ID C1 C2 C3 s
1 1 1 20 400 5.00
2 2 0 15 500 0.00
3 3 0 2 350 0.00
4 4 1 4 402 0.00
5 5 1 8 333 4.67

setDT(dat)[,CALC:=func(C1,C2,C3)]
dat
ID C1 C2 C3 CALC
1: 1 1 20 400 5.00
2: 2 0 15 500 0.00
3: 3 0 2 350 0.00
4: 4 1 4 402 0.00
5: 5 1 8 333 4.67





Thanks so much! That's the answer I was looking for.
– sneaky_lobster
Jul 3 at 9:14






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

9ma Nld 57xEP6WPyXSkz,8X 54noZjbMZbNi7GlJEItZ 0p6YJ,QMw1W4YfO7dX3JCgO uiMmnsINEgmwQ,J
9r 3k,c8Qp,ky Q JnV0dYKo29tVSNtVZ,8NU,wc1e0TPk f HnKN9m6qvsGAwTjm4J auZHlwz

Popular posts from this blog

PHP contact form sending but not receiving emails

Do graphics cards have individual ID by which single devices can be distinguished?

Create weekly swift ios local notifications