Create new column depending on values from other column


Create new column depending on values from other column



I have a DataFrame that looks something like this:


import numpy as np
import pandas as pd

df=pd.DataFrame([['vt 40462',5,6],[5,6,6],[5,5,8],[4,3,1],['vl 6450',5,6],[5,6,7],
[1,2,3],['vt 40462',5,6],[5,5,8],['vl 658',6,7],[5,5,8],[4,3,1],['vt 40461',5,6],[5,5,8],
[7,8,5]],columns=['A','B','C'])



df


A B C
0 vt 40462 5 6
1 5 6 6
2 5 5 8
3 4 3 1
4 vl 6450 5 6
5 5 6 7
6 1 2 3
7 vt 40462 5 6
8 5 5 8
9 vl 658 6 7
10 5 5 8
11 4 3 1
12 vt 40461 5 6
13 5 5 8
14 7 8 5



I want to give indexes the values that are between vt and vl in column A and create a new columns as :


vt


vl


A


A B C D
0 vt 40462 5 6 vt 40462
1 5 6 6 vt 40462
2 5 5 8 vt 40462
3 4 3 1 vt 40462
4 vl 6450 5 6 vl 6450
5 5 6 7 vl 6450
6 1 2 3 vl 6450
7 vt 40462 5 6 vt 40462
8 5 5 8 vt 40462
9 vl 658 6 7 vl 658
10 5 5 8 vl 658
11 4 3 1 vl 658
12 vt 40461 5 6 vt 40461
13 5 5 8 vt 40461
14 7 8 5 vt 40461




2 Answers
2



Another way would be to assign column D to be all values of A that start with a letter, and then use df.ffill() to get rid of NaNs:


assign


D


A


df.ffill()


NaN


df.assign(D=df.loc[df.A.str.contains('^[A-Za-z]', na=False), 'A']).ffill()


A B C D
0 vt 40462 5 6 vt 40462
1 5 6 6 vt 40462
2 5 5 8 vt 40462
3 4 3 1 vt 40462
4 vl 6450 5 6 vl 6450
5 5 6 7 vl 6450
6 1 2 3 vl 6450
7 vt 40462 5 6 vt 40462
8 5 5 8 vt 40462
9 vl 658 6 7 vl 658
10 5 5 8 vl 658
11 4 3 1 vl 658
12 vt 40461 5 6 vt 40461
13 5 5 8 vt 40461
14 7 8 5 vt 40461



Or, more or less equivalently, but in 2 steps:


df.loc[df.A.astype(str).str.contains('^[A-Za-z]'), 'D'] = df.A

df.ffill()





thanks it's working
– sym
Jul 3 at 6:49



Use str.split, if ' ' not found the it returns NaN use ffill to fill NaN and join fields together and assign to 'D':


str.split


ffill


#Thanks @user3483203 for the upgrade in syntax
df['D'] = df['A'].str.split().ffill().apply(' '.join)
print(df)



Output:


A B C D
0 vt 40462 5 6 vt 40462
1 5 6 6 vt 40462
2 5 5 8 vt 40462
3 4 3 1 vt 40462
4 vl 6450 5 6 vl 6450
5 5 6 7 vl 6450
6 1 2 3 vl 6450
7 vt 40462 5 6 vt 40462
8 5 5 8 vt 40462
9 vl 658 6 7 vl 658
10 5 5 8 vl 658
11 4 3 1 vl 658
12 vt 40461 5 6 vt 40461
13 5 5 8 vt 40461
14 7 8 5 vt 40461





I think you can simplify to df.A.str.split().ffill().apply(' '.join) (Although using a lambda for join may be clearer as to what it's doing)
– user3483203
Jul 2 at 20:37



df.A.str.split().ffill().apply(' '.join)






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

api-platform.com Unable to generate an IRI for the item of type

PHP contact form sending but not receiving emails

Do graphics cards have individual ID by which single devices can be distinguished?