create subtracted dataframe using multiple values from the bottom rows

Multi tool use
create subtracted dataframe using multiple values from the bottom rows
I have an example df1
below...I need to create a new df_sub
by subtracting values at the bottom of each column from the rest of the column.
df1
df_sub
df_sub = i(every value in each column) - (a_squared * Temp + b * Temp + c)**
Example df1.head()
df1.head()
Temperature A1 A2 A3 A4
25.0 681.51 147.40 409.26 680.83
25.2 615.89 124.34 362.39 618.37
25.4 568.72 95.22 310.37 567.22
25.6 522.08 89.74 272.69 516.53
25.8 480.04 68.20 229.03 477.30
Example df1.tail()
df1.tail()
Temperature A1 A2 A3 A4
95.0 -102.14 6348.77 2276.56 -2545.60
a 15.26 10.67 -1.87 13.25
b -1016.94 -623.29 29.40 -902.77
c 16557.63 9044.62 715.07 14941.87
a_squared 232.95 113.95 3.53 175.65
This is what I had tried, and the error I get...
df_sub = df1.iloc[:-4] - (Temp * df1.iloc[-1, :] + (Temp * df1.iloc[-3, :]) + df1.iloc[-2, :])
Temp is a list like: np.arange(25, 95.2, 0.2)
np.arange(25, 95.2, 0.2)
ValueError: operands could not be broadcast together with shapes (96,) (351,)
Any help would be appreciated!
@heltonbiker They are measurements from a temperature heating experiment...there are more columns and rows in reality.
– schnick
Jun 20 at 19:26
So the first item of your problem seems weird. If you want to calculate
np.polyfit(x, y, 2)
, you need x
values and y
values. But you mention "rows 0, 1, 2 for each column", which would give three sequences, not two. So where would x
come from, and where would y
come from?– heltonbiker
Jun 20 at 19:38
np.polyfit(x, y, 2)
x
y
x
y
Any thoughts now @heltonbiker?
– schnick
Jul 3 at 5:46
Sorry, I still find this confusing. Specifically, the idea of applying a different reggression for each temperature value doesn't seem to make sense. Supposing you get this working, how do you expect to use the found formulas? To extrapolate temperature values? Or extrapolate sensor readings given a temperature value? I think it would be much easier to analyze your difficulties with a better understanding of that.
– heltonbiker
Jul 3 at 13:50
1 Answer
1
If Temp
is some scalar is necessary create index by Temperature
column by set_index
as first step:
Temp
Temperature
set_index
df1 = df1.set_index('Temperature')
print (df1)
A1 A2 A3 A4
Temperature
25.0 681.51 147.40 409.26 680.83
25.2 615.89 124.34 362.39 618.37
25.4 568.72 95.22 310.37 567.22
25.6 522.08 89.74 272.69 516.53
25.8 480.04 68.20 229.03 477.30
95.0 -102.14 6348.77 2276.56 -2545.60
a 15.26 10.67 -1.87 13.25
b -1016.94 -623.29 29.40 -902.77
c 16557.63 9044.62 715.07 14941.87
a_squared 232.95 113.95 3.53 175.65
And then multiple index values converted to numpy array with broadcasting:
idx = df1.index
#if necessary convert index to numeric
#idx = pd.to_numeric(df1.index, errors='coerce')
a = df1.iloc[-1].values * idx[:-4].values[:, None]
b = df1.iloc[-3].values * idx[:-4].values[:, None]
df_sub = df1.iloc[:-4] - (a + b + df1.iloc[-2].values)
print (df_sub)
A1 A2 A3 A4
Temperature
25.0 3723.630 3836.280 -1129.060 3916.960
25.2 3814.808 3915.088 -1182.516 3999.924
25.4 3924.436 3987.836 -1241.122 4094.198
25.6 4034.594 4084.224 -1285.388 4188.932
25.8 4149.352 4164.552 -1335.634 4295.126
95.0 57819.280 45691.450 -1566.860 51588.930
Temp is the Temperature column (index) not a scalar, I just wanted to show it goes from 25-95 in 0.2 steps. So there are 351 rows.
– schnick
yesterday
@Schnick - Please check solution now.
– jezrael
yesterday
Just drop the '-4' from both instances of `idx[:-4].values[:, None] Thanks!
– schnick
yesterday
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
What A1 to A4 columns mean? Are these measurements? Is this a physical phenomenon you are modeling?
– heltonbiker
Jun 20 at 19:13