Write Array and Variable to Dataframe - python-3.x

I have an array in the format [27.214 27.566] - there can be several numbers. Additionally I have a Datetime variable.
now=datetime.now()
datetime=now.strftime('%Y-%m-%d %H:%M:%S')
time.sleep(0.5)
agilent.write("MEAS:TEMP? (#101:102)")
values = np.fromstring(agilent.read(), dtype=float, sep=',')
The output from the array is [27.214 27.566]
Now I would like to write this into a dataframe with the following structure:
Datetime, FirstValueArray, SecondValueArray, ....
How to do this? In the dataframe every one minute a new array is added.

I will assume you want to append a row to an existing dataframe df with appropriate columns : value1, value2, ..., lastvalue, datetime
We can easily convert the array to a series :
s = pd.Series(array)
What you want to do next is append the datetime value to the series :
s.append(datetime, ignore_index=True) cf Series.append
Now you have a series whose length matches df.columns. You want to convert that series to a dataframe to be able to use pd.concat :
df_to_append = s.to_frame().T
We need to get the transpose of the original dataframe, because Series.to_frame() returns a dataframe with the series as a single column, and we want a single index but multiple columns.
Before you concatenate, however, you need to make sure both those dataframes columns names match, or it will create additional columns :
df_to_append.columns = df.columns
Now we can concatenate our two dataframes :
pd.concat([df, df_to_append], ignore_index=True) cf pandas.Concat
For further details, see the documentation

Related

How to ungroup Column groups and covert them into rows using pandas?

I have the following table from downloading stock data downloaded for multiple stocks. I used the following code
i = ['NTPC.NS', 'GAIL.NS']
stock = yf.download(tickers=i, start='2021-01-11', end='2021-03-10', interval = '5m', group_by = 'tickers')
The output dataframe looks like this
But I want the output to be like this
Use DataFrame.stack by first level, then rename index names and convert last level of MultiIndex to column by DataFrame.reset_index:
df = stock.stack(level=0).rename_axis(['Datetime','stockname']).reset_index(level=-1)
#if necessary change order of columns
df = df[df.columns.tolist()[1:] + df.columns.tolist()[:1]]

Get n rows based on column filter in a Dataframe pandas

I have a dataframe df as below.
I want the final dataframe to be like this as follows. i.e, for each unique Name only last 2 rows must be present in the final output.
i tried the following snippet but its not working.
df = df[df['Name']].tail(2)
Use GroupBy.tail:
df1 = df.groupby('Name').tail(2)
Just one more way to solve this using GroupBy.nth:
df1 = df.groupby('Name').nth([-1,-2]) ## this will pick the last 2 rows

Python: DataFrame Index shifting

I have several dataframes that I have concatenated with pandas in the line:
xspc = pd.concat([df1,df2,df3], axis = 1, join_axes = [df3.index])
In df2 the index values read one day later than the values of df1, and df3. So for instance when the most current date is 7/1/19 the index values for df1 and df3 will read "7/1/19" while df2 reads '7/2/19'. I would like to be able to concatenate each series so that each dataframe is joined on the most recent date, so in other words I would like all the dataframe values from df1 index value '7/1/19' to be concatenated with dataframe 2 index value '7/2/19' and dataframe 3 index value '7/1/19'. When methods can I use to shift the data around to join on these not matching index values?
You can reset the index of the data frame and then concat the dataframes
df1=df1.reset_index()
df2=df2.reset_index()
df3=df3.reset_index()
df_final = pd.concat([df1,df2,df3],axis=1, join_axes=[df3.index])
This should work since you mentioned that the date in df2 will be one day after df1 or df3

Merge multiple dataframes using multiindex in python

I have 3 series which is generated out of the code shown below. I have shown a the code for one series below
I would like to merge 3 such series/dataframes using columns (subject_id,hadm_id,icustay_id) but unfortunately these headings don't appear as column names. How do I convert them as columns and use them for merging with another series/dataframe of similar datatype
I am generating series from another dataframe (df) based on the condition given below. Though I already tried converting this series to dataframe, still it doesn't display the indices, instead it displays the column name as index. I have shown the output below. I would like to see the values 'Subject_id','hadm_id','icustay_id' as column names in dataframe along with other column 'val_bw_80_110' so that I can join with other dataframes using these 3 ids ('Subject_id','hadm_id','icustay_id')
s1 =
df.groupby(['subject_id','hadm_id','icustay_id'['val_bw_80_110'].mean()
I expect an output where the ids (subject_id,hadm_id,icustay_id) are converted to column names and can be used for joining/merging with other dataframes.
You can add parameter as_index=False to DataFrame.groupby or use Series.reset_index:
df = df.groupby(['subject_id','hadm_id','icustay_id'], as_index=False)['val_bw_80_110'].mean()
Or:
df = df.groupby(['subject_id','hadm_id','icustay_id'])['val_bw_80_110'].mean().reset_index()

pandas read_csv create new column and usecols at the same time

I'm trying to load multiple csv files into a single dataframe df while:
adding column names
adding and populating a new column (Station)
excluding one of the columns (QD)
All of this works fine until I attempt to exclude a column with usecols, which throws the error Too many columns specified: expected 5 and found 4.
Is it possible to create a new column and pass usecols at the same time?
The reason I'm creating & populating a new 'Station' column during read_csv is my dataframe will contain data from multiple stations. I can work around the error by doing read_csv in one statement and dropping the QD column in the next with df.drop('QD', axis=1, inplace=True) but want to make sure I understand how to do this the most pandas way possible.
Here's the code that throws the error:
df = pd.concat(pd.read_csv("http://lgdc.uml.edu/common/DIDBGetValues?ursiCode=" + row['StationCode'] + "&charName=MUFD&DMUF=3000",
skiprows=17,
delim_whitespace=True,
parse_dates=[0],
usecols=['Time','CS','MUFD','Station'],
names=['Time','CS','MUFD','QD','Station']
).fillna(row['StationCode']
).set_index(['Time', 'Station'])
for index, row in stationdf.iterrows())
Example StationCode from stationdf BC840.
Data sample 2016-09-19T00:00:05.000Z 100 19.34 //
You can create a new column using operator chaining with assign:
df = pd.read_csv(...).assign(StationCode=row['StationCode'])

Resources