I am facing an issue while plotting graph in matplotlib as I am unable to convert data exactly to give inputs to matplotlib
Here is my data
I have converted it as following dataframe
2016 0.333945
2017 0.330923
2018 0.321857
2019 0.312790
<class 'pandas.core.frame.DataFrame'>
by using following code:
import pandas as pd
df = pd.read_csv("portfolio.txt")
companyname = "AAPL"
frames = df.loc[:, df.columns.str.startswith(companyname)]
l1 = frames.loc['2015-6-1':'2019-6-10']
plt.plot(li1, label="Company Past Information")
plt.xlabel('Risk Aversion')
plt.ylabel('Optimal Investment Portfolio')
plt.title('Optimal Investment Portfolio For Low, Medium & High')
After plotting to matplotlib I getting output correctly for which data is existed.
But for which data is not available graph is plotting wrongly.
2016 NaN
2017 NaN
2018 NaN
2019 NaN
Due to this I am unable to plot graph correctly
Please help out of this
Thanks in advance
If you're reading you data in from a .csv using pandas you can:
import pandas as pd
df = pd.csv_read(your_csv, parse_dates=[0]) # 0 means your dates are in the first column
Otherwise you can convert your data column to datatime using:
import pandas as pd
df['date'] = pd.to_datetime(df['date'])
When using matplotlib then you can:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(df.iloc[:, 0], df.loc[:, some_column])
I have a dataframe in Pandas in which the rows are observations at different times and each column is a size bin where the values represent the number of particles observed for that size bin. So it looks like the following:
bin1 bin2 bin3 bin4 bin5
Time1 50 200 30 40 5
Time2 60 60 40 420 700
Time3 34 200 30 67 43
I would like to use plotly/cufflinks to create a scatterplot in which the x axis will be each size bin, and the y axis will be the values in each size bin. There will be three colors, one for each observation.
As I'm more experienced in Matlab, I tried indexing the values using iloc (note the example below is just trying to plot one observation):
df.iplot(kind="scatter",theme="white",x=df.columns, y=df.iloc[1,:])
But I just get a key error: 0 message.
Is it possible to use indexing when choosing x and y values in Pandas?
Rather than indexing, I think you need to better understand how pandas and matplotlib interact each other.
Let's go by steps for your case:
As the pandas.DataFrame.plot documentation says, the plotted series is a column. You have the series in the row, so you need to transpose your dataframe.
To create a scatterplot, you need both x and y coordinates in different columns, but you are missing the x column, so you also need to create a column with the x values in the transposed dataframe.
Apparently pandas does not change color by default with consecutive calls to plot (matplotlib does it), so you need to pick a color map and pass a color argument, otherwise all points will have the same color.
Here a working example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#Here I copied you data in a data.txt text file and import it in pandas as a csv.
#You may have a different way to get your data.
df = pd.read_csv('data.txt', sep='\s+', engine='python')
#I assume to have a column named 'time' which is set as the index, as you show in your post.
tdf = df.transpose() #transpose the dataframe
#Drop the time column from the trasponsed dataframe. time is not a data to be plotted.
tdf = tdf.drop('time')
#Creating x values, I go for 1 to 5 but they can be different.
tdf['xval'] = np.arange(1, len(tdf)+1)
#Choose a colormap and making a list of colors to be used.
colormap = plt.cm.rainbow
colors = [colormap(i) for i in np.linspace(0, 1, len(tdf))]
#Make an empty plot, the columns will be added to the axes in the loop.
fig, axes = plt.subplots(1, 1)
for i, cl in enumerate([datacol for datacol in tdf.columns if datacol != 'xval']):
tdf.plot(x='xval', y=cl, kind="scatter", ax=axes, color=colors[i])
This plots the following image:
Here a tutorial on picking colors in matplotlib.
I am having difficulty getting plot.bar to group the bars together the way I have them grouped in the dataframe. The dataframe returns the grouped data correctly, however, the bar graph is providing a separate bar for every line int he dataframe. Ideally, everything in my code below should group 3-6 bars together for each department (Dept X should have bars grouped together for each type, then count of true/false as the Y axis).
dname Type purchased
Dept X 0 False 141
True 270
1 False 2020
True 2604
2 False 2023
True 1047
import psycopg2
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
##connection and query data removed
df = pd.merge(df_departments[["id", "dname"]], df_widgets[["department", "widgetid", "purchased","Type"]], how='inner', left_on='id', right_on='department')
df.set_index(['dname'], inplace=True)
dx=df.groupby(['dname', 'Type','purchased'])['widgetid'].size()
dx.plot.bar(x='dname', y='widgetid', rot=90)
I can't be sure without a more reproducible example, but try unstacking the innermost level of the MultiIndex of dx before plotting:
dx.unstack().plot.bar(x='dname', y='widgetid', rot=90)
I expect this to work because when plotting a DataFrame, each column becomes a legend entry and each row becomes a category on the horizontal axis.
I want to explicitly set the order of the stacks in a Matplotlib stackplot. Here is an example:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.DataFrame(np.random.randint(0,100,size=(100,4)),columns=list('ABCD'))
This produces the following image:
The last row of the dataframe from:
99 16 30 84 57
Here is what I want to achieve:
I want to re-order the plot of the stacks such that the stacks are plotted from the bottom up A, B, D, C i.e. the columns ordered from the bottom up, by the order of their increasing values in the last row of the df.
So far, I have tried re-ordering explicitly the columns in the df before plotting:
but this produces exactly the same chart as above.
Thank you for any help here!
The graphs are not the same. Look at the areas beneath the red graph for a particle x. The shapes for those graphs are different for the green and blue shaded areas.
And now,