my data set is
var data = [
[2010,"Internet",19],
[2010,"Water",20],
[2011,"Internet",36],
[2011,"Water",44],
[2012,"Internet",51],
[2012,"Water",53],
[2013,"Internet",39],
[2013,"Water",30]
]
var allHeights = [-250,-215,-185,-140, -85, 40, 85, 140, 185, 215, 250];
var allColors = [
d3.rgb(203,23,23),d3.rgb(108,74,246),d3.rgb(9,235,175),d3.rgb(159,172,38),d3.rgb(236,105,18),
d3.rgb(137,195,95),d3.rgb(241,137,108),d3.rgb(253,233,127),d3.rgb(109,151,216),d3.rgb(15,199,58)
];
I want to draw a y-axis on the extreme right of the svg. The axis should be divided into points corresponding to "allHeights" and each point should be labelled with the
year (i.e. data[i][0]).
I want to draw a legend for data[i][2] and fill each legend with allColors[i].
Please help
Related
I have a DataFrame that contains two features namely LotFrontage and LotArea.
I want to plot a bar graph to show the relation between them.
My code is:
import matplotlib.pyplot as plt
visual_df=pd.DataFrame()
visual_df['area']=df_encoded['LotArea']
visual_df['frontage']=df_encoded['LotFrontage']
visual_df.dropna(inplace=True)
plt.figure(figsize=(15,10))
plt.bar(visual_df['area'],visual_df['frontage'])
plt.show()
The column LotFrontage is in Float datatype.
What is wrong with my code and How can I correct it?
To see a relationship between two features, a scatter plot is usually much more informative than a bar plot. To draw a scatter plot via matplotlib: plt.scatter(visual_df['area'], visual_df['frontage']). You can also invoke pandas scatter plot, which automatically adds axis labels: df.plot(kind='scatter', x='area', y='frontage').
For a lot of statistical purposes, seaborn can be handy. sns.regplot not only creates the scatter plot but automatically also tries to fit the data with a linear regression and shows a confidence interval.
from matplotlib import pyplot as plt
import pandas as pd
import seaborn as sns
area = [8450, 9600, 11250, 9550, 14260, 14115, 10084, 6120, 7420, 11200, 11924, 10652, 6120, 10791, 13695, 7560, 14215, 7449, 9742, 4224, 14230, 7200]
frontage = [65, 80, 68, 60, 84, 85, 75, 51, 50, 70, 85, 91, 51, 72, 68, 70, 101, 57, 75, 44, 110, 60]
df = pd.DataFrame({'area': area, 'frontage': frontage})
sns.regplot(x='area', y='frontage', data=df)
plt.show()
PS: The main problem with the intented bar plot is that the x-values lie very far apart. Moreover, the default width is one and very narrow bars can get too narrow to see in the plot. Adding an explicit edge color can make them visible:
plt.bar(visual_df['area'], visual_df['frontage'], ec='blue')
You could set a larger width, but then some bars would start to overlap.
Alternatively, pandas barplot would treat the x-axis as categorical, showing all x-values next to each other, as if they were strings. The bars are drawn in the order of the dataframe, so you might want to sort first:
df.sort_values('area').plot(kind='bar', x='area', y='frontage')
plt.tight_layout()
I need to plot two features of a dataframe where df['DEPTH'] should be inverted and at y-axis and df['SPECIES'] should be at x-axis. Imagining that the plot would be a variant line, I would like to fill with color the area near the y-axis (left side of the line). So I wrote some code:
df = pd.DataFrame({'DEPTH': [100, 150, 200, 250, 300, 350, 400, 450, 500, 550],
'SPECIES':[12, 8, 9, 6, 10, 7, 4, 3, 1, 2]})
plt.plot(df['SPECIES'], df['DEPTH'])
plt.fill_between(df['SPECIES'], df['DEPTH'])
plt.ylabel('DEPTH')
plt.xlabel('SPECIES')
plt.ylim(np.max(df['DEPTH']), np.min(df['DEPTH']))
I tried 'plt.fill_between', but then the left part of the plot doesn't get all filled.
Anyone knows how can the filled part (blue color) reach the y-axis?
Instead of fill_between, you can use fill_betweenx. It will start filling from 0 by default, thus you need to set your x limit to be 0 too.
plt.plot(df['SPECIES'], df['DEPTH'])
# changing fill_between to fill_betweenx -- the order also changes
plt.fill_betweenx(df['DEPTH'], df['SPECIES'])
plt.ylabel('DEPTH')
plt.xlabel('SPECIES')
plt.ylim(np.max(df['DEPTH']), np.min(df['DEPTH']))
# setting the lower limit to 0 for the filled area to reach y axis.
plt.xlim(0,np.max(df['SPECIES']))
plt.show()
The result is below.
I would like to delete the lines which are actually shown in the picture and also put the number (their values) in each graph, I mean the value which belong to each one. How can I do it?
The values are from a data set taken from Kaggle.
Here is some code to help you get the requested layout.
The states and the numbers are from Wikipedia.
import matplotlib.pyplot as plt
states = ['Acre', 'Alagoas', 'Amazonas', 'Amapá', 'Bahia', 'Ceará', 'Federal District',
'Espírito Santo', 'Goiás', 'Maranhão', 'Minas Gerais', 'Mato Grosso do Sul',
'Mato Grosso', 'Pará', 'Paraíba', 'Pernambuco', 'Piauí', 'Paraná', 'Rio de Janeiro',
'Rio Grande do Norte', 'Rondônia', 'Roraima', 'Rio Grande do Sul', 'Santa Catarina',
'Sergipe', 'São Paulo', 'Tocantins']
fires = [2918, 73, 7625, 24, 2383, 327, 68, 229, 1786, 5596, 2919, 451, 15476, 10747, 81, 132,
2818, 181, 396, 68, 6441, 4608, 2029, 1107, 62, 1616, 6436]
fires, states = zip(*sorted(zip(fires, states))) #sort both arrays on number of fires
fires = fires[-15:] # limit to the 15 highest numbers
states = states[-15:]
fig, ax = plt.subplots(figsize=(8, 6))
ax.barh(states, fires, color="#08519c")
plt.box(False) # remove the complete box around the plot
plt.xticks([]) # remove all the ticks on the x-axis
ax.yaxis.set_ticks_position('none') # removes the tick marks on the y-axis but leaves the text
for i, v in enumerate(fires):
ax.text(v + 180, i, f'{v:,}'.replace(',', '.'), color='#08519c', fontweight='normal', ha='left', va='center')
plt.subplots_adjust(left=0.22) # more space to read the names
plt.title('Wildfires Brazil 2019', fontsize=20, y=0.98) # title larger and a bit lower
plt.show()
PS: about
for i, v in enumerate(fires):
ax.text(v + 180, i, f'{v:,}'.replace(',', '.'), color='#08519c', fontweight='normal', ha='left', va='center')
This has a v going through each element of fires, one by one. i is the index for which fires[i] == b. ax.text(x, y, 'some text') puts a text on a certain position, where they are measured with the same distances as those marked on the axes (that's why default the axes are shown). When the axes are just text instead of numbers, they are numbered internally 0, 1, 2, 3, ... . So, x=v + 180 is the x-position where number-of-fires v+180 would be. And y=i means just the position of label number i.
So I've been trying to plot some data. I have got the data to fetch from a database and placed it all correctly into the variable text_. This is the snippet of the code:
import sqlite3
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from dateutil.parser import parse
fig, ax = plt.subplots()
# Twin the x-axis twice to make independent y-axes.
axes = [ax, ax.twinx(), ax.twinx()]
# Make some space on the right side for the extra y-axis.
fig.subplots_adjust(right=0.75)
# Move the last y-axis spine over to the right by 20% of the width of the axes
axes[-1].spines['right'].set_position(('axes', 1.2))
# To make the border of the right-most axis visible, we need to turn the frame on. This hides the other plots, however, so we need to turn its fill off.
axes[-1].set_frame_on(True)
axes[-1].patch.set_visible(False)
# And finally we get to plot things...
text_ = [('01/08/2017', 6.5, 143, 88, 60.2, 3), ('02/08/2017', 7.0, 146, 90, 60.2, 4),
('03/08/2017', 6.7, 142, 85, 60.2, 5), ('04/08/2017', 6.9, 144, 86, 60.1, 6),
('05/08/2017', 6.8, 144, 88, 60.2, 7), ('06/08/2017', 6.7, 147, 89, 60.2, 8)]
colors = ('Green', 'Red', 'Blue')
label = ('Blood Sugar Level (mmol/L)', 'Systolic Blood Pressure (mm Hg)', 'Diastolic Blood Pressure (mm Hg)')
y_axisG = [text_[0][1], text_[1][1], text_[2][1], text_[3][1], text_[4][1], text_[5][1]] #Glucose data
y_axisS = [text_[0][2], text_[1][2], text_[2][2], text_[3][2], text_[4][2], text_[5][2]] # Systolic Blood Pressure data
y_axisD = [text_[0][3], text_[1][3], text_[2][3], text_[3][3], text_[4][3], text_[5][3]] # Diastolic Blood Pressure data
AllyData = [y_axisG, y_axisS, y_axisD] #list of the lists of data
dates = [text_[0][0], text_[1][0], text_[2][0], text_[3][0], text_[4][0], text_[5][0]] # the dates as strings
x_axis = [(parse(x, dayfirst=True)) for x in dates] #converting the dates to datetime format for the graph
Blimits = [5.5, 130, 70] #lower limits of the axis
Tlimits = [8, 160, 100] #upper limits of the axis
for ax, color, label, AllyData, Blimits, Tlimits in zip(axes, colors, label, AllyData, Blimits, Tlimits):
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m/%d/%Y')) #format's the date
plt.gca().xaxis.set_major_locator(mdates.DayLocator())
data = AllyData
ax.plot(data, color=color) #plots all the y-axis'
ax.set_ylim([Blimits, Tlimits]) #limits
ax.set_ylabel(label, color=color) #y-axis labels
ax.tick_params(axis='y', colors=color)
axes[0].set_xlabel('Date', labelpad=20)
plt.gca().set_title("Last 6 Month's Readings",weight='bold',fontsize=15)
plt.show()
The code currently makes this graph:
Graph with no x-values
I understand the problem is probably in the ax.plot part but I'm not sure what exactly. I tried putting that line of code as ax.plot(data, x_axis, color=color however, this made the whole graph all messed up and the dates didn't show up on the x-axis like i wanted them to.
Is there something I've missed?
If this has been answered elsewhere, please can you show me how to implement that into my code by editing my code?
Thanks a ton
Apparently x_data is never actually used in the code. Instead of
ax.plot(data, color=color)
which plots the data against its indices, you would want to plot the data against the dates stored in x_axis.
ax.plot(x_axis, data, color=color)
Finally, adding plt.gcf().autofmt_xdate() just before plt.show will rotate the dates nicely, such that they don't overlap.
how do I plot an area around a set of points on a map in R? e.g.
map('world')
map.axes()
p <- matrix(c(50, 50, 80, 100, 70, 40, 25, 60), ncol=2) # make some points
points(p, pch=19, col="red")
polygon(p, col="blue")
... which gives me a polygon with a vertex at each of the points, but it looks rather crappy. Is there any way to "smooth" the polygon into some sort of curve?
One option is to make a polygon bounded by a Bézier curve, using the bezier function in the Hmisc package. However I cannot get the start/end point to join up neatly. For example:
## make some points
p <- matrix(c(50, 50, 80, 100, 70, 40, 25, 60), ncol=2)
## add the starting point to the end
p2 <- cbind(1:5,p[c(1:4,1),])
## linear interpolation between these points
t.coarse <- seq(1,5,0.05)
x.coarse <- approx(p2[,1],p2[,2],xout=t.coarse)$y
y.coarse <- approx(p2[,1],p2[,3],xout=t.coarse)$y
## create a Bezier curve
library(Hmisc)
bz <- bezier(x.coarse,y.coarse)
library(maps)
map('world')
map.axes()
polygon(bz$x,bz$y, col=rgb(0,0,1,0.5),border=NA)
Here's one way, draw the polygon and make it as pretty as you like. This really has nothing to do with areas on maps, more about how you generate the vertices of your polygon.
library(maps)
p <- matrix(c(50, 50, 80, 100, 70, 40, 25, 60), ncol=2)
plot(p, pch = 16, col = "red", cex = 3, xlim = range(p[,1]) + c(-10,10), ylim = range(p[,2]) + c(-5, 5))
map(add = TRUE)
#click until happy, right-click "stop" to finish
p <- locator(type = "l")
map()
polygon(cbind(p$x, p$y), col = "blue")
Otherwise you could interpolate intermediate vertices and smooth them somehow, and in the context of a lon/lat map maybe with use reprojection to get more realistic line segments - but depends on your purpose.