I have a dataset in pandas with measurements acquired with varying sample time.
I'm trying to plot the data vs. elapsed time, but time axis gets messed up.
I've found info that use of .autofmt_xdate() should solve the issue, but works only for data with fixed sampling frequency. In case of my data the x axis is entirely missing any labels.
a simple example of both cases
import pandas as pd
import matplotlib.pyplot as plt
idx1 = pd.to_timedelta(['00:00:00', '00:00:30', '00:01:00', '00:01:30', '00:02:00'])
idx2 = pd.to_timedelta(['00:00:01', '00:00:30', '00:01:00', '00:01:30', '00:02:00'])
vals = range(5)
s1= pd.Series(vals, idx1)
s2= pd.Series(vals, idx2)
# Labels on x are ok
plt.figure()
plt.gca().set_title('fixed frequency f=30s')
s1.plot()
plt.gcf().autofmt_xdate()
plt.show()
# Labels on x are messed up
plt.figure()
plt.gca().set_title('variable frequency')
s2.plot()
plt.gcf().autofmt_xdate()
plt.show()
Related
Good day Everyone, I'm new there hope someone will guide me and help me with my query.
is there away to plot the wave of signal using python? i have 9 points of frequency an power and i want it plot it using python v3.6.
i found some recourse like here and here and here and here i have try the code in below , but i want the graph shows as wave not in same that way. any suggest ?
code is :
# importing the required module
import matplotlib.pyplot as plt
# x axis values
x = [54,58,61,62,64,65,66,69,72] # frequency
# corresponding y axis values
y = [2,2.5,4,3,2.5,3.5,4.5,3,2] # Power
# plotting the points
plt.plot(x, y)
# naming the x axis
plt.xlabel('x - axis')
# naming the y axis
plt.ylabel('y - axis')
# giving a title to my graph
plt.title('My first graph!')
# function to show the plot
plt.show()
code of sin-wave, how i modify the code in below to assign the value of frequency and power as : freq = [54,58,61,62,64,65,66,69,72] # frequency and Power = [2,2.5,4,3,2.5,3.5,4.5,3,2] # Power
import numpy as np
import matplotlib
matplotlib.use('TKAgg') #use matplotlib backend TkAgg (optional)
import matplotlib.pyplot as plt
sample_rate = 200 # sampling frequency in Hz (atleast 2 times f)
t = np.linspace(0,5,sample_rate) #time axis
f = 100 #Signal frequency in Hz
sig = np.sin(2*np.pi*f*(t/sample_rate))
plt.plot(t,sig)
plt.xlabel("Time")
plt.ylabel("Amplitude")
plt.tight_layout()
plt.show()
I am plotting a Matplotlib chart with 10000 x axis data points. To avoid the X axis labels overlapping, I have used a Major MultipleLocator of 40 and a minor MultipleLocator of 10. This code works for 1000 data points.
from matplotlib import pyplot as plt
import numpy as np
import matplotlib.ticker as mticker
##generating 1000 data points
years = [i for i in range(1,10000)]
data = np.random.rand(len(years))
fig, ax = plt.subplots(figsize = (18,6))
ind = np.arange(len(data))
bars1 = ax.bar(ind, data,
label='Data')
ax.set_title("Data vs Year")
#Format Y Axis
ax.set_ylabel("Data")
ax.set_ylim((0,1))
#Format X Axis
ax.set_xticks(range(0,len(ind)))
ax.set_xticklabels(years)
ax.set_xlabel("Years")
ax.xaxis.set_major_locator(mticker.MultipleLocator(40))
ax.xaxis.set_major_formatter(mticker.FormatStrFormatter('%d'))
ax.xaxis.set_minor_locator(mticker.MultipleLocator(10))
fig.autofmt_xdate()
ax.xaxis_date()
plt.tight_layout()
plt.show()
This above chart produces the following error.
RuntimeError: Locator attempting to generate 1102 ticks from -510.0 to 10500.0: exceeds Locator.MAXTICKS
Can you please tell me the error in this chart?
First of all, you should remove these two lines:
ax.set_xticks(range(0,len(ind)))
ax.set_xticklabels(years)
These lines set 10000 ticks first. Since you used ax.xaxis.set_major/minor_locator(), these two lines are not needed. And then the line ax.xaxis.set_minor_locator(mticker.MultipleLocator(10)) will generate 1102 ticks (mticker.Locator.MAXTICKS==1000), so you should change the arg to at least 12 as a result of my testing.
Change arg of mticker.MultipleLocator() larger will get fewer ticks.
Despite any reason, if you do need 277 major ticks (40), and 1102 minor ticks (10), you can change the 'MAXTICKS' by mticker.Locator.MAXTICKS = 2000
So currently learning how to import data and work with it in matplotlib and I am having trouble even tho I have the exact code from the book.
This is what the plot looks like, but my question is how can I get it where there is no white space between the start and the end of the x-axis.
Here is the code:
import csv
from matplotlib import pyplot as plt
from datetime import datetime
# Get dates and high temperatures from file.
filename = 'sitka_weather_07-2014.csv'
with open(filename) as f:
reader = csv.reader(f)
header_row = next(reader)
#for index, column_header in enumerate(header_row):
#print(index, column_header)
dates, highs = [], []
for row in reader:
current_date = datetime.strptime(row[0], "%Y-%m-%d")
dates.append(current_date)
high = int(row[1])
highs.append(high)
# Plot data.
fig = plt.figure(dpi=128, figsize=(10,6))
plt.plot(dates, highs, c='red')
# Format plot.
plt.title("Daily high temperatures, July 2014", fontsize=24)
plt.xlabel('', fontsize=16)
fig.autofmt_xdate()
plt.ylabel("Temperature (F)", fontsize=16)
plt.tick_params(axis='both', which='major', labelsize=16)
plt.show()
There is an automatic margin set at the edges, which ensures the data to be nicely fitting within the axis spines. In this case such a margin is probably desired on the y axis. By default it is set to 0.05 in units of axis span.
To set the margin to 0 on the x axis, use
plt.margins(x=0)
or
ax.margins(x=0)
depending on the context. Also see the documentation.
In case you want to get rid of the margin in the whole script, you can use
plt.rcParams['axes.xmargin'] = 0
at the beginning of your script (same for y of course). If you want to get rid of the margin entirely and forever, you might want to change the according line in the matplotlib rc file:
axes.xmargin : 0
axes.ymargin : 0
Example
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset('tips')
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
tips.plot(ax=ax1, title='Default Margin')
tips.plot(ax=ax2, title='Margins: x=0')
ax2.margins(x=0)
Alternatively, use plt.xlim(..) or ax.set_xlim(..) to manually set the limits of the axes such that there is no white space left.
If you only want to remove the margin on one side but not the other, e.g. remove the margin from the right but not from the left, you can use set_xlim() on a matplotlib axes object.
import seaborn as sns
import matplotlib.pyplot as plt
import math
max_x_value = 100
x_values = [i for i in range (1, max_x_value + 1)]
y_values = [math.log(i) for i in x_values]
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
sn.lineplot(ax=ax1, x=x_values, y=y_values)
sn.lineplot(ax=ax2, x=x_values, y=y_values)
ax2.set_xlim(-5, max_x_value) # tune the -5 to your needs
considering the following pandas DataFrame:
labels values_a values_b values_x values_y
0 date1 1 3 150 170
1 date2 2 6 200 180
It is easy to plot this with Seaborn (see example code below). However, due to the big difference between values_a/values_b and values_x/values_y, the bars for values_a and values_b are not easily visible (actually, the dataset given above is just a sample and in my real dataset the difference is even bigger). Therefore, I would like to use two y-axis, i.e., one y-axis for values_a/values_b and one for values_x/values_y. I tried to use plt.twinx() to get a second axis but unfortunately, the plot shows only two bars for values_x and values_y, even though there are at least two y-axis with the right scaling. :) Do you have an idea how to fix that and get four bars for each label whereas the values_a/values_b bars relate to the left y-axis and the values_x/values_y bars relate to the right y-axis?
Thanks in advance!
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
columns = ["labels", "values_a", "values_b", "values_x", "values_y"]
test_data = pd.DataFrame.from_records([("date1", 1, 3, 150, 170),\
("date2", 2, 6, 200, 180)],\
columns=columns)
# working example but with unreadable values_a and values_b
test_data_melted = pd.melt(test_data, id_vars=columns[0],\
var_name="source", value_name="value_numbers")
g = sns.barplot(x=columns[0], y="value_numbers", hue="source",\
data=test_data_melted)
plt.show()
# values_a and values_b are not displayed
values1_melted = pd.melt(test_data, id_vars=columns[0],\
value_vars=["values_a", "values_b"],\
var_name="source1", value_name="value_numbers1")
values2_melted = pd.melt(test_data, id_vars=columns[0],\
value_vars=["values_x", "values_y"],\
var_name="source2", value_name="value_numbers2")
g1 = sns.barplot(x=columns[0], y="value_numbers1", hue="source1",\
data=values1_melted)
ax2 = plt.twinx()
g2 = sns.barplot(x=columns[0], y="value_numbers2", hue="source2",\
data=values2_melted, ax=ax2)
plt.show()
This is probably best suited for multiple sub-plots, but if you are truly set on a single plot, you can scale the data before plotting, create another axis and then modify the tick values.
Sample Data
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np
columns = ["labels", "values_a", "values_b", "values_x", "values_y"]
test_data = pd.DataFrame.from_records([("date1", 1, 3, 150, 170),\
("date2", 2, 6, 200, 180)],\
columns=columns)
test_data_melted = pd.melt(test_data, id_vars=columns[0],\
var_name="source", value_name="value_numbers")
Code:
# Scale the data, just a simple example of how you might determine the scaling
mask = test_data_melted.source.isin(['values_a', 'values_b'])
scale = int(test_data_melted[~mask].value_numbers.mean()
/test_data_melted[mask].value_numbers.mean())
test_data_melted.loc[mask, 'value_numbers'] = test_data_melted.loc[mask, 'value_numbers']*scale
# Plot
fig, ax1 = plt.subplots()
g = sns.barplot(x=columns[0], y="value_numbers", hue="source",\
data=test_data_melted, ax=ax1)
# Create a second y-axis with the scaled ticks
ax1.set_ylabel('X and Y')
ax2 = ax1.twinx()
# Ensure ticks occur at the same positions, then modify labels
ax2.set_ylim(ax1.get_ylim())
ax2.set_yticklabels(np.round(ax1.get_yticks()/scale,1))
ax2.set_ylabel('A and B')
plt.show()
I'm following this linear regression tutorial. Here's my code:
import pandas as pd
from sklearn import linear_model
import matplotlib.pyplot as plt
dataframe = pd.read_fwf('brain_body.txt')
x_values = dataframe[['Brain']]
y_values = dataframe[['Body']]
body_reg = linear_model.LinearRegression()
body_reg.fit(x_values, y_values)
plt.scatter(x_values, y_values)
plt.plot(x_values, body_reg.predict(x_values))
plt.show()
When I run the script, I get no errors, but the graph doesn't seem to account for the y-values. I reduced the data points to three so it's easier to see:
I tried to manually change the y-axis with plt.ylim([-1000,7000]) but no luck.
Thanks for any suggestions!
There's nothing wrong with the code, it's just that you have a few very extreme values in relation to the rest of your data. Matplotlib expands the graph to show the extreme values, but that ends up in bunching all the others. Broadening your ylim will only increase the effect - try a much smaller ylim and xlim instead:
plt.ylim([0, 20])
plt.xlim([0, 2])