How to save matplotlib figure as pickle file after calling show? - python-3.x

I am trying to save figures created with matplotlib as pickle (not sure if other file formats would work) file so that I can reopen the figure and do further modification as needed. However, I would like to see the plot before saving. I found similar queries like:
1. Save python matplotlib figure as pickle,
2. Save plot to image file instead of displaying it using Matplotlib,
3. python matplotlib save graph as data file,
4. Saving interactive Matplotlib figures.
But, none of them calling show() before saving or talks about that.
Here is the code:
import matplotlib.pyplot as plt
import pickle
fig, ax = plt.subplots()
ax.plot(list(range(1, 100, 10)), 'bo-')
plt.show() # enabling this does not save anything with pickle
# save the mpl figure as pickle format
with open('fig1.pkl', 'wb') as fs:
pickle.dump(ax, fs)
plt.close("all")
# load the saved figure as mpl figure
with open('fig1.pkl', 'rb') as fl:
fig1 = pickle.load(fl)
plt.show()
It works just fine if I do not call show() in the first case. But saving seems not working with pickle if I do call. Saving with other formats (.png/.pdf) works fine. Any indication if I am doing any mistakes will be much appreciated.

Related

How to save images using matplotlib without displaying them?

I have multiple(in millions) numpy 2-D arrays which need to be saved. One can save an individual image like this:
import numpy as np
import matplotlib.pyplot as plt
surface_profile = np.empty((50,50)) #numpy array to be saved
plt.figure()
plt.imshow(surface_profile)
save_filename='filename.png'
plt.savefig(save_filename)
However this process also displays the image which I don't require. If I keep saving million images like this, I should somehow avoid imshow() function of matplotlib.
Any help???
PS: I forgot to mention that I am using Spyder.
Your problem is using plt.imshow(surface_profile) to create the image as this will always display the image as well.
This can be done using PIL, try the following,
from PIL import Image
import numpy as np
surface_profile = np.empty((50,50)) #numpy array to be saved
im = Image.fromarray(surface_profile)
im = im.convert('RGB')
save_filename="filename.png"
im.save(save_filename, "PNG")

Why will Seaborn function 'regplot' not run in Jupyter?

I am having trouble with code Seaborn regplot function in Jupyter notebooks using Watson-Studio.
Using Python 3.6, the code appears to get stuck whilst processing, and this happens until I stop the code.
When I run this using IDLE on my Mac, the code runs perfectly and the plot shows.
Seems to happen with plots lmplot and regplot, however boxplots etc do show as normal.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
df = pd.read_csv(csv.csv)
sns.regplot(x = 'independent', y = 'dependent', data = df)
The expected results should be a graph of the linear relationship between the two variables, however I am just getting a loading bar.
When I stop running the kernel, the graph exists as a scatterplot with no line of best fit. Of course this has the error in notebook as 'Keyboard Interrupted'.
Could this possibly be a bug? Thanks for your help.
Set ci parameter to none and it will solve your problem.
sns.regplot(x = 'independent', y = 'dependent', data = df, ci = None)

How do I save each plot in individual pdf files while batch processing the x and y for the plot?

I came back to do some python work after not practicing it for a while. I feel like my issue may be a simple problem.. But I'm not sure how to address it. If you could share your insights, it would be a huge help. My current code is as the following:
import numpy as np
import matplotlib.pyplot as plt
import csv
import os
myfiles = os.listdir('input') #'all my input files are accessed
Then I do a whole bunch of math then generate my output files and save them in 'output' folder. Output folders consist output files of which each contains x and y columns as the code suggests in the following.
with open('output/'+file,'w') as f:
for a,b in zip(my_x,my_y):
f.write('{0:f},{1:f}\n'.format(a,b))
My biggest question lies here in the following, where I want to plot each output file and save them in pdf.
with open('output/'+file,'w') as f:
for a in zip(my_x,my_y):
fig, ax = plt.subplots()
ax.plot(my_x,my_y)
plt.xlim(3500,5700)
plt.show()
ax.set_ylabel('y_value')
ax.set_title('x')
fig.savefig()
The error message I get is
savefig() missing 1 required positional argument: 'fname'
Without the figure part of the code, the code runs fine (reads the input and generates the output files fine).
Any suggestions as to how to save figure for each of the output file?
If my code provided here is not sufficient enough to understand what's going on, let me know. I can provide more!
Would this do what you want?
with open('output/'+file,'w') as f:
for a in zip(my_x,my_y):
fig, ax = plt.subplots()
ax.plot(my_x,my_y)
plt.xlim(3500,5700)
plt.show()
ax.set_ylabel('y_value')
ax.set_title('x')
fig.savefig('output/' + file + '.pdf')

pdf not responsive when getting multiple pages in it using savefig

I am using matplotlib pyplot to generate plots. I want all my plots to come in a single pdf so I am using PdfPages. I am able to generate a single pdf with multiple plots but when the number of pages increase in the pdf it takes time to respond,.i.e, It opens and displays the first page of the pdf but when I try scrolling to go to other pages I have to wait for some time to see all the pages. The code I used is
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
plt.figure(figsize=(16, 9))
ax = plt.subplot(121)
ax.yaxis.set_major_formatter(FormatStrFormatter('%.0f'))
dataframe.plot.scatter(x=numericColumn2, y=numericColumn1, ax = ax)
plt.close()
pdf.savefig()
pdf.close()
Is there some parameter or something else I can add so that pdf does not takes time o respond?

Python3x + MatPlotLib - Updating a chart?

I am new to both the python and matplotlib languages and working on something for my husband.
I hope you guys can help me out.
I would like to pull in a file using Open, read it, and update a graph with it's values.
Sounds easy enough right? Not so much in practice.
Here is what I have so far to open and chart the file. This works fine as it is to chart the file 1 time.
import matplotlib.pyplot as plt
fileopen = open('.../plotresults.txt', 'r').read()
fileopen = eval(fileopen) ##because the file contains a dict and security is not an issue.
print(fileopen) ## So I can see it working
for key,value in fileopen.items():
plot1 = value
plt.plot(plot1, label=str(key))
plt.legend()
plt.show()
Now I would like to animate the chart or update it so that I can see changes to the data. I have tried to use matplotlib's animation feature but it is advanced beyond my current knowledge.
Is there a simple way to update this chart, say every 5 minutes?
Note:
I tried using Schedule but it breaks the program (maybe a conflict between schedule and having matplotlib figures open??).
Any help would be deeply appreciated.
Unfortunately you will just waste time trying to get a clean solution without either using matplotlib's animation feature or using the matplotlib OO interface.
As a dirty hack you can use the following:
from threading import Timer
from matplotlib import pyplot as plt
import numpy
# Your data generating code here
def get_data():
data = numpy.random.random(100)
label = str(data[0]) # dummy label
return data, label
def update():
print('update')
plt.clf()
data, label = get_data()
plt.plot(data, label=label)
plt.legend()
plt.draw()
t = Timer(0.5, update) # restart update in 0.5 seconds
t.start()
update()
plt.show()
It spins off however a second thread by Timer. So to kill the script, you have to hit Ctrl-C twice on the console.
I myself would be interested if there is a cleaner way to do this in this simple manner in the confines of the pyplot machinery.
Edits in italic.

Resources