Plotting glitch with matplotlib [python3]? - EDITED - python-3.x

I have some issues plotting "large" datasets of timeseries data in python, where the time jumps across a few decades in erroneous samples. We aim to visualise only the timestamp (unixtime + custom microseconds) vs index. In this example there are roughly 40k samples.
Basically, I am assuming it is some issue with the rendering of the plot by matplotlib, because when I move the axes, both the scatter points and also the lineplot seem to glitch all over the place. A further bit of evidence for this is that the line in the lineplot is not actually going through the markers, when I zoom in or pan the plot.
The timestamps are continuous and increase by 40ms between steps.
Overview of errors (timestamp is zero -> default date 1.1.1970)
Zoomed in on y axis
More zoomed in
Example of how the plot should look like
Timestamp raw data (ignore ms fraction 2)
Code used to plot (using google colab, re-created in Visual Studio Code)
if single_file_or_multiple == "multiple":
fig = px.line(df_trace, x=df_trace.index, y="time", markers=True,
color="rec_id")
fig.show()

Related

XV-11 Lidar read data with Raspberry pi 3B+

I have a XV-11 Lidar sensor from an old vacuum cleaner and I want to use it for a robot project.
During my research, I saw a very interesting and simple approach using Matplotlib and display all the distances using scatter points. eg (https://udayankumar.com/2018/08/01/working-with-lidar/) but when I run this python code to RP3 indeed a Matplotlib window is popping up with all the distances but the refresh rate for data it's too slow and impossible to view in real time. I mean the matplotlib display is falling behind a few dozens of seconds with all the sensor readings.
My next idea was to do something by myself with the following display lines but I have same result: Good readings but delayed a lot.
points =[]
plt.ion()
x = dist_mm*np.cos(angle_rad)
y = dist_mm*np.sin(angle_rad)
points.append([x,y])
points = np.array(points)
plt.scatter(points[:,0], points[:,1])
if angle == 356:
plt.plot()
plt.draw()
plt.pause(0.0001)
plt.clf()
print ("-----------")
What I'm doing wrong or what I can improve in this case? My expectations are something like this
Lidar animation, source: https://github.com/Hyun-je/pyrplidar but in this example it's used a different Lidar sensor
You are clearing and re-creating the axes, background etc. every time. At the very least you can limit this drawing/re-drawing to only the relevant plot points for a degree of improvement.
If you're not familiar with this I'd start with the animation guidance- https://matplotlib.org/stable/api/animation_api.html which introduces some basics like updating only parts of the figure.
If you're still churning out too much data to update then limiting the frequency with which you read your data or more specifically the rate at which you redraw might result in more stability too.
Probably worth hunting down more general guidance on realtime plotting e.g. update frame in matplotlib with live camera preview

pyplot - plot with a lot of arrow annotations is very slow

I have a code that generates plot using python pyplot. This plot is very fast with big amount of points but when I add as well a big amount of arrow annotations the plot is being very slow and each pan or zoom action takes a long time.
This is the line in the code where I add the annotations:
arrow = ax.annotate('', pointA,pointB ,annotation_clip=False,
arrowprops=dict(arrowstyle = 'simple'))
Any suggestions how can I accelerate the plot behavior?
Thanks

Stripplot and boxplot outliers do not overlap

I have been combining boxplots and strippplots with seaborn and I noticed that the boxplot outliers often have larger values that the largest values displayed by the stripplot. How can this be? The boxplot outliers as well as the stripplot are supposed to be real data points right?
This is the code I used to generate the graph:
data_long = pd.melt(data, id_vars=['var'])
sns.boxplot(x='value', y='var', data=data_long, hue='variable', orient='h',
order=sorted(values), palette='Set3')
sns.stripplot(x='value', y='var', data=data_long, hue='variable', orient='h', dodge=True, palette='Set3',
edgecolor='black', linewidth=1, jitter=True)
plt.semilogx(basex=2)
Here is the example:
Does anybody have any idea what is going on?
Highest regards.
As I'm making this question nicer, trying to get rid of that -1, I noticed I have order=(values) only in the boxplot, this makes the data differ between the box and the stripplot. Adding the order parameter also to the stripplot solves the problem.

matplotlib.pyplot.imshow awkwardly not plotting all of the data when array is transposed

I am trying to plot an array full of ones and zeros and most of the time it works well and looks like this.
However, when my array becomes too big (I need to plot 60,000x70) the plot only draws part of the data.
At first I thought that this might be some sort of memory issue, but the arrays actually are not that big after all and when looking into memory usage there also was no sign of too heavy lifting.
It becomes really weird, however, when I plot the transposed array, because then it works like a breeze.
I looked around in forums quite a lot but apparently nobody else has had such an issue. Might this be a bug? I really need to plot it in the original orientation. So, any help is highly appreciated. Thanks in advance!
UPDATE
This exactly reproduces my problem:
import numpy as np
import matplotlib.pyplot as plt
# generate fake data
a = np.random.random((60000, 70))
for x in np.nditer(a, op_flags=['readwrite']):
if x > 0.9:
x[...] = 1
else:
x[...] = 0
# plot fake data
fig, axes = plt.subplots(2, 2)
axes[0][0].imshow(a, interpolation='none', cmap='binary', aspect='auto')
axes[0][1].imshow(a.T, interpolation='none', cmap='binary', aspect='auto')
axes[1][0].imshow(a[:30000], interpolation='none', cmap='binary', aspect='auto')
axes[1][1].imshow(a[:30000].T, interpolation='none', cmap='binary', aspect='auto')
plt.show()
The code yields this. In the upper left subplot everything is plotted. In the plot showing the transposed array (upper right), however, matplotlib only draws the first ~10000 columns. The lower two plots just show the first half of the array (left normal, right transposed) and as you can see, with smaller arrays there is no issue.
SOLVED
This problem does not occur with matplotlib 2.x
[SOLVED]
The problem only occurs with outdated versions of matplotlib.

Histogram in logarithmic scale in gnuplot

I have to plot an histogram in logarithmic scale on both axis using gnuplot. I need bins to be equally spaced in log10. Using a logarithmic scale on the y axis isn't a problem. The main problem is creating the bin on the x axis. For example, using 10 bins in log10, first bins will be [1],[2],[3]....[10 - 19][20 - 29].....[100 190] and so on. I've searched on the net but I couldn't find any practical solution. If realizing it in gnuplot is too much complicated could you suggest some other software/language to do it?
As someone asked I will explain more specifically what I need to do. I have a (huge) list like this:
1 14000000
2 7000000
3 6500000
.
.
.
.
6600 1
8900 1
15000 1
19000 1
It shows, for example, that 14 milions of ip addresses have sent 1 packet, 7 milions 2 packets.... 1 ip address have sent 6600 packets, ... , 1 ip address have sent 19000 packets. As you can see the values on both axes are pretty high so I cannot plot it without a logarithmic scale.
The first things I tried because I needed to do it fast was plotting this list as it is with gnuplot setting logscale on both axes using boxes. The result is understandable but not too appropriate. In fact, the boxes became more and more thin going right on the x axis because, obviously, there are more points in 10-100 than in 1-10! So it became a real mess after the second decade.
I tried plotting a histogram with both axis being logarithmically scaled and gnuplot through the error
Log scale on X is incompatible with histogram plots.
So it appears that gnuplot does not support a log scale on the x axis with histograms.
Plotting in log-log scale in GnuPlot is perfectly doable contrary to the other post in this thread.
One can set the log-log scale in GnuPlot with the command set logscale.
Then, the assumption is that we have a file with positive (strictly non-zero) values both in the x-axis, as well as the y-axis. For example, the following file is a valid file:
1 0.5
2 0.2
3 0.15
4 0.05
After setting the log-log scale one can plot the file with the command:
plot "file.txt" w p where of course file.txt is the name of the file. This command will generate the output with points.
Note also that plotting boxes is tricky and is probably not recommended. One first has to restrict the x-range with a command of the form set xrange [1:4] and only then plot with boxes. Otherwise, when the x-range is undefined an error is returned. I am assuming that in this case plot requires (for appropriate x-values) some boxes to have size log(0), which of course is undefined and hence the error is returned.
Hope it is clear and it will also help others.
Have you tried Matplotlib with Python? Matplotlib is a really nice plotting library and when used with Python's simple syntax, you can plot things quite easily:
import matplotlib.pyplot as plot
figure = plot.figure()
axis = figure.add_subplot(1 ,1, 1)
axis.set_yscale('log')
# Rest of plotting code

Resources