I am trying to graph roughly 15k different data points. I have tried all of the different variations that I can think of to reduce the number of xtics, but I can't seem to change the frequency that they are drawn.
Here's my gnuplot file:
# need to call with two variables -e "filename='...'" -e "machine='...'"
set title machine." Activity ".filename
set datafile separator ","
set autoscale x
set autoscale y
set autoscale y2
set yrange [0:*]
set y2range [0:*]
set y2tics
set style data lines
set ylabel "% CPU"
set xlabel "Time"
set y2label "Memory (MB)"
set bmargin 7 # room for the xtic label
set term pngcairo size 960,720
set output filename.".png"
# I've also tried autofreq and explicit labels
set xtics axis out rotate 90 scale 0.5 (20101939, 1000000, 25102219)
plot filename \
using 2:xtic(1) title "CPU" with points pt 1 axes x1y1, \
"" using ($3 / 1024 / 1024):xtic(1) title "Memory" with points pt 1 axes x1y2
The format of my data is:
datetime(ddHHMMss), %cpu, mem-in-bytes, pid, process-alias, process-name
My data looks like the following (roughly sampled every 30 seconds for 15k records):
Despite my xtics command with explicit start, interval, and end, my graph always ends up with the xtic labels being overlapped. Here's what it looks like:
To filter the tic labels, replace xtic(1) with a criterion for printing a non-blank label. For example, this will print every 25th label.
plot filename \
using 2:xtic( int($0)%25 ? "" : strcol(1) ) title "CPU" with points pt 1
int($0) is the line number; strcol(1) is the content of column 1 read as a string
what i have:
csv data with timestamps in the first column, columns I want to plot selectively after that.
Every data point ist roughly ten minutes apart. Data is for 24 hours. Everything else set up nicely, examples below
What i want:
Be able to map the time data formatted on the x-axis (xrange?). Like xtics every n hours, in a given format (like "%T, %A"). Best configurable per column I want to plot (thinking about multiplot).
set title "Battery Log"
set datafile separator ','
set key center bottom outside
set border lw 0.5 lc '#959595'
set terminal svg dynamic rounded mouse lw 1 background '#272822'
set grid ytics
set ytics nomirror in
set yrange [0:100]
set xtics nomirror
set xtics rotate
set xdata time
set timefmt "%s"
set format x "%T, %A"
plot 'stats.csv' \
u 0:2 w l lc '#f92783' t columnheader, '' \
u 0:8 w l lc '#a6e22a' t columnheader
what about this?
### set time xtics
N = 3 # every n-th hour
set samples 100
set xdata time
set format x "%a, %H:%M"
set xtics rotate
set xtics N*3600
plot '+' u ($0*1200):(3*sin(x)+rand(0)) w lp pt 7 not
### end of code
which should give something like this, ticks every 3rd hour.
Set your N depending on the column you want to plot.
I have the following data:
N Computed Value of pi
y, x
1, 8.0
10, 3.6
100, 3.36
1000, 3.212
10000, 3.152
100000, 3.14316
1000000, 3.14266
10000000, 3.1420448
100000000, 3.14190876
1000000000, 3.141573084
And I am trying format y axis in terms of 10^x
I used the following code:
set terminal pngcairo size 1280,800 enhanced font 'Helvetica,24'
set output "fig.png"
# Title, axis label, range and ticks
set title "Simulations"
set xlabel "Number of Iterations(n)"
set ylabel "Computed values"
# Legend location and grid
set key top left
set grid
set ytics out nomirror
set xtics out nomirror
set format y "10^{%L}"
# Plot the data
plot data.dat" using 2:1 title "" with linesp lw 2 pt 7 ps 1.5
But i am getting the following output:
Please help
I had to guess what you really want because there were some inconsistencies. Here is what I believe brings you where you want, changes commented in the source file.
# Title, axis label, range and ticks
set title "Simulations"
set xlabel "Number of Iterations(n)"
set ylabel "Computed values"
# Legend location and grid
set datafile separator comma # gnuplot looks for spaces
# you must tell it about the comma
unset key # same as title "" as you have in your plot command
set grid
set ytics out nomirror
set xtics out nomirror
set logscale x # I guess that's what you want and how it should be
set format x "10^{%L}" # yr x axis is labeled iterations, so I guess
# that's what you want
# Plot the data
plot [][2:10] "data.dat" using 1:2 with linesp lw 2 pt 7 ps 1.5
# swapped 2:1 so that the iterations are on the x axis
# introduced a range for y so that it is better to see
# 'title ""' removed, see 'unset key'
This gives you
May not be exactly what you want but should move you to the next level I hope.
I am using gnuplot to plot data from two separate csv files (found in this link: https://drive.google.com/open?id=0B2Iv8dfU4fTUZGV6X1Bvb3c4TWs) with a different number of rows which generates the following graph.
These data seem to have no common timestamp (the first column) in both csv files and yet gnuplot seems to fit the plotting as shown above.
Here is the gnuplot script that I use to generate my plot.
# ###### GNU Plot
set style data lines
set terminal postscript eps enhanced color "Times" 20
set output "output.eps"
set title "Actual vs. Estimated Comparison"
set style line 99 linetype 1 linecolor rgb "#999999" lw 2
#set border 1 back ls 11
set key right top
set key box linestyle 50
set key width -2
set xrange [0:10]
set key spacing 1.2
#set nokey
set grid xtics ytics mytics
#set size 2
#set size ratio 0.4
#show timestamp
set xlabel "Time [Seconds]"
set ylabel "Segments"
set style line 1 lc rgb "#ff0000" lt 1 pi 0 pt 4 lw 4 ps 0
plot "estimated.csv" using ($1):2 with lines title "Estimated", "actual.csv" using ($1):2 with lines title "Actual";
Is there any way where we can print out (write to a file) the values of the intersection of these plots by ignoring the peaks above green plot? I also have tried to do an sql-join query but it doesn't seem to print out anything for the same reason I explained above.
PS: If the blue line doesn't touch the green line (i.e. if it is way below the green line), I want to take the values of the closest green line so that it will be a one-to-one correspondence (or very close) with the actual dataset.
Perhaps one could somehow force Gnuplot to reinterpolate both data sets on a fine grid, save this auxiliary data and then compare it row by row. However, I think that it's indeed much more practical to delegate this task to an external tool.
It's certainly not the most efficient way to do it, nevertheless a "lazy approach" could be to read the data points, interpret each dataset as a LineString (collection of line segments, essentially equivalent to assuming a linear interpolation between data points) and then calculate the intersection points. In Python, the script to do this might look like this:
#!/usr/bin/env python
import sys
import numpy as np
from shapely.geometry import LineString
def load_data(fname):
return LineString(np.genfromtxt(fname, delimiter = ','))
lines = list(map(load_data, sys.argv[1:]))
for g in lines[0].intersection(lines[1]):
if g.geom_type != 'Point':
print('%f,%f' % (g.x, g.y))
Then in Gnuplot, one can invoke it directly:
set terminal pngcairo
set output 'fig.png'
set datafile separator comma
set yr [0:700]
set xr [0:10]
set xtics 0,2,10
set ytics 0,100,700
set grid
set xlabel "Time [seconds]"
set ylabel "Segments"
plot \
'estimated.csv' w l lc rgb 'dark-blue' t 'Estimated', \
'actual.csv' w l lc rgb 'green' t 'Actual', \
'<python filter.py estimated.csv actual.csv' w p lc rgb 'red' ps 0.5 pt 7 t ''
which gives:
i have a csv file from 5 year malware data collected there are 2 columns the dates and the ips every date have 1 or more ips example
1/5/2013 12.234.123
1/5/2013 45.123.566
1/5/2013 100.546.12
1/6/2013 42.153.756
3/4/2014 75.356.258 etc... (every day for 5 years)
now i am trying to get the percentage difference between every month example:
November 2014 - 10%
December 2014 - 15%
i tried to put the percentage on y axis and in x2 axis but im getting some crazy results i am new to gnuplot and im still learning it here is the code i have right now:
set title 'Results Per Month'
set xlabel 'Date'
set ylabel 'Percentage'
set terminal png size 2800,900
set datafile sep ','
set xdata time
set timefmt '%Y/%m/%d'
set xrange['2009/3/22':'2014/12/02']
set xtics 30*24*60*60
set format x '%Y/%m'
set autoscale x2fix
set x2tics
set x2range[0:*]
set format x2 "%g %%"
set xtics nomirror rotate by -90
set grid ytics xtics
set ytics 10
set yrange [0:*]
set term png
set output 'file.png'
plot 'export.csv' using (timecolumn(1) - (tm_mday(timecolumn(1))-1)*24*60*60):(1) smooth frequency w lp pt 7 ps 2 notitle, \
'' using (($1-$2)/$1*100):x2ticlabels(2) axes x2y1 with points ps 2 lw 2
I would suggest you to use some external script for such kind of preprocessing (you can also do this on-the-fly). Yes, you can do this in gnuplot in two steps, but can become quite complicated and requires some more profound knowledge of gnuplot.
Here is a working script, but I won't go into detail about the many different aspects of the actual implementation:
set xdata time
set timefmt '%Y/%m/%d'
set datafile separator ','
set table 'temporaryfile.dat'
set format x '%Y/%m/%d'
plot 'export.csv' using (timecolumn(1) - (tm_mday(timecolumn(1))-1)*24*60*60):(1) smooth frequency
unset table
set y2tics
set ytics nomirror
set timefmt '"%Y/%m/%d"'
set format x '%b %Y'
set xtics rotate by 90 right
set datafile separator white
set yrange[0:*]
plot 'temporaryfile.dat' using 1:(strcol(3) eq "i" ? $2 : 1/0) w lp pt 7 ps 2 title 'IP count', \
'' using 1:(x1=x0, x0=$2, strcol(3) eq "i" ? ($0 == 0 || x0 == 0 ? 0 : (x0-x1)/x0 * 100.0) : 1/0) axes x1y2 w lp title 'percentual change'
Basically, first you plot the result data of smooth frequency into a second data file. Then you can plot this, and to the calculations for the percentages.
Please note, that I used a timeformat which corresponds to your test data (and the data of your previous question), which doesn't correspond with what you have in your script! Please pay attention to this.
Also note, that the timefmt before the actual plot must be extended by quote signs which are written around the dates in tmp.dat.
Finally, the strcol(3) eq 'i' is necessary to circumvent a gnuplot bug, which causes a last line to be written with invalid data.
I have the following graph:
first data set display searches.
second data set display clicks.
y1 shows searches scale, y2 shows click scale.
on the x1 I have time values displayed.
I wish to display clicks values (each hour) on x2 (the upper axis).
When I add the command set x2tics it displays the searches data and not the clicks like I wished.
How do I change it so it will display the clicks unit?
Gnuplot script:
set xlabel "Time"
set ylabel "Times"
set y2range [0:55000]
set y2tics 0, 1000
set ytics nomirror
set datafile separator "|"
set title "History of searches"
set xdata time # The x axis data is time
set timefmt "%Y-%m-%d %H:%M" # The dates in the file look like 10-Jun-04
set format x "%d/%m\n%H:%M"
set grid
set terminal png size 1024,768 # gnuplot recommends setting terminal before output
set output "outputFILE.png" # The output filename; to be set after setting
# terminal
load "labelsFILE"
plot 'goodFILE' using 1:3 lt 2 with lines t 'Success' , 'clicksFILE' using 1:2 lt 5 with lines t 'Clicks right Y' axis x1y2
graph http://img42.imageshack.us/img42/1269/wu0b.png
Ok, so to get started, here is how you can set a label with the number of clicks as follows (using you data file names):
plot 'goodFILE' using 1:3 lt 2 with lines t 'Success',\
'clicksFILE' using 1:2 lt 5 with lines t 'Clicks right Y' axis x1y2,\
'' using 1:2:(sprintf("%dk", int($2/1000.0))) with labels axis x1y2 offset 0,1 t ''
Just add this as plotting command, and it should work just fine.
To illustrate, how the labels might look like, here is an example with some dummy data:
set terminal pngcairo
set output 'blubb.png'
set xlabel "Time"
set ylabel "Times"
set y2label "Clicks per hour"
set y2range [0:10000]
set yrange [0:1]
set ytics nomirror
set y2tics
set key left
set samples 11
set xrange[0:10000]
plot '+' using 1:1:(sprintf("%dk", int($1/1000.0))) every ::1::9 with labels axis x1y2 offset 0,1 t '',\
'' using 1:1 with linespoints axis x1y2 pt 7 t 'Clicks per hour'
Which gives you: