Gnuplot percentage difference between 2 months - gnuplot

i have a csv file from 5 year malware data collected there are 2 columns the dates and the ips every date have 1 or more ips example
1/5/2013 12.234.123
1/5/2013 45.123.566
1/5/2013 100.546.12
1/6/2013 42.153.756
3/4/2014 75.356.258 etc... (every day for 5 years)
now i am trying to get the percentage difference between every month example:
November 2014 - 10%
December 2014 - 15%
i tried to put the percentage on y axis and in x2 axis but im getting some crazy results i am new to gnuplot and im still learning it here is the code i have right now:
set title 'Results Per Month'
set xlabel 'Date'
set ylabel 'Percentage'
set terminal png size 2800,900
set datafile sep ','
set xdata time
set timefmt '%Y/%m/%d'
set xrange['2009/3/22':'2014/12/02']
set xtics 30*24*60*60
set format x '%Y/%m'
set autoscale x2fix
set x2tics
set x2range[0:*]
set format x2 "%g %%"
set xtics nomirror rotate by -90
set grid ytics xtics
set ytics 10
set yrange [0:*]
set term png
set output 'file.png'
plot 'export.csv' using (timecolumn(1) - (tm_mday(timecolumn(1))-1)*24*60*60):(1) smooth frequency w lp pt 7 ps 2 notitle, \
'' using (($1-$2)/$1*100):x2ticlabels(2) axes x2y1 with points ps 2 lw 2

I would suggest you to use some external script for such kind of preprocessing (you can also do this on-the-fly). Yes, you can do this in gnuplot in two steps, but can become quite complicated and requires some more profound knowledge of gnuplot.
Here is a working script, but I won't go into detail about the many different aspects of the actual implementation:
set xdata time
set timefmt '%Y/%m/%d'
set datafile separator ','
set table 'temporaryfile.dat'
set format x '%Y/%m/%d'
plot 'export.csv' using (timecolumn(1) - (tm_mday(timecolumn(1))-1)*24*60*60):(1) smooth frequency
unset table
set y2tics
set ytics nomirror
set timefmt '"%Y/%m/%d"'
set format x '%b %Y'
set xtics rotate by 90 right
set datafile separator white
set yrange[0:*]
x0=x1=0
plot 'temporaryfile.dat' using 1:(strcol(3) eq "i" ? $2 : 1/0) w lp pt 7 ps 2 title 'IP count', \
'' using 1:(x1=x0, x0=$2, strcol(3) eq "i" ? ($0 == 0 || x0 == 0 ? 0 : (x0-x1)/x0 * 100.0) : 1/0) axes x1y2 w lp title 'percentual change'
Basically, first you plot the result data of smooth frequency into a second data file. Then you can plot this, and to the calculations for the percentages.
Please note, that I used a timeformat which corresponds to your test data (and the data of your previous question), which doesn't correspond with what you have in your script! Please pay attention to this.
Also note, that the timefmt before the actual plot must be extended by quote signs which are written around the dates in tmp.dat.
Finally, the strcol(3) eq 'i' is necessary to circumvent a gnuplot bug, which causes a last line to be written with invalid data.

Related

Xtics too close together with GnuPlot and lots of datapoints

I am trying to graph roughly 15k different data points. I have tried all of the different variations that I can think of to reduce the number of xtics, but I can't seem to change the frequency that they are drawn.
Here's my gnuplot file:
reset
# need to call with two variables -e "filename='...'" -e "machine='...'"
set title machine." Activity ".filename
set datafile separator ","
set autoscale x
set autoscale y
set autoscale y2
set yrange [0:*]
set y2range [0:*]
set y2tics
set style data lines
set ylabel "% CPU"
set xlabel "Time"
set y2label "Memory (MB)"
set bmargin 7 # room for the xtic label
set term pngcairo size 960,720
set output filename.".png"
# I've also tried autofreq and explicit labels
set xtics axis out rotate 90 scale 0.5 (20101939, 1000000, 25102219)
plot filename \
using 2:xtic(1) title "CPU" with points pt 1 axes x1y1, \
"" using ($3 / 1024 / 1024):xtic(1) title "Memory" with points pt 1 axes x1y2
The format of my data is:
datetime(ddHHMMss), %cpu, mem-in-bytes, pid, process-alias, process-name
My data looks like the following (roughly sampled every 30 seconds for 15k records):
20101939,0,137932800,6172,process-alias,process-name
20102009,0.15623667978147077,139509760,6172,process-alias,process-name
20102039,0.15623669540380838,139866112,6172,process-alias,process-name
20102109,0.41663098777764329,141488128,6172,process-alias,process-name
20102139,0.052078915131769939,141455360,6172,process-alias,process-name
Despite my xtics command with explicit start, interval, and end, my graph always ends up with the xtic labels being overlapped. Here's what it looks like:
To filter the tic labels, replace xtic(1) with a criterion for printing a non-blank label. For example, this will print every 25th label.
plot filename \
using 2:xtic( int($0)%25 ? "" : strcol(1) ) title "CPU" with points pt 1
int($0) is the line number; strcol(1) is the content of column 1 read as a string

set xtics by <start>, <incr> in gnuplot does not work

I plot with gnuplot the following:
$Data <<EOD
time_,value
1-23:59:58,1
1-23:59:59,2
2-00:00:00,3
2-00:00:01,4
2-00:00:02,5
EOD
set term png size 800,600
set output "ask.png"
set datafile separator comma
set grid
set xdata time
set timefmt "%d-%H:%M:%S"
set format x "%H:%M:%S"
set xtics nomirror
set autoscale xfix
set autoscale x2fix
startnumber=1
xticdata=2
mxticdata=2
set xtics xticdata rotate
set mxtics mxticdata
set x2data
set x2tics startnumber, xticdata rotate
set mx2tics mxticdata
set link x2 via x+1 inverse x-1
plot $Data using 1:2 title columnheader(2)
set output
Data of the column 2 which contains nearly 50,000 records is value of a parameter. set link has to be used to align x-axis and x2-axis. And I want to show x2tic labels for counter/index which must be related to the time(column 1).
The output is alright, but you can see from the attached figure that the labels on x2-axis are big numbers, which is not what I want. I want to get labels like "1,3,5...".
So what's wrong with my code? And how to correct it? Thanks.
If the idea is that the x2 axis should be labeled with the content of column 2 regardless of its numerical relationship to column 1, then you can use:
set xdata time
set timefmt "%d-%H:%M:%S"
set format x "%H:%M:%S"
set xtics nomirror
set x2tics nomirror
set link x2
plot $Data using 1:2:x2ticlabels( int($0+1)%2 ? strcol(2) : "" ) title columnheader(2)
This creates one x2 axis tick label for every data point. The even-numbered ones to blank and the odd-numbered ones are set to whatever is in column 2.
the quickest fix would be
set link x2 via x-86397 inverse x+86397
But it depends what timesteps you have and what numbers you have in column 2. If your time step it is strictly regular and 1 second, and column 2 just counts up, then column 2 is redundant.
Timedata is handled internally as seconds from 01.01.1970 00:00:00.
One day has 86400 seconds. Check help time/date.

Problem with gnuplot - timestamp data mapping to xrange

what i have:
csv data with timestamps in the first column, columns I want to plot selectively after that.
Every data point ist roughly ten minutes apart. Data is for 24 hours. Everything else set up nicely, examples below
What i want:
Be able to map the time data formatted on the x-axis (xrange?). Like xtics every n hours, in a given format (like "%T, %A"). Best configurable per column I want to plot (thinking about multiplot).
Data:
1545389400,39,0,0,1,664,2493,31.7
1545390000,37,0,0,1,736,3093,32.5
1545391200,33,0,0,1,664,4293,32.6
1545392400,28,0,0,1,704,5493,31.3
1545393000,26,0,0,0,649,6093,30.8
1545393600,24,0,0,0,632,6693,30.5
Code:
set title "Battery Log"
set datafile separator ','
set key center bottom outside
set border lw 0.5 lc '#959595'
set terminal svg dynamic rounded mouse lw 1 background '#272822'
set grid ytics
set ytics nomirror in
set yrange [0:100]
set xtics nomirror
set xtics rotate
set xdata time
set timefmt "%s"
set format x "%T, %A"
plot 'stats.csv' \
u 0:2 w l lc '#f92783' t columnheader, '' \
u 0:8 w l lc '#a6e22a' t columnheader
what about this?
### set time xtics
N = 3 # every n-th hour
set samples 100
set xdata time
set format x "%a, %H:%M"
set xtics rotate
set xtics N*3600
plot '+' u ($0*1200):(3*sin(x)+rand(0)) w lp pt 7 not
### end of code
which should give something like this, ticks every 3rd hour.
Set your N depending on the column you want to plot.

Remove weekend gaps in gnuplot for candlestick chart

I am trying to plot some financial candlestick charts with gnuplot. The problem is that there is no data during the weekends, and I don't want these gaps to be showed. Picture and code included below.
set datafile separator ","
set xdata time
set timefmt"%Y-%m-%d"
set xrange ["2015-10-22":"2016-02-06"]
set yrange [*:*]
set format x
plot 'head.dat' using 1:2:4:3:5 notitle with candlesticks
As you have one entry per working day, instead of using the dates as abscissae you can use the line number:
plot 'head.dat' using 0:2:4:3:5 notitle with candlesticks
Then I guess you'll ask how to restore the dates on the x-axis. You can use xticslabel :
set xtics rotate 90
plot "head.dat" u 0:2:4:3:5:xticlabels(1) notitle with candlesticks
If you want to avoid having every label shown use this everyNth function posted by dir, e.g. every fifth label:
set datafile separator ","
everyNth(countColumn, labelColumnNum, N) = \
( (int(column(countColumn)) % N == 0) ? stringcolumn(labelColumnNum) : "" )
set xtics rotate 90
plot "head.dat" using 0:2:4:3:5:xticlabels(everyNth(0, 1, 5)) notitle with candlesticks
Results in:

gnuplot display y2 values on x2

I have the following graph:
first data set display searches.
second data set display clicks.
y1 shows searches scale, y2 shows click scale.
on the x1 I have time values displayed.
I wish to display clicks values (each hour) on x2 (the upper axis).
When I add the command set x2tics it displays the searches data and not the clicks like I wished.
How do I change it so it will display the clicks unit?
Gnuplot script:
set xlabel "Time"
set ylabel "Times"
set y2range [0:55000]
set y2tics 0, 1000
set ytics nomirror
set datafile separator "|"
set title "History of searches"
set xdata time # The x axis data is time
set timefmt "%Y-%m-%d %H:%M" # The dates in the file look like 10-Jun-04
set format x "%d/%m\n%H:%M"
set grid
set terminal png size 1024,768 # gnuplot recommends setting terminal before output
set output "outputFILE.png" # The output filename; to be set after setting
# terminal
load "labelsFILE"
plot 'goodFILE' using 1:3 lt 2 with lines t 'Success' , 'clicksFILE' using 1:2 lt 5 with lines t 'Clicks right Y' axis x1y2
replot
Graph:
graph http://img42.imageshack.us/img42/1269/wu0b.png
Ok, so to get started, here is how you can set a label with the number of clicks as follows (using you data file names):
plot 'goodFILE' using 1:3 lt 2 with lines t 'Success',\
'clicksFILE' using 1:2 lt 5 with lines t 'Clicks right Y' axis x1y2,\
'' using 1:2:(sprintf("%dk", int($2/1000.0))) with labels axis x1y2 offset 0,1 t ''
Just add this as plotting command, and it should work just fine.
To illustrate, how the labels might look like, here is an example with some dummy data:
set terminal pngcairo
set output 'blubb.png'
set xlabel "Time"
set ylabel "Times"
set y2label "Clicks per hour"
set y2range [0:10000]
set yrange [0:1]
set ytics nomirror
set y2tics
set key left
set samples 11
set xrange[0:10000]
plot '+' using 1:1:(sprintf("%dk", int($1/1000.0))) every ::1::9 with labels axis x1y2 offset 0,1 t '',\
'' using 1:1 with linespoints axis x1y2 pt 7 t 'Clicks per hour'
Which gives you:

Resources