Plotting datafile with abbreviated values with gnuplot

Plotting datafile with abbreviated values with gnuplot - gnuplot

I have a gnuplot datafile that looks like this:
1 4810 582 573 587
2 99k 67k 56k 40k
3 119k 82k 68k 49k
4 119k 81k 68k 49k
5 120k 81k 65k 45k
6 121k 82k 65k 44k
7 124k 106k 97k 86k
8 128k 134k 131k 131k
9 128k 130k 131k 135k
10 129k 133k 130k 132k
First column will be on the X-axis labeled as "Time", the rest are the different interrupt values with respect to time (i.e. IRQ1, IRQ2, IRQ3, IRQ4)
The problem when generating a plot with this is that gnuplot does not seem to interpret the abbreviated values with the K suffix as numbers in the thousands, but instead as raw values such as 99, 67, 119, etc. Thus the lines will jump from around 5000 at time 1 and drop to around 100 in the graph.
Are there any options to tell gnuplot to automatically interpret abbreviated values and plot them accordingly?

I think there is no direct way of telling gnuplot of how to interpret the input in this case.
You can, however, write your own function that converts the string-input to numbers
check(x)=(pos=strstrt(x,"k"),\
pos > 0 ? real(substr(x,1,pos-1))*1000 : real(x))
The function check first determines the position of the letter 'k' in the input. (The function strstrt returns '0' if the input x does not contain the letter 'k'.)
If the input contains the letter 'k', take the input, discard the last letter, convert the remaining part to a number and multiply it by 1000.
If the input does not contain 'k', return the input
Now you can plot the data file (assuming its name is test):
plot 'test' u 1:(check(stringcolumn(2))) w l
This should do the job!

a non-purely gnuplot, unix solution would use process substitution:
plot "<(sed 's/k/000/g' datafile.dat)" u 1:2 w lp
The sed 's/k/000/g' command replaces all occurrences of the character k with 000 in datafile.dat: e.g. 96k will be replaced with 96000.
The output is similar to the plot posted by #Knorr

Related

Gnuplot: operations with data in different blocks

I have a data file consisting of two blocks (separated by a single blank line) and would like to plot the difference between data from block 1 and block 2, i.e., something like
plot 'a.dat' using 1:($2_1-$2_2)
where $2_1 is supposed to mean "data from block 1, col.2" and $2_2 "data from block 2, col.2". Is that possible within Gnuplot, and if so, how?
Thanks,
Tom

This task is most likely not possible directly in Gnuplot, however, one can preprocess the data file first, using, e.g., gawk and then plot the modified file. For example:
dataFile="a.dat"
plotCmd(fname)=sprintf("<gawk '\
BEGIN{mode=0;l=0;} \
mode==0{if(NF==0){mode=1;}else{x[NR]=$1;y[NR]=$2;}} \
mode==1{if(NF>0){mode=2;l=NR;}} \
mode==2{print $1,y[NR-l+1],$2}' %s", fname)
plot plotCmd(dataFile) u 1:($2-$3) w l
The gawk script reads the file and saves the first and second column into arrays x and y until it reaches a blank line (zero number of fields). Then it skips all consecutive blank lines until it reaches a non-empty line (NF>0). It remembers the position of this line in the input file and then outputs for each line in the second block the x-coordinate together with the corresponding y-coordinate from the first block, i.e., a data file such as
1 2
2 4
3 6
1 4
2 8
3 12
would be transformed into
1 2 4
2 4 8
3 6 12
This assumes that the x-coordinates in both blocks match...

Remove whitespace padding in matlab fprintf file output

I have large matrices of data that look something like this:
DataOut' = [34 1 0.0 -4.75343000000000 0.0291776000000000 5.32835000000000 1.23598000000000 0.890008000000000;
7 1 0.0902364000000000 -4.74065000000000 0.0 1.97133000000000 9.49706000000000 16.1658000000000]
The first two columns are IDs and always integers and the remaining 6 columns are 2 pairs of (X,Y,Z) coordinates (Floats) for each respective ID.
I'm writing the data to a file using the following syntax:
fprintf(' %u %u %-6.12g %-6.12g %-6.12g %-6.12g %-6.12g %-6.12g \r\n', DataOut)
>> 34 1 0 -4.75343 0.0291776 5.32835 1.23598 0.890008
7 1 0.0902364 -4.74065 0 1.97133 9.49706 16.1658
This format is fine in almost all cases except the one highlighted above, where the insignificant trailing zeros are replaced with spaces, leading to a big gap between some columns instead of the single space. The software reading this data really doesn't like all theses spaces and breaks when it finds more than the expected one.
My desired output is to only have a single space between each column:
>> 34 1 0 -4.75343 0.0291776 5.32835 1.23598 0.890008
7 1 0.0902364 -4.74065 0 1.97133 9.49706 16.1658
Does anyone know how to get fprintf do just leave one space after removing the insignificant trailing zeros? Using fprintf is nice because I don't need any loops and when you have several thousand of these matrices to be written out I guess that would be quite slow if I had to do some checking in a loop?

The format spec that you have used for floating point numbers (%-6.12g) species that you want to remove trailing zeros that are non-significant (with a maximum of 12 numbers after the decimal). However, the -6 specifies that you want each field to be at least 6 characters wide. In the case of your 0, it has a width of 1 so fprintf will pad it to be 6 characters wide (hence all the whitespace). If you simply remove the -6 from the beginning of each of your format specifiers you will get the output you desire.
fprintf(' %u %u %.12g %.12g %.12g %.12g %.12g %.12g \r\n', DataOut)
% 34 7 1 1 0 0.0902364 -4.75343 -4.74065
% 2.917760e-02 0 5.32835 1.97133 1.23598 9.49706 0.890008 16.1658

Gnuplot CCDF plotting and log-log scale

My data file is a set of sorted single-column:
1
1
2
2
2
3
...
999
1000
1000
I am able to successfully plot the CDF using the command like (assuming 10000 lines in the file):
plot "file" using 1:(1/10000.) smooth cumulative title "CDF"
I am also able to plot the logcale of x axis by:
set logscale x
My problem is how can I have a CCDF plotting with Gnuplot?
In additional, the CDF with log-log scale (set logscale xy) can not give me any output. What if I would like to have a log-log CCDF plotting?
Many thanks!

I found a workaround for this problem, because I do not think you can plot a CCDF only using gnuplot.
Briefly, I just parsed my data using bash to create a dataset where the cumulative data is explicit; then gnuplot may simply plot the new dataset. As an example, assuming that your file contains the (numerical) values you want to cumulate, I would do in a bash environment:
cat data | sort -n | uniq --count | awk 'BEGIN{sum=0}{print $2,$1,sum; sum=sum+$1}' > parsed.dat'
This command reads the dataset (cat data), sorts the numerical data using their value (sort -n), counts the occurrences of each sample (uniq --count) and creates a new dataset, calculating as well the cumulative sum of each data value (the awk command).
This new dataset contains 3 columns: the first column ($1 in gnuplot) contains the unique values of your dataset, the $2 contains the number of the occurrences of your values, and the third column represents the cumulative sum.
Finally, in gnuplot, you can do this:
stats "parsed.dat" using 3;
plot "parsed.dat" using 1:($3/STATS_max) with lines title "CDF",\
"" using 1:(1-$3/STATS_max) with lines title "CCDF",\
"" using 1:($2/STATS_max) with boxes title "PDF"
The stats command of gnuplot analyzes the third column (the one with the cumulative sum) and stores the values to some variables. STATS_max is the max value of this column (so it is the final cumulative sum). Now you have all the data you need to plot not only the CDF, but also the CCDF (which is 1 - CDF) and also the PDF (or the normalized histogram, for discrete values).

Plotting multiple graphs depending on column value with gnuplot

I have the following data, which I wan't to plot using GNUPLOT:
#TIME #VALUE #SOURCE
1 100 A
1 88 B
2 115 A
2 100 B
3 130 A
3 210 B
I want to have two lines drawn, depending on the value of column #SOURCE. One line for A and one line for B. Is this possible with GNUPLOT and if yes how?
Is it possible to also draw a summation of column #VALUE depending over column #TIME? Means, that for all equal entries in #TIME, the values in #VALUE will be summed up.
Thanks in advance,
Frank

One way to do it would be to use grep to locate lines ending with A or B and plot the result. You can do this in a single plot line with a for loop if you know the characters lines will end in:
plot for [s in 'A B'] sprintf("<(grep -v '%s$' data.dat)", s) u 1:2 w l
This plots the data you provided (saved in data.dat) as two different lines.
You could also change the for part to [s in 'word1 word2 word3'] or any other string you like. If you don't know the character/word lines will be ending with you would probably need to pass the file twice first to determine the string for the for loop and a second time to do the plotting.

How to plot piecewise function using data plot in Gnuplot?

According to figure above. this picture is generated from data points in text file. My question is that how can i remove the line at any two points if graph is jumped? (In my picture see that graph is jump about on x~260)
note that my purpose is that i just want to make this graph look like piecewise function that mean line on the middle of graph should not be connected because is jumped.

In gnuplot you can split a line in several parts either when you have an invalid data value somewhere, or an empty line.
For the first situation, you could check inside the using statement, if the difference to the previous point is too large, and invalidate the current point. But that would also make you loose not only the connecting line, but also the first point after the jump:
lim=3
y2=y1=0
plot 'test.dat' using (y2=y1,y1=$2,$1):($0 > 0 && abs(y2-y1) > lim ? 1/0 : y1) with linespoints
The test data file I used is
1 1
2 1.1
3 0.95
4 1
5 5
6 6
7 5.5
8 5.8
9 -2
10 -2.5
11 -4
As you see, the points at x=5 and x=9 are lost.
Alternatively, you can pipe your data through an external tool like awk for the filtering. In this case you can insert an empty line when the difference between two consecutive y-values exceeds some limit:
filter(lim) = 'awk ''{if(NR > 1 && sqrt((y-$2)**2) > '.lim.') print ""; print; y=$2}'' test.dat'
plot '< '.filter(3) using 1:2 with lines
Note, that I used the sqrt((..)**2) only to simulate an abs function, which awk doesn't have.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Plotting datafile with abbreviated values with gnuplot - gnuplot

Related

Gnuplot: operations with data in different blocks

Remove whitespace padding in matlab fprintf file output

Gnuplot CCDF plotting and log-log scale

Plotting multiple graphs depending on column value with gnuplot

How to plot piecewise function using data plot in Gnuplot?

Categories

Resources