Creating and modifying arbitrary-length arrays while plotting in gnuplot - gnuplot

I would like to count the number of occurrences of an event (for example, x data value equals some number) and store these occurrences in order, while plotting a file in gnuplot. Say I have the following file:
1
0
0
0
1
1
0
Now I want to count how many times I have a 1 and store that number in variable N. Then I want to know the positions where that happens and store that information in an array pos, all of this while plotting the file. The result, for the example above, should be:
print N
3
print pos
1 5 6
I know how to achieve the counting:
N = 0
plot "data" u ($0):($1 == 1 ? (N = N+1, $1) : $1)
print N
3
Then to achieve the position recording, it would be schematically something like this:
N = 0 ; pos = ""
plot "data" u ($0):($1 == 1 ? (N = N+1, pos = pos." ".$0, $1) : $1) # This doesn't work!
print N
3
print pos
1 5 6
How can this be done in gnuplot without resorting to external bash commands?

Well, as sometimes happens writing down the question triggers an idea for an answer. I'll leave it here in case somebody finds it useful:
N=0 ; pos=""
plot "data" u ($0):($1 == 1 ? (N = N+1, pos = sprintf("%s %g", pos, $0+1), $1) : $1)
print N
3
print pos
1 5 6
Note I had to use $0+1 because position 1 is treated by gnuplot as zero.

Related

Find H Index of the nodes of a graph using NetworkX

Definition of H Index used in this algorithm
Supposing a relational expression is represented as y = F(x1, x2, . . . , xn), where F returns an integer number greater than 0, and the function is to find a maximum value y satisfying the condition that there exist at least y elements whose values are not less than y. Hence, the H-index of any node i is defined as
H(i) = F(kj1 ,kj2 ,...,k jki)
where kj1, kj2, . . . , kjki represent the set of degrees of neighboring nodes of node i.
Now I want to find the H Index of the nodes of the following graphs using the algorithm given below :
Graph :
Code (Written in Python and NetworkX) :
def hindex(g, n):
nd = {}
h = 0
# print(len(list(g.neighbors(n))))
for v in g.neighbors(n):
#nd[v] = len(list(g.neighbors(v)))
nd[v] = g.degree(v)
snd = sorted(nd.values(), reverse=True)
for i in range(0,len(snd)):
h = i
if snd[i] < i:
break
#print("H index of " + str(n)+ " : " + str(h))
return h
Problem :
This algorithm is returning the wrong values of nodes 1, 5, 8 and 9
Actual Values :
Node 1 - 6 : H Index = 2
Node 7 - 9 : H Index = 1
But for Node 1 and 5 I am getting 1, and for Node 8 and 9 I am getting 0.
Any leads on where I am going wrong will be highly appreciated!
Try this:
def hindex(g, n):
sorted_neighbor_degrees = sorted((g.degree(v) for v in g.neighbors(n)), reverse=True)
h = 0
for i in range(1, len(sorted_neighbor_degrees)+1):
if sorted_neighbor_degrees[i-1] < i:
break
h = i
return h
There's no need for a nested loop; just make a decreasing list, and calculate the h-index like normal.
The reason for 'i - 1' is just that our arrays are 0-indexed, while h-index is based on rankings (i.e. the k largest values) which are 1-indexed.
From the definition of h-index: For a non-increasing function f, h(f) is max i >= 0 such that f(i) >= i. This is, equivalently, the min i >= 1 such that f(i) < i, minus 1. Here, f(i) is equal to sorted_neighbor_degrees[i - 1]. There are of course many other ways (with different time and space requirements) to calculate h.

Taxicab geometry task

So, i've been struggling for some time with this task. It sounds like this: given N points(X,Y) X,Y integers, and M questions of the form P(A, B), find the total distance from point P(A,B) to all the N given points. Distance from A(x1, y1) to B(x2, y2) = max(|x1-x2|, |y1-y2|). Maybe it sounds wierd, i'm not an english speaker, sorry for the mistakes. I'll leave here the IN/OUT
IN.txt (N = 4, M = 3, the first 4 coordinates represent the given points.
the next 3 coordinates are the points from which i have to compute the total lenght)
4 3
3 5
-3 -2
1 4
-4 -3
2 -4
1 4
4 2
OUT.txt
28
15
21
Here's some Python that should do the trick for you. Be sure to pay attention to which directory you're in when you're writing so you don't overwrite things.
I've tested it on the input information you presented in the question, and it works, providing the formatted output file as desired.
# Assuming you're in the directory to IN.txt -- otherwise, insert the filepath.
input_file = open("IN.txt", "r")
# Read the input file and split it by new lines
input_lines_raw = input_file.read().split('\n')
input_file.close()
# Split the input lines and eliminate the spaces/create the vector int lists
input_lines_split = []
for element in input_lines_raw:
input_lines_split.append(element.split(' '))
input_lines = []
for sub in input_lines_split:
inserter = []
for elem in sub:
if (len(elem) > 0):
inserter.append(elem)
input_lines.append(inserter)
input_lines = [[int(j) for j in i] for i in input_lines]
# Build the original and final vector arrays
origin_vectors = []
dest_vectors = []
for i in range(1, input_lines[0][0] + 1):
dest_vectors.append(input_lines[i])
for i in range(input_lines[0][0] + 1, input_lines[0][0] + input_lines[0][1] + 1):
origin_vectors.append(input_lines[i])
# "Distance" operations on the lists of vectors themselves/generate results array
results_arr = []
for original in origin_vectors:
counter = 0
for final in dest_vectors:
counter = counter + max(abs(original[0] - final[0]), abs(original[1] - final[1]))
results_arr.append(counter)
print(results_arr)
for element in results_arr:
print(str(element))
# Open the ouput file and write to it, creating a new one if it doesn't exist.
# NOTE: This will overrwrite any existing "OUT.txt" file in the current directory.
output_file = open("OUT.txt", "w")
for element in results_arr:
output_file.write(str(element) + '\n')
output_file.close()

The running sequential average of a list of numbers in J

I'm trying to generate the Sierpinski triangle (chaos game version) in J. The general iterative algorithm to generate it, given 3 vertices, is:
point = (0, 0)
loop:
v = randomly pick one of the 3 vertices
point = (point + v) / 2
draw point
I'm trying to create the idiomatic version in J. So far this is what I have:
load 'plot'
numpoints =: 200000
verticesx =: 0 0.5 1
verticesy =: 0 , (2 o. 0.5) , 0
rolls =: ?. numpoints$3
pointsx =: -:#+ /\. rolls { verticesx
pointsy =: -:#+ /\. rolls { verticesy
'point' plot pointsx ; pointsy
This works, but I'm not sure I understand what's going on with -:#+ /\.. I think it's only working because of a mathematical quirk. I was trying to make a dyadic average function that would run as an accumulation through the list of points in the same way that + does in +/ \ i. 10, but I couldn't get anything like that to work. How would I do that?
Update:
To be clear, I'm trying to create a binary function avg that I could use in this way:
avg /\ randompoints
avg =: -:#+ doesn't work with this, for some reason. So I think what I'm having trouble with is properly defining an avg function with the proper variadicity.
To be as close to the algorithm as possible, I would probably do something like this:
v =: 3 2$ 0 0 0.5, (2 o. 0.5), 1 0
ps =: 1 2 $ (?3) { v
next =: 4 :'y,((?x){v) -:#+ ({: y)'
ps =: (3&next)^:20000 ps
'point' plot ({.;{:) |: ps
but your version is much more efficient.

search for specific word in text file in Matlab

I want to search for a specific word in a text file and return its position. This code reads the text fine...
fid = fopen('jojo-1 .txt','r');
while 1
tline = fgetl(fid);
if ~ischar(tline)
break
end
end
but when I add this code
U = strfind(tline, 'Term');
it returns [] although the string 'Term' exists in the file.
Can you please help me?
For me, it works fine:
strfind(' ertret Term ewrwerewr', 'Term')
ans =
9
Are you sure that 'Term' is really in your line?
I believe that your ~ischar(tline) makes the trouble because the code "breaks" when the tline is not char..so the strfind cannot find anything.
so the mayor change I made is to actually search for the String at the line which was identified as a line with some characters.
I tried a little bit modification of your code on my TEXT file:
yyyy/mmdd(or -ddd)/hh.h):2011/-201/10.0UT geog Lat/Long/Alt= 50.0/ 210.0/2000.0
NeQuick is used for topside Ne profile
URSI maps are used for the F2 peak density (NmF2)
CCIR maps are used for the F2 peak height (hmF2)
IRI-95 option is used for D-region
ABT-2009 option is used for the bottomside thickness parameter B0
The foF2 STORM model is turned on
Scotto-97 no L option is used for the F1 occurrence probability
TBT-2011 option is used for the electron temperature
RBY10+TTS03 option is used for ion composition
Peak Densities/cm-3: NmF2= 281323.9 NmF1= 0.0 NmE= 2403.3
Peak Heights/km: hmF2= 312.47 hmF1= 0.00 hmE= 110.00
Solar Zenith Angle/degree 109.6
Dip (Magnetic Inclination)/degree 65.76
Modip (Modified Dip)/degree 55.06
Solar Sunspot Number (12-months running mean) Rz12 57.5
Ionospheric-Effective Solar Index IG12 63.3
TEC [1.E16 m-2] is obtained by numerical integration in 1km steps
from 50 to 2000.0 km. t is the percentage of TEC above the F peak.
-
H ELECTRON DENSITY TEMPERATURES ION PERCENTAGES/% 1E16m-2
km Ne/cm-3 Ne/NmF2 Tn/K Ti/K Te/K O+ N+ H+ He+ O2+ NO+ Clust TEC t/%
0.0 -1 -1.000 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 7.7 75
5.0 -1 -1.000 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 7.7 75
10.0 -1 -1.000 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 7.7 75
it is an output from one Ionospheric model but that is not important:)
so I used following Matlab code to find where it string TEMPERATURES
out = fopen('fort.7'); % Open function
counter = 0; % line counter (sloppy but works)
while 1 % infinite loop
tline = fgetl(out); % read a line
counter = counter + 1; % we are one line further
if ischar(tline) % if the line is string
U = strfind(tline, 'TEMPERATURES'); % where the string start (if at all)
if isfinite(U) == 1; % if it is a number actually
break % we found it, lets go home
end
end
end
results:
counter = 26
U = 27

Can gnuplot compute and plot the delta between consecutive data points

For instance, given the following data file (x^2 for this example):
0
1
4
9
16
25
Can gnuplot plot the points along with the differences between the points as if it were:
0 0
1 1 # ( 1 - 0 = 1)
4 3 # ( 4 - 1 = 3)
9 5 # ( 9 - 4 = 5)
16 7 # (16 - 9 = 7)
25 9 # (25 -16 = 9)
The actual file has more than just the column I'm interested in and I would like to avoid pre-processing in order to add the deltas, if possible.
dtop's solution didn't work for me, but this works and is purely gnuplot (not calling awk):
delta_v(x) = ( vD = x - old_v, old_v = x, vD)
old_v = NaN
set title "Compute Deltas"
set style data lines
plot 'data.dat' using 0:($1), '' using 0:(delta_v($1)) title 'Delta'
Sample data file named 'data.dat':
0
1
4
9
16
25
Here's how to do this without pre-processing:
Script for gnuplot:
# runtime_delta.dem script
# run with
# gnuplot> load 'runtime_delta.dem'
#
reset
delta_v(x) = ( vD = x - old_v, old_v = x, vD)
old_v = NaN
set title "Compute Deltas"
set style data lines
plot 'runtime_delta.dat' using 0:(column('Data')), '' using 0:(delta_v(column('Data'))) title 'Delta'
Sample data file 'runtime_delta.dat':
Data
0
1
4
9
16
25
How about using awk?
plot "< awk '{print $1,$1-prev; prev=$1}' <datafilename>"
Below is a version that uses arrays from Gnuplot 5.1. Using arrays allows multiple diffs to be calculated in single Gnuplot instance.
array Z[128]
do for [i=1:128] { Z[i] = NaN }
diff(i, x) = (y = x - Z[i], Z[i] = x, y)
i is the instance index that needs to be incremented for each use. For example
plot "file1.csv" using 1:(diff(1,$2)) using line, \
"file2.csv" using 1:(diff(2,$2)) using line

Resources