My data is formatted this way (x, y, t, e):
I can't get an animated heatmap (in the gif format out of it). Here is my code:
set view map scale 1
set size square
set xlabel("x (m)")
set ylabel("y (m)")
set zlabel("V/m")
set title "E_z evolution"
set cblabel "E (V/m)"
set xrange [0:200]
set yrange [0:200]
set zrange [0:200]
set pm3d implicit at s
set pm3d corners2color max
set term gif animate delay 100
set output "electric_field_evo.gif"
DATA = "result.dat"
stats DATA using 4
do for [i=1:int(STATS_blocks)]{
splot DATA index (i-1) every 1::1::1 using (t=$3):(NaN):(NaN) notitle, \
DATA index (i-1) using 1:2:4 with pm3d title sprintf("time = %g",t)
}
# do for [i=1:int(STATS_blocks)]{
# splot DATA index (i-1) using 1:2:(t=$3,$4) with pm3d title sprintf("time = %g",t)
#
# }
And the result (sample from the gif for t = 190, but the whole gif looks like that: empty heatmap), notice heatmap is empty, and legend limit on the right keep changing frame after frame (they should be set to overall min/max):
Result from the stats call:
EDIT: ADD SAMPLE DATA
1 1 10 0.0000000E+00
1 2 10 0.0000000E+00
1 3 10 0.0000000E+00
1 4 10 0.0000000E+00
1 5 10 0.0000000E+00
1 6 10 0.0000000E+00
1 7 10 0.0000000E+00
1 8 10 0.0000000E+00
1 9 10 0.0000000E+00
1 10 10 0.0000000E+00
2 1 10 0.0000000E+00
2 2 10 -2.4976539E-06
2 3 10 -5.3808012E-04
2 4 10 -2.6733600E-02
2 5 10 1.610434
2 6 10 -2.6733600E-02
2 7 10 -5.3808012E-04
2 8 10 -2.4976580E-06
2 9 10 -4.8060720E-09
2 10 10 0.0000000E+00
3 1 10 0.0000000E+00
3 2 10 -4.0286686E-06
3 3 10 -8.8682520E-04
3 4 10 -6.8456993E-02
3 5 10 0.8074945
3 6 10 -6.8456993E-02
3 7 10 -8.8682520E-04
3 8 10 -4.0286805E-06
3 9 10 -9.2283745E-09
3 10 10 0.0000000E+00
4 1 10 0.0000000E+00
4 2 10 -5.1066436E-06
4 3 10 -1.0590287E-03
4 4 10 -8.2134798E-02
4 5 10 2.1518860E-02
4 6 10 -8.2134798E-02
4 7 10 -1.0590287E-03
4 8 10 -5.1066618E-06
4 9 10 -1.2548365E-08
4 10 10 0.0000000E+00
5 1 10 0.0000000E+00
5 2 10 -5.3812091E-06
5 3 10 -1.0855671E-03
5 4 10 -8.8276833E-02
5 5 10 -0.3647908
5 6 10 -8.8276833E-02
5 7 10 -1.0855671E-03
5 8 10 -5.3812300E-06
5 9 10 -1.4017410E-08
5 10 10 0.0000000E+00
6 1 10 0.0000000E+00
6 2 10 -5.1066436E-06
6 3 10 -1.0618623E-03
6 4 10 -8.5113689E-02
6 5 10 -0.6305632
6 6 10 -8.5113689E-02
6 7 10 -1.0618623E-03
6 8 10 -5.1066618E-06
6 9 10 -1.2548365E-08
6 10 10 0.0000000E+00
7 1 10 0.0000000E+00
7 2 10 -4.0534715E-06
7 3 10 -9.2593604E-04
7 4 10 -8.5873455E-02
7 5 10 -1.141478
7 6 10 -8.5873455E-02
7 7 10 -9.2593604E-04
7 8 10 -4.0534824E-06
7 9 10 -9.2283745E-09
7 10 10 0.0000000E+00
8 1 10 0.0000000E+00
8 2 10 -2.7910814E-06
8 3 10 -7.3620438E-04
8 4 10 -7.2833136E-02
8 5 10 -0.6117640
8 6 10 -7.2833136E-02
8 7 10 -7.3620438E-04
8 8 10 -2.7910855E-06
8 9 10 -4.9456297E-09
8 10 10 0.0000000E+00
9 1 10 0.0000000E+00
9 2 10 -1.2563974E-06
9 3 10 -4.6025845E-04
9 4 10 -5.4116480E-02
9 5 10 0.2660810
9 6 10 -5.4116480E-02
9 7 10 -4.6025845E-04
9 8 10 -1.2563978E-06
9 9 10 -1.3758922E-09
9 10 10 0.0000000E+00
10 1 10 0.0000000E+00
10 2 10 0.0000000E+00
10 3 10 0.0000000E+00
10 4 10 0.0000000E+00
10 5 10 0.0000000E+00
10 6 10 0.0000000E+00
10 7 10 0.0000000E+00
10 8 10 0.0000000E+00
10 9 10 0.0000000E+00
10 10 10 0.0000000E+00
1 1 20 0.0000000E+00
1 2 20 0.0000000E+00
1 3 20 0.0000000E+00
1 4 20 0.0000000E+00
1 5 20 0.0000000E+00
1 6 20 0.0000000E+00
1 7 20 0.0000000E+00
1 8 20 0.0000000E+00
1 9 20 0.0000000E+00
1 10 20 0.0000000E+00
2 1 20 0.0000000E+00
2 2 20 -6.1337640E-05
2 3 20 -1.4902533E-03
2 4 20 3.2542488E-03
2 5 20 0.3296941
2 6 20 3.2542488E-03
2 7 20 -1.4902533E-03
2 8 20 -6.1346022E-05
2 9 20 -9.6680935E-07
2 10 20 0.0000000E+00
3 1 20 0.0000000E+00
3 2 20 -1.1673706E-04
3 3 20 -2.5404668E-03
3 4 20 4.2203847E-02
3 5 20 1.418503
3 6 20 4.2203847E-02
3 7 20 -2.5404671E-03
3 8 20 -1.1675285E-04
3 9 20 -1.8424856E-06
3 10 20 0.0000000E+00
4 1 20 0.0000000E+00
4 2 20 -1.5823430E-04
4 3 20 -2.6485780E-03
4 4 20 9.9604763E-02
4 5 20 1.320845
4 6 20 9.9604763E-02
4 7 20 -2.6485780E-03
4 8 20 -1.5825577E-04
4 9 20 -2.5378417E-06
4 10 20 0.0000000E+00
5 1 20 0.0000000E+00
5 2 20 -1.7999914E-04
5 3 20 -2.5499274E-03
5 4 20 0.1251577
5 5 20 2.412011
5 6 20 0.1251577
5 7 20 -2.5499277E-03
5 8 20 -1.8002378E-04
5 9 20 -2.9330913E-06
5 10 20 0.0000000E+00
6 1 20 0.0000000E+00
6 2 20 -1.7958798E-04
6 3 20 -2.3236359E-03
6 4 20 0.1080537
6 5 20 0.6282932
6 6 20 0.1080537
6 7 20 -2.3236363E-03
6 8 20 -1.7961276E-04
6 9 20 -2.9684843E-06
6 10 20 0.0000000E+00
7 1 20 0.0000000E+00
7 2 20 -1.6216180E-04
7 3 20 -2.0380928E-03
7 4 20 0.1280525
7 5 20 -0.2143435
7 6 20 0.1280525
7 7 20 -2.0380928E-03
7 8 20 -1.6218355E-04
7 9 20 -2.6321391E-06
7 10 20 0.0000000E+00
8 1 20 0.0000000E+00
8 2 20 -1.2343809E-04
8 3 20 -1.6553155E-03
8 4 20 0.1108039
8 5 20 -0.6842759
8 6 20 0.1108039
8 7 20 -1.6553155E-03
8 8 20 -1.2345419E-04
8 9 20 -1.9603046E-06
8 10 20 0.0000000E+00
9 1 20 0.0000000E+00
9 2 20 -6.7093235E-05
9 3 20 -1.1316845E-03
9 4 20 6.1340898E-02
9 5 20 0.1410470
9 6 20 6.1340898E-02
9 7 20 -1.1316845E-03
9 8 20 -6.7101741E-05
9 9 20 -1.0376655E-06
9 10 20 0.0000000E+00
10 1 20 0.0000000E+00
10 2 20 0.0000000E+00
10 3 20 0.0000000E+00
10 4 20 0.0000000E+00
10 5 20 0.0000000E+00
10 6 20 0.0000000E+00
10 7 20 0.0000000E+00
10 8 20 0.0000000E+00
10 9 20 0.0000000E+00
10 10 20 0.0000000E+00
As mentioned in the comments, gnuplot expects a certain data structure for splot ... with pm3d.
See the following minimized example. If you can change your data in that way it would be the easiest. Otherwise, if you cannot or don't want to change your original datafiles you can think about some workaround and insert the empty line with gnuplot.
Required data format:
1 1 1.1 # 1st block start, index=0
1 2 1.2
1 3 1.3 # one blank line follows
2 1 1.4
2 2 1.5
2 3 1.6 # one blank line follows
3 1 1.7
3 2 1.8
3 3 1.9 # 1st block end, followed by two blank lines
1 1 2.1 # 2nd block start, index=1
1 2 2.2
1 3 2.3 # one blank line follows
2 1 2.4
2 2 2.5
2 3 2.6 # one blank line follows
3 1 2.7
3 2 2.8
3 3 2.9
Related
I have csv file contain on 6 columns like this:
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
I need to convert this columns to rows to be like this:
1 1 1 1 1 1
2 2 2 2 2 2
3 3 3 3 3 3
4 4 4 4 4 4
5 5 5 5 5 5
6 6 6 6 6 6
7 7 7 7 7 7
8 8 8 8 8 8
How can do that please?
This is input
This is the output
Try:
import csv
with open("input.csv", "r") as f_in, open("output.csv", "w") as f_out:
reader = csv.reader(f_in, delimiter=" ")
writer = csv.writer(f_out, delimiter=" ")
writer.writerows(zip(*reader))
Contents of input.csv:
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
Contents of output.csv after the script run:
1 1 1 1 1 1
2 2 2 2 2 2
3 3 3 3 3 3
4 4 4 4 4 4
5 5 5 5 5 5
6 6 6 6 6 6
7 7 7 7 7 7
8 8 8 8 8 8
you are looking for a table pivot method
if you are using pandas , this will do the trick https://pandas.pydata.org/docs/reference/api/pandas.pivot_table.html
I want to extract rows when the column x value remains the same for more than five consecutive rows.
x x2
0 5 5
1 4 5
2 10 6
3 10 5
4 10 6
5 10 78
6 10 89
7 10 78
8 10 98
9 10 8
10 10 56
11 60 45
12 10 65
Desired_output:
x x2
0 10 6
1 10 5
2 10 6
3 10 78
4 10 89
5 10 78
6 10 98
7 10 8
8 10 56
You can use .shift + .cumsum to identify the blocks of consecutive rows where column x value remains same, then group the dataframe on these blocks and transform using count to identify the groups which have greater than 5 consecutive same values in x:
b = df['x'].ne(df['x'].shift()).cumsum()
df_out = df[df['x'].groupby(b).transform('count').gt(5)]
Details:
>>> b
0 1
1 2
2 3
3 3
4 3
5 3
6 3
7 3
8 3
9 3
10 3
11 4
12 5
Name: x, dtype: int64
>>> df_out
x x2
2 10 6
3 10 5
4 10 6
5 10 78
6 10 89
7 10 78
8 10 98
9 10 8
10 10 56
you can use shift to compare the next row and then take cumulative sum to compare if the repeat is greater than 5, then group on x and transform any then mask with the condition to unselect rows where condition does not match.
c = df['x'].eq(df['x'].shift())
out = df[c.cumsum().gt(5).groupby(df['x']).transform('any') & (c|c.shift(-1))]
print(out)
x x2
2 10 6
3 10 5
4 10 6
5 10 78
6 10 89
7 10 78
8 10 98
9 10 8
10 10 56
I've a list of number from 1 to 53. I am trying to calculate 1) the quarter of a week and 2) the number of that week within that quarter using numeric week numbers. (if 53, needs to be qtr 4 wk 14, if 27 needs to be 3rd quarter wk 1). Got this working in excel, but not in python? Any thoughts?
tried the following, but at each try I've an issue with the wk's like 13 or 27 depending on the method I'm using.
13 -> should be qtr 1 , 27 -> should be 3 qtr.
df['qtr1'] = df['wk']//13
df['qtr2']=(np.maximum((df['wk']-1),1)/13)+1
df['qtr3']=((df1['wk']-1)//13)
df['qtr4'] = df['qtr2'].astype(int)
Results are awkward
wk qtr qtr2 qtr3 qtr4
1.0 0 1.076923 -1.0 1
13.0 1(wrong) 1.923077 0.0 1
14.0 1 2.000000 1.0 2
27.0 2 3.000000 1.0 2 (wrong)
28.0 2 3.076923 2.0 3
You can convert your weeks to integers, by using astype:
df['wk'] = df['wk'].astype(int)
You should subtract it with one first, like:
df['qtr'] = ((df['wk']-1) // 13) + 1
df['weekinqtr'] = (df['wk']-1) % 13 + 1
since 13//13 will be 1, not zero. This gives us:
>>> df
wk qtr weekinqtr
0 1 1 1
1 13 1 13
2 14 2 1
3 26 2 13
4 27 3 1
5 28 3 2
If you want extra columns per quarter, you can use get_dummies(..) [pandas-doc] to obtain a one-hot encoding per quarter:
>>> df.join(pd.get_dummies(df['qtr'], prefix='qtr'))
wk qtr weekinqtr qtr_1 qtr_2 qtr_3
0 1 1 1 1 0 0
1 13 1 13 1 0 0
2 14 2 1 0 1 0
3 26 2 13 0 1 0
4 27 3 1 0 0 1
5 28 3 2 0 0 1
Using div // and modulo % work for what you want I think
In [254]: df = pd.DataFrame({'week':range(52)})
In [255]: df['qtr'] = (df['week'] // 13) + 1
In [256]: df['qtr_week'] = df['week'] % 13
In [257]: df.loc[(df['qtr_week'] ==0),'qtr_week']=13
In [258]: df
Out[258]:
week qtr qtr_week
0 1 1 1
1 2 1 2
2 3 1 3
3 4 1 4
4 5 1 5
5 6 1 6
6 7 1 7
7 8 1 8
8 9 1 9
9 10 1 10
10 11 1 11
11 12 1 12
12 13 2 13
13 14 2 1
14 15 2 2
15 16 2 3
16 17 2 4
17 18 2 5
18 19 2 6
19 20 2 7
20 21 2 8
21 22 2 9
22 23 2 10
23 24 2 11
24 25 2 12
25 26 3 13
26 27 3 1
27 28 3 2
28 29 3 3
29 30 3 4
30 31 3 5
31 32 3 6
32 33 3 7
33 34 3 8
34 35 3 9
35 36 3 10
36 37 3 11
37 38 3 12
38 39 4 13
39 40 4 1
40 41 4 2
41 42 4 3
42 43 4 4
43 44 4 5
44 45 4 6
45 46 4 7
46 47 4 8
47 48 4 9
48 49 4 10
49 50 4 11
50 51 4 12
I have the following data:
at_score atp_1 atp_2 atp_3 g_date g_id g_time ht_diff ht_score htp_1 htp_2 htp_3
0 0 6 7 8 11/16/18 1 0 0 0 1 2 3
1 13 6 7 9 11/16/18 1 15 2 15 1 2 3
2 20 7 8 10 11/16/18 1 18 2 22 3 4 5
3 40 7 8 6 11/16/18 1 33 5 45 4 1 2
4 65 8 7 6 11/16/18 1 60 -3 62 1 2 3
5 0 6 7 8 11/20/18 2 0 0 0 1 2 3
6 10 9 7 8 11/20/18 2 7 -4 6 4 2 3
7 26 6 10 7 11/20/18 2 24 -1 25 1 5 4
8 40 9 7 8 11/20/18 2 42 5 45 1 2 5
9 65 6 7 10 11/20/18 2 60 5 70 1 5 2
where at_score, ht_score are the away & home team's score on a particular date (g_date), in a particular game (g_id), & at a particular time in the game (g_time). ht_diff represents the home team's score differential (ht_score - at_score). Finally, and for my purposes most importantly, atp_1, atp_2, atp_3 are the 3 away players who are playing at that point. htp_1, htp_2, htp_3 are their home team counterparts.
What I'd like to calculate is the variance-covariance matrix for each of the home & away team players based on how the ht_diff, ht_score & at_score changed while they were playing and the players they were playing with. For example away player 6 played with players 7 & 8 for the first 13 minutes of g_id 1 (ht_diff = 2 for this period) & the last 27 minutes (ht_diff = -3).
In the end I have about 2.5 million observations (as well as 10 players playing at a time) so finding a 'easy' to calculate this would be extremely helpful.
Suppose i have a file like this...
4 2 8 2 12 3 18 2 22 2 26 2 28 3 30 2
4 3 10 2 14 2 18 2 20 3 22 2 28 2 32 2
2 3 10 3 12 2 16 2 18 3 20 2 24 2 26 3
1 3 3 3 17 3 19 3 26 2 28 2 30 2 32 2
4 2 8 2 12 3 18 2 22 2 26 2 28 3 30 2
the first and the last line are the same in the input...
I want the output to be like ...
4 2 8 2 12 3 18 2 22 2 26 2 28 3 30 2 2
4 3 10 2 14 2 18 2 20 3 22 2 28 2 32 2 1
2 3 10 3 12 2 16 2 18 3 20 2 24 2 26 3 1
1 3 3 3 17 3 19 3 26 2 28 2 30 2 32 2 1
The extra last coloum in the output simply specifies the extra number of lines.....
how can i do this in bash...
i know the sort command but it only works with one number per line....
Coming from sehe's suggestion, what about this?
sort your_file | uniq -c | awk '{for(i=2;i<=NF;i++) printf $i"\t"; printf $1"\n"}'
Output:
1 3 3 3 17 3 19 3 26 2 28 2 30 2 32 2 1
2 3 10 3 12 2 16 2 18 3 20 2 24 2 26 3 1
4 2 8 2 12 3 18 2 22 2 26 2 28 3 30 2 2
4 3 10 2 14 2 18 2 20 3 22 2 28 2 32 2 1