Flot break line chart - why? - flot

For the following data, I receive dis-continuous line chart.
There are lines for
07:10..07:50,
08:10..08:50,
09:10..09:50,
but Flot ignores the values at 08:00, 09:00 (why?)
and does not connect lines from 07:50 to 08:10 (why?)
In what cases Flot decide to "break" line charts?
07:20 10/03/2016 25.4 24.2 24.7
07:30 10/03/2016 25.2 23.9 24.3
07:40 10/03/2016 25.1 23.8 24.3
07:50 10/03/2016 25.1 23.8 24.3
08:00 10/03/2016 25.1 23.8 24.3
08:10 10/03/2016 25.1 23.9 24.3
08:20 10/03/2016 24.9 24.2 24.3
08:30 10/03/2016 24.9 24.2 24.3
08:40 10/03/2016 24.9 24.2 24.3
08:50 10/03/2016 25 24.5 24.6
09:00 10/03/2016 25.1 24.6 24.7
09:10 10/03/2016 25.2 24.6 24.8
09:20 10/03/2016 25.2 24.6 24.8
09:30 10/03/2016 25.2 24.6 24.7
09:40 10/03/2016 25.2 24.6 24.7

found it - it happens when hours\minutes in CSV are "0" and my code was expecting "00",
-> no UTC generated,
-> null value,
-> chart-line break.
I saw it only when I opened the csv in text editor.
When opened in Excel, Excel seems to set the values into correct format!
v 7:50, 10/3/2016,25.1,23.8,24.3,52.9,0.0,24.0
x 8:0, 10/3/2016,25.1,23.8,24.3,54.2,0.0,24.0 <- the chart breaks.
v 8:10, 10/3/2016,25.1,23.9,24.3,52.8,0.0,24.1
v 8:50, 10/3/2016,25.0,24.5,24.6,49.6,0.0,24.5
x 9:0, 10/3/2016,25.1,24.6,24.7,48.3,0.0,24.6 <- the chart breaks.
v 9:10, 10/3/2016,25.2,24.6,24.8,47.6,0.0,24.7
v 9:20, 10/3/2016,25.2,24.6,24.8,47.1,0.0,24.7

Related

Fitting a sinc function with gnuplot

I am trying to fit a sinc function with gnuplot but it fails with the message:
'Undefined value during function evaluation'.
First my data:
27 9.3
27.2 9.3
27.8 9.3
29 9.4
32 9.5
34 9.6
34.2 9.7
34.4 9.7
34.6 9.8
34.8 10.1
35 10.9
35.2 12.9
35.4 16.1
35.6 21.1
35.8 26.5
36 31.8
36.2 34.7
36.4 36.6
36.6 36.3
36.8 32.3
37 26.4
37.2 20.6
37.4 15.4
37.6 11.6
37.8 9.9
38 9.6
38.5 10
39 9.5
39.5 9.5
40 9.6
What I am trying to do in Gnuplot:
sinc(x)=sin(pi*x)/pi/x
f(x)=a*(sinc((b*(x-c))))**2+d
fit f(x) '4_temp.txt' via a,b,c,d
I set a,b,c,d close to the values that are needed (see picture) but it wont fit.
Somebody can help?
Thanks in advance.
I can reproduce your error message. You are trying to fit a sin(x)/x function. For x=0 you will get 0/0, although, gnuplot has no problems to plot sin(x)/x, apparently, fitting has a problem with this.
Only if you add a little offset, e.g. 1e-9, it seems to work and it will find some reasonable parameters.
As #Ethan says, you need to choose some starting values which should not be too far away from the final values.
You will get the fitted values:
Final set of parameters Asymptotic Standard Error
======================= ==========================
a = 27.5271 +/- 0.2822 (1.025%)
b = 0.608263 +/- 0.006576 (1.081%)
c = 36.3954 +/- 0.00657 (0.01805%)
d = 9.21346 +/- 0.127 (1.379%)
Code:
### fitting type of sin(x)/x function
reset session
$Data <<EOD
27 9.3
27.2 9.3
27.8 9.3
29 9.4
32 9.5
34 9.6
34.2 9.7
34.4 9.7
34.6 9.8
34.8 10.1
35 10.9
35.2 12.9
35.4 16.1
35.6 21.1
35.8 26.5
36 31.8
36.2 34.7
36.4 36.6
36.6 36.3
36.8 32.3
37 26.4
37.2 20.6
37.4 15.4
37.6 11.6
37.8 9.9
38 9.6
38.5 10
39 9.5
39.5 9.5
40 9.6
EOD
a=25
b=1
c=36
d=10
sinc(x)=sin(pi*x)/pi/(x)
f(x)=a*(sinc((b*(x-c+1e-9))))**2+d
set fit nolog
fit f(x) $Data via a,b,c,d
plot $Data u 1:2 w p pt 7, f(x) w l lc rgb "red"
### end of code
Result:

BeautifulSoup and urlopen aren't fetching the right table

I'm trying to practice BeautifulSoup and urlopen by using Basketball-Reference datasets. When I try and get individual player's stats, everything works fine, but then I tried to use the same code for Team's stats and apparently urlopen isn't finding the right table.
The following code is to get the "headers" from the page.
def fetch_years():
#Determine the urls
url = "https://www.basketball-reference.com/leagues/NBA_2000.html?sr&utm_source=direct&utm_medium=Share&utm_campaign=ShareTool#team-stats-per_game::none"
html = urlopen(url)
soup = BeautifulSoup(html)
soup.find_all('tr')
headers = [th.get_text() for th in soup.find_all('tr')[0].find_all('th')]
headers = headers[1:]
print(headers)
I'm trying to get the Team's stats per game data, in a format like:
['Tm', 'G', 'MP', 'FG', ...]
Instead, the header data I'm getting is:
['W', 'L', 'W/L%', ...]
which is the very first table in the 1999-2000 season information about the teams (under the name 'Division Standings').
If you use that same code for a player's data such as this one, you get the result I'm looking for:
Age Tm Lg Pos G GS MP FG ... DRB TRB AST STL BLK TOV PF PTS
0 20 OKC NBA PG 82 65 32.5 5.3 ... 2.7 4.9 5.3 1.3 0.2 3.3 2.3 15.3
1 21 OKC NBA PG 82 82 34.3 5.9 ... 3.1 4.9 8.0 1.3 0.4 3.3 2.5 16.1
2 22 OKC NBA PG 82 82 34.7 7.5 ... 3.1 4.6 8.2 1.9 0.4 3.9 2.5 21.9
3 23 OKC NBA PG 66 66 35.3 8.8 ... 3.1 4.6 5.5 1.7 0.3 3.6 2.2 23.6
4 24 OKC NBA PG 82 82 34.9 8.2 ... 3.9 5.2 7.4 1.8 0.3 3.3 2.3 23.2
The code to webscrape came originally from here.
the sports -reference.com sites are trickier than your standard ones. The tables are rendered after loading the page (with the exception of a few tables on the pages), so you'd need to use Selenium to let it render first, then pull the html source code.
However, the other option is if you look at the html source, you'll see those tables are within the comments. You could use BeautifulSoup to pull out the comments tags, then search through those for the table tags.
This will return a list of dataframes, and the Team Per Game stats are the table in index position 1:
import requests
from bs4 import BeautifulSoup
from bs4 import Comment
import pandas as pd
def fetch_years():
#Determine the urls
url = "https://www.basketball-reference.com/leagues/NBA_2000.html?sr&utm_source=direct&utm_medium=Share&utm_campaign=ShareTool#team-stats-per_game::none"
html = requests.get(url)
soup = BeautifulSoup(html.text)
comments = soup.find_all(string=lambda text: isinstance(text, Comment))
tables = []
for each in comments:
if 'table' in each:
try:
tables.append(pd.read_html(each)[0])
except:
continue
return tables
tables = fetch_years()
Output:
print (tables[1].to_string())
Rk Team G MP FG FGA FG% 3P 3PA 3P% 2P 2PA 2P% FT FTA FT% ORB DRB TRB AST STL BLK TOV PF PTS
0 1.0 Sacramento Kings* 82 241.5 40.0 88.9 0.450 6.5 20.2 0.322 33.4 68.7 0.487 18.5 24.6 0.754 12.9 32.1 45.0 23.8 9.6 4.6 16.2 21.1 105.0
1 2.0 Detroit Pistons* 82 241.8 37.1 80.9 0.459 5.4 14.9 0.359 31.8 66.0 0.481 23.9 30.6 0.781 11.2 30.0 41.2 20.8 8.1 3.3 15.7 24.5 103.5
2 3.0 Dallas Mavericks 82 240.6 39.0 85.9 0.453 6.3 16.2 0.391 32.6 69.8 0.468 17.2 21.4 0.804 11.4 29.8 41.2 22.1 7.2 5.1 13.7 21.6 101.4
3 4.0 Indiana Pacers* 82 240.6 37.2 81.0 0.459 7.1 18.1 0.392 30.0 62.8 0.478 19.9 24.5 0.811 10.3 31.9 42.1 22.6 6.8 5.1 14.1 21.8 101.3
4 5.0 Milwaukee Bucks* 82 242.1 38.7 83.3 0.465 4.8 13.0 0.369 33.9 70.2 0.483 19.0 24.2 0.786 12.4 28.9 41.3 22.6 8.2 4.6 15.0 24.6 101.2
5 6.0 Los Angeles Lakers* 82 241.5 38.3 83.4 0.459 4.2 12.8 0.329 34.1 70.6 0.482 20.1 28.9 0.696 13.6 33.4 47.0 23.4 7.5 6.5 13.9 22.5 100.8
6 7.0 Orlando Magic 82 240.9 38.6 85.5 0.452 3.6 10.6 0.338 35.1 74.9 0.468 19.2 26.1 0.735 14.0 31.0 44.9 20.8 9.1 5.7 17.6 24.0 100.1
7 8.0 Houston Rockets 82 241.8 36.6 81.3 0.450 7.1 19.8 0.358 29.5 61.5 0.480 19.2 26.2 0.733 12.3 31.5 43.8 21.6 7.5 5.3 17.4 20.3 99.5
8 9.0 Boston Celtics 82 240.6 37.2 83.9 0.444 5.1 15.4 0.331 32.2 68.5 0.469 19.8 26.5 0.745 13.5 29.5 43.0 21.2 9.7 3.5 15.4 27.1 99.3
9 10.0 Seattle SuperSonics* 82 241.2 37.9 84.7 0.447 6.7 19.6 0.339 31.2 65.1 0.480 16.6 23.9 0.695 12.7 30.3 43.0 22.9 8.0 4.2 14.0 21.7 99.1
10 11.0 Denver Nuggets 82 242.1 37.3 84.3 0.442 5.7 17.0 0.336 31.5 67.2 0.469 18.7 25.8 0.724 13.1 31.6 44.7 23.3 6.8 7.5 15.6 23.9 99.0
11 12.0 Phoenix Suns* 82 241.5 37.7 82.6 0.457 5.6 15.2 0.368 32.1 67.4 0.477 17.9 23.6 0.759 12.5 31.2 43.7 25.6 9.1 5.3 16.7 24.1 98.9
12 13.0 Minnesota Timberwolves* 82 242.7 39.3 84.3 0.467 3.0 8.7 0.346 36.3 75.5 0.481 16.8 21.6 0.780 12.4 30.1 42.5 26.9 7.6 5.4 13.9 23.3 98.5
13 14.0 Charlotte Hornets* 82 241.2 35.8 79.7 0.449 4.1 12.2 0.339 31.7 67.5 0.469 22.7 30.0 0.758 10.8 32.1 42.9 24.7 8.9 5.9 14.7 20.4 98.4
14 15.0 New Jersey Nets 82 241.8 36.3 83.9 0.433 5.8 16.8 0.347 30.5 67.2 0.454 19.5 24.9 0.784 12.7 28.2 40.9 20.6 8.8 4.8 13.6 23.3 98.0
15 16.0 Portland Trail Blazers* 82 241.2 36.8 78.4 0.470 5.0 13.8 0.361 31.9 64.7 0.493 18.8 24.7 0.760 11.8 31.2 43.0 23.5 7.7 4.8 15.2 22.7 97.5
16 17.0 Toronto Raptors* 82 240.9 36.3 83.9 0.433 5.2 14.3 0.363 31.2 69.6 0.447 19.3 25.2 0.765 13.4 29.9 43.3 23.7 8.1 6.6 13.9 24.3 97.2
17 18.0 Cleveland Cavaliers 82 242.1 36.3 82.1 0.442 4.2 11.2 0.373 32.1 70.9 0.453 20.2 26.9 0.750 12.3 30.5 42.8 23.7 8.7 4.4 17.4 27.1 97.0
18 19.0 Washington Wizards 82 241.5 36.7 81.5 0.451 4.1 10.9 0.376 32.6 70.6 0.462 19.1 25.7 0.743 13.0 29.7 42.7 21.6 7.2 4.7 16.1 26.2 96.6
19 20.0 Utah Jazz* 82 240.9 36.1 77.8 0.464 4.0 10.4 0.385 32.1 67.4 0.476 20.3 26.2 0.773 11.4 29.6 41.0 24.9 7.7 5.4 14.9 24.5 96.5
20 21.0 San Antonio Spurs* 82 242.1 36.0 78.0 0.462 4.0 10.8 0.374 32.0 67.2 0.476 20.1 27.0 0.746 11.3 32.5 43.8 22.2 7.5 6.7 15.0 20.9 96.2
21 22.0 Golden State Warriors 82 240.9 36.5 87.1 0.420 4.2 13.0 0.323 32.3 74.0 0.437 18.3 26.2 0.697 15.9 29.7 45.6 22.6 8.9 4.3 15.9 24.9 95.5
22 23.0 Philadelphia 76ers* 82 241.8 36.5 82.6 0.442 2.5 7.8 0.323 34.0 74.8 0.454 19.2 27.1 0.708 14.0 30.1 44.1 22.2 9.6 4.7 15.7 23.6 94.8
23 24.0 Miami Heat* 82 241.8 36.3 78.8 0.460 5.4 14.7 0.371 30.8 64.1 0.481 16.4 22.3 0.736 11.2 31.9 43.2 23.5 7.1 6.4 15.0 23.7 94.4
24 25.0 Atlanta Hawks 82 241.8 36.6 83.0 0.441 3.1 9.9 0.317 33.4 73.1 0.458 18.0 24.2 0.743 14.0 31.3 45.3 18.9 6.1 5.6 15.4 21.0 94.3
25 26.0 Vancouver Grizzlies 82 242.1 35.3 78.5 0.449 4.0 11.0 0.361 31.3 67.6 0.463 19.4 25.1 0.774 12.3 28.3 40.6 20.7 7.4 4.2 16.8 22.9 93.9
26 27.0 New York Knicks* 82 241.8 35.3 77.7 0.455 4.3 11.4 0.375 31.0 66.3 0.468 17.2 22.0 0.781 9.8 30.7 40.5 19.4 6.3 4.3 14.6 24.2 92.1
27 28.0 Los Angeles Clippers 82 240.3 35.1 82.4 0.426 5.2 15.5 0.339 29.9 67.0 0.446 16.6 22.3 0.746 11.6 29.0 40.6 18.0 7.0 6.0 16.2 22.2 92.0
28 29.0 Chicago Bulls 82 241.5 31.3 75.4 0.415 4.1 12.6 0.329 27.1 62.8 0.432 18.1 25.5 0.709 12.6 28.3 40.9 20.1 7.9 4.7 19.0 23.3 84.8
29 NaN League Average 82 241.5 36.8 82.1 0.449 4.8 13.7 0.353 32.0 68.4 0.468 19.0 25.3 0.750 12.4 30.5 42.9 22.3 7.9 5.2 15.5 23.3 97.5

Excel 2010 Find (dots)-replace(commas)

I am trying to clean data for calculations in excel. The data is taken from a pdf file, and there are numbers where I need to change dots into commas, since if I don't do that, excel understands the "text-to-columns seperated" numbers as dates, and transforms them accordingly which is not what I need.
But after "find-replace"'ing, the data within each cell gets "alt-enter"ed automatically.
I noticed that after a comma, the cell is transformed in this way. For instance, if there are two commas in one cell, then there are two lines in one cell, which creates a problem for me to use text-to-columns to seperate the cells.
So here is the data:
+++
Year CPI WPI Year CPI WPI
1960 29.8 31.7 1980 86.3 93.8
1961 30.0 31.6 1981 94.0 98.8
1962 30.4 31.6 1982 97.6 100.5
1963 30.9 31.6 1983 101.3 102.3
1964 31.2 31.7 1984 105.3 103.5
1965 31.8 32.8 1985 109.3 103.6
1966 32.9 33.3 1986 110.5 99.70
1967 33.9 33.7 1987 115.4 104.2
1968 35.5 34.6 1988 120.5 109.0
1969 37.7 36.3 1989 126.1 113.0
1970 39.8 37.1 1990 133.8 118.7
1971 41.1 38.6 1991 137.9 115.9
1972 42.5 41.1 1992 141.9 117.6
1973 46.2 47.4 1993 145.8 118.6
1974 51.9 57.3 1994 149.7 121.9
1975 55.5 59.7 1995 153.5 125.7
1976 58.2 62.5 1996 158.6 128.8
1977 62.1 66.2 1997 161.3 126.7
1978 67.7 72.7 1998 163.9 122.7
1979 76.7 83.4 1999 168.3 128.0
+++
I would like to have this data seperated by "space", where the decimals begin with a comma instead with a dot.
Why don't you use
=SUBSTITUTE(A1,".",",")
That's exactly what you want isn't it?

Colouring a pm3d surface using a column values

I am trying to colour a splot surface using pm3d and wanted to colour using values from another column instead of the z-axis.
The input file (test.file, tab separated) is :
atom_num residue_name X Y Z
288 1 45.3 36.6 79.3
301 1 38.9 197.4 72.5
314 1 118.2 53.8 76.5
327 1 58.2 139.1 78.5
353 1 1.9 14.4 71.9
366 1 156.9 180.0 72.1
379 1 183.2 5.4 69.5
392 1 71.7 155.4 75.8
457 1 83.4 11.8 74.8
613 1 97.1 180.7 77.5
626 1 145.2 160.3 71.7
678 2 73.1 76.3 81.0
704 3 30.3 46.5 79.3
717 2 216.0 130.7 85.5
743 2 55.0 137.2 74.4
756 2 23.4 67.3 78.3
769 2 46.9 156.1 77.3
821 2 145.4 143.9 80.7
990 2 7.8 119.3 79.8
1016 3 44.3 67.3 76.7
1042 3 12.8 44.4 74.3
1055 3 149.1 79.9 78.2
1068 3 100.8 35.8 76.1
1081 3 57.6 196.8 76.8
1094 3 214.7 122.8 79.5
1107 3 82.0 190.0 74.4
1120 3 150.9 39.4 71.3
1133 3 50.4 143.7 75.3
1146 1 42.9 104.7 74.3
1159 1 139.0 48.8 73.4
1172 1 66.8 165.3 71.5
1198 1 190.7 150.1 84.2
1211 1 92.1 5.1 75.8
1224 1 211.8 177.7 74.1
1237 1 131.6 0.2 73.6
1250 2 103.8 104.2 76.6
1276 2 132.4 5.0 70.0
1289 2 94.4 9.4 73.0
1302 2 72.6 33.7 74.3
1315 2 14.4 162.6 74.7
1406 2 171.4 143.6 86.1
1419 2 209.5 52.9 77.4
1445 2 11.6 14.7 72.3
1458 1 115.5 165.0 73.0
1549 1 147.1 45.5 76.1
1575 1 115.8 36.6 74.5
1588 1 35.8 37.3 76.2
1601 1 65.4 28.2 76.9
1614 1 13.4 199.9 76.5
The commands I am using is:
set dgrid3d 30,30
set hidden3d
set palette rgbformulae 33,13,10
splot "test.file" u 3:4:5 w pm3d
The image is appearing like this:
The plot is by default colouring based on the Z-axis value (column 5). I am stuck colouring the plot using the values of Residue Name (column 2), which ranges from 1-3. Is there an option to define which coloumn to choose for colouring? Ideally I would like to have the same plot but coloured according to the column 2, so that I can see which "Residue types" lie in which contours.
Any help would hugely helpful.
As your residue is an integer, it is unclear whether you want it interpolated onto the grid.
However, if that's what you want, you can use the solution in Plotting 3D surface from scatter points and a png on the same 3D graph but don't use with pm3d when writing tables. Here's a solution with a quick and somewhat dirty unix trick to merge the tables:
set terminal push #Save current terminal settings
set terminal unknown #dummy terminal
set table "surface.dat"
set dgrid3d
splot 'test.dat' using 3:4:5
set table "residue.dat"
splot 'test.dat' using 3:4:2
unset dgrid3d
unset table
set term pop #reset current terminal settings
!paste surface.dat residue.dat > test_grid.dat
splot "test_grid.dat" u 1:2:3:7 w pm3d

Do files have to be csv to differentiate between columns to plot a graph with gnuplot

I am trying to plot a line graph using data from a file that has several columns in it (16 in fact). I have bee trying to use the command
plot 'snr.dat' using 2:16 with lines
but I do not seem to be getting the result I would like.
I have attached an extract from the file I am using.
2014/10/30 0:00:28.847 00000 159.9 71.6 -12.51 .40 64.1 217.1 3 23.1 15 1 3511. .055 -9.99 11.4
2014/10/30 0:00:28.847 00000 229.9 103.9 -12.51 .40 64.1 217.1 3 23.1 15 1 3511. .055 -9.99 11.4
2014/10/30 0:00:28.847 00000 159.9 81.7 -12.51 .40 59.9 92.6 3 29.4 23 1 3511. .055 -9.99 11.4
2014/10/30 0:00:28.847 00001 159.9 71.6 -12.51 .40 64.0 217.1 3 23.4 25 1 3508. .055 -9.99 11.3
2014/10/30 0:00:28.847 00001 229.9 103.9 -12.51 .40 64.0 217.1 3 23.4 25 1 3508. .055 -9.99 11.3
2014/10/30 0:00:28.847 00001 159.9 81.7 -12.51 .40 59.9 92.6 3 29.6 14 1 3508. .055 -9.99 11.3
2014/10/30 0:01:30.114 00002 229.9 92.3 1.02 1.62 67.3 138.7 2 27.2 25 1 1746. .138 -9.99 5.7
2014/10/30 0:01:30.114 00002 159.9 89.9 1.02 1.62 56.4 97.4 2 26.5 35 1 1746. .138 -9.99 5.7
2014/10/30 0:02:30.504 00005 96.0 90.1 -25.64 1.18 20.3 120.5 1 17.2 45 1 2553. .165 -9.99 8.7
2014/10/30 0:02:52.896 00007 102.0 91.5 2.23 .03 26.4 140.8 1 11.8 35 1 19393. .098 -9.99 23.6
2014/10/30 0:02:52.890 00008 100.0 89.6 3.52 .57 26.5 139.9 1 10.9 35 1 4394. .214 -9.99 13.0
2014/10/30 0:02:52.894 00009 104.0 93.3 2.39 .52 26.4 141.0 1 10.1 13 1 4376. .110 -9.99 12.5
2014/10/30 0:03:20.093 0000B 106.0 84.5 5.30 2.01 37.4 202.2 1 25.8 45 1 2306. .095 -9.99 7.8
2014/10/30 0:04:08.515 0000D 102.0 88.1 13.20 1.92 30.5 180.6 3 28.4 15 1 3200. .061 -9.99 9.9
2014/10/30 0:04:08.515 0000D 102.0 99.4 13.20 1.92 12.9 68.6 3 26.1 45 1 3200. .061 -9.99 9.9
2014/10/30 0:04:08.515 0000D 102.0 88.2 13.20 1.92 30.3 128.4 3 38.2 13 1 3200. .061 -9.99 9.9
2014/10/30 0:04:12.642 0000E 108.0 91.9 -38.85 .20 31.9 222.0 1 23.8 15 1 9636. .084 -9.99 20.2
2014/10/30 0:04:12.640 0000F 110.0 93.6 -38.17 .51 31.9 221.9 1 23.6 25 1 4974. .086 -9.99 14.7
2014/10/30 0:04:40.580 0000G 201.9 93.0 -20.01 .41 63.4 38.1 1 24.7 15 1 2716. .244 -9.99 9.3
I would like to have the time (that's in the second in the second column) on the x axis, and the snr values (that's in the 16th column) on the y axis with a line joining them.
Thanks for any help, and if you need any more info just ask please.
Then you must tell gnuplot, that you want to plot time data on the x-axis with
set xdata time
and in which format the time should be parsed
set timefmt '%H:%M:%S'
So, a complete minimal script could be
set timefmt '%H:%M:%S'
set xdata time
plot 'snr.dat' using 2:17 with lines title 'SNR'

Resources