Extending macro from 1 row to 56 rows. Application defined error - excel

Ihave never done Excel VBA macros.
The data I’m trying to get organized into a single column is in excel rows 22-78.
0 0.04 0.08 0.12 0.16 0.2 0.24 0.28 0.32 0.36 0.4 0.44 0.48 0.52 0.56 0.6 0.64 0.68 0.72 0.76 0.8 0.84 0.88 0.92 0.96 1 1.04 1.08 1.12 1.16 1.2 1.24 1.28 1.32 1.36 1.4 1.44 1.48 1.52 1.56 1.6 1.64 1.68 1.72 1.76 1.8 1.84 1.88 1.92 1.96 2 2.04 2.08 2.12 2.16 2.2 2.24 2.28 2.32 2.36 2.4 2.44 2.48 2.52 2.56 2.6 2.64 2.68 2.72 2.76 2.8 2.84 2.88 2.92 2.96 3 3.04 3.08 3.12 3.16 3.2 3.24 3.28 3.32 3.36 3.4 3.44 3.48 3.52 3.56 3.6 3.64 3.68 3.72 3.76 3.8 3.84 3.88 3.92 3.96 4 4.04 4.08 4.12 4.16 4.2 4.24 4.28 4.32 4.36 4.4 4.44 4.48 4.52 4.56 4.6 4.64 4.68 4.72 4.76 4.8 4.84 4.88 4.92 4.96 5 5.04 5.08 5.12 5.16 5.2 5.24 5.28 5.32 5.36 5.4 5.44 5.48 5.52 5.56 5.6 5.64 5.68 5.72 5.76 5.8 5.84 5.88 5.92 5.96 6 6.04 6.08 6.12 6.16 6.2 6.24 6.28 6.32 6.36 6.4 6.44 6.48 6.52 6.56 6.6 6.64 6.68 6.72 6.76 6.8 6.84 6.88 6.92 6.96 7 7.04 7.08 7.12 7.16 7.2 7.24 7.28 7.32 7.36 7.4 7.44 7.48 7.52 7.56 7.6 7.64 7.68 7.72 7.76 7.8 7.84 7.88 7.92 7.96 8 8.04 8.08 8.12 8.16 8.2 8.24 8.28 8.32 8.36 8.4 8.44 8.48 8.52 8.56 8.6 8.64 8.68 8.72 8.76 8.8 8.84 8.88 8.92 8.96 9 9.04 9.08 9.12 9.16 9.2 9.24 9.28 9.32 9.36 9.4 9.44 9.48 9.52 9.56 9.6 9.64 9.68 9.72 9.76 9.8 9.84 9.88 9.92 9.96 10 10.04 10.08 10.12 10.16 10.2 10.24 10.28 10.32 10.36 10.4 10.44 10.48 10.52 10.56 10.6 10.64 10.68 10.72 10.76 10.8 10.84 10.88 10.92 10.96 11 11.04 11.08 11.12 11.16 11.2 11.24 11.28 11.32 11.36 11.4 11.44 11.48 11.52 11.56 11.6 11.64 11.68 11.72 11.76 11.8 11.84 11.88 11.92 11.96 12 12.04 12.08 12.12 12.16 12.2 12.24 12.28 12.32 12.36 12.4 12.44 12.48 12.52 12.56 12.6 12.64 12.68 12.72 12.76 12.8 12.84 12.88 12.92 12.96 13 13.04 13.08 13.12 13.16 13.2 13.24 13.28 13.32 13.36 13.4 13.44 13.48 13.52 13.56 13.6 13.64 13.68 13.72 13.76 13.8 13.84 13.88 13.92 13.96 14 14.04 14.08 14.12 14.16 14.2 14.24 14.28 14.32 14.36 14.4 14.44 14.48 14.52 14.56 14.6 14.64 14.68 14.72 14.76 14.8 14.84 14.88 14.92 14.96 15 15.04 15.08 15.12 15.16 15.2 15.24 15.28 15.32
This is the data in one row. And such I have from row 22-78. the final files have a similar number of columns but many more rows.
I am not sure what would be a good way to organize this into a single column in excel
I got this working for 1 row.
here's the code
Sub RowsToColumn()
Dim RN As Range
Dim RI As Range
Dim r As Long
Dim LR As Long
Application.ScreenUpdating = False
Columns(1).Insert
r = 0
LR = Range("A" & Rows.Count).End(xlUp).row
For Each RN In Range("A1:A" & LR)
r = r + 1
For Each RI In Range(RN, Range("XFD" & RN.row).End(xlToLeft))
r = r + 1
Cells(r, 1) = RI
RI.Clear
Next RI
Next RN
Columns("A:A").SpecialCells(xlCellTypeBlanks).Delete Shift:=xlUp
End Sub
But to extend this for Rows A22-78
Sub RowsToColumn_Second()
Dim RN As Range
Dim RI As Range
Dim r As Long
Dim LR As Long
Dim row As Range
Dim rng As Range
Dim cell As Range
Application.ScreenUpdating = False
Set rng = Range("A22:A78")
For Each row In rng.Rows
Columns(1, rng).Insert
r = 0
LR = Range("A" & Rows.Count).End(xlUp).row
LR = Range("A" & Rows.Count).End(xlUp).row
For Each RN In Range("A1:A" & LR)
r = r + 1
For Each RI In Range(RN, Range("XFD" & RN.row).End(xlToLeft))
r = r + 1
Cells(r, 1) = RI
RI.Clear
Next RI
Next RN
Next row
Columns("A:A").SpecialCells(xlCellTypeBlanks).Delete Shift:=xlUp
End Sub
This is where it saysApplication defined error-1004. It doesn't like Columns(1, rng).Insert

copy the data
and Paste Special -> Transpose, this will change from rows to colums, or viceversa

Related

Can we replace outliers with the predicted values in pyspark?

I have a df in spark:
(I am actually working on this dataset it is not possible to paste whole data so here is the link)
df = https://www.kaggle.com/schirmerchad/bostonhoustingmlnd?select=housing.csv
Now I found the outliers as below (22 rows in total):
def IQR(df,column):
quantiles = sdf.approxQuantile(column, [0.25, 0.75], 0)
q1 = quantiles[0]
q3 = quantiles[1]
IQR = q3-q1
lower = q1 - 1.5*IQR
upper = q3+ 1.5*IQR
return (lower,upper)
lower, upper = IQR(df,'RM')
lower,upper = 4.8374999999999995 7.617500000000001
outliers = df.filter((df['RM'] > upper) | (df['RM'] < lower))
Now below are the outliers detected :
RM LSTAT PTRATIO MEDV
8.069 4.21 18 812700
7.82 3.57 18 919800
7.765 7.56 17.8 835800
7.853 3.81 14.7 1018500
8.266 4.14 17.4 940800
8.04 3.13 17.4 789600
7.686 3.92 17.4 980700
8.337 2.47 17.4 875700
8.247 3.95 17.4 1014300
8.259 3.54 19.1 898800
8.398 5.91 13 1024800
7.691 6.58 18.6 739200
7.82 3.76 14.9 953400
7.645 3.01 14.9 966000
3.561 7.12 20.2 577500
3.863 13.33 20.2 485100
4.138 37.97 20.2 289800
4.368 30.63 20.2 184800
4.652 28.28 20.2 220500
4.138 23.34 20.2 249900
4.628 34.37 20.2 375900
4.519 36.98 20.2 147000
Now I want to replace the outliers with the ml predicted values, after the ml process I got the predicted values as below:-
RM LSTAT PTRATIO MEDV column_assem column prediction
8.069 4.21 18 812700 {"vectorType":"dense","length":3,"values":[4.21,18,812700]} {"vectorType":"dense","length":3,"values":[812699.9991344779,32.9872628621034,25.697942748362507]} 7.138307692307692
7.82 3.57 18 919800 {"vectorType":"dense","length":3,"values":[3.57,18,919800]} {"vectorType":"dense","length":3,"values":[919799.999082192,36.25675952004636,26.656936598060938]} 7.138307692307692
7.765 7.56 17.8 835800 {"vectorType":"dense","length":3,"values":[7.56,17.8,835800]} {"vectorType":"dense","length":3,"values":[835799.9989959698,37.18609141885786,25.87518521779868]} 7.138307692307692
7.853 3.81 14.7 1018500 {"vectorType":"dense","length":3,"values":[3.81,14.7,1018500]} {"vectorType":"dense","length":3,"values":[1018499.9990279829,40.25963007114179,24.285126110831364]} 7.138307692307692
8.266 4.14 17.4 940800 {"vectorType":"dense","length":3,"values":[4.14,17.4,940800]} {"vectorType":"dense","length":3,"values":[940799.9990507461,37.621770135316275,26.279618209844216]} 7.138307692307692
8.04 3.13 17.4 789600 {"vectorType":"dense","length":3,"values":[3.13,17.4,789600]} {"vectorType":"dense","length":3,"values":[789599.999195178,31.094759131505864,24.832393813608636]} 7.138307692307692
7.686 3.92 17.4 980700 {"vectorType":"dense","length":3,"values":[3.92,17.4,980700]} {"vectorType":"dense","length":3,"values":[980699.9990305867,38.858227336579965,26.637789595102927]} 7.138307692307692
8.337 2.47 17.4 875700 {"vectorType":"dense","length":3,"values":[2.47,17.4,875700]} {"vectorType":"dense","length":3,"values":[875699.9991585133,33.577861049146954,25.59625197564997]} 7.138307692307692
8.247 3.95 17.4 1014300 {"vectorType":"dense","length":3,"values":[3.95,17.4,1014300]} {"vectorType":"dense","length":3,"values":[1014299.9990056665,40.11446130241714,26.949909126197]} 7.138307692307692
8.259 3.54 19.1 898800 {"vectorType":"dense","length":3,"values":[3.54,19.1,898800]} {"vectorType":"dense","length":3,"values":[898799.9990899825,35.406713649671325,27.56000332051734]} 7.138307692307692
8.398 5.91 13 1024800 {"vectorType":"dense","length":3,"values":[5.91,13,1024800]} {"vectorType":"dense","length":3,"values":[1024799.9989586923,42.669988999612016,22.74784587477886]} 7.138307692307692
7.691 6.58 18.6 739200 {"vectorType":"dense","length":3,"values":[6.58,18.6,739200]} {"vectorType":"dense","length":3,"values":[739199.9990946348,32.64270527156902,25.73328780757773]} 7.138307692307692
7.82 3.76 14.9 953400 {"vectorType":"dense","length":3,"values":[3.76,14.9,953400]} {"vectorType":"dense","length":3,"values":[953399.9990744753,37.82403517229104,23.880552758747136]} 7.138307692307692
7.645 3.01 14.9 966000 {"vectorType":"dense","length":3,"values":[3.01,14.9,966000]} {"vectorType":"dense","length":3,"values":[965999.9990932231,37.53477931241747,23.960460322415766]} 7.138307692307692
3.561 7.12 20.2 577500 {"vectorType":"dense","length":3,"values":[7.12,20.2,577500]} {"vectorType":"dense","length":3,"values":[577499.9991773808,27.20258411502299,25.862694427868608]} 6.376732394366198
3.863 13.33 20.2 485100 {"vectorType":"dense","length":3,"values":[13.33,20.2,485100]} {"vectorType":"dense","length":3,"values":[485099.999013695,30.032948373359417,25.311342678468208]} 6.043858108108108
4.138 37.97 20.2 289800 {"vectorType":"dense","length":3,"values":[37.97,20.2,289800]} {"vectorType":"dense","length":3,"values":[289799.99824280146,47.51591753902686,24.707706732637366]} 5.2370714285714275
4.368 30.63 20.2 184800 {"vectorType":"dense","length":3,"values":[30.63,20.2,184800]} {"vectorType":"dense","length":3,"values":[184799.99858809082,36.35256433967503,23.378827944979733]} 5.2370714285714275
4.652 28.28 20.2 220500 {"vectorType":"dense","length":3,"values":[28.28,20.2,220500]} {"vectorType":"dense","length":3,"values":[220499.9986495131,35.3082739723793,23.59425617851294]} 5.2370714285714275
4.138 23.34 20.2 249900 {"vectorType":"dense","length":3,"values":[23.34,20.2,249900]} {"vectorType":"dense","length":3,"values":[249899.99881098093,31.44714189260281,23.625084354536643]} 6.043858108108108
4.628 34.37 20.2 375900 {"vectorType":"dense","length":3,"values":[34.37,20.2,375900]} {"vectorType":"dense","length":3,"values":[375899.9983146336,47.06252004732307,25.328138233469573]} 5.2370714285714275
4.519 36.98 20.2 147000 {"vectorType":"dense","length":3,"values":[36.98,20.2,147000]} {"vectorType":"dense","length":3,"values":[146999.99838054206,41.31545014321207,23.33912202640834]} 5.2370714285714275
If it is one value I am aware of lit() to replace it but when there are multiple values how do we replace with the original one's?
Assuming that the original dataframe is called df and the machine-learning transformed dataframe is called ml, you can do a join and replace the RM column with the prediction value if the row satisfy the outlier condition:
df2 = df.join(ml, df.columns, 'left').withColumn(
'RM',
F.when(
(F.col('RM') > upper) | (F.col('RM') < lower),
F.col('prediction')
).otherwise(F.col('RM'))
).select(df.columns)

BeautifulSoup and urlopen aren't fetching the right table

I'm trying to practice BeautifulSoup and urlopen by using Basketball-Reference datasets. When I try and get individual player's stats, everything works fine, but then I tried to use the same code for Team's stats and apparently urlopen isn't finding the right table.
The following code is to get the "headers" from the page.
def fetch_years():
#Determine the urls
url = "https://www.basketball-reference.com/leagues/NBA_2000.html?sr&utm_source=direct&utm_medium=Share&utm_campaign=ShareTool#team-stats-per_game::none"
html = urlopen(url)
soup = BeautifulSoup(html)
soup.find_all('tr')
headers = [th.get_text() for th in soup.find_all('tr')[0].find_all('th')]
headers = headers[1:]
print(headers)
I'm trying to get the Team's stats per game data, in a format like:
['Tm', 'G', 'MP', 'FG', ...]
Instead, the header data I'm getting is:
['W', 'L', 'W/L%', ...]
which is the very first table in the 1999-2000 season information about the teams (under the name 'Division Standings').
If you use that same code for a player's data such as this one, you get the result I'm looking for:
Age Tm Lg Pos G GS MP FG ... DRB TRB AST STL BLK TOV PF PTS
0 20 OKC NBA PG 82 65 32.5 5.3 ... 2.7 4.9 5.3 1.3 0.2 3.3 2.3 15.3
1 21 OKC NBA PG 82 82 34.3 5.9 ... 3.1 4.9 8.0 1.3 0.4 3.3 2.5 16.1
2 22 OKC NBA PG 82 82 34.7 7.5 ... 3.1 4.6 8.2 1.9 0.4 3.9 2.5 21.9
3 23 OKC NBA PG 66 66 35.3 8.8 ... 3.1 4.6 5.5 1.7 0.3 3.6 2.2 23.6
4 24 OKC NBA PG 82 82 34.9 8.2 ... 3.9 5.2 7.4 1.8 0.3 3.3 2.3 23.2
The code to webscrape came originally from here.
the sports -reference.com sites are trickier than your standard ones. The tables are rendered after loading the page (with the exception of a few tables on the pages), so you'd need to use Selenium to let it render first, then pull the html source code.
However, the other option is if you look at the html source, you'll see those tables are within the comments. You could use BeautifulSoup to pull out the comments tags, then search through those for the table tags.
This will return a list of dataframes, and the Team Per Game stats are the table in index position 1:
import requests
from bs4 import BeautifulSoup
from bs4 import Comment
import pandas as pd
def fetch_years():
#Determine the urls
url = "https://www.basketball-reference.com/leagues/NBA_2000.html?sr&utm_source=direct&utm_medium=Share&utm_campaign=ShareTool#team-stats-per_game::none"
html = requests.get(url)
soup = BeautifulSoup(html.text)
comments = soup.find_all(string=lambda text: isinstance(text, Comment))
tables = []
for each in comments:
if 'table' in each:
try:
tables.append(pd.read_html(each)[0])
except:
continue
return tables
tables = fetch_years()
Output:
print (tables[1].to_string())
Rk Team G MP FG FGA FG% 3P 3PA 3P% 2P 2PA 2P% FT FTA FT% ORB DRB TRB AST STL BLK TOV PF PTS
0 1.0 Sacramento Kings* 82 241.5 40.0 88.9 0.450 6.5 20.2 0.322 33.4 68.7 0.487 18.5 24.6 0.754 12.9 32.1 45.0 23.8 9.6 4.6 16.2 21.1 105.0
1 2.0 Detroit Pistons* 82 241.8 37.1 80.9 0.459 5.4 14.9 0.359 31.8 66.0 0.481 23.9 30.6 0.781 11.2 30.0 41.2 20.8 8.1 3.3 15.7 24.5 103.5
2 3.0 Dallas Mavericks 82 240.6 39.0 85.9 0.453 6.3 16.2 0.391 32.6 69.8 0.468 17.2 21.4 0.804 11.4 29.8 41.2 22.1 7.2 5.1 13.7 21.6 101.4
3 4.0 Indiana Pacers* 82 240.6 37.2 81.0 0.459 7.1 18.1 0.392 30.0 62.8 0.478 19.9 24.5 0.811 10.3 31.9 42.1 22.6 6.8 5.1 14.1 21.8 101.3
4 5.0 Milwaukee Bucks* 82 242.1 38.7 83.3 0.465 4.8 13.0 0.369 33.9 70.2 0.483 19.0 24.2 0.786 12.4 28.9 41.3 22.6 8.2 4.6 15.0 24.6 101.2
5 6.0 Los Angeles Lakers* 82 241.5 38.3 83.4 0.459 4.2 12.8 0.329 34.1 70.6 0.482 20.1 28.9 0.696 13.6 33.4 47.0 23.4 7.5 6.5 13.9 22.5 100.8
6 7.0 Orlando Magic 82 240.9 38.6 85.5 0.452 3.6 10.6 0.338 35.1 74.9 0.468 19.2 26.1 0.735 14.0 31.0 44.9 20.8 9.1 5.7 17.6 24.0 100.1
7 8.0 Houston Rockets 82 241.8 36.6 81.3 0.450 7.1 19.8 0.358 29.5 61.5 0.480 19.2 26.2 0.733 12.3 31.5 43.8 21.6 7.5 5.3 17.4 20.3 99.5
8 9.0 Boston Celtics 82 240.6 37.2 83.9 0.444 5.1 15.4 0.331 32.2 68.5 0.469 19.8 26.5 0.745 13.5 29.5 43.0 21.2 9.7 3.5 15.4 27.1 99.3
9 10.0 Seattle SuperSonics* 82 241.2 37.9 84.7 0.447 6.7 19.6 0.339 31.2 65.1 0.480 16.6 23.9 0.695 12.7 30.3 43.0 22.9 8.0 4.2 14.0 21.7 99.1
10 11.0 Denver Nuggets 82 242.1 37.3 84.3 0.442 5.7 17.0 0.336 31.5 67.2 0.469 18.7 25.8 0.724 13.1 31.6 44.7 23.3 6.8 7.5 15.6 23.9 99.0
11 12.0 Phoenix Suns* 82 241.5 37.7 82.6 0.457 5.6 15.2 0.368 32.1 67.4 0.477 17.9 23.6 0.759 12.5 31.2 43.7 25.6 9.1 5.3 16.7 24.1 98.9
12 13.0 Minnesota Timberwolves* 82 242.7 39.3 84.3 0.467 3.0 8.7 0.346 36.3 75.5 0.481 16.8 21.6 0.780 12.4 30.1 42.5 26.9 7.6 5.4 13.9 23.3 98.5
13 14.0 Charlotte Hornets* 82 241.2 35.8 79.7 0.449 4.1 12.2 0.339 31.7 67.5 0.469 22.7 30.0 0.758 10.8 32.1 42.9 24.7 8.9 5.9 14.7 20.4 98.4
14 15.0 New Jersey Nets 82 241.8 36.3 83.9 0.433 5.8 16.8 0.347 30.5 67.2 0.454 19.5 24.9 0.784 12.7 28.2 40.9 20.6 8.8 4.8 13.6 23.3 98.0
15 16.0 Portland Trail Blazers* 82 241.2 36.8 78.4 0.470 5.0 13.8 0.361 31.9 64.7 0.493 18.8 24.7 0.760 11.8 31.2 43.0 23.5 7.7 4.8 15.2 22.7 97.5
16 17.0 Toronto Raptors* 82 240.9 36.3 83.9 0.433 5.2 14.3 0.363 31.2 69.6 0.447 19.3 25.2 0.765 13.4 29.9 43.3 23.7 8.1 6.6 13.9 24.3 97.2
17 18.0 Cleveland Cavaliers 82 242.1 36.3 82.1 0.442 4.2 11.2 0.373 32.1 70.9 0.453 20.2 26.9 0.750 12.3 30.5 42.8 23.7 8.7 4.4 17.4 27.1 97.0
18 19.0 Washington Wizards 82 241.5 36.7 81.5 0.451 4.1 10.9 0.376 32.6 70.6 0.462 19.1 25.7 0.743 13.0 29.7 42.7 21.6 7.2 4.7 16.1 26.2 96.6
19 20.0 Utah Jazz* 82 240.9 36.1 77.8 0.464 4.0 10.4 0.385 32.1 67.4 0.476 20.3 26.2 0.773 11.4 29.6 41.0 24.9 7.7 5.4 14.9 24.5 96.5
20 21.0 San Antonio Spurs* 82 242.1 36.0 78.0 0.462 4.0 10.8 0.374 32.0 67.2 0.476 20.1 27.0 0.746 11.3 32.5 43.8 22.2 7.5 6.7 15.0 20.9 96.2
21 22.0 Golden State Warriors 82 240.9 36.5 87.1 0.420 4.2 13.0 0.323 32.3 74.0 0.437 18.3 26.2 0.697 15.9 29.7 45.6 22.6 8.9 4.3 15.9 24.9 95.5
22 23.0 Philadelphia 76ers* 82 241.8 36.5 82.6 0.442 2.5 7.8 0.323 34.0 74.8 0.454 19.2 27.1 0.708 14.0 30.1 44.1 22.2 9.6 4.7 15.7 23.6 94.8
23 24.0 Miami Heat* 82 241.8 36.3 78.8 0.460 5.4 14.7 0.371 30.8 64.1 0.481 16.4 22.3 0.736 11.2 31.9 43.2 23.5 7.1 6.4 15.0 23.7 94.4
24 25.0 Atlanta Hawks 82 241.8 36.6 83.0 0.441 3.1 9.9 0.317 33.4 73.1 0.458 18.0 24.2 0.743 14.0 31.3 45.3 18.9 6.1 5.6 15.4 21.0 94.3
25 26.0 Vancouver Grizzlies 82 242.1 35.3 78.5 0.449 4.0 11.0 0.361 31.3 67.6 0.463 19.4 25.1 0.774 12.3 28.3 40.6 20.7 7.4 4.2 16.8 22.9 93.9
26 27.0 New York Knicks* 82 241.8 35.3 77.7 0.455 4.3 11.4 0.375 31.0 66.3 0.468 17.2 22.0 0.781 9.8 30.7 40.5 19.4 6.3 4.3 14.6 24.2 92.1
27 28.0 Los Angeles Clippers 82 240.3 35.1 82.4 0.426 5.2 15.5 0.339 29.9 67.0 0.446 16.6 22.3 0.746 11.6 29.0 40.6 18.0 7.0 6.0 16.2 22.2 92.0
28 29.0 Chicago Bulls 82 241.5 31.3 75.4 0.415 4.1 12.6 0.329 27.1 62.8 0.432 18.1 25.5 0.709 12.6 28.3 40.9 20.1 7.9 4.7 19.0 23.3 84.8
29 NaN League Average 82 241.5 36.8 82.1 0.449 4.8 13.7 0.353 32.0 68.4 0.468 19.0 25.3 0.750 12.4 30.5 42.9 22.3 7.9 5.2 15.5 23.3 97.5

How to correct Python number presentation and/or precision

The floating point numbers with finite precision are represented with different precision in identical conditions
It is detected and tested on python version 3.x under Linux and Windows. And take the negative effect for the next calculation.
for i in range(100):
k = 1 + i / 100;
print(k)
1.0
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.1
1.11
1.12
1.13
1.1400000000000001
1.15
1.16
1.17
1.18
1.19
1.2
1.21
1.22
1.23
1.24
1.25
1.26
1.27
1.28
1.29
1.3
1.31
1.32
1.33
1.34
1.35
1.3599999999999999
1.37
1.38
1.3900000000000001
1.4
1.41
1.42
1.43
1.44
1.45
1.46
1.47
1.48
1.49
1.5
1.51
1.52
1.53
1.54
1.55
1.56
1.5699999999999998
1.58
1.5899999999999999
1.6
1.6099999999999999
1.62
1.63
1.6400000000000001
1.65
1.6600000000000001
1.67
1.6800000000000002
1.69
1.7
1.71
1.72
1.73
1.74
1.75
1.76
1.77
1.78
1.79
1.8
1.81
1.8199999999999998
1.83
1.8399999999999999
1.85
1.8599999999999999
1.87
1.88
1.8900000000000001
1.9
1.9100000000000001
1.92
1.9300000000000002
1.94
1.95
1.96
1.97
1.98
1.99
It is possible to set the precision in the following way:
for i in range(100):
k = 1 + i / 100;
print("%.Nf"%k)
Where N - decimal numbers.
Keep in mind, that regularly you don't need a lot of them, though the number could be really huge.

Pandas Concat new column

Why do i get NaN in 'ACTION' column?
It seems strange to me that i am getting that result. I have tried using ignore_index = True and it has a freq error.
C H L O OI V WAP ACTION
datetime
2017-03-14 00:52:00 8.25 8.25 8.19 8.21 302.0 1769.0 8.22 NaN
2017-03-13 23:54:00 8.09 8.10 8.09 8.10 6.0 65.0 8.10 NaN
2017-03-14 01:03:00 8.29 8.32 8.28 8.29 175.0 1084.0 8.30 NaN
2017-03-14 00:03:00 8.15 8.15 8.14 8.15 13.0 50.0 8.15 NaN
2017-03-13 23:57:00 8.13 8.13 8.12 8.12 3.0 6.0 8.12 NaN
I want to get -
C H L O OI V WAP ACTION
datetime
2017-03-14 00:52:00 8.25 8.25 8.19 8.21 302.0 1769.0 8.22 100
2017-03-13 23:54:00 8.09 8.10 8.09 8.10 6.0 65.0 8.10 200
2017-03-14 01:03:00 8.29 8.32 8.28 8.29 175.0 1084.0 8.30 300
2017-03-14 00:03:00 8.15 8.15 8.14 8.15 13.0 50.0 8.15 400
2017-03-13 23:57:00 8.13 8.13 8.12 8.12 3.0 6.0 8.12 500
buy_stp = pd.Series([100,200,300,400,500],name= 'ACTION')
print(buy_stp)
df10 = pd.concat([df_concat_results,
buy_stp],
axis=1,
join_axes=[df_concat_results.index])
print(df10)
You need same indexes - Series with DataFrame for alignment else get NaNs:
buy_stp.index = df.index
df['ACTION'] = buy_stp
print (df)
C H L O OI V WAP ACTION
datetime
2017-03-14 00:52:00 8.25 8.25 8.19 8.21 302.0 1769.0 8.22 100
2017-03-13 23:54:00 8.09 8.10 8.09 8.10 6.0 65.0 8.10 200
2017-03-14 01:03:00 8.29 8.32 8.28 8.29 175.0 1084.0 8.30 300
2017-03-14 00:03:00 8.15 8.15 8.14 8.15 13.0 50.0 8.15 400
2017-03-13 23:57:00 8.13 8.13 8.12 8.12 3.0 6.0 8.12 500
Or:
buy_stp = pd.Series([100,200,300,400,500],name= 'ACTION', index=df.index)
print(buy_stp)
datetime
2017-03-14 00:52:00 100
2017-03-13 23:54:00 200
2017-03-14 01:03:00 300
2017-03-14 00:03:00 400
2017-03-13 23:57:00 500
Name: ACTION, dtype: int64
df['ACTION'] = buy_stp
print (df)
C H L O OI V WAP ACTION
datetime
2017-03-14 00:52:00 8.25 8.25 8.19 8.21 302.0 1769.0 8.22 100
2017-03-13 23:54:00 8.09 8.10 8.09 8.10 6.0 65.0 8.10 200
2017-03-14 01:03:00 8.29 8.32 8.28 8.29 175.0 1084.0 8.30 300
2017-03-14 00:03:00 8.15 8.15 8.14 8.15 13.0 50.0 8.15 400
2017-03-13 23:57:00 8.13 8.13 8.12 8.12 3.0 6.0 8.12 500
Also works if convert to numpy array by values or list, only is necessary same length df and buy_stp:
df['ACTION'] = buy_stp.values
print (df)
C H L O OI V WAP ACTION
datetime
2017-03-14 00:52:00 8.25 8.25 8.19 8.21 302.0 1769.0 8.22 100
2017-03-13 23:54:00 8.09 8.10 8.09 8.10 6.0 65.0 8.10 200
2017-03-14 01:03:00 8.29 8.32 8.28 8.29 175.0 1084.0 8.30 300
2017-03-14 00:03:00 8.15 8.15 8.14 8.15 13.0 50.0 8.15 400
2017-03-13 23:57:00 8.13 8.13 8.12 8.12 3.0 6.0 8.12 500
df['ACTION'] = buy_stp.tolist()
print (df)
C H L O OI V WAP ACTION
datetime
2017-03-14 00:52:00 8.25 8.25 8.19 8.21 302.0 1769.0 8.22 100
2017-03-13 23:54:00 8.09 8.10 8.09 8.10 6.0 65.0 8.10 200
2017-03-14 01:03:00 8.29 8.32 8.28 8.29 175.0 1084.0 8.30 300
2017-03-14 00:03:00 8.15 8.15 8.14 8.15 13.0 50.0 8.15 400
2017-03-13 23:57:00 8.13 8.13 8.12 8.12 3.0 6.0 8.12 500
If I understand you correctly, you just want to add a column to a data frame. If so, this is the easiest way to do it.
df['Action'] = buy_stp

Gnuplot maximum of an sm b smoothed curve

I have a data file:
0.4 -0.97
0.41 -0.96
0.42 -0.95
0.43 -0.93
0.44 -0.92
0.45 -0.91
0.46 -0.90
0.47 -0.88
0.48 -0.87
0.49 -0.86
0.5 -0.84
0.51 -0.83
0.52 -0.82
0.53 -0.81
0.54 -0.80
0.55 -0.78
0.56 -0.77
0.57 -0.76
0.58 -0.74
0.59 -0.73
0.6 -0.72
0.61 -0.71
0.62 -0.70
0.63 -0.69
0.64 -0.67
0.65 -0.66
0.66 -0.65
0.67 -0.64
0.68 -0.62
0.69 -0.61
0.7 -0.60
0.71 -0.59
0.72 -0.58
0.73 -0.56
0.74 -0.55
0.75 -0.54
0.76 -0.53
0.77 -0.52
0.78 -0.51
0.79 -0.50
0.8 -0.49
0.81 -0.47
0.82 -0.47
0.83 -0.46
0.84 -0.44
0.85 -0.43
0.86 -0.42
0.87 -0.41
0.88 -0.40
0.89 -0.39
0.9 -0.38
0.91 -0.49
0.92 -0.48
0.93 -0.47
0.94 -0.46
0.95 -0.44
0.96 -0.43
0.97 -0.42
0.98 -0.41
0.99 -0.40
1.0 -0.39
1.01 -0.38
1.02 -0.37
1.03 -0.36
1.04 -0.35
1.05 -0.34
1.06 -0.33
1.07 -0.32
1.08 -0.31
1.09 -0.30
1.1 -0.30
1.11 -0.29
1.12 -0.28
1.13 -0.27
1.14 -0.26
1.15 -0.25
1.16 -0.24
1.17 -0.24
1.18 -0.23
1.19 -0.22
1.2 -0.21
1.21 -0.20
1.22 -0.20
1.23 -0.19
1.24 -0.18
1.25 -0.17
1.26 -0.17
1.27 -0.16
1.28 -0.15
1.29 -0.14
1.3 -0.13
1.31 -0.12
1.32 -0.11
1.33 -0.11
1.34 -0.10
1.35 -0.09
1.36 -0.08
1.37 -0.08
1.38 -0.07
1.39 -0.06
1.4 -0.05
1.41 -0.04
1.42 -0.03
1.43 -0.03
1.44 -0.02
1.45 -0.01
1.46 -0.01
1.47 -0.00
1.48 0.00
1.49 0.01
1.5 0.02
1.51 0.03
1.52 0.04
1.53 0.04
1.54 0.05
1.55 0.06
1.56 0.06
1.57 0.07
1.58 0.08
1.59 0.08
1.6 0.09
1.61 0.09
1.62 0.10
1.63 0.10
1.64 0.10
1.65 0.11
1.66 0.11
1.67 0.12
1.68 0.12
1.69 0.13
1.7 0.14
1.71 0.14
1.72 0.14
1.73 0.15
1.74 0.15
1.75 0.16
1.76 0.16
1.77 0.17
1.78 0.17
1.79 0.18
1.8 0.19
1.81 0.20
1.82 0.20
1.83 0.21
1.84 0.21
1.85 0.22
1.86 0.22
1.87 0.23
1.88 0.24
1.89 0.24
1.9 0.25
1.91 0.25
1.92 0.26
1.93 0.26
1.94 0.26
1.95 0.27
1.96 0.28
1.97 0.28
1.98 0.28
1.99 0.29
2.0 0.29
2.01 0.29
2.02 0.29
2.03 0.30
2.04 0.30
2.05 0.30
2.06 0.31
2.07 0.32
2.08 0.32
2.09 0.33
2.1 0.33
2.11 0.33
2.12 0.34
2.13 0.34
2.14 0.34
2.15 0.35
2.16 0.35
2.17 0.36
2.18 0.36
2.19 0.36
2.2 0.37
2.21 0.37
2.22 0.37
2.23 0.38
2.24 0.38
2.25 0.38
2.26 0.38
2.27 0.39
2.28 0.39
2.29 0.39
2.3 0.40
2.31 0.40
2.32 0.40
2.33 0.40
2.34 0.41
2.35 0.41
2.36 0.42
2.37 0.42
2.38 0.43
2.39 0.43
2.4 0.43
2.41 0.43
2.42 0.44
2.43 0.44
2.44 0.44
2.45 0.44
2.46 0.45
2.47 0.45
2.48 0.45
2.49 0.45
2.5 0.46
2.51 0.46
2.52 0.46
2.53 0.47
2.54 0.47
2.55 0.47
2.56 0.48
2.57 0.48
2.58 0.49
2.59 0.36
2.6 0.36
2.61 0.36
2.62 0.36
2.63 0.37
2.64 0.37
2.65 0.37
2.66 0.37
2.67 0.38
2.68 0.38
2.69 0.38
2.7 0.38
2.71 0.38
2.72 0.38
2.73 0.38
2.74 0.38
2.75 0.38
2.76 0.38
2.77 0.38
2.78 0.38
2.79 0.39
2.8 0.39
2.81 0.39
2.82 0.39
2.83 0.39
2.84 0.39
2.85 0.28
2.86 0.28
2.87 0.28
2.88 0.28
2.89 0.28
2.9 0.28
2.91 0.28
2.92 0.28
2.93 0.29
2.94 0.29
2.95 0.29
2.96 0.29
2.97 0.29
2.98 0.29
2.99 0.29
3.0 0.19
3.01 0.19
3.02 0.19
3.03 0.19
3.04 0.19
3.05 0.19
3.06 0.19
3.07 0.19
3.08 0.20
3.09 0.20
3.1 0.20
3.11 0.20
3.12 0.20
3.13 0.20
3.14 0.20
3.15 0.20
3.16 0.20
3.17 0.20
3.18 0.21
3.19 0.21
3.2 0.21
3.21 0.21
3.22 0.21
3.23 0.21
3.24 0.21
3.25 0.21
3.26 0.21
3.27 0.21
3.28 0.21
3.29 0.21
3.3 0.21
3.31 0.21
3.32 0.21
3.33 0.21
3.34 0.21
3.35 0.21
3.36 0.21
3.37 0.22
3.38 0.22
3.39 0.22
3.4 0.22
3.41 0.22
3.42 0.22
3.43 0.22
3.44 0.22
3.45 0.22
3.46 0.22
3.47 0.22
3.48 0.22
3.49 0.22
3.5 0.22
3.51 0.23
3.52 0.23
3.53 0.23
3.54 0.23
3.55 0.23
3.56 0.13
3.57 0.13
3.58 0.13
3.59 0.13
3.6 0.13
3.61 0.13
3.62 0.13
3.63 0.13
3.64 0.13
3.65 0.13
3.66 0.13
3.67 0.13
3.68 0.13
3.69 0.13
3.7 0.13
3.71 0.13
3.72 0.14
3.73 0.14
3.74 0.14
3.75 0.14
3.76 0.05
3.77 0.05
3.78 0.05
3.79 0.05
3.8 0.05
3.81 -0.04
3.82 -0.04
3.83 -0.04
3.84 -0.04
3.85 -0.04
3.86 -0.04
3.87 -0.04
3.88 -0.04
3.89 -0.04
3.9 -0.04
3.91 -0.04
3.92 -0.04
3.93 -0.04
3.94 -0.04
3.95 -0.12
3.96 -0.12
3.97 -0.12
3.98 -0.12
3.99 -0.12
4.0 -0.12
4.01 -0.12
4.02 -0.12
4.03 -0.12
4.04 -0.12
4.05 -0.19
4.06 -0.19
4.07 -0.19
4.08 -0.19
4.09 -0.19
4.1 -0.19
4.11 -0.19
4.12 -0.41
4.13 -0.41
4.14 -0.41
4.15 -0.47
4.16 -0.47
4.17 -0.47
4.18 -0.47
4.19 -0.47
4.2 -0.47
4.21 -0.47
4.22 -0.54
4.23 -0.54
4.24 -0.60
4.25 -0.65
4.26 -0.65
4.27 -0.65
4.28 -0.65
4.29 -0.65
4.3 -0.65
4.31 -0.65
4.32 -0.65
4.33 -0.65
4.34 -0.65
4.35 -0.65
4.36 -0.65
4.37 -0.65
4.38 -0.71
4.39 -0.71
4.4 -0.71
4.41 -0.71
4.42 -0.71
4.43 -0.71
4.44 -0.71
4.45 -0.71
4.46 -0.71
4.47 -0.71
4.48 -0.71
4.49 -0.71
4.5 -0.71
4.51 -0.71
4.52 -0.71
4.53 -0.76
4.54 -0.76
4.55 -0.82
4.56 -0.82
4.57 -0.87
4.58 -0.87
4.59 -0.87
4.6 -0.87
4.61 -0.92
4.62 -0.97
4.63 -1.06
4.64 -1.06
4.65 -1.06
4.66 -1.06
4.67 -1.06
4.68 -1.06
4.69 -1.06
4.7 -1.06
4.71 -1.06
4.72 -1.06
4.73 -1.06
4.74 -1.11
4.75 -1.11
4.76 -1.11
4.77 -1.11
4.78 -1.11
4.79 -1.11
4.8 -1.11
4.81 -1.11
4.82 -1.11
4.83 -1.11
4.84 -1.11
4.85 -1.15
4.86 -1.15
4.87 -1.15
4.88 -1.15
I wish to create a "well" smoother curve, so i use
plot "for_gnuplot" lw 3 w l sm b title ""
I get the following image:
This is very nice, but i wish to mark the maximum in some way. I know that with sm b the maximum is not the real maximum of the plot, but i dont know how to mark this new maximum value.
Thanks
You can write the (x,y) data of the smoothed plot to a temporary file, do some statistics on this file, and plot the results:
# Generate the data for the smooth plot
set samples 1000
set table "temp.dat"
plot "for_gnuplot" lw 3 w l sm b title "1"
unset table
# Get maximum values and indices of maximum values:
# A_max_y, A_index_max_y, B_max_y, B_index_max_y
stats "for_gnuplot" prefix "A"
stats "temp.dat" using 1:2 prefix "B"
# Calculate positions from indices.
# We need the x-value (first column) at B_index_max_y. We know that the first
# column of "temp.dat" consists of equidistant x-values. So we just fit a
# linear function to map from index to position. (Could be done analytically.)
pos_from_index(x) = a*x + b
fit pos_from_index(x) "for_gnuplot" using 0:1 via a, b
A_xvalue_max_y = pos_from_index(A_index_max_y)
fit pos_from_index(x) "temp.dat" using 0:1 via a, b
B_xvalue_max_y = pos_from_index(B_index_max_y)
# Make some arrows to indicate maximal values
set arrow 1 from A_xvalue_max_y, graph 0.99 to A_xvalue_max_y, A_max_y fill lw 2
set arrow 2 from B_xvalue_max_y, graph 0.8 to B_xvalue_max_y, B_max_y fill lw 2
set label 1 at A_xvalue_max_y, graph 0.99 "max raw" offset 0.2, -0.3
set label 2 at B_xvalue_max_y, graph 0.8 "max smooth" center offset 0, -0.4
# Finally plot the graphs
set terminal png
set output "graph.png"
plot "for_gnuplot" lw 2 w l title "raw" ,\
"for_gnuplot" lw 2 w l sm b title "smooth"
This produces the following output:
PS: I would be interested if there is a more direct way to access a value from a file at a specific index.
Here is a link: http://www.phyast.pitt.edu/~zov1/gnuplot/html/statistics.html
Scroll to "Determining the position of the minimum and maximum".

Resources