Multiple boxplots in SAS

Multiple boxplots in SAS - statistics

I have this data set and I would like to make all boxplots of the 9 input variables to appear on the same plot, despite that they are in different scales. Could you please tell me if there is an easy way to accomplish this?
I am a novice SAS user so I would appreciate some advice. Thank you.
data raw;
input ID$ Family DistRd Cotton Maize Sorg Millet Bull Cattle Goats;
datalines;
FARM1 12 80 1.5 1 3 0.25 2 0 1
FARM2 54 8 6 4 0 1 6 32 5
FARM3 11 13 0.5 1 0 0 0 0 0
FARM4 21 13 2 2.5 1 0 1 0 5
FARM5 61 30 3 5 0 0 4 21 0
FARM6 20 70 0 2 3 0 2 0 3
FARM7 29 35 1.5 2 0 0 0 0 0
FARM8 29 35 2 3 2 0 0 0 0
FARM9 57 9 5 5 0 0 4 5 2
FARM10 23 33 2 2 1 0 2 1 7
FARM11 13 9 0.5 2 2 0 0 0 0
FARM12 15 9 2 2 2 0 0 0 0
FARM13 27 3 1.5 0 2 1 0 0 1
FARM14 28 5 2 0.5 2 2 2 0 5
FARM15 52 5 7 1 7 0 4 11 3
FARM16 12 10 2 2.5 3 0 0 0 0
FARM17 25 30 1 1 4 0 2 0 5
FARM18 5 3 1 0 1 0.5 0 0 3
FARM19 45 30 4.5 1 1 0 6 13 20
FARM20 6 7 1 1 1 1 2 0 5
FARM21 17 8 1.5 0.5 1.5 0.25 0 0 2
FARM22 22 6 3 2 3 1 3 0 2
FARM23 43 40 7 3 3 0.5 6 2 3
FARM24 66 36 0 0.5 5 5 0 0 0
FARM25 15 3 1 0 1.5 0.5 1 0 1
FARM26 26 5 2 1.5 2 2 1 0 0
FARM27 31 5 1.5 1 3 2 2 0 0
FARM28 37 2 3 2 3 5 3 0 5
FARM29 81 2 8 4 4 12 7 8 13
FARM30 14 10 0 0.5 3 1 0 0 0
FARM31 20 7 2 1 4 3 2 0 5
FARM32 26 7 2 1 2 2 2 0 2
FARM33 12 10 0.5 1 3 1 0 0 0
FARM34 18 35 4 3 3 3 4 0 0
FARM35 11 29 1 0.5 3 2 2 0 2
FARM36 50 29 5 3 5 4 4 8 4
FARM37 7 9 0 1 1 0 0 0 0
FARM38 26 9 2 1 3 0 0 0 0
FARM39 19 33 1 1.5 0 4 2 0 0
FARM40 43 33 3 3 4 7 4 3 0
FARM41 18 12 3 0 1 1 2 1 1
FARM42 64 20 3 5 2 2 4 0 6
FARM43 61 25 9 7 3 8 4 17 0
FARM44 18 3 0.5 0.5 2 2 0 0 4
FARM45 11 2 0.5 0 1.5 1.5 1 1 0
FARM46 30 3 4 2 4 0 4 2 0
FARM47 16 1.5 2 0.5 2 2 2 2 0
FARM48 46 1 0.75 1 3 2 0 0 2
FARM49 18 2 1.5 0.5 2 2 2 0 2
FARM50 81 3 12 1.5 10 8 11 14 15
FARM51 15 0 1.5 1.5 2.5 0 1 0 0
FARM52 26 11 3.5 2 4 0 2 2 2
FARM53 10 11 0 0 1.5 0 0 0 0
FARM54 40 12 5 3 6 1 8 17 10
FARM55 82 4 11 7 5 0.5 8 5 0
FARM56 40 5.5 6 4 2.5 1 3 0 2
FARM57 29 8 3 2 4 2 0 0 2
FARM58 23 5 5 4 3 1 1 0 0
FARM59 53 4 0 3 0 3 6 0 0
FARM60 57 3.5 9 8 0 0 10 23 0
FARM61 23 4 2 2 0.5 4 2 0 0
FARM62 9 31 2 2 0 2 1 0 0
FARM63 22 35 3 2 3 0 5 6 1
FARM64 25 35 3 1 2.5 0 4 8 10
FARM65 20 0 1.5 1 3 0 1 6 0
FARM66 27 41 1.1 0.25 1.5 1.5 0 3 1
FARM67 30 19 2 2 4 1 2 0 5
FARM68 77 18 8 4 6 4 6 8 6
FARM69 13 100 0.5 0.5 0 1 0 0 4
FARM70 24 100 2 3 0 0.5 3 14 10
FARM71 29 90 2 1.5 1.5 1.5 2 0 2
FARM72 57 90 10 7 0 1.5 7 8 7
;
run;

You need to transpose the values and use a group= statement.
Steps
1 Sort by ID
2 Transpose the data
3 Adjust the labels for display
4 Plot with PROC SGPLOT
proc sort data=raw;
by id;
run;
proc transpose data=raw out=raw_t;
by id;
run;
data raw_t;
set raw_t;
label _name_ = "Variable";
label col1 = "Value";
run;
ods html;
title "My Box Plot";
proc sgplot data=raw_t;
vbox col1 / group=_name_ ;
run;
ods html close;
Produces:

Related

How to write a LAMBDA function in Excel for this recursive calculation

I'm trying to come up with a LAMBDA formula that captures the following recursive calculation:
Column A has 40 rows with integers between 1 and 40. Column B divides each integer in column A by 6 and rounds it up. Column C divides each integer in column B by 6 and rounds it up. This continues until the integer is 1 or less, and then I want the sum of the full row for a given integer. So, for example, for the number 25 in column A, I get 6 (5 from column B and 1 from column C). For the number 40 in column A, I get 10 (7 from column B, 2 from column C, 1 from column D).
Is it possible to come up with a LAMBDA function that would get me the correct output for a given number in column A? I don't want to use VBA - just want to use the LAMBDA function for this.
Image of the XL
Data
Column 1
Column 2
Column 3
Column 4
Sum
1
0
0
0
0
1
2
1
0
0
0
1
3
1
0
0
0
1
4
1
0
0
0
1
5
1
0
0
0
1
6
1
0
0
0
1
7
2
1
0
0
3
8
2
1
0
0
3
9
2
1
0
0
3
10
2
1
0
0
3
11
2
1
0
0
3
12
2
1
0
0
3
13
3
1
0
0
4
14
3
1
0
0
4
15
3
1
0
0
4
16
3
1
0
0
4
17
3
1
0
0
4
18
3
1
0
0
4
19
4
1
0
0
5
20
4
1
0
0
5
21
4
1
0
0
5
22
4
1
0
0
5
23
4
1
0
0
5
24
4
1
0
0
5
25
5
1
0
0
6
26
5
1
0
0
6
27
5
1
0
0
6
28
5
1
0
0
6
29
5
1
0
0
6
30
5
1
0
0
6
31
6
1
0
0
7
32
6
1
0
0
7
33
6
1
0
0
7
34
6
1
0
0
7
35
6
1
0
0
7
36
6
1
0
0
7
37
7
2
1
0
10

Use BYROW and SCAN:
=BYROW(A1:A40,LAMBDA(c,SUM(SCAN(c,SEQUENCE(,4,6,0),LAMBDA(a,b,IF(a=1,0,ROUNDUP(a/b,0)))))))

Trying to webscrape table data from rotogrinders. Any help would be greatly appreciated

So far I have:
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup
import pandas as pd
driver = webdriver.Firefox()
url = r"https://rotogrinders.com/game-stats/nba-player?site=fanduel&range=yesterday"
driver.get(url)
cookies = driver.find_element_by_xpath(r'//*[#id="bc-close-cookie"]').click()
select = driver.find_element_by_xpath(r'/html/body/div[1]/div/section/div/section/div[2]/div[2]/a[2]').click()
I need to scrape the table data to a .csv file.
Any suggestions?

Because of the subscription pop-up:
When this error occurs, click on no thank you before clicking on the all button.

Selenium is overkill here because the data is within the <script> tags. You could get it through a simple request using requests, then either a) use BeautifulSoup to pull out the <script> tag, then parse it, or b) pull it straight away using regex. I chose the latter.
It comes in a nice json format meaning turning into a table is relatively easy with pandas. You'll also use pnadas to write to csv.
import requests
import pandas as pd
import re
import json
url = 'https://rotogrinders.com/game-stats/nba-player?site=fanduel&range=yesterday'
response = requests.get(url)
p = re.compile("var data = (\[.*\])")
result = p.search(response.text)
jsonStr = result.group(1)
jsonData = json.loads(jsonStr)
df = pd.DataFrame(jsonData)
df.to_csv('nba.csv', index=False)
Output:
print(df.to_string())
id name fpts gp fgm fga ftm fta 3pm 3pa 2pm 2pa reb ast stl blk to pts oreb pfoul tfoul ffoul min fouls dd td usg pace fg% ft% 3p% 2p% team pos player
0 947 Lance Stephenson 6.70 1 1 1 2 2 0 0 1 1 1 3 0 0 3 4 0 1 0 0 17.80 0 0 0 13.19 5 1 1 0 1 IND SCW Lance Stephenson
1 1079 Stephen Curry 58.00 1 12 27 9 9 6 16 6 11 5 8 1 0 2 39 0 3 0 0 43.95 0 0 0 32.40 33.50 0.44 1 0.38 0.55 GSW CG Stephen Curry
2 1087 Chris Paul 51.50 1 8 14 2 2 2 6 6 8 5 11 2 1 0 20 0 3 0 0 36.68 0 1 0 20.19 15 0.57 1 0.33 0.75 PHO DIS Chris Paul
3 1277 Taj Gibson 12.50 1 1 3 0 0 0 0 1 3 5 1 0 1 0 2 1 4 0 0 17.95 0 0 0 7.42 2 0.33 0 0 0.33 NYK PB Taj Gibson
4 1334 Andre Iguodala 24.00 1 1 2 2 2 0 0 1 2 5 4 0 4 4 4 2 2 0 0 31.32 0 0 0 10.47 5 0.50 1 0 0.50 GSW 3DW Andre Iguodala
5 1485 JaVale McGee 10.80 1 3 5 2 2 0 0 3 5 4 0 0 0 2 8 1 3 0 0 17.82 0 0 0 17.69 7 0.60 1 0 0.60 PHO PB JaVale McGee
6 13301 Jonas Valanciunas 39.00 1 8 11 1 2 1 4 7 7 10 2 1 2 3 18 1 5 0 0 33.03 0 1 0 18.82 14 0.73 0.50 0.25 1 NOP PB Jonas Valanciunas
7 13312 Bismack Biyombo 19.30 1 2 3 5 7 0 0 2 3 4 1 2 0 2 9 2 3 0 0 27.90 0 0 0 12.06 6.50 0.67 0.71 0 0.67 PHO PB Bismack Biyombo
8 13315 Klay Thompson 13.90 1 6 17 0 0 0 7 6 10 2 1 0 0 2 12 1 1 0 0 23.10 0 0 0 33.47 18 0.35 0 0 0.60 GSW 3DW Klay Thompson
9 13335 Kemba Walker 11.40 1 1 5 3 4 0 3 1 2 2 2 1 0 2 5 0 2 1 0 21.17 0 0 0 17.80 9 0.20 0.75 0 0.50 NYK CG Kemba Walker
10 13353 Alec Burks 19.60 1 3 9 5 5 2 6 1 3 3 2 1 0 3 13 1 3 0 0 22.58 0 0 0 26.32 13.50 0.33 1 0.33 0.33 NYK CG Alec Burks
11 13913 Evan Fournier 10.10 1 2 8 1 2 1 6 1 2 3 1 0 0 1 6 1 1 0 0 25.15 0 0 0 16.24 9 0.25 0.50 0.17 0.50 NYK SHW Evan Fournier
12 13942 Jae Crowder 21.60 1 4 9 2 2 3 6 1 3 3 0 2 0 1 13 0 3 0 0 32.65 0 0 0 13.33 11 0.44 1 0.50 0.33 PHO 3DW Jae Crowder
13 13955 Jeremy Lamb 27.40 1 2 5 8 10 2 3 0 2 2 2 2 1 1 14 1 2 0 0 18.88 0 0 0 23.43 10 0.40 0.80 0.67 0 IND SCW Jeremy Lamb
14 14077 Garrett Temple 0.20 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 10.88 0 0 0 3.68 1 0 0 0 0 NOP SHW Garrett Temple
15 14559 Justin Holiday 23.50 1 6 13 0 0 4 9 2 4 5 3 0 0 3 16 1 2 0 0 34.22 0 0 0 19.86 15 0.46 0 0.44 0.50 IND SHW Justin Holiday
16 16879 Tim Hardaway Jr. 18.50 1 4 10 0 0 3 7 1 3 5 1 0 0 0 11 0 3 0 0 28.28 0 0 0 14.61 10 0.40 0 0.43 0.33 DAL SHW Tim Hardaway Jr.
17 16943 Reggie Bullock 11.70 1 1 3 0 0 1 3 0 0 6 1 0 0 0 3 1 1 0 0 20.18 0 0 0 6.60 2 0.33 0 0.33 0 DAL 3DW Reggie Bullock
18 18566 Dwight Powell 8.60 1 2 3 2 3 0 0 2 3 3 0 0 0 1 6 2 1 0 0 14.35 0 0 0 14.83 3.50 0.67 0.67 0 0.67 DAL VB Dwight Powell
19 18620 Andrew Wiggins 22.80 1 5 15 0 1 1 6 4 9 4 2 1 1 2 11 2 5 0 0 38.03 0 0 0 19.04 15.50 0.33 0 0.17 0.44 GSW SCW Andrew Wiggins
20 18632 Julius Randle 24.40 1 1 9 2 4 0 2 1 7 7 6 1 1 3 4 2 1 1 0 29.48 0 0 0 21.36 12 0.11 0.50 0 0.14 NYK VB Julius Randle
21 18899 Kristaps Porzingis 37.70 1 7 15 2 2 2 4 5 11 11 1 0 2 1 18 2 5 0 0 32.28 0 1 0 21.33 15 0.47 1 0.50 0.45 DAL VB Kristaps Porzingis
22 18941 Cameron Payne 18.60 1 4 10 3 3 1 4 3 6 3 0 1 0 0 12 0 1 1 0 18.93 0 0 0 23.92 11.50 0.40 1 0.25 0.50 PHO DIS Cameron Payne
23 18945 Kevon Looney 36.50 1 5 6 3 4 0 0 5 6 15 3 2 0 5 13 6 3 0 1 28.17 0 1 0 19.52 7 0.83 0.75 0 0.83 GSW PB Kevon Looney
24 18949 Devin Booker 41.00 1 11 25 5 5 1 8 10 17 5 6 0 0 2 28 0 2 1 0 38.30 0 0 0 32.56 29.50 0.44 1 0.13 0.59 PHO SCW Devin Booker
25 31814 Nemanja Bjelica 22.20 1 4 5 0 0 0 1 4 4 6 2 1 1 2 8 1 1 0 0 14.13 0 0 0 21.68 6 0.80 0 0 1 GSW VF Nemanja Bjelica
26 35227 Brandon Ingram 30.00 1 4 10 6 8 1 1 3 9 5 6 1 0 3 15 0 2 0 0 26.88 0 0 0 27.53 17 0.40 0.75 1 0.33 NOP SCW Brandon Ingram
27 35995 Damion Lee 14.00 1 3 5 2 2 1 2 2 3 0 0 2 0 1 9 0 3 0 0 14 0 0 0 19.66 7 0.60 1 0.50 0.67 GSW CG Damion Lee
28 36032 Dorian Finney-Smith 14.20 1 2 6 0 0 1 5 1 1 6 2 0 0 1 5 1 4 0 0 32 0 0 0 9.57 6 0.33 0 0.20 1 DAL 3DW Dorian Finney-Smith
29 36041 Gary Payton II 18.00 1 3 5 0 1 0 2 3 3 5 0 2 0 0 6 1 1 1 0 17.40 0 0 0 12.51 4.50 0.60 0 0 1 GSW CG Gary Payton II
30 37796 Frank Ntilikina 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1.70 0 0 0 0.00 0 0 0 0 0 DAL DIS Frank Ntilikina
31 37845 Maxi Kleber 22.00 1 3 9 2 2 1 7 2 2 5 4 0 1 2 9 1 2 0 0 27.13 0 0 0 19.46 11 0.33 1 0.14 1 DAL VB Maxi Kleber
32 37861 Torrey Craig 32.90 1 5 10 0 0 2 5 3 5 7 1 3 1 1 12 0 3 0 0 34.22 0 0 0 13.24 11 0.50 0 0.40 0.60 IND 3DW Torrey Craig
33 37889 Josh Hart 34.40 1 5 10 5 9 2 4 3 6 7 4 1 0 0 17 1 4 0 0 35.28 0 0 0 17.32 13.50 0.50 0.56 0.50 0.50 NOP SHW Josh Hart
34 408821 Mitchell Robinson 37.00 1 6 7 5 10 0 0 6 7 15 0 0 1 1 17 7 3 0 0 30.05 0 1 0 16.51 6 0.86 0.50 0 0.86 NYK PB Mitchell Robinson
35 408841 Landry Shamet 4.20 1 0 2 0 0 0 2 0 0 1 0 1 0 0 0 0 3 0 0 10.07 0 0 0 7.94 2 0 0 0 0 PHO SHW Landry Shamet
36 408842 Mikal Bridges 42.60 1 5 11 2 2 0 4 5 7 8 6 4 0 0 12 2 1 0 0 37.18 0 0 0 14.91 10 0.45 1 0 0.71 PHO 3DW Mikal Bridges
37 408968 Devonte' Graham 19.10 1 5 16 1 1 4 10 1 6 3 1 0 0 1 15 0 1 0 0 28.03 0 0 0 25.36 17.50 0.31 1 0.40 0.17 NOP CG Devonte' Graham
38 408971 Luka Doncic 44.60 1 9 23 8 11 2 9 7 14 8 8 1 0 8 28 1 1 0 0 38.13 0 0 0 40.37 35.50 0.39 0.73 0.22 0.50 DAL CG Luka Doncic
39 409062 Jalen Brunson 20.50 1 8 13 2 2 1 2 7 11 5 1 0 0 6 19 2 3 0 0 34.75 0 0 0 23.26 18 0.62 1 0.50 0.64 DAL DIS Jalen Brunson
40 570141 Gary Clark 3.50 1 1 5 0 0 1 4 0 1 0 1 0 0 1 3 0 0 0 0 7.95 0 0 0 31.85 6 0.20 0 0.25 0 NOP VF Gary Clark
41 1115150 RJ Barrett 30.20 1 6 13 4 7 1 5 5 8 6 2 0 1 0 17 1 1 0 0 27.98 0 0 0 23.93 15.50 0.46 0.57 0.20 0.63 NYK SCW RJ Barrett
42 1115261 Goga Bitadze 32.30 1 5 12 2 5 1 4 4 8 9 5 0 1 2 13 4 2 2 0 31.35 0 0 0 22.78 12.50 0.42 0.40 0.25 0.50 IND VB Goga Bitadze
43 1115443 Jaxson Hayes 8.90 1 3 4 0 0 0 0 3 4 2 1 0 0 1 6 0 5 0 0 16.47 0 0 0 12.94 5 0.75 0 0 0.75 NOP PB Jaxson Hayes
44 1115445 Nickeil Alexander-Walker 13.60 1 1 4 2 2 0 0 1 4 3 4 0 0 0 4 0 3 0 0 24.42 0 0 0 10.16 5 0.25 1 0 0.25 NOP CG Nickeil Alexander-Walker
45 1115568 Jordan Poole 13.40 1 1 7 3 4 0 5 1 2 2 4 0 0 0 5 0 2 0 0 24.67 0 0 0 16.34 9 0.14 0.75 0 0.50 GSW CG Jordan Poole
46 1115625 Cam Johnson 19.90 1 2 7 2 2 1 6 1 1 7 1 1 0 0 7 0 2 0 0 20.47 0 0 0 16.04 8 0.29 1 0.17 1 PHO SHW Cam Johnson
47 1311751 Oshae Brissett 12.70 1 0 7 2 4 0 3 0 4 6 1 1 0 1 2 0 1 0 0 21.98 0 0 0 18.36 10 0 0.50 0 0 IND SHW Oshae Brissett
48 1333886 Juan Toscano-Anderson 15.30 1 2 5 0 0 1 3 1 2 4 1 2 0 2 5 1 4 0 0 14.87 0 0 0 19.72 6 0.40 0 0.33 0.50 GSW SHW Juan Toscano-Anderson
49 2439136 Obi Toppin 5.40 1 0 1 0 0 0 1 0 0 2 0 0 1 0 0 0 1 0 0 18.52 0 0 0 2.16 1 0 0 0 0 NYK VB Obi Toppin
50 2439366 Josh Green 4.70 1 1 3 0 0 0 0 1 3 1 1 0 0 0 2 0 1 0 0 11.18 0 0 0 11.91 3 0.33 0 0 0.33 DAL 3DW Josh Green
51 2439514 Immanuel Quickley 31.30 1 4 13 4 4 2 9 2 4 4 5 3 0 4 14 0 1 0 0 23.28 0 0 0 35.07 19 0.31 1 0.22 0.50 NYK SHW Immanuel Quickley
52 3005116 Quentin Grimes 16.20 1 5 9 0 0 3 6 2 3 1 2 0 0 1 13 1 4 0 0 23.83 0 0 0 17.89 9 0.56 0 0.50 0.67 NYK 3DW Quentin Grimes
53 3005227 Isaiah Jackson 33.90 1 5 12 5 8 0 1 5 11 7 1 3 0 0 15 3 5 0 0 18.63 0 0 0 34.03 13 0.42 0.63 0 0.45 IND VF Isaiah Jackson
54 3005228 Chris Duarte 48.90 1 10 16 5 5 2 3 8 13 7 3 3 0 0 27 2 3 0 0 38.53 0 0 0 19.92 16.50 0.63 1 0.67 0.62 IND SCW Chris Duarte
55 3005417 Herb Jones 28.80 1 5 7 0 0 1 3 4 4 4 4 3 0 2 11 1 3 0 0 37.08 0 0 0 11.13 8 0.71 0 0.33 1 NOP 3DW Herb Jones
56 3007111 Jonathan Kuminga 14.50 1 0 0 5 6 0 0 0 0 5 3 0 0 1 5 1 3 1 0 15.37 0 0 0 12.05 3 0 0.83 0 0 GSW VF Jonathan Kuminga
57 3014026 Keifer Sykes 17.30 1 4 12 0 0 2 5 2 7 4 3 0 0 2 10 2 3 0 0 31.02 0 0 0 19.33 12 0.33 0 0.40 0.29 IND DIS Keifer Sykes
58 3015595 Duane Washington 8.50 1 3 7 0 0 2 2 1 5 0 1 0 0 1 8 0 1 0 0 18.37 0 0 0 18.14 8 0.43 0 1 0.20 IND SHW Duane Washington
59 3015773 Jose Alvarado 31.00 1 6 9 0 0 1 2 5 7 0 4 4 0 0 13 0 3 0 0 19.97 0 0 0 20.67 9 0.67 0 0.50 0.71 NOP DIS Jose Alvarado

How to return first item when the items in the pandas dataframe window are the same?

I am a python beginner.
I have the following pandas DataFrame, with only two columns; "Time" and "Input".
I want to loop over the "Input" column. Assuming we have a window size w= 3. (three consecutive values) such that for every selected window, we will check if all the items/elements within that window are 1's, then return the first item as 1 and change the remaining values to 0's.
index Time Input
0 11 0
1 22 0
2 33 0
3 44 1
4 55 1
5 66 1
6 77 0
7 88 0
8 99 0
9 1010 0
10 1111 1
11 1212 1
12 1313 1
13 1414 0
14 1515 0
My intended output is as follows
index Time Input What_I_got What_I_Want
0 11 0 0 0
1 22 0 0 0
2 33 0 0 0
3 44 1 1 1
4 55 1 1 0
5 66 1 1 0
6 77 1 1 1
7 88 1 0 0
8 99 1 0 0
9 1010 0 0 0
10 1111 1 1 1
11 1212 1 0 0
12 1313 1 0 0
13 1414 0 0 0
14 1515 0 0 0
What should I do to get the desired output? Am I missing something in my code?

import pandas as pd
import re
pd.Series(list(re.sub('111', '100', ''.join(df.Input.astype(str))))).astype(int)
Out[23]:
0 0
1 0
2 0
3 1
4 0
5 0
6 1
7 0
8 0
9 0
10 1
11 0
12 0
13 0
14 0
dtype: int32

how to shift column labels to left python

I have dataframe i want to move column name to left from specific column. original dataframe have many columns can not do this by rename columns
df=pd.DataFrame({'A':[1,3,4,7,8,11,1,15,20,15,16,87],
'H':[1,3,4,7,8,11,1,15,78,15,16,87],
'N':[1,3,4,98,8,11,1,15,20,15,16,87],
'p':[1,3,4,9,8,11,1,15,20,15,16,87],
'B':[1,3,4,6,8,11,1,19,20,15,16,87],
'y':[0,0,0,0,1,1,1,0,0,0,0,0]})
print((df))
A H N p B y
0 1 1 1 1 1 0
1 3 3 3 3 3 0
2 4 4 4 4 4 0
3 7 7 98 9 6 0
4 8 8 8 8 8 1
5 11 11 11 11 11 1
6 1 1 1 1 1 1
7 15 15 15 15 19 0
8 20 78 20 20 20 0
9 15 15 15 15 15 0
10 16 16 16 16 16 0
11 87 87 87 87 87 0
Here i want to remove label N first dataframe after removing label N
A H p B y
0 1 1 1 1 1 0
1 3 3 3 3 3 0
2 4 4 4 4 4 0
3 7 7 98 9 6 0
4 8 8 8 8 8 1
5 11 11 11 11 11 1
6 1 1 1 1 1 1
7 15 15 15 15 19 0
8 20 78 20 20 20 0
9 15 15 15 15 15 0
10 16 16 16 16 16 0
11 87 87 87 87 87 0
Rrquired output:
A H P B y
0 1 1 1 1 1 0
1 3 3 3 3 3 0
2 4 4 4 4 4 0
3 7 7 98 9 6 0
4 8 8 8 8 8 1
5 11 11 11 11 11 1
6 1 1 1 1 1 1
7 15 15 15 15 19 0
8 20 78 20 20 20 0
9 15 15 15 15 15 0
10 16 16 16 16 16 0
11 87 87 87 87 87 0
Here last column can be ignore
Note: in original dataframe have many columns , can not rename columns , so need some auto method to shift column names lef

You can do
df.columns=sorted(df.columns.str.replace('N',''),key=lambda x : x=='')
df
A H p B y
0 1 1 1 1 1 0
1 3 3 3 3 3 0
2 4 4 4 4 4 0
3 7 7 98 9 6 0
4 8 8 8 8 8 1
5 11 11 11 11 11 1
6 1 1 1 1 1 1
7 15 15 15 15 19 0
8 20 78 20 20 20 0
9 15 15 15 15 15 0
10 16 16 16 16 16 0
11 87 87 87 87 87 0

Replace the columns with your own custom list.
>>> cols = list(df.columns)
>>> cols.remove('N')
>>> df.columns = cols + ['']
Output
>>> df
A H p B y
0 1 1 1 1 1 0
1 3 3 3 3 3 0
2 4 4 4 4 4 0
3 7 7 98 9 6 0
4 8 8 8 8 8 1
5 11 11 11 11 11 1
6 1 1 1 1 1 1
7 15 15 15 15 19 0
8 20 78 20 20 20 0
9 15 15 15 15 15 0
10 16 16 16 16 16 0
11 87 87 87 87 87 0

Sum the values from the days in a specific week in Excel

So I have some rows of data and some columns with dates.
As you can see on the image below.
I want the sum of the week for each row - but the tricky thing is that not every week is 5 days, so there might be weeks with 3 days. So somehow, I want to try to go for the weeknumber and then sum it.
Can anyone help with me a formular (or a VBA macro)?
I am completely lost after trying several approaches.
18-May-15 19-May-15 20-May-15 21-May-15 22-May-15 25-May-15 26-May-15 27-May-15 28-May-15 29-May-15 1-Jun-15 2-Jun-15 3-Jun-15 4-Jun-15 WEEK 1 TOTAL WEEK 2 TOTAL
33 15 10 19 18 8 10 15 10 29 16 24 8 26 74
18 11 8 17 0 6 16 9 16 16 36 9 6 4 55
0 0 1 0 0 1 0 0 1 0 0 3 3 2 8
30 7 4 8 8 11 10 3 0 11 3 4 5 6 18
0 0 0 11 0 0 0 1 0 7 8 1 1 2 12
1 1 4 0 5 1 6 2 1 4 2 4 5 4 15
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
52 27 22 36 23 15 32 26 27 49 54 37 19 34 144
30 50 25 21 34 12 33 32 26 43 54 43 18 32 147
0 0 1 0 3 0 0 0 0 0 0 0 0 0 0
29 5 3 4 4 1 1 2 4 4 3 4 2 3 12
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 4 1 10 9 0 0 0 0 0 1 1 2
1 2 0 0 0 0 0 1 3 0 0 0 2 2 4
15 29 5 17 16 4 18 20 12 28 25 22 4 23 74
11 15 11 3 15 7 11 9 5 12 18 10 5 7 40
1 0 2 1 1 0 0 1 8 1 4 3 2 0 9
3 6 7 0 2 1 4 2 1 2 7 8 7 2 24
21 21 21 21 21 22 22 22 22 22 23 23 23 23

Using SUMIF is one way. But you need to get your references straight in order to make it easy to enter.
Note in the diagram below, the formula:
=SUMIF(Weeknums,M$1,$B2:$K2)
where weeknums is the row of calculated Week Numbers.
Also note that the column headers showing the Week number to be summed could be made more explanatory with custom formatting:

I know you've already accepted an answer but just to show you:
If you transposed your data you would then be able to utilise the pivot tables
You could set up a calculated field to calculate exactly what you wanted (and depending on how you sorted/grouped the date you could sort this by weeks, months, quarters or even years
You would then get all of your final values displayed in an easy to read format grouped by whatever you want. In my opinion this is a lot more powerful solution for the long run.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Multiple boxplots in SAS - statistics

Related

How to write a LAMBDA function in Excel for this recursive calculation

Trying to webscrape table data from rotogrinders. Any help would be greatly appreciated

How to return first item when the items in the pandas dataframe window are the same?

how to shift column labels to left python

Sum the values from the days in a specific week in Excel

Categories

Resources