Excel get data from web from dynamic table - excel

I am trying to pull the full list of player data from https://stats.nba.com/players/traditional/?sort=PLAYER_NAME&dir=-1&Season=2019-20&SeasonType=Regular%20Season. However, the table is dynamic (URL doesn't change) so when I set up the connection Excel only scrapes the first 50 rows. It does not recognize that there are 6 other pages within the table that I need to scrape as well.
Does anyone know how to use the "Get Data" -> "From Web" capability in excel to import data from a dynamic table like the one shown above?

Instead of referencing the hosting page, why not use this endpoint that is returning the JSON data that populates the table? You just need to marry up the resultsSets.headers to the array positions within rowSet entries.
Edit: I found resources that explain the NBA REST API here: http://nbasense.com/nba-api/Stats/Stats/Players/AllPlayers . Take some time and review what's available. Any of those end-points can be consumed by Excel the way you are trying to.
Example:
https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2019-20&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&TwoWay=0&VsConference=&VsDivision=&Weight=
Sample of a Rowset showing Aaron Gordon. Per resultSets, 2nd field is Name, 5th is age... matches the table, and gives all players not just page 1.
"rowSet": [ [ 203932, "Aaron Gordon", 1610612753, "ORL", 24.0, 1, 1, ...
For brevity, that's just a sample and far from all of the info it returns. You can click that link and see the JSON data it's returning within your browser.

Related

Excel - VLookup to return values in dropdown list

I am currently trying to display some manager/employee names based on business unit.
Each Business Unit could have multiple managers and multiple employees.
My question is can VLookup or any other method return a drop down list to select a certain manager/employee based on the Business unit selected?
Please see image below to see the layout and expected output.
I am hoping to use 3 drop down menus which when the business unit is selected to be able to auto populate first employee and manager in the list but also be able to have drop down menu for both to select other employees/managers etc.
Thank you.
screenshots
screenshots
screenshots
Please find my Excel-sheet in which I modulated a possible solution via the following link (shared on OneDrive):
https://cronos-my.sharepoint.com/:f:/g/personal/oortsja_cronos_be/EuUIF6pW95xGtcA0gQjwtIkB_x4LCc8oWks9VwoVTfrhJA?e=7fO6Dz
To summarize how I got to this solution:
I made different tables based on the data you provided (Business Unit > Manager > Employee). Using Name manager (see example), I gave those tables specific names that relate to their respective Business Unit > Manager > Employee.
Using =INDIRECT(), I reference those tables based on the names I gave them. E.g. table Ireland (Business Unit) contains values "John" and "Keith". Based on that output, using =INDIRECT(), "John" for example references table John (Manager) which contains the value "Mary" (Employee)
Basically, in my solution the key is using =INDIRECT() referencing multiple tables, VLOOKUP doesn't suit your needs in this specific case.

Get filtered data from Google Sheets API

I am using nodeJS for fetching the data from google sheet and the URL looks like :
var url = `https://sheets.googleapis.com/v4/spreadsheets/${sheet_key}/values/Sheet2!A1:J20?key=${google_API_key}`
From this, I am only getting the data between A1 to J20 So now I have two questions :
How do I get all the rows from the sheet or last 10.
How do I apply a structured query filter like: where name == "Himanshu"
Edited :
For question 2: what I have done is using Query Language Reference (Version 0.7) / structured queries, which is referenced by #Tanaike. This is how my URL looks
like now and its working as well.
https://docs.google.com/a/google.com/spreadsheets/d/${sheet_key}/gviz/tq?tq=select%20*%20where%20B%20%3D%20'Himanshu'&key=${google_API_key}
But the issue is it's returning me the string something like this which I cant able to parse.
google.visualization.Query.setResponse({"version":"0.6","reqId":"0","status":"ok","sig":"509770406","table":{"cols":[{"id":"A","label":"Response Path","type":"string"},{"id":"B","label":"Name","type":"string"}]}})
How about this answer?
A1
You can retrieve all rows using sheet name as the range as follows.
var url = `https://sheets.googleapis.com/v4/spreadsheets/${sheet_key}/values/Sheet2?key=${google_API_key}`
A2
You can use the query like where name == "Himanshu" using Query Language. In order to use this, please share the Spreadsheet as follows.
On Google Drive
On the Spreadsheet file
right-click -> Share -> Advanced -> Click "change" at "Private - Only you can access"
Check "On Anyone with the link"
Click "Save"
At "Link to share", copy URL.
Retrieve file ID from https://docs.google.com/spreadsheets/d/### file ID ###/edit?usp=sharing
About Query Language, you can see the detail information at Query Language Reference.
If I misunderstand your question, I'm sorry.
Edit 1 :
When you want to retrieve the formatted data, please use tqx=out:. In your case, tqx=out:json is used. It seems that it's default. For example, you want csv, it's tqx=out:csv. Also you can use tqx=out:html. I think that tqx=out:csv may be useful for your situation.
https://docs.google.com/a/google.com/spreadsheets/d/${sheet_key}/gviz/tq?tqx=out:csv&tq=select%20*%20where%20B%20%3D%20'Himanshu'&key=${google_API_key}
Edit 2 :
In order to retrieve JSON data of spreadsheet, please do as follows.
On Spreadsheet
Click File -> Publish to the web
Publish as web page.
URL 1
You can retrieve the values of spreadsheet as JSON using the following URL.
https://spreadsheets.google.com/feeds/cells/${sheet_key}/od6/public/values?alt=json
od6 means 1st page of spreadsheet.
URL 2
If you want to retrieve other pages, please confirm using the following URL.
https://spreadsheets.google.com/feeds/worksheets/${sheet_key}/public/basic?alt=json
Note :
If error occurs when you access the URLs, please confirm whether the spreadsheet is published again.

Spotfire - Adding dynamic date restriction to data table which is based on an information link. Spotfire 6.0.1

I have an information link that I want to restrict in spotfire when I add it as a data table so that certain data is excluded. I want to restrict column 'DAY' to the past 91 days.
These are the steps I have tried that haven't worked:
Added data table and clicked 'load on demand' (in the 'Add Data Tables' window) and then 'settings'
On the 'DAY' column, clicked 'Define Input'
Chose 'Range(fixed/properties/expression)' as the 'Input' for the selected parameter
Then as the 'expression' for the 'Min', used: DateAdd('dd',-91,DateTimeNow())
It returns an error when I try to add a transformation to the data or just returns no data when I add the data table. If I just restrict the data with a fixed value it works as expected but clearly this would mean that I would need to change the restriction everyday. I have also been able to restrict the data to a static date directly on the information link under the 'Filters' heading. What I really need is a dynamic restriction that is placed on in some way, in Spotfire, rather than directly on the data source (Oracle).
Would be grateful for any help! Thanks!
It couldn't recognize DateTimeNow(), which is a DateTime as a Date. Spotfire gets kind of picky about that sort of thing.
Replace the Expression used for Min with
DateAdd("dd",-91,Date(DateTimeNow()))
and it should work.

excel lookup that alters URL for web based data pull

I've been trying to figure this out for 5 days, searched this site, watched youtube tutorials, and it's just not coming together for me. I know very little excel and no visual basic.
I need to be able to pull specific info from a website and populate an excel sheet with that info.
User-entered data (the variable?) is an email address. I have a long list of email addresses. From this list I want to generate the web data pull.
My excel is currently set up as a form, Column 1=email, that is the info I have. What I want to pull is in columns 2-8: 2=id number, 3=first name, 4=last name, etc.
The site I am pulling from is an internal API, looks like: http://blah.web.blah.com/blah/blah/blah/emailAddress. This site displays each value that I seek as:
<id>12345</id>, <firstName>Joe</firstName>
The site has over 25 lines of info tagged like this, I am only interested in pulling 7 specific lines into the appropriate column and row, based on the email value in Column 1.
I can easily capture all 25+ lines of info, one at a time.
=HYPERLINK("blah.web.blah.com/blah/blah/blah/"&A3
I would prefer to do this as a batch where I paste the email addresses into Column 1 and walk away while the magic computer executes the batch.
...this seems like it should be easy, but I don't know how to do it and haven't found a solution that starts to function. And like a seasoned Russian once said, "What is hard? Everything you do not know."
If you have excel 2013 , you can use FilterXML function

Filling rows of a repeating table on opening the form

In my Infopath form I use a repeating table. On opening this form on sharepoint I would like to have some rows of the repeating table filled out using information from an other list. I use content types.
What i am working on is a Timesheet system where the user can register how many hours a week he worked on different projects.
I would like that by creation of a timesheet some predefined projects will be already inserted, meaning that the repeating table will have for example 5 rows already with 5 favourite or most used projects selected based on a separate PetProject list.
When I looked at the workflow in the list where the timesheet is being created I couldn’t find the column projectname in the dropdown so I cant give it a value. When I went to look in the Form settings of TimeSheets I saw that projectname cant be selected/edited, its in plain black whereas the other columns are blue and clickable. I thought its probably because the value of projectname is merged from the different rows in the repeating table.
Is there any way I can work around this problem and assign a value to projectname by creation of the timesheet?
Thank you so much!
I think that you will need to write some code to query the data that you are after and add the data to new rows in the repeating table.
There is a loading event that you can hook into to then query a secondary data source and then
add the row to the repeating table.
Will this run with in infopath as a thick client or will it run as a browser based form using infopath form service?

Resources