Rest API Pagination and multiple loops in Power query/Power BI - pagination

Background:
I want extract all data from Rest API but the issue is that its URL provides data of 50 rows per ID.
So far I've been able to create a function in Power query which allows me to extract all the AGENT-ID's,
URL:
https://vcc-na8.8x8.com/api/stats/agents/**{AGENT-ID}**/activities?n=1
PFB,
(id as text) as table=>
let
Source = Xml.Tables(
Web.Contents(
"https://vcc-na8.8x8.com",
[RelativePath=
"/api/stats/agents/"
&(id)&
"/activities?n="]
)
),
in
source
Than I invoked this function against the Agent id's column in new table and received al the id's with 50 rows.
Issue:
It provides 50 rows per ID. I need to do a pagination in above function so that it gives me all of the rows with these all Id's.
PFB the API documentation.
URL: https://support.8x8.com/cloud-contact-center/virtual-contact-center/developers/8x8-contact-center-statistics-reporting-api#Testing_Using_A_Browser
Let me know if someone has the way to solve my problem.

Related

Problem with OData query in Microsoft Flow for GetItems action on a SharePoint list having 20000+ rows

I have created a Microsoft Flow (PowerAutomate), to retrieve records from a SharePoint list. The list has 22,300 records in it. I have written the below OData filter query. The column (Email) is of Person type.
Email/EMail eq 'me.someone#company.com'
But none of the records are retrieved. However, if I give the below query, the same record is retrieved.
ID eq 22102
The list contains the below records.
ID
Email
22102
Me someone
2
another person
However, if I query for the email id of the 2nd record (another.person#company.com), it is retrieved fine.
Email/EMail eq 'another.person#company.com'
I am wondering is there any limitation towards retrieving records greater than 20000? Really surprised why the second record is retrieved and not the first one? Or do I have to tweak any pagination settings?
When retrieving using ID, it works fine, while for Person, it is not. Is there any indexing I need to enable for query to work beyond 20,000?
If you are having more than 5000 items with this person/email "me.someone#company.com", filter condition will fail. This is the limitation of SharePoint and SharePoint REST APIs.
If you have less than 5000 items for this particular email/person, you can try below suggestions:
Add indexing on this person or group column
Add 5000 in Top Count in Get items action
Enable pagination from settings of Get items action and provide number of items you want to fetch.
You can fetch maximum of 100000 items using Get items action.

Extracting data from API with pagination offset using Azure Data Factory

I am having a below API
http://example.com/?mdule=API&method=Live.getLastVisitsDetails&filter_limit=2000&filter_offset=0
Have to extract more data by increasing the offset into 2000
http://example.com/?mdule=API&method=Live.getLastVisitsDetails&filter_limit=2000&filter_offset=2000
http://example.com/?mdule=API&method=Live.getLastVisitsDetails&filter_limit=2000&filter_offset=4000
http://example.com/?mdule=API&method=Live.getLastVisitsDetails&filter_limit=2000&filter_offset=6000
I have a maximum of 6000 records. I don't know how to pass the offset value of every 2000 in the data factory
I can able to extract individually with the above links but wanted to do it automatically for all 6000 records.
Can anyone show me some documentation or advise how to execute this in the datafactory?
I saw the pagination documentation, but no success
Extracting data using copy activity from rest API
Step1: Create a new pipeline and add a Copy Data activity.
Step2: Configure the source of copy activity, adding a pagination rule configured as below
Follow this sample URL and also make sure ={offset} at the end of URL.
Pagination rule either select option1 or option2 .In my case I selected Range:0:8:2
In your scenario you can follow the range as below :
Option1: QueryParameter.{offset}: Range:0:6000:2000
Option2: AbsoluteUrl.{offset}:Range:0:6000:2000
Range option are using 0 as the start value,6000 as the max value, and increasing by 2000 each time.
Storage account
For more detail information refer this official article.

Get google sheet filter view data with api call

There's really very little code I can show here. I'm calling a google sheets spreadsheet using the api with node.js.
Within the spreadsheet, I have created a bunch of filter views. Is there a way to tell the api to call the filtered data instead of just getting all the data in the sheet.
const dataFromGoogleSheets = `https://sheets.googleapis.com/v4/spreadsheets/${
config.spreadsheetId
}/values:batchGet?ranges=${"Sheet1"}&majorDimension=ROWS&key=${
config.apiKey
}`;
UPDATE
sheets.spreadsheets.values.get(
{
spreadsheetId: "mySpreadSheetID",
filterViewId: 121321321,
title: "Filters"
}
)
You cannot fetch data from Sheets API or Google Apps Script with an active FilterView.
As #Tanaike mentioned in the comments, you can fetch the criteria from the FilterView and reprocess it on top of the data.
You can also file a Feature Request for this on the Public Issue Tracker.

Excel get data from web from dynamic table

I am trying to pull the full list of player data from https://stats.nba.com/players/traditional/?sort=PLAYER_NAME&dir=-1&Season=2019-20&SeasonType=Regular%20Season. However, the table is dynamic (URL doesn't change) so when I set up the connection Excel only scrapes the first 50 rows. It does not recognize that there are 6 other pages within the table that I need to scrape as well.
Does anyone know how to use the "Get Data" -> "From Web" capability in excel to import data from a dynamic table like the one shown above?
Instead of referencing the hosting page, why not use this endpoint that is returning the JSON data that populates the table? You just need to marry up the resultsSets.headers to the array positions within rowSet entries.
Edit: I found resources that explain the NBA REST API here: http://nbasense.com/nba-api/Stats/Stats/Players/AllPlayers . Take some time and review what's available. Any of those end-points can be consumed by Excel the way you are trying to.
Example:
https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2019-20&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&TwoWay=0&VsConference=&VsDivision=&Weight=
Sample of a Rowset showing Aaron Gordon. Per resultSets, 2nd field is Name, 5th is age... matches the table, and gives all players not just page 1.
"rowSet": [ [ 203932, "Aaron Gordon", 1610612753, "ORL", 24.0, 1, 1, ...
For brevity, that's just a sample and far from all of the info it returns. You can click that link and see the JSON data it's returning within your browser.

Cassandra - join two tables and save result to new table

I am working on a self-bi application where users can upload their own datasets which are stored in Cassandra tables that are created dynamically. The data is extracted from files that the user can upload. So, each dataset is written into its own Cassandra table modeled based on column headers in the uploaded file while indexing the dimensions.
Once the data is uploaded, the users are allowed to build reports, analyze, etc., from within the application. I need a way to allow users to merge/join data from two or more datasets/tables based on matching keys and write the result into a new Cassandra table. Once a dataset/table is created, it will stay immutable and data is only read from it.
user table 1
username
email
employee id
user table 2
employee id
manager
I need to merge data in user table 1 and user table 2 on matching employee id and write to new table that is created dynamically.
new table
username
email
employee id
manager
What would be the best way to do this?
The only option that you have is to do the join in your application code. There are just few details to suggest a proper solution.
Please add details about table keys, usage patterns... in general, in cassandra you model from usage point of view, i.e. starting with queries that you'll execute on data.
In order to merge 2 tables on this pattern, you have to do it into application, creating the third table (target table ) and fill it with data from both tables. You have to make sure that you read the data in pages to not OOM, it really depends on size of the data.
Another alternative is to build the joins into Spark, but maybe is too over-engineering in your case.
You can have merge table with primary key of user so that merged data goes in one row and that should be unique since it is one time action.
Than when user clicks you can go through one table in batches with fetch size (for java you can check query options but that is a way to have a fixed window which will be loaded and when reached move to next fetch size of elements). Lets say you have fetch size of 1000 items, iterate over them from one table and find matches in second table, and after 1000 is reached place batch of 1000 inserts to new table.
If that is time consuming you can as suggested use some other tool like Apache Spark or Spring Batch and do that in background informing user that it will take place.

Resources