How do you paginate the results of GitHub's compare commit API - github-api

I am not able to paginate the results of the Github compare commit REST API as defined here:
https://docs.github.com/en/github-ae#latest/rest/commits/commits#compare-two-commits
I perform a GET operation in the following format:
url = https://api.github.com/repos/${owner}/${repo}/compare/${base}...${head}?per_page=50
I always get 300 files with no link as to how I can get to the next page (list of 50 items).
Note:
I have read the git hub pagination https://docs.github.com/en/rest/guides/traversing-with-pagination, and it has not provided much insight.
Ideally, I am looking for a JS implementation of how-to page the results of the compare API

What I observed when working with the API is that when you paginate it does not paginate files, rather it paginates commits. I think this makes some sense, I'm not sure what you would do if it paginated files as well.
I wrote a basic script using Octokit.js to traverse the commits which I'll adjust here to collect files for you. It's just some rough code but hopefully gives you an idea.
import { Octokit } from '#octokit/rest';
const octokit = new Octokit( { auth: 'my personal access token' });
const owner = 'put repo owner here';
const repo = 'put repo name here';
const base = 'put base ref here';
const head = 'put head ref here';
const {data: { total_commits, files }} = await octokit.repos.compareCommitsWithBasehead( {
owner,
repo,
basehead: `${base}...${head}`,
per_page: 50,
} );
// total commits / page size rounded up.
const pages = Math.ceil( total_commits / 50 );
let allFiles = [];
// add page 1 files
allFiles = allFiles.concat( files );
for ( let i = 2; i <= pages; i++ ) {
const {
data: { files: pagedFiles },
} = await octokit.repos.compareCommitsWithBasehead( {
owner,
repo,
basehead: `${base}...${head}`,
per_page: 100,
page: i,
} );
allFiles = allFiles.concat( pagedFiles );
}
console.log( pagedFiles.length );
console.log( pagedFiles );

Related

contentful getEntries published data only

Is there a way to query the results to show only data that has been published and is not in draft state? I looked in the documentation and didn't quite find it.
This is what I currently have:
export const getAllPages = async (context?) => {
const client = createClient({
space: process.env.CONTENTFUL_SPACE_ID,
accessToken: process.env.CONTENTFUL_ACCESS_TOKEN,
});
const pages = await client.getEntries({
content_type: "page",
include: 10,
"fields.slug[in]": `/${context.join().replace(",", "/")}`,
});
return pages?.items?.map((item) => {
const fields = item.fields;
return {
title: fields["title"],
};
});
};
You can detect that the entries you get are in Published state:
function isPublished(entity) {
return !!entity.sys.publishedVersion &&
entity.sys.version == entity.sys.publishedVersion + 1
}
In your case, I would look for both Published and Changed:
function isPublishedChanged(entity) {
return !!entity.sys.publishedVersion &&
entity.sys.version >= entity.sys.publishedVersion + 1
}
Check the documentation:
https://www.contentful.com/developers/docs/tutorials/general/determine-entry-asset-state/
To get only the published data you will need to use the Content Delivery API token. If you use the Content Preview API Token, you will receive both the published and draft entries.
You can read more about it here: https://www.contentful.com/developers/docs/references/content-delivery-api/
If using the Content Delivery API you need to filter on the sys.revision attribute for each item. A published item should have its revision attribute set to greater than 0.
const publishedItems = data.items.filter(item => item.sys.revision > 0)

discord.js v13 What code would I use to collect the first attachment (image or video) from a MessageCollector?

I've looked everywhere and tried all I could think of but found nothing, everything seemed to fail.
One bit of code I've used before that failed:
Message.author.send({ embeds: [AttachmentEmbed] }).then(Msg => {
var Collector = Msg.channel.createMessageCollector({ MessageFilter, max: 1, time: 300000 });
Collector.on(`collect`, Collected => {
if (Collected.content.toLowerCase() !== `cancel`) {
console.log([Collected.attachments.values()].length);
if ([Collected.attachments.values()].length > 0) {
var Attachment = [Collected.attachments.values()];
var AttachmentType = `Image`;
PostApproval(false, Mode, Title, Description, Pricing, Contact, Attachment[0], AttachmentType);
} else if (Collected.content.startsWith(`https://` || `http://`) && !Collected.content.startsWith(`https://cdn.discordapp.com/attachments/`)) {
var Attachment = Collected.content.split(/[ ]+/)[0];
var AttachmentType = `Link`;
PostApproval(false, Mode, Title, Description, Pricing, Contact, Attachment, AttachmentType);
console.log(Attachment)
} else if (Collected.content.startsWith(`https://cdn.discordapp.com/attachments/`)) {
var Attachment = Collected.content.split(/[ ]+/)[0];
var AttachmentType = `ImageLink`;
PostApproval(false, Mode, Title, Description, Pricing, Contact, Attachment, AttachmentType);
console.log(Attachment)
}
[Collected.attachments.values()].length will always be 1. Why? Well you have these 2 possibilities:
[ [] ] //length 1
[ [someMessageAttachment] ] //length 1
The proper way to check is using the spread operator (...)
[...(Collected.attachments.values())].length //returns amount of attachments in the message

Array list with 2 values and doing it to a top 10 list

im working on a Discord bot and have a reputation system with fs (npm package) and saving peoples reps in a file and doing the file name as they discord id
now im working on a top 10 command and would need some help here, i currently have this as code:
let users = [];
let reps = [];
fs.readdirSync('./data/reps/').forEach(obj => {
users.push(obj.replace('.json', ''))
let file = fs.readFileSync(`./data/reps/${obj}`)
let data = JSON.parse(file)
reps.push(data.reps)
})
let top = [...users, ...reps]
top.sort((a,b) => {a - b})
console.log(top)
the files form the users are like this:
{
"users": [
"437762415275278337"
],
"reps": 1
}
users are the current users that can't rep the persion anymore and don't need to use it in the command
i wan to get the top 10 of reps so that i can get the user id and how many reps they have, how could i do it with the code above?
You could try this
const topTen = fs.readdirSync('./data/reps/').map(obj => {
const file = fs.readFileSync(`./data/reps/${obj}`);
const data = JSON.parse(file);
return { ...data, name: obj.replace('.json', '') };
}).sort((a, b) => a.reps - b.reps).slice(0, 10);
console.log(topTen);
I would change how you push the data
const users = [];
fs.readdirSync('./data/reps/').forEach(obj => {
let file = fs.readFileSync(`./data/reps/${obj}`)
let data = JSON.parse(file)
reps.push({ reps: data.reps, id: obj.replace(".json", "") });
})
That way when you sort the array the id goes along with
//define this after the fs.readdirSync.forEach method
const top = users.sort((a,b)=> a.reps-b.reps).slice(0,10);
If you want an array of top ids
const topIds = top.map(e => e.id);
If you want a quick string of it:
const str = top.map(e => `${e.id}: ${e.reps}`).join("\n");
Also you should probably just have one or two json files, one would be the array of user id's and their reps and then the other could be of user id's and who they can't rep anymore

ES6 : Object restructuration for mailchimp api

I want to construct a object base on an array and another object.
The goal is to send to mailchimp api my users interests, for that, I've got :
//Array of skills for one user
const skillsUser1 = ["SKILL1", "SKILL3"]
//List of all my skills match to mailchimp interest group
const skillsMailchimpId = {
'SKILL1': 'list_id_1',
'SKILL2': 'list_id_2',
'SKILL3': 'list_id_3',
}
//Mapping of user skill to all skills
const outputSkills = skillsUser1.map((skill) => skillsMailchimpId[skill]);
console.log(outputSkills);
The problem is after, outputSkill get me an array :
["ID1", "ID3"]
But what the mailchimp api need, and so what I need : :
{ "list_id_1": true,
"list_id_2": false, //or empty
"list_id_3" : true
}
A simple way would be this (see comments in code for explanation):
// Array of skills for one user
const skillsUser1 = ["SKILL1", "SKILL3"]
// List of all my skills match to mailchimp interest group
const skillsMailchimpId = {
'SKILL1': 'list_id_1',
'SKILL2': 'list_id_2',
'SKILL3': 'list_id_3',
}
// Create an output object
const outputSkills = {};
// Use `Object.entries` to transform `skillsMailchimpId` to array
Object.entries(skillsMailchimpId)
// Use `.forEach` to add properties to `outputSkills`
.forEach(keyValuePair => {
const [key, val] = keyValuePair;
outputSkills[val] = skillsUser1.includes(key);
});
console.log(outputSkills);
The basic idea is to loop over skillsMailchimpId instead of skillsUser.
But that is not very dynamic. For your production code, you probably want to refactor it to be more flexible.
// Array of skills for one user
const skillsUser1 = ["SKILL1", "SKILL3"]
// List of all my skills match to mailchimp interest group
const skillsMailchimpId = {
'SKILL1': 'list_id_1',
'SKILL2': 'list_id_2',
'SKILL3': 'list_id_3',
}
// Use `Object.entries` to transform `skillsMailchimpId` to array
const skillsMailchimpIdEntries = Object.entries(skillsMailchimpId);
const parseUserSkills = userSkills => {
// Create an output object
const outputSkills = {};
// Use `.forEach` to add properties to `outputSkills`
skillsMailchimpIdEntries.forEach(([key, val]) => {
outputSkills[val] = userSkills.includes(key);
});
return outputSkills;
}
// Now you can use the function with any user
console.log(parseUserSkills(skillsUser1));

Github API: Retrieve all commits for all branches for a repo

According to the V2 documentation, you can list all commits for a branch with:
commits/list/:user_id/:repository/:branch
I am not seeing the same functionality in the V3 documentation.
I would like to collect all branches using something like:
https://api.github.com/repos/:user/:repo/branches
And then iterate through them, pulling all commits for each. Alternatively, if there's a way to pull all commits for all branches for a repo directly, that would work just as well if not better. Any ideas?
UPDATE: I tried passing the branch :sha as a param as follows:
params = {:page => 1, :per_page => 100, :sha => b}
The problem is that when i do this, it doesn't page the results properly. I feel like we're approaching this incorrectly. Any thoughts?
I have encountered the exact same problem. I did manage to acquire all the commits for all branches within a repository (probably not that efficient due to the API).
Approach to retrieve all commits for all branches in a repository
As you mentioned, first you gather all the branches:
# https://api.github.com/repos/:user/:repo/branches
https://api.github.com/repos/twitter/bootstrap/branches
The key that you are missing is that APIv3 for getting commits operates using a reference commit (the parameter for the API call to list commits on a repository sha). So you need to make sure when you collect the branches that you also pick up their latest sha:
Trimmed result of branch API call for twitter/bootstrap
[
{
"commit": {
"url": "https://api.github.com/repos/twitter/bootstrap/commits/8b19016c3bec59acb74d95a50efce70af2117382",
"sha": "8b19016c3bec59acb74d95a50efce70af2117382"
},
"name": "gh-pages"
},
{
"commit": {
"url": "https://api.github.com/repos/twitter/bootstrap/commits/d335adf644b213a5ebc9cee3f37f781ad55194ef",
"sha": "d335adf644b213a5ebc9cee3f37f781ad55194ef"
},
"name": "master"
}
]
Working with last commit's sha
So as we see the two branches here have different sha, these are the latest commit sha on those branches. What you can do now is to iterate through each branch from their latest sha:
# With sha parameter of the branch's lastest sha
# https://api.github.com/repos/:user/:repo/commits
https://api.github.com/repos/twitter/bootstrap/commits?per_page=100&sha=d335adf644b213a5ebc9cee3f37f781ad55194ef
So the above API call will list the last 100 commits of the master branch of twitter/bootstrap. Working with the API you have to specify the next commit's sha to get the next 100 commits. We can use the last commit's sha (which is 7a8d6b19767a92b1c4ea45d88d4eedc2b29bf1fa using the current example) as input for the next API call:
# Next API call for commits (use the last commit's sha)
# https://api.github.com/repos/:user/:repo/commits
https://api.github.com/repos/twitter/bootstrap/commits?per_page=100&sha=7a8d6b19767a92b1c4ea45d88d4eedc2b29bf1fa
This process is repeated until the last commit's sha is the same as the API's call sha parameter.
Next branch
That is it for one branch. Now you apply the same approach for the other branch (work from the latest sha).
There is a large issue with this approach... Since branches share some identical commits you will see the same commits over-and-over again as you move to another branch.
I can image that there is a much more efficient way to accomplish this, yet this worked for me.
I asked this same question for GitHub support, and they answered me this:
GETing /repos/:owner/:repo/commits should do the trick. You can pass the branch name in the sha parameter. For example, to get the first page of commits from the '3.0.0-wip' branch of the twitter/bootstrap repository, you would use the following curl request:
curl https://api.github.com/repos/twitter/bootstrap/commits?sha=3.0.0-wip
The docs also describe how to use pagination to get the remaining commits for this branch.
As long as you are making authenticated requests, you can make up to 5,000 requests per hour.
I used the rails github-api in my app as follows(using https://github.com/peter-murach/github gem):
github_connection = Github.new :client_id => 'your_id', :client_secret => 'your_secret', :oauth_token => 'your_oath_token'
branches_info = {}
all_branches = git_connection.repos.list_branches owner,repo_name
all_branches.body.each do |branch|
branches_info["#{branch.name}".to_s] = "#{branch.commit.url}"
end
branches_info.keys.each do |branch|
commits_list.push (git_connection.repos.commits.list owner,repo_name, start_date, end_date, :sha => "branch_name")
end
Using GraphQL API v4
You can use GraphQL API v4 to optimize commits download per branch. In the following method, I've managed to download in a single request 1900 commits (100 commits per branch in 19 different branches) which drastically reduces the number of requests (compared to using REST api).
1 - Get all branches
You will have to get all branches & go through pagination if you have more than 100 branches :
Query :
query($owner:String!, $name:String!, $branchCursor: String!) {
repository(owner: $owner, name: $name) {
refs(first: 100, refPrefix: "refs/heads/",after: $branchCursor) {
totalCount
edges {
node {
name
target {
...on Commit {
history(first:0){
totalCount
}
}
}
}
}
pageInfo {
endCursor
hasNextPage
}
}
}
}
variables :
{
"owner": "google",
"name": "gson",
"branchCursor": ""
}
Try it in the explorer
Note that branchCursor variable is used when you have more than 100 branches & features the value of pageInfo.endCursor in the previous request in that case.
2 - Split the branches array into array of 19 branches max
There is some limitation of the number of request per nodes that prevents us from making too much query per node. Here, some testing I've performed showed that we can't go over 19*100 commits in a single query.
Note that in case of repo which have < 19 branches, you don't need to bother about that
3 - Query commits by chunk of 100 for each branch
You can then create your query dynamically for getting the 100 next commits on all branches. An example with 2 branches :
query ($owner: String!, $name: String!) {
repository(owner: $owner, name: $name) {
branch0: ref(qualifiedName: "JsonArrayImplementsList") {
target {
... on Commit {
history(first: 100) {
...CommitFragment
}
}
}
}
branch1: ref(qualifiedName: "master") {
target {
... on Commit {
history(first: 100) {
...CommitFragment
}
}
}
}
}
}
fragment CommitFragment on CommitHistoryConnection {
totalCount
nodes {
oid
message
committedDate
author {
name
email
}
}
pageInfo {
hasNextPage
endCursor
}
}
Try it in the explorer
The variables used are owner for the repo's owner & name for the name of the repo.
A fragment in order to avoid duplication of commit history field definition.
You can see that pageInfo.hasNextpage & pageInfo.endCursor will be used to go through pagination for each branch. The pagination takes place in history(first: 100) with specification of the last cursor encountered. For instance the next request will have history(first: 100, after: "6e2fcdcaf252c54a151ce6a4441280e4c54153ae 99"). For each branch, we have to update the request with the last endCursor value to query for the 100 next commit.
When pageInfo.hasNextPage is false, there is no more page for this branch, so we won't include it in the next request.
When the last branch have pageInfo.hasNextPage to false, we have retrieved all commits
Sample implementation
Here is a sample implementation in NodeJS using github-graphql-client. The same method could be implemented in any other language. The following will also store commits in a file commitsX.json :
var client = require('github-graphql-client');
var fs = require("fs");
const owner = "google";
const repo = "gson";
const accessToken = "YOUR_ACCESS_TOKEN";
const branchQuery = `
query($owner:String!, $name:String!, $branchCursor: String!) {
repository(owner: $owner, name: $name) {
refs(first: 100, refPrefix: "refs/heads/",after: $branchCursor) {
totalCount
edges {
node {
name
target {
...on Commit {
history(first:0){
totalCount
}
}
}
}
}
pageInfo {
endCursor
hasNextPage
}
}
}
}`;
function buildCommitQuery(branches){
var query = `
query ($owner: String!, $name: String!) {
repository(owner: $owner, name: $name) {`;
for (var key in branches) {
if (branches.hasOwnProperty(key) && branches[key].hasNextPage) {
query+=`
${key}: ref(qualifiedName: "${branches[key].name}") {
target {
... on Commit {
history(first: 100, after: ${branches[key].cursor ? '"' + branches[key].cursor + '"': null}) {
...CommitFragment
}
}
}
}`;
}
}
query+=`
}
}`;
query+= commitFragment;
return query;
}
const commitFragment = `
fragment CommitFragment on CommitHistoryConnection {
totalCount
nodes {
oid
message
committedDate
author {
name
email
}
}
pageInfo {
hasNextPage
endCursor
}
}`;
function doRequest(query, variables) {
return new Promise(function (resolve, reject) {
client({
token: accessToken,
query: query,
variables: variables
}, function (err, res) {
if (!err) {
resolve(res);
} else {
console.log(JSON.stringify(err, null, 2));
reject(err);
}
});
});
}
function buildBranchObject(branch){
var refs = {};
for (var i = 0; i < branch.length; i++) {
console.log("branch " + branch[i].node.name);
refs["branch" + i] = {
name: branch[i].node.name,
totalCount: branch[i].node.target.history.totalCount,
cursor: null,
hasNextPage : true,
commits: []
};
}
return refs;
}
async function requestGraphql() {
var iterateBranch = true;
var branches = [];
var cursor = "";
// get all branches
while (iterateBranch) {
let res = await doRequest(branchQuery,{
"owner": owner,
"name": repo,
"branchCursor": cursor
});
iterateBranch = res.data.repository.refs.pageInfo.hasNextPage;
cursor = res.data.repository.refs.pageInfo.endCursor;
branches = branches.concat(res.data.repository.refs.edges);
}
//split the branch array into smaller array of 19 items
var refChunk = [], size = 19;
while (branches.length > 0){
refChunk.push(branches.splice(0, size));
}
for (var j = 0; j < refChunk.length; j++) {
//1) store branches in a format that makes it easy to concat commit when receiving the query result
var refs = buildBranchObject(refChunk[j]);
//2) query commits while there are some pages existing. Note that branches that don't have pages are not
//added in subsequent request. When there are no more page, the loop exit
var hasNextPage = true;
var count = 0;
while (hasNextPage) {
var commitQuery = buildCommitQuery(refs);
console.log("request : " + count);
let commitResult = await doRequest(commitQuery, {
"owner": owner,
"name": repo
});
hasNextPage = false;
for (var key in refs) {
if (refs.hasOwnProperty(key) && commitResult.data.repository[key]) {
isEmpty = false;
let history = commitResult.data.repository[key].target.history;
refs[key].commits = refs[key].commits.concat(history.nodes);
refs[key].cursor = (history.pageInfo.hasNextPage) ? history.pageInfo.endCursor : '';
refs[key].hasNextPage = history.pageInfo.hasNextPage;
console.log(key + " : " + refs[key].commits.length + "/" + refs[key].totalCount + " : " + refs[key].hasNextPage + " : " + refs[key].cursor + " : " + refs[key].name);
if (refs[key].hasNextPage){
hasNextPage = true;
}
}
}
count++;
console.log("------------------------------------");
}
for (var key in refs) {
if (refs.hasOwnProperty(key)) {
console.log(refs[key].totalCount + " : " + refs[key].commits.length + " : " + refs[key].name);
}
}
//3) write commits chunk (up to 19 branches) in a single json file
fs.writeFile("commits" + j + ".json", JSON.stringify(refs, null, 4), "utf8", function(err){
if (err){
console.log(err);
}
console.log("done");
});
}
}
requestGraphql();
This also work with repo with a lot of branches, for instances this one which has more than 700 branches
Rate Limit
Note that while it is true that with GraphQL you can perform a reduced number of requests, it won't necessarily improve your rate limit as the rate limit is based on points & not a limited number of requests : check GraphQL API rate limit
Pure JS Implementation without Access Token (Unauthorised Usage)
const base_url = 'https://api.github.com';
function httpGet(theUrl, return_headers) {
var xmlHttp = new XMLHttpRequest();
xmlHttp.open("GET", theUrl, false); // false for synchronous request
xmlHttp.send(null);
if (return_headers) {
return xmlHttp
}
return xmlHttp.responseText;
}
function get_all_commits_count(owner, repo, sha) {
let first_commit = get_first_commit(owner, repo);
let compare_url = base_url + '/repos/' + owner + '/' + repo + '/compare/' + first_commit + '...' + sha;
let commit_req = httpGet(compare_url);
let commit_count = JSON.parse(commit_req)['total_commits'] + 1;
console.log('Commit Count: ', commit_count);
return commit_count
}
function get_first_commit(owner, repo) {
let url = base_url + '/repos/' + owner + '/' + repo + '/commits';
let req = httpGet(url, true);
let first_commit_hash = '';
if (req.getResponseHeader('Link')) {
let page_url = req.getResponseHeader('Link').split(',')[1].split(';')[0].split('<')[1].split('>')[0];
let req_last_commit = httpGet(page_url);
let first_commit = JSON.parse(req_last_commit);
first_commit_hash = first_commit[first_commit.length - 1]['sha']
} else {
let first_commit = JSON.parse(req.responseText);
first_commit_hash = first_commit[first_commit.length - 1]['sha'];
}
return first_commit_hash;
}
let owner = 'getredash';
let repo = 'redash';
let sha = 'master';
get_all_commits_count(owner, repo, sha);
Credits - https://gist.github.com/yershalom/a7c08f9441d1aadb13777bce4c7cdc3b

Resources