twitter api count more than 100, using twitter search api - python-3.x

i want to search-tweet related 'data' and count more than 100
this is python grammer
from twython import Twython
twitter= Twython(app_key=APP_KEY,app_secret=APP_SECRET)
for status in twitter.search(q='"data"',count =10000)["statuses"]:
user =status["user"]["screen_name"].encode('utf-8')
text =status["text"]
data = "{0} {1} {2}".format(user ,text,'\n\n')
print(data)
f.writelines(data)

So what you're trying to do uses the Twitter API. Specifically the GET search/tweets endpoint.
In the docs for this endpoint:
https://dev.twitter.com/rest/reference/get/search/tweets
We can see that count has a maximum value of 100:
So even though you specify 10000, it only returns 100 because that's the max.
I've not tried either, but you can likely use the until or max_id parameters also mentioned in the docs to get more results/the next 100 results.
Keep in mind: "that the search index has a 7-day limit. In other words, no tweets will be found for a date older than one week" - the docs
Hope this helps!

You can use the field next_token of the response to get more tweets.
Refer to these articles:
https://lixinjack.com/how-to-collect-more-than-100-tweets-when-using-twitter-api-v2/
https://developer.twitter.com/en/docs/twitter-api/tweets/search/integrate/paginate

The max_id parameter is the key and it is further explained here:
To use max_id correctly, an application’s first request to a timeline
endpoint should only specify a count. When processing this and
subsequent responses, keep track of the lowest ID received. This ID
should be passed as the value of the max_id parameter for the next
request, which will only return Tweets with IDs lower than or equal to
the value of the max_id parameter.
https://developer.twitter.com/en/docs/tweets/timelines/guides/working-with-timelines
In other words, using the lowest id retrieved from a search, you can access the older tweets. As mentioned by Tyler, the non-commercial version is limited to 7-day, but the commercial version can search up to 30 days.

Related

Google Photos API mediaItems list/search methods ignore pageSize param

I am attempting to do a retrieve of all media items that a given Google Photos user has, irrespective of any album(s) that they are in. However when I attempt to use either the mediaItems.list or the mediaItems.search methods, the pageSize param I am including in either request is either being ignored or not fully fullfilled.
Details of mediaItems.list request
GET https://photoslibrary.googleapis.com/v1/mediaItems?pageSize=<###>
Details of mediaItems.search request
POST https://photoslibrary.googleapis.com/v1/mediaItems:search
BODY { 'pageSize': <###> }
I have made a simple implementation of these two requests here as an example for this question, it just requires a valid accessToken to use:
https://jsfiddle.net/zb2htog1/
Running this script with the following pageSize against a Google Photos account with 100s of photos and 10s of albums consistently returns the same unexpected amount of result for both methods:
Request pageSize
Returned media items count
1
1
25
9
50
17
100
34
I know that Google states the following for the pageSize parameter for both of these methods:
“Maximum number of media items to return in the response. Fewer media
items might be returned than the specified number. The default
pageSize is 25, the maximum is 100.”
I originally assumed that the reason fewer media items might be returned is because an account might have less media items in total than a requested pageSize, or that a request with a pageToken has reached the end of a set of paged results. However I am now wondering if this just means that results may vary in general?
Can anyone else confirm if they have the same experience when using these methods without an album ID for an account with a suitable amount of photos to test this? Or am I perhaps constructing my requests in an incorrect fashion?
I experience something similar. I get back half of what I expect.
If I don't set the pageSize, I get back just 13, If I set to 100, I get back 50.

How to get Salesforce REST API to paginate?

I'm using the simple_salesforce python wrapper for the Salesforce REST API. We have hundreds of thousands of records, and I'd like to split up the pull of the salesforce data so all records are not pulled at the same time.
I've tried passing a query like:
results = salesforce_connection.query_all("SELECT my_field FROM my_model limit 2000 offset 50000")
to see records 50K through 52K but receive an error that offset can only be used for the first 2000 records. How can I use pagination so I don't need to pull all records at once?
Your looking to use salesforce_connection.query(query=SOQL) and then .query_more(nextRecordsUrl, True)
Since .query() only returns 2000 records you need to use .query_more to get the next page of results
From the simple-salesforce docs
SOQL queries are done via:
sf.query("SELECT Id, Email FROM Contact WHERE LastName = 'Jones'")
If, due to an especially large result, Salesforce adds a nextRecordsUrl to your query result, such as "nextRecordsUrl" : "/services/data/v26.0/query/01gD0000002HU6KIAW-2000", you can pull the additional results with either the ID or the full URL (if using the full URL, you must pass ‘True’ as your second argument)
sf.query_more("01gD0000002HU6KIAW-2000")
sf.query_more("/services/data/v26.0/query/01gD0000002HU6KIAW-2000", True)
Here is an example of using this
data = [] # list to hold all the records
SOQL = "SELECT my_field FROM my_model"
results = sf.query(query=SOQL) # api call
## loop through the results and add the records
for rec in results['records']:
rec.pop('attributes', None) # remove extra data
data.append(rec) # add the record to the list
## check the 'done' attrubite in the response to see if there are more records
## While 'done' == False (more records to fetch) get the next page of records
while(results['done'] == False):
## attribute 'nextRecordsUrl' holds the url to the next page of records
results = sf.query_more(results['nextRecordsUrl', True])
## repeat the loop of adding the records
for rec in results['records']:
rec.pop('attributes', None)
data.append(rec)
Looping through the records and using the data
## loop through the records and get their attribute values
for rec in data:
# the attribute name will always be the same as the salesforce api name for that value
print(rec['my_field'])
Like the other answer says though, this can start to use up a lot of resources. But it what you're looking for if want to achieve page nation.
Maybe create a more focused SOQL statement to get only the records needed for your use case at that specific moment.
LIMIT and OFFSET aren't really meant to be used like that, what if somebody inserts or deletes a record on earlier position (not to mention you don't have ORDER BY in there). SF will open a proper cursor for you, use it.
https://pypi.org/project/simple-salesforce/ docs for "Queries" say that you can either call query and then query_more or you can go query_all. query_all will loop and keep calling query_more until you exhaust the cursor - but this can easily eat your RAM.
Alternatively look into the bulk query stuff, there's some magic in the API but I don't know if it fits your use case. It'd be asynchronous calls and might not be implemented in the library. It's called PK Chunking. I wouldn't bother unless you have millions of records.

Get data having a maximum attribute from firebase in nodejs

So i am working with firebase in nodejs, there is a "number" attribute in each of my document of a specific table(name generated at runtime). I want to get the data having the attribute "number"'s maximum value.
Here is my sample data:-
-L1GIb7Vyn6Yhd5gghH0
correct: blah
number: 9
question: A sample question
wrong1: blekh
wrong2: blahhh
I have seen answers like "childAdded" and all but all in vain also I can't use .endAt() or startAt() because I don't know the "number"'s value at any time.
My sample code till now is:-
queRef.child(req.session.quiztopicname+req.session.quiztopictype).
orderByChild("number").endAt(9).once("value",function(snapshot){
console.log(snapshot.val());
});
Use limitToLast(1) on your sorted reference/query to only retrieve the greatest value. Bear in mind that if there are multiple children with the same greatest value, you'll still only get one of them. There's more documentation or sorting and filtering here.

Obtaining video duration

A search is made to obtain videos. How do I obtain the duration of the video?
for search_result in search_response.get("items", []):
if search_result["id"]["kind"] == "youtube#video":
VideoID.append(search_result["id"]["videoId"])
ChannelID.append(search_result["snippet"]["channelId"])
VideoName.append(search_result["snippet"]["title"])
ChannelName.append(search_result["snippet"]["channelTitle"])
videoDuration.append(search_result["contentDetails"]["duration"])
The last line returns a key error. Judging by the API on the website this is how it should be done but the documentation is pretty weird to be honest.
https://developers.google.com/youtube/v3/docs/videos
Cheers
The search.list endpoint does not accept/provide contentDetails. You will need to take the id results from search and make another call to the videos.list endpoint for that.

there any way to show more than 20 photos of the instagram API?

I'm trying to display more than 20 photos feed in a website like this:
http://snap20.com.br/instagram/
There's any way to show?
Simple. Just append &count=-1 at the back of your api call.
For instance:
https://api.instagram.com/v1/tags/YOURTAG/media/recent?access_token=YOURACCESSTOKEN&count=-1
* Update April 2014 (credits: #user1406691): count=-1 is no longer available. Response:
{"meta":{"error_type":"APIInvalidParametersError","code":400,
"error_message":"Count must be larger than zero."}}
You may wish to use this instead:
https://api.instagram.com/v1/tags/YOURTAG/media/recent?access_token=YOURACCESSTOKEN&count=35
There's also another method via rss + db but it's longer, although it's not limited to 30 calls / hour.
Actually, there is a chance to get the next 20 pictures, and after that the next 20 and so on...
In the JSON responce there is an "pagination" array:
"pagination":{
"next_max_tag_id":"1411892342253728",
"deprecation_warning":"next_max_id and min_id are deprecated for this endpoint; use min_tag_id and max_tag_id instead",
"next_max_id":"1411892342253728",
"next_min_id":"1414849145899763",
"min_tag_id":"1414849145899763",
"next_url":"https:\/\/api.instagram.com\/v1\/tags\/lemonbarclub\/media\/recent?client_id=xxxxxxxxxxxxxxxxxx\u0026max_tag_id=1411892342253728"
}
this is the information on specific API call and the object "next_url" shows the URL to get the next 20 pictures so just take that URL and call it for the next 20 pictures...
for more information about Instagram API check this out: https://medium.com/#KevinMcAlear/getting-friendly-with-instagrams-api-abe3b929bc52
Instagram has a 20 image limit on their API, check out this thread and my answer:
What is the maximum number of requests for Instagram?
Also, have a look at this link to bypass the pagination and display all results:
http://thegregthompson.com/displaying-instagram-images-ignoring-page-pagination/

Resources