Youtube Data API: getting total comments, likes, dislikes - python-3.x

I have this issue.
I have a list of youtube channels I am polling from the API to get some stats daily.
Total comments, likes and dislikes (all time and all videos)
I have implemented the below, it works, but it loops through every single video one at a time, hitting the API.
Is there a way to make one API call with several video IDs?
Or is there a better way to do this and get these stats?
#find stats for all channel videos - how will this scale?
def video_stats(row):
videoid = row['video_id']
query = yt.get_video_metadata(videoid)
vids = pd.DataFrame(query, index=[0])
df['views'] = vids['video_view_count'].sum()
df['comments'] = vids['video_comment_count'].sum()
df['likes'] = vids['video_like_count'].sum()
df['dislikes'] = vids['video_dislike_count'].sum()
return 'no'
df['stats'] = df.apply(video_stats, axis = 1)
channel['views'] = df['views'].sum()
channel['comments'] = df['comments'].sum()
channel['likes'] = df['likes'].sum()
channel['dislikes'] = df['dislikes'].sum()

According to the docs, you may cumulate in one Videos.list API endpoint call the IDs of several different videos:
id: string
The id parameter specifies a comma-separated list of the YouTube video ID(s) for the resource(s) that are being retrieved. In a video resource, the id property specifies the video's ID.
However, the code you have shown is too terse for to figure out a way of adapting it to such type of (batch) endpoint call.

Related

GoogleCloud Speech2Text "long_running_recognize" response object un-iterable

When running a speech to text api request from Google cloud services (over 60s audio so i need to use the long_running_recognize function, as well as retrieve the audio from a Cloud Storage Bucket), i properly get a text response, but i cannot iterate through the LongRunningResponse object that is returned, which renders the info inside semi useless.
When using just the "client.recognize()" function, i get a similar response to the long running response, except when i check for the results in the short form, i can iterate through the object just fine, contrary to the long response.
I run nearly identical parameters through each recognize function (a 1m40s long audio for long running, and a 30s for the short recognize, both from my cloud bucket).
short_response = client.recognize(config=config, audio=audio_uri)
subs_list = []
for result in short_response.results:
for alternative in result.alternatives:
for word in alternative.words:
if not word.start_time:
start = 0
else:
start = word.start_time.total_seconds()
end = word.end_time.total_seconds()
t = word.word
subs_list.append( ((float(start),float(end)), t) )
print(subs_list)
Above function works fine, the ".results" attribute correctly returns objects that i can further gain attributes from and iterate through. I use the for loops to create subtitles for a video.
I then try a similar thing on the long_running_recognize, and get this:
long_response = client.long_running_recognize(config=config, audio=audio_uri)
#1
print(long_response.results)
#2
print(long_response.result())
Output from #1 returns error:
AttributeError: 'Operation' object has no attribute 'results'. Did you mean: 'result'?
Output from #2 returns the info i need, but when checking "type(long_response.result())" i get:
<class 'google.cloud.speech_v1.types.cloud_speech.LongRunningRecognizeResponse'>
Which i suppose is not an iterable object, and i cannot figure out how to apply a similar process as i do to the recognize function to gain subtitles the way i need.

flickrAPI upload photos

I'm trying to use python Flickr API to upload photos to my Flickr account. I already got the API key and secret and user them to get information about my albums and photos, but I got some sort of errors trying to upload new photos. Here is my code:
import flickrapi
api_key = u'xxxxxxxxxxxxxxxxxxxxxxxx'
api_secret = u'xxxxxxxxxxxxxxxxxxxx'
flickr = flickrapi.FlickrAPI(api_key, api_secret)
filename = 'd:/downloads/_D4_6263-Enhanced.png'
title = 'Fach Halcones'
description = 'Posting image using API'
tags = 'fidae'+','+'aviation'+','+'extra'+','+'air shows'
flickr.upload(filename, title, description, tags)
When I run the script, I got the following error:
File "uploadPhotos.py", line 15, in module
flickr.upload(filename, title, description, tags)
TypeError: upload() takes from 2 to 4 positional arguments but 5 were given
looking at the Flickr API documentation, it seems to accept up to five arguments (filename,
fileobj, title, description, tags), and I'm passing only four, since fileobj is optional.
I have googled for some examples, but I was unable to find something that does the trick. So, any help would be awesome.
Regards,
Marcio
I found the solution, and I'm sharing it here. There were two issues with my code.
First: We must use kwargs; Second: tags must be separated by space, not commas
Here the final version:
import flickrapi
api_key = u'xxxxxxxxxxxxxxxxxxxxxxxx'
api_secret = u'xxxxxxxxxxxxxxxxxxxx'
flickr = flickrapi.FlickrAPI(api_key, api_secret)
params = {}
params['filename'] = 'd:/downloads/_D4_6263-Enhanced.png'
params['title'] = 'Fach Halcones'
params['description'] = 'Posting image using API'
params['tags'] = '''fidae aviation extra "air shows" '''
flickr.upload(**params)
That's it...

Premium search API, use of retweets_of and from function

I am using TwitterAPI in python3 for premium search to find archived tweets that are retweeted by user1 from user2 with specific keywords. After some suggestions, I have used https://developer.twitter.com/en/docs/tweets/rules-and-filtering/overview/operators-by-product and https://github.com/geduldig/TwitterAPI to make this code, but when I run the code I am not getting any output or error message.
The code works fine when I am not using the retweets_of and from operators, but these are the rules I want to use to get my data.
I know my code shows a premium Sandbox search, but I will upgrade it to premium Full Archive search when I have the right code.
from TwitterAPI import TwitterAPI
#Keys and Tokens from Twitter Developer
consumer_key = "xxxxxxxxxxxxx"
consumer_secret = "xxxxxxxxxxxxxxxxxxx"
access_token = "xxxxxxxxxxxxxxxxxxx"
access_token_secret = "xxxxxxxxxxxxxxxxx"
PRODUCT = '30day'
LABEL = 'MyLABELname'
api = TwitterAPI(consumer_key, consumer_secret, access_token, access_token_secret)
r = api.request('tweets/search/%s/:%s' % (PRODUCT, LABEL),
{'query':'retweets_of:user.Tesla from:user.elonmusk Supercharger battery'})
for item in r:
print (item['text'] if 'text' in item else item)
Does someone know what the problem is with my code or is there any other way to use the retweets_of and from operators for a premium search. Is it also possible to add a count operator to my code so it will give numbers as output and not all of the tweets in writing?
You should omit "user." in your query.
Also, by specifying "Supercharger battery", which is perfectly fine, you require both in the search results. However, if you require only either word to be present, you would use "Supercharger OR battery".
Finally, to specify a larger number of results, use the maxResults parameter (10 to 100).
Here is your example with all of the above:
r = api.request('tweets/search/%s/:%s' % (PRODUCT, LABEL),
{'query':'retweets_of:Tesla from:elonmusk Supercharger OR battery',
'maxResults':100})
Twitter's Premium Search doc may be helpful: https://developer.twitter.com/en/docs/tweets/search/api-reference/premium-search.html

How to retrieve all historical public tweets with Twitter Premium Search API in Sandbox version (using next token)

I want to download all historical tweets with certain hashtags and/or keywords for a research project. I got the Premium Twitter API for that. I'm using the amazing TwitterAPI to take care of auth and so on.
My problem now is that I'm not an expert developer and I have some issues understanding how the next token works, and how to get all the tweets in a csv.
What I want to achieve is to have all the tweets in one single csv, without having to manually change the dates of the fromDate and toDate values. Right now I don't know how to get the next token and how to use it to concatenate requests.
So far I got here:
from TwitterAPI import TwitterAPI
import csv
SEARCH_TERM = 'my-search-term-here'
PRODUCT = 'fullarchive'
LABEL = 'here-goes-my-dev-env'
api = TwitterAPI("consumer_key",
"consumer_secret",
"access_token_key",
"access_token_secret")
r = api.request('tweets/search/%s/:%s' % (PRODUCT, LABEL),
{'query':SEARCH_TERM,
'fromDate':'200603220000',
'toDate':'201806020000'
}
)
csvFile = open('2006-2018.csv', 'a')
csvWriter = csv.writer(csvFile)
for item in r:
csvWriter.writerow([item['created_at'],item['user']['screen_name'], item['text'] if 'text' in item else item])
I would be really thankful for any help!
Cheers!
First of all, TwitterAPI includes a helper class that will take care of this for you. TwitterPager works with many types of Twitter endpoints, not just Premium Search. Here is an example to get you started: https://github.com/geduldig/TwitterAPI/blob/master/examples/page_tweets.py
But to answer your question, the strategy you should take is to put the request you currently have inside a while loop. Then,
1. Each request will return a next field which you can get with r.json()['next'].
2. When you are done processing the current batch of tweets and ready for your next request, you would include the next parameter set to the value above.
3. Finally, eventually a request will not include a next in the the returned json. At that point break out of the while loop.
Something like the following.
next = ''
while True:
r = api.request('tweets/search/%s/:%s' % (PRODUCT, LABEL),
{'query':SEARCH_TERM,
'fromDate':'200603220000',
'toDate':'201806020000',
'next':next})
if r.status_code != 200:
break
for item in r:
csvWriter.writerow([item['created_at'],item['user']['screen_name'], item['text'] if 'text' in item else item])
json = r.json()
if 'next' not in json:
break
next = json['next']

Twitter API: How to get users ID, who favorite specific tweet?

I'm trying to get info about users, who added specific tweet to favorites, but I can't find it in documentation.
It is unfair that twitter can do that, but doesn't give this method as API.
Apparently, the only way to do this is to scrape Twitter's website:
import urllib2
from lxml.html import parse
#returns list(retweet users),list(favorite users) for a given screen_name and status_id
def get_twitter_user_rts_and_favs(screen_name, status_id):
url = urllib2.urlopen('https://twitter.com/' + screen_name + '/status/' + status_id)
root = parse(url).getroot()
num_rts = 0
num_favs = 0
rt_users = []
fav_users = []
for ul in root.find_class('stats'):
for li in ul.cssselect('li'):
cls_name = li.attrib['class']
if cls_name.find('retweet') >= 0:
num_rts = int(li.cssselect('a')[0].attrib['data-tweet-stat-count'])
elif cls_name.find('favorit') >= 0:
num_favs = int(li.cssselect('a')[0].attrib['data-tweet-stat-count'])
elif cls_name.find('avatar') >= 0 or cls_name.find('face-pile') >= 0:#else face-plant
for users in li.cssselect('a'):
#apparently, favs are listed before retweets, but the retweet summary's listed before the fav summary
#if in doubt you can take the difference of returned uids here with retweet uids from the official api
if num_favs > 0:#num_rt > 0:
#num_rts -= 1
num_favs -= 1
#rt_users.append(users.attrib['data-user-id'])
fav_users.append(users.attrib['data-user-id'])
else:
#fav_users.append(users.attrib['data-user-id'])
rt_users.append(users.attrib['data-user-id'])
return rt_users, fav_users
#example
if __name__ == '__main__':
print get_twitter_user_rts_and_favs('alien_merchant', '674104400013578240')
Short answer: You can't do this perfectly.
Long answer: You can do this with some effort but it isn't going to be even close to perfect. You can use the twitter api to monitor the activity of up to 4000 user id's. If a tweet is created by one of the 4k people you monitor, then you can get all the information including the people who have favourited the tweet. This also requires that you push all the information about the people you monitor onto a database (I use mongodb). You can then query the database for information about your tweet.
Twitter API v2 has new likes functionality:
https://twittercommunity.com/t/announcing-twitter-api-v2-likes-lookup-and-blocks-lookup/154353
To get users who have liked a Tweet, use the GET /2/tweets/:id/liking_users endpoint.
They've also provided example code on their github repo.
Use the endpoint favorites/list with max_id set to the tweet you're looking for.
https://dev.twitter.com/rest/reference/get/favorites/list

Resources