Twitter API, Searching with dollar signs - python-3.x

This code opens a twitter listener, and the search terms are in the variable, upgrades_str. Some searches work, and some don't. I added AMZN to the upgrades list just to be sure there's a frequently used term since this is using an open Twitter stream, and not searching existing tweets.
Below, I think we only need to review numbers 2 and 4.
I'm using Python 3.5.2 :: Anaconda 4.0.0 (64-bit) on Windows 10.
Variable searches
Searching with: upgrades_str: ['AMZN', 'SWK', 'AIQUY', 'SFUN', 'DOOR'] = returns tweets such as 'i'm tired of people'
Searching with: upgrades_str: ['$AMZN', '$SWK', '$AIQUY', '$SFUN', '$DOOR'] = returns tweets as as 'Chicago to south Florida. Hiphop lives'. This search is the one I wish worked.
Explicit searches
Searching by replacing the variable 'upgrades_str' with the explicit string: ['AMZN', 'SWK', 'AIQUY', 'SFUN', 'DOOR'] = returns 'After being walked in on twice, I have finally figured out how to lock the door here in Sweden'. This one at least has the search term 'door'.
Searching by replacing the variable 'upgrades_str' with the explicit string: ['$AMZN', '$SWK', '$AIQUY', '$SFUN', '$DOOR'] = returns '$AMZN $WFM $KR $REG $KIM: Amazon’s Whole Foods buy puts shopping centers at risk as real'. So the explicit call works, but not the identical variable.
Explicitly searching for ['$AMZN'] = returns a good tweet: 'FANG setting up really good for next week! Added $googl jun23 970c avg at 4.36. $FB $AMZN'.
Explicitly searching for ['cool'] returns 'I can’t believe I got such a cool Pillow!'
import tweepy
import dataset
from textblob import TextBlob
from sqlalchemy.exc import ProgrammingError
import json
db = dataset.connect('sqlite:///tweets.db')
class StreamListener(tweepy.StreamListener):
def on_status(self, status):
if status.retweeted:
return
description = status.user.description
loc = status.user.location
text = status.text
coords = status.coordinates
geo = status.geo
name = status.user.screen_name
user_created = status.user.created_at
followers = status.user.followers_count
id_str = status.id_str
created = status.created_at
retweets = status.retweet_count
bg_color = status.user.profile_background_color
blob = TextBlob(text)
sent = blob.sentiment
if geo is not None:
geo = json.dumps(geo)
if coords is not None:
coords = json.dumps(coords)
table = db['tweets']
try:
table.insert(dict(
user_description=description,
user_location=loc,
coordinates=coords,
text=text,
geo=geo,
user_name=name,
user_created=user_created,
user_followers=followers,
id_str=id_str,
created=created,
retweet_count=retweets,
user_bg_color=bg_color,
polarity=sent.polarity,
subjectivity=sent.subjectivity,
))
except ProgrammingError as err:
print(err)
def on_error(self, status_code):
if status_code == 420:
return False
access_token = 'token'
access_token_secret = 'tokensecret'
consumer_key = 'consumerkey'
consumer_secret = 'consumersecret'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
stream_listener = StreamListener()
stream = tweepy.Stream(auth=api.auth, listener=stream_listener)
stream.filter(track=upgrades_str, languages=['en'])

Here's the answer, in case someone has the problem in the future: "Note that punctuation is not considered to be part of a #hashtag or #mention, so a track term containing punctuation will not match either #hashtags or #mentions." From: https://dev.twitter.com/streaming/overview/request-parameters#track
And for multiple terms, the string, which was converted from a list, needs to be changed to ['term1,term2']. Just strip out the apostrophes and spaces:
upgrades_str = re.sub('[\' \[\]]', '', upgrades_str)
upgrades_str = '[\''+format(upgrades_str)+'\']'

Related

some beginners confusion with flask-sqlalchemy

sqlalchemy and I'm trying to understand the finer workings of the way the objects work. I've got some test code that just seems to work differently than the tutorials and I am getting confused. I think this is a case of the tensorflow 1 vs 2 docs confusion happening
the code I posted, I'm told works for other people, and the last line of that is database.session.commit(), the code is the last code block.
This is a brand new Debian Stable VM with only VSCode, terminator and bpython installed beyond what is necessary for the application. THIS SHOULD WORK.
the database file is created but not populated. the tables are created but not columns or rows.
sqlite3 shell shows that nothing is in the database but the tables are there
database.session.flush()
does not add the stuff either
database.session.query(User).all()
also returns an empty thingamabob
and
database.session.query(User).filter(User.user_id)
returns : SQLAlchemy object has no attribute "query"
I'm trying to make a function that returns the user object based on its user_id and then change a variable in that user object using
object.field = "blorp"
I can make an object and access it like
user = User()
user.user_id
but for some strange reason, even though I have defaults set, it doesn't have those fields populated with those defaults. I have to explicitly define the fields during the declaration
but I can assign stuff like:
user.user_id = 2
this object returns ALL of the users right?
>>> asdf = database.session.query(User)
>>> asdf
<flask_sqlalchemy.BaseQuery object at 0x7f49aacff780>
the "all()" method returns an empty array for both class.query and database.session.query
>>> users = User.query.all()
>>> users
[]
>>> users = database.session.query(User).all()
>>> users
[]
Using the following test code:
from flask.config import Config
from flask import Flask, render_template, Response, Request ,Config
from flask_sqlalchemy import SQLAlchemy
HTTP_HOST = "gamebiscuits"
ADMIN_NAME = "Emperor of Sol"
ADMIN_PASSWORD = "password"
ADMIN_EMAIL = "game_admin"
DANGER_STRING= "TACOCAT"
class Config(object):
SQLALCHEMY_DATABASE_URI = 'sqlite:///' + HTTP_HOST + '.db'
SQLALCHEMY_TRACK_MODIFICATIONS = True
solar_empire_server = Flask(__name__ , template_folder="templates" )
solar_empire_server.config.from_object(Config)
database = SQLAlchemy(solar_empire_server)
def update_database(thing):
database.session.add(thing)
database.commit()
def user_by_id(id_of_user):
return User.query.filter_by(user_id = id_of_user).first()
#without the flask migrate module, you need to instantiate
# databases with default values. That module wont be loaded
# yet during the creation of a NEW game
class User(database.Model):
user_id = database.Column(database.Integer, default = 0, primary_key = True)
username = database.Column(database.String(64), default = "tourist", index=True, unique=True)
email = database.Column(database.String(120), default = DANGER_STRING , index=True, unique=True)
password_hash = database.Column(database.String(128), default = DANGER_STRING)
turns_run = database.Column(database.Integer, default = 0)
cash = database.Column(database.Integer, default = 1000)
def __repr__(self):
return '<User id:{} name: {} >'.format(self.user_id , self.username)
class UserShip(User):
ship_id = database.Column(database.String(128), default = "1")
ship_name = database.Column(database.String(128), default = "goodship moop")
def __repr__(self):
return '<User id:{} name: {} >'.format(self.ship_id , self.ship_name)
admin = User(username=ADMIN_NAME, user_id = 1, email=ADMIN_EMAIL , password_hash = ADMIN_PASSWORD)
guest = User(username='guest', user_id = 2, email='test#game.net' , password_hash = 'password')
user = User()
usership = UserShip()
adminship = UserShip()
guestship = UserShip()
database.create_all()
database.session.add(admin)
database.session.add(guest)
database.session.add(user)
database.session.add(usership)
database.session.add(adminship)
database.session.add(guestship)
database.session.commit()
Okay, so when you see default here it's going to be on INSERT (see the SQLAlchemy default documentation).
To quote the key part here:
Column INSERT and UPDATE defaults refer to functions that create a default value for a particular column in a row as an INSERT or UPDATE statement is proceeding against that row, in the case where no value was provided to the INSERT or UPDATE statement for that column.
So these defaults will only be initialized on adding to the database, flushing won't have any effect if an object that hasn't been added (session.add). So User() is only going to create an object, it will not create a DB row until you add (and flush / commit as necessary).

When I run a tweepy python script nothing happens on the associated twitter account

I've created a python script with tweepy that replies to suicidal tweets with a link to a support website. However, nothing happens when I run the code and tweet with any of the code words on a different account. I'm opening and running the .py file in command prompt.
Like I said, I've tried using the specific words that should trigger it but it does not reply.
import tweepy
#the following module is a file with the specific keys set in
#a dictionary to the given variable, don't want to show them due to
#privacy/security
from keys import keys
CONSUMER_KEY = keys['consumer_key']
CONSUMER_SECRET = keys['consumer_secret']
ACCESS_TOKEN = keys['access_token']
ACCESS_TOKEN_SECRET = keys['access_token_secret']
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)
twts = api.search(q="suicide")
t = ['suicide',
'kill myself',
'hate myself',
'Suicidal',
'self-harm',
'self harm']
for s in twts:
for i in t:
if i == s.text:
sn = s.user.screen_name
m = "#%s You are loved! For help, visit https://suicidepreventionlifeline.org/" % (sn)
s = api.update_status(m, s.id)
It should reply with a help link, but it doesn't and I don't know what I did wrong in my code. Any help?
Replace :
if i == s.text:
with :
if i in s.text:
Or to match words case sensitive, the best should be :
if i.lower() in s.text.lower():
Because Suicidal (an other words from the t array) can't be equal to the tweet text.
I guess you want to check if the text contains this word.

How to retrieve all historical public tweets with Twitter Premium Search API in Sandbox version (using next token)

I want to download all historical tweets with certain hashtags and/or keywords for a research project. I got the Premium Twitter API for that. I'm using the amazing TwitterAPI to take care of auth and so on.
My problem now is that I'm not an expert developer and I have some issues understanding how the next token works, and how to get all the tweets in a csv.
What I want to achieve is to have all the tweets in one single csv, without having to manually change the dates of the fromDate and toDate values. Right now I don't know how to get the next token and how to use it to concatenate requests.
So far I got here:
from TwitterAPI import TwitterAPI
import csv
SEARCH_TERM = 'my-search-term-here'
PRODUCT = 'fullarchive'
LABEL = 'here-goes-my-dev-env'
api = TwitterAPI("consumer_key",
"consumer_secret",
"access_token_key",
"access_token_secret")
r = api.request('tweets/search/%s/:%s' % (PRODUCT, LABEL),
{'query':SEARCH_TERM,
'fromDate':'200603220000',
'toDate':'201806020000'
}
)
csvFile = open('2006-2018.csv', 'a')
csvWriter = csv.writer(csvFile)
for item in r:
csvWriter.writerow([item['created_at'],item['user']['screen_name'], item['text'] if 'text' in item else item])
I would be really thankful for any help!
Cheers!
First of all, TwitterAPI includes a helper class that will take care of this for you. TwitterPager works with many types of Twitter endpoints, not just Premium Search. Here is an example to get you started: https://github.com/geduldig/TwitterAPI/blob/master/examples/page_tweets.py
But to answer your question, the strategy you should take is to put the request you currently have inside a while loop. Then,
1. Each request will return a next field which you can get with r.json()['next'].
2. When you are done processing the current batch of tweets and ready for your next request, you would include the next parameter set to the value above.
3. Finally, eventually a request will not include a next in the the returned json. At that point break out of the while loop.
Something like the following.
next = ''
while True:
r = api.request('tweets/search/%s/:%s' % (PRODUCT, LABEL),
{'query':SEARCH_TERM,
'fromDate':'200603220000',
'toDate':'201806020000',
'next':next})
if r.status_code != 200:
break
for item in r:
csvWriter.writerow([item['created_at'],item['user']['screen_name'], item['text'] if 'text' in item else item])
json = r.json()
if 'next' not in json:
break
next = json['next']

Getting the correct information from differently formulated queries

Howdo people,
I'm to put together a limited Q&A program that will allow the user to query Wikidata using SPARQL with very specific/limited query structures.
I've got the program going, but I'm running into issues when entering queries that are formulated differently.
def sparql_query(line):
m = re.search('What is the (.*) of (.*)?', line)
relation = m.group(1)
entity = m.group(2)
wdparams['search'] = entity
json = requests.get(wdapi, wdparams).json()
for result in json['search']:
entity_id = result['id']
wdparams['search'] = relation
wdparams['type'] = 'property'
json = requests.get(wdapi, wdparams).json()
for result in json['search']:
relation_id = result['id']
fire_sparql(entity_id, relation_id)
return fire_sparql
As you can see, this only works with queries following that specific structure, for example "What is the color of night?" Queries along the lines of 'What are the ingredients of pizza?' simply would cause the program to crash because it doesn't follow the 'correct' structure as set in the code. As such, I would like it to be able to differentiate between different types of query structures ("What is.." and "What are.." for example) and still collect the needed information (relation/property and entity).
This setup is required, insofar as I can determine, seeing as the property and entity need to be extracted from the query in order to get the proper results from Wikidata. This is unfortunately also what I'm running into problems with; I can't seem to use 'if' or 'while-or' statements without the code returning all sorts of issues.
So the question being: How can I make the code accept differently formulated queries whilst still retrieving the needed information from them?
Many thanks in advance.
The entirety of the code in case required:
#!/usr/bin/python3
import sys
import requests
import re
def main():
example_queries()
for line in sys.stdin:
line = line.rstrip()
answer = sparql_query(line)
print(answer)
def example_queries():
print("Example query?\n\n Ask your question.\n")
wdapi = 'https://www.wikidata.org/w/api.php'
wdparams = {'action': 'wbsearchentities', 'language': 'en', 'format': 'json'}
def sparql_query(line):
m = re.search('What is the (.*) of (.*)', line)
relation = m.group(1)
entity = m.group(2)
wdparams['search'] = entity
json = requests.get(wdapi, wdparams).json()
for result in json['search']:
entity_id = result['id']
wdparams['search'] = relation
wdparams['type'] = 'property'
json = requests.get(wdapi, wdparams).json()
for result in json['search']:
relation_id = result['id']
fire_sparql(entity_id, relation_id)
return fire_sparql
url = 'https://query.wikidata.org/sparql'
def fire_sparql(ent, rel):
query = 'SELECT * WHERE { wd:' + ent + ' wdt:' + rel + ' ?answer.}'
print(query)
data = requests.get(url, params={'query': query, 'format': 'json'}).json()
for item in data['results']['bindings']:
for key in item:
if item[key]['type'] == 'literal':
print('{} {}'.format(key, item[key]['value']))
else:
print('{} {}'.format(key, item[key]))
if __name__ == "__main__":
main()

Using if elif statements with Tweepy in Python

I'm trying to create a bit of code to listen certain keywords on Twitter and am struggling to get the results I want.
I'm using Python 3.4.3.
Here's what I have that's working so far...
import tweepy
from time import sleep
CONSUMER_KEY = 'abcabcabc'
CONSUMER_SECRET = 'abcabcabc'
ACCESS_KEY = 'abcabcabc'
ACCESS_SECRET = 'abcabcabc'
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
auth.secure = True
api = tweepy.API(auth)
class MyStreamListener(tweepy.StreamListener):
def on_status(self, status):
try:
print(status.text)
except:
print("false")
myStreamListener = MyStreamListener
myStream = tweepy.Stream(auth = api.auth, listener=myStreamListener())
myStream.filter(track=['Cycling', 'Running'])
I'm trying to an if and elif statements to print different results depending on if the Tweet is Cycling or Running. I've used this code, but can't get it to work...
class MyStreamListener(tweepy.StreamListener):
def on_status(self, status):
if 'Cycling' in myStreamListener:
print('Cyclist' + status.text)
elif 'Running' in myStreamListener:
print('Runner' + status.text)
else:
print('false' + status.text)
myStreamListener = MyStreamListener
myStream = tweepy.Stream(auth = api.auth, listener=myStreamListener())
myStream.filter(track=['Cycling', 'Running'])
I can get the if & elif to work offline when not adding the complexity of Tweepy into the equation, but am confused about exactly how to use the in statement.
I'm new to Python so will more than likely be making some simple mistakes!
Any help would be greatly appreciated,
Many thanks,
Matt
You are using in with the wrong variable (which points to the listener function). Use the status variable instead:
if 'cycling' in status.text.lower():
print('Cyclist' + status.text)
Note that I added .lower() to the end of status.text. That way it will match the word 'cycling' regardless of if it is capitalised or not.

Resources