Searching a lot of keywords on twitter via tweepy - python-3.x

I am trying to make a python code with tweepy that will track all the tweets from a specific country from a date which will have some of the chosen specific keywords. And I have chosen a lot of keywords like 24-25.
My keywords are vigilance anticipation interesting ecstacy joy serenity admiration trust acceptance terror fear apprehensive amazement surprize distraction grief sadness pensiveness loathing disgust boredom rage anger annoyance.
for more understanding, my code till now is:
places = api.geo_search(query="Canada",granularity="country")
place_id = places[0].id
public_tweets = tweepy.Cursor(api.search,
q="place:"+place_id+" since:2020-03-01",
lang="en",
).items(num_tweets)
Please help me with this question as soon as possible.
Thank You

Related

Fine Tuning GPT-3 for Consistent Output Format

I am trying to use Open AI API to create quiz questions with three incorrect answers and one correct answer. The prompt I use is
`Write a quiz on ${quiz_topic} with 5 questions. Each question has 3 incorrect answers and 1 correct answer. The correct answer is always the last answer. Write each answer on a separate line`
I would like to parse ChatGPT's answer but the format is inconsistent. For instance, is quiz_topic is "literature", I get the following response:
Q1. Who wrote the novel "The Catcher in the Rye"?
A. J.D. Salinger
B. Ernest Hemingway
C. F. Scott Fitzgerald
D. Mark Twain
This is the correct format. But if I use "U.S History", I get the following output:
Q1: Who was the first president of the United States?
A1: George Washington Jr., Thomas Jefferson, John Adams, George Washington
This is a different format. So my goal is to fine-tune it using the article here
I have created 200 line of a training set. Here is a sample:
{"prompt":"Write a quiz on general knowledge with 5 questions. Each question has 3 incorrect answers and 1 correct answer. The correct answer is always the last answer.Write each answer on a separate line \n\n---\n\n","completion":" Q. What is the Italian word for "tomato"?\nA. Aglio\nA. Cipolla\nA. Peperoncino\nA. Pomodoro\n\nQ. Which restaurant's mascot is a clown?\nA. Whataburger\nA. Burger King\nA. Sonic\nA. McDonald's\n\nQ. Which of these colours is NOT featured in the logo for Google?\nA. Yellow\nA. Blue\nA. Green\nA. Pink\n\nQ. In 2013 how much money was lost by Nigerian scams?\nA. $95 Million\nA. $956 Million\nA. $2.7 Billion\nA. $12.7 Billion\n\nQ. What is the name of Poland in Polish?\nA. Pupcia\nA. Polszka\nA. Póland\nA. Polska\n\n \n\n###\n\n"}
{"prompt":"Write a quiz on books with 5 questions. Each question has 3 incorrect answers and 1 correct answer. The correct answer is always the last answer.Write each answer on a separate line \n\n---\n\n","completion":" Q. What is the name of Eragon's dragon in "Eragon"?\nA. Glaedr\nA. Thorn\nA. Arya\nA. Saphira\n\nQ. In the "The Hobbit", who kills Smaug?\nA. Bilbo Baggins\nA. Gandalf the Grey\nA. Frodo\nA. Bard\n\nQ. What is Hermione Granger's middle name?\nA. Jane\nA. Emma\nA. Jo\nA. Jean\n\nQ. According to The Hitchhiker's Guide to the Galaxy book, the answer to life, the universe and everything else is...\nA. Loving everyone around you\nA. Chocolate\nA. Death\nA. 42\n\nQ. What is the name of the three headed dog in Harry Potter and the Sorcerer's Stone?\nA. Spike\nA. Poofy\nA. Spot\nA. Fluffy\n\n \n\n###\n\n"}
When I run the validation tool with the command
openai tools fine_tunes.prepare_data -f training.jsonl
I get the following message
- All prompts start with prefix `Write a quiz on `. Fine-tuning doesn't require the instruction specifying the task, or a few-shot example scenario. Most of the time you should only add the input data into the prompt, and the desired output into the completion
I don't understand why I must remove "Write a quiz on". So I have misunderstood how to fine-tune a model for consistent formatting.
Can anybody shed a light on how to make sure I get the same formatting with the same prompt

400 Error when trying to use jsonlines file with OpenAI answers

I uploaded a file to openai for the purpose of answers. I can't figure out how to access it without getting an error.
This is the contents of my file:
{"text":"We are focusing on improving communication to the customers."}
{"text":"The emphasis on the customer is working well."}
{"text":"Services initiatives"}
{"text":"CMM and focus on customer service."}
{"text":"We are working on CMM to focus on the customer."}
{"text":"Our emphasis on making our customers number one."}
{"text":"Customer relationships are continuing to improve."}
This is the information about the file when I query openai:
{"object":"file","id":"file-02fdY5PuZ1aO4cnd2r1PxaB7","purpose":"answers","filename":"f3f1c105-5217-4132-8950-a040b6183ed7","bytes":458,"created_at":1648396701,"status":"processed","status_details":null}
Now I ask a question at https://api.openai.com/v1/answers with this POST:
{"file":"file-02fdY5PuZ1aO4cnd2r1PxaB7","question":"What is the main theme?","search_model": "davinci","model": "davinci","examples_context": "In 2017, U.S. life expectancy was 78.6 years.","examples": [["What is human life expectancy in the United States?","The life expectancy in the United States is 78 years."]],"max_tokens": 100,"stop": ["\n", "<|endoftext|>"]}
The server returns a 400 error. If in my request I replace the "file" field with a "documents" array using same data, the request is successful.
I don't know what more to look at. If someone can suggest something or if you seen an issue in what I am trying to do, I would deeply appreciate hearing from you.
Thanks

Python/Selenium - How to process extracted data from App store

I am using Selenium/Python to parse reviews from Apple app store. I used the following code to extract the data for the first five reviews:
URL: https://apps.apple.com/us/app/lemonade-insurance/id1055653645#see-all/reviews
wait = WebDriverWait(driver, 5)
response_ratings = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".we-customer-review")))
response_container = []
for e in response_ratings[:5]:
response_container.append(e.get_attribute('innerHTML'))
print(response_container[0])
Then, I print the first output
I expect to have star 5 out of 5, date July 6, 2019, title Convenient and Affordable!!!!, review The Lemonade app is so easy to use as well as having affordable rates!..., and Developer Response Thanks so much for your awesome review!! We're so happy to have you in... for the first review.
How do I get the above info? Thank you in advance for the help
You can use BeautifulSoup to parse the innerHTML and get what you're looking for.
One way of doing it would be:
import re
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
link = 'https://apps.apple.com/us/app/lemonade-insurance/id1055653645#see-all/reviews'
stars = re.compile(r"\d out of \d")
with webdriver.Chrome() as driver:
wait = WebDriverWait(driver, 10)
driver.get(link)
elements = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".we-customer-review")))
for elem in elements:
s = BeautifulSoup(elem.get_attribute("innerHTML"), "html.parser")
review_date = s.find("time").text
review_body = s.find("p").text
review_title = s.find("h3", {"data-test-customer-review-title": ""}).text.strip()
review_stars = ''.join(re.findall(stars, str(s.find("figure"))))
dev_response = s.find_all("p", {"data-test-bidi": ""})
print(f"{review_title} | {review_date} | {review_stars}")
print(review_body)
print(dev_response[1].text if len(dev_response) > 1 else "")
print("-" * 80)
This prints:
Convenient and Affordable!!!! | 07/06/2019 | 5 out of 5
The Lemonade app is so easy to use as well as having affordable rates! It took me all of 15 minutes to sign-up, pick a coverage and a deductible. Very nice customer service as well as very informative. At anytime of any day of the week I can log onto my account and check everything as well as make any necessary changes that I may need or want. The chat option feature works fantastic! Whenever I have a question I just go to the chat feature and within seconds someone is there to help and answer all my questions regarding their services and my plan coverage. I wished they had a referral feature cause I’ve already set up a couple of family members with the company as well. They were amazed that it only took about 10-15 minutes to setup and just how affordable it is!! I’ve gotten quotes from so many other companies but the monthly payments and deductible were too expensive, I was a little hesitant at first but I said hey I should at least give it a try and so far so good!!! I’m hoping that it’ll never come to the point where I’ll actually need to file a claim but if so I feel confident that the process will be easy and stress free considering how much stress I’m going to actually have due to a burglary or theft. I have faith that we’re going to have a very long relationship. Thanks to all the developers of the Lemonade App, the name of the company is nice too!!!!
Thanks so much for your awesome review!! We're so happy to have you in our Lemonade community!
--------------------------------------------------------------------------------
Made the best lemonade I’ve ever had! | 07/01/2018 | 5 out of 5
I have been telling EVERYONE about Lemonade. I don’t know how, but, getting insurance through your company is actually FUN! I have never had so much FUN doing a chore that typically involves a boring Q & A. The app really made me feel like I was getting insurance through a friend. I smiled with the shout out to my horoscope sign after entering my birthday, I loved the “making lemonade” process when getting the quote, and (because I have a future date for the policy to start) I absolutely adore the countdown. I look at it almost daily and become even more excited for my move (and I really don’t like having to move so this is really helping). The use of unclaimed funds going to charity actually makes my heart melt. By providing freedom to choose what organization you would like to contribute to truly makes me feel like I am giving back in some way, and it is beyond noble and inspiring for you to use that money to help others turn lemons into lemonade. Also, your customer service has been impeccable. Every question I have had has been answered quickly and by a friendly representative of the company. I don’t know much about the insurance world, other than we need to have it, but, you make me want to work for you!!! Where do I sign up?
--------------------------------------------------------------------------------

Finding Related Topics using Google Knowledge Graph API

I'm currently working on a behavioral targeting application and I need a considerably large keyword database/tool/provider that enables applications to reach to the similar keywords via given keyword for my app. I've recently found that Freebase, which had been providing a similar service before Google acquired them and then integrated to their Knowledge Graph. I was wondering if it's possible to have a list of related topics/keywords for the given entity.
import json
import urllib
api_key = 'API_KEY_HERE'
query = 'Yoga'
service_url = 'https://kgsearch.googleapis.com/v1/entities:search'
params = {
'query': query,
'limit': 10,
'indent': True,
'key': api_key,
}
url = service_url + '?' + urllib.urlencode(params)
response = json.loads(urllib.urlopen(url).read())
for element in response['itemListElement']:
print element['result']['name'] + ' (' + str(element['resultScore']) + ')'
The script above returns the queries below, though I'd like to receive related topics to yoga, such as health, fitness, gym and so on, rather than the things that has the word "Yoga" in their name.
Yoga Sutras of Patanjali (71.245544)
Yōga, Tokyo (28.808222)
Sri Aurobindo (28.727333)
Yoga Vasistha (28.637642)
Yoga Hosers (28.253984)
Yoga Lin (27.524054)
Patanjali (27.061115)
Yoga Journal (26.635073)
Kripalu Center (26.074436)
Yōga Station (25.10318)
I'd really appreciate any suggestions, and I'm also open to using any other API if there is any that I could make use of. Cheers.
See your point:) So here's the script I use for that using Serpstat's API. Here's how it works:
Script collects the keywords from Serpstat's database
Then, collects search suggestions from Serpstat's database
Finally, collects search suggestions from Google's suggestions
Note that to make script work correctly, it's preferable to fill all input boxes. But not all of them are required.
Keyword — required keyword
Search Engine — a search engine for which the analysis will be carried out. For example, for the US Google, you need to set the g_us. The entire list of available search engines can be found here.
Limit the maximum number of phrases from the organic issue, which will participate in the analysis. You cannot set more than 1000 here.
Default keys — list of two-word keywords. You should give each of them some "weight" to receive some kind of result if something goes wrong.
Format: type, keyword, "weight". Every keyword should be written from a new line.
Types:
w — one word
p — two words
Examples:
"w; bottle; 50" — initial weight of word bottle is 50.
"p; plastic bottle; 30" — initial weight of phrase plastic bottle is 30.
"w; plastic bottle; 20" — incorrect. You cannot use a two-word phrase for the "w" type.
Bad words — comma-separated list of words you want the script to exclude from the results.
Token — here you need to enter your token for API access. It can be found on your profile page.
You can download the source code for script here

Searching closest venues using ll and radius not working properly

I know there are a lot of questions about this issue, but I've reached a point where I can't really do anything else but to ask if somebody else has a solution for this issue...
Using the Foursquare api explorer to test out my query I can't seem to obtain an accurate or even good fit for the data I need to obtain.
It is quite simple. I need to obtain the closest venue from a set of coordinates. I don't mind not having results if nothing is found near by.
So, reading the API documentation (https://developer.foursquare.com/docs/venues/venues) I conclude that I need a search and not an explore because I don't want sugestions of recommended venues (and the results when I tested it proved that it wasn't what I was expecting).
So, using search api I want to find places (the place, but places would do...) close to these coordinates
ll=37.424782,-122.162989
considering that I want places close by, I add
radius=51
and I don't really want many results
limit=2
from the documentation I see that radius is
Only valid for requests with intent=browse, or requests with
intent=checkin and categoryId or query
so, I use
intent=browse
which concludes my query to:
venues/search?intent=browse&ll=37.424782,-122.162989&radius=51&limit=2
Query Result:
https://developer.foursquare.com/docs/explore#req=venues/search%3Fintent%3Dbrowse%26ll%3D37.424782,-122.162989%26radius%3D51%26limit%3D2
Here we can see that the first result is straight outside of the radius ... distance: 135
the second result however is cool ... distance: 50
What am I doing wrong to get these results? If I increase the limit all I get is more results that are also outside the radius, I could iterate through them and find the one with the smallest distance... but I have no guarantee that the closest result will be on the top X that I limit, even If I had that guarantee, it would be a tiresome solution to an apparently simple question...
Thanks for the help...
Marc
EDIT:
I managed to make have the query perform as I intended ... But I had to add all of the parent categories from:
https://developer.foursquare.com/categorytree
categoryId=
4d4b7104d754a06370d81259, Arts & Entertainment
4d4b7105d754a06372d81259, College & University
4d4b7105d754a06373d81259, Event
4d4b7105d754a06374d81259, Food
4d4b7105d754a06376d81259, Nightlife Spot
4d4b7105d754a06377d81259, Outdoors & Recreation
4d4b7105d754a06375d81259, Professional & Other Places
4e67e38e036454776db1fb3a, Residence
4d4b7105d754a06378d81259, Shop & Service
4d4b7105d754a06379d81259 Travel & Transport
making my query into:
venues/search?
intent=checkin&ll=37.424782,-122.162989&radius=60&categoryId=4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
Query Result:
https://developer.foursquare.com/docs/explore#req=venues/search%3Fintent%3Dcheckin%26ll%3D37.424782,-122.162989%26radius%3D60%26categoryId%3D4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
It still has results outside of my radius still ... but it's an acceptable error margin ... it is weird however.
I managed to make have the query perform as I intended ... But I had to add all of the parent categories from:
https://developer.foursquare.com/categorytree
categoryId=
4d4b7104d754a06370d81259, Arts & Entertainment
4d4b7105d754a06372d81259, College & University
4d4b7105d754a06373d81259, Event
4d4b7105d754a06374d81259, Food
4d4b7105d754a06376d81259, Nightlife Spot
4d4b7105d754a06377d81259, Outdoors & Recreation
4d4b7105d754a06375d81259, Professional & Other Places
4e67e38e036454776db1fb3a, Residence
4d4b7105d754a06378d81259, Shop & Service
4d4b7105d754a06379d81259 Travel & Transport
making my query into: venues/search?
intent=checkin&ll=37.424782,-122.162989&radius=60&categoryId=4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
Query Result: https://developer.foursquare.com/docs/explore#req=venues/search%3Fintent%3Dcheckin%26ll%3D37.424782,-122.162989%26radius%3D60%26categoryId%3D4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
It still has results outside of my radius still ... but it's an acceptable error margin ... it is weird however.
Although this question is old I'm responding for others. I was working on something similar to this recently and what I learned was that in order to use radius you also need to use the 'query' parameter. What I did was to use the star character '*' and it worked for me. I have to say though that the limit of 50 is something I haven't solved yet which I'm working on at the moment.

Resources