Python/Selenium - How to process extracted data from App store - python-3.x

I am using Selenium/Python to parse reviews from Apple app store. I used the following code to extract the data for the first five reviews:
URL: https://apps.apple.com/us/app/lemonade-insurance/id1055653645#see-all/reviews
wait = WebDriverWait(driver, 5)
response_ratings = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".we-customer-review")))
response_container = []
for e in response_ratings[:5]:
response_container.append(e.get_attribute('innerHTML'))
print(response_container[0])
Then, I print the first output
I expect to have star 5 out of 5, date July 6, 2019, title Convenient and Affordable!!!!, review The Lemonade app is so easy to use as well as having affordable rates!..., and Developer Response Thanks so much for your awesome review!! We're so happy to have you in... for the first review.
How do I get the above info? Thank you in advance for the help

You can use BeautifulSoup to parse the innerHTML and get what you're looking for.
One way of doing it would be:
import re
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
link = 'https://apps.apple.com/us/app/lemonade-insurance/id1055653645#see-all/reviews'
stars = re.compile(r"\d out of \d")
with webdriver.Chrome() as driver:
wait = WebDriverWait(driver, 10)
driver.get(link)
elements = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".we-customer-review")))
for elem in elements:
s = BeautifulSoup(elem.get_attribute("innerHTML"), "html.parser")
review_date = s.find("time").text
review_body = s.find("p").text
review_title = s.find("h3", {"data-test-customer-review-title": ""}).text.strip()
review_stars = ''.join(re.findall(stars, str(s.find("figure"))))
dev_response = s.find_all("p", {"data-test-bidi": ""})
print(f"{review_title} | {review_date} | {review_stars}")
print(review_body)
print(dev_response[1].text if len(dev_response) > 1 else "")
print("-" * 80)
This prints:
Convenient and Affordable!!!! | 07/06/2019 | 5 out of 5
The Lemonade app is so easy to use as well as having affordable rates! It took me all of 15 minutes to sign-up, pick a coverage and a deductible. Very nice customer service as well as very informative. At anytime of any day of the week I can log onto my account and check everything as well as make any necessary changes that I may need or want. The chat option feature works fantastic! Whenever I have a question I just go to the chat feature and within seconds someone is there to help and answer all my questions regarding their services and my plan coverage. I wished they had a referral feature cause I’ve already set up a couple of family members with the company as well. They were amazed that it only took about 10-15 minutes to setup and just how affordable it is!! I’ve gotten quotes from so many other companies but the monthly payments and deductible were too expensive, I was a little hesitant at first but I said hey I should at least give it a try and so far so good!!! I’m hoping that it’ll never come to the point where I’ll actually need to file a claim but if so I feel confident that the process will be easy and stress free considering how much stress I’m going to actually have due to a burglary or theft. I have faith that we’re going to have a very long relationship. Thanks to all the developers of the Lemonade App, the name of the company is nice too!!!!
Thanks so much for your awesome review!! We're so happy to have you in our Lemonade community!
--------------------------------------------------------------------------------
Made the best lemonade I’ve ever had! | 07/01/2018 | 5 out of 5
I have been telling EVERYONE about Lemonade. I don’t know how, but, getting insurance through your company is actually FUN! I have never had so much FUN doing a chore that typically involves a boring Q & A. The app really made me feel like I was getting insurance through a friend. I smiled with the shout out to my horoscope sign after entering my birthday, I loved the “making lemonade” process when getting the quote, and (because I have a future date for the policy to start) I absolutely adore the countdown. I look at it almost daily and become even more excited for my move (and I really don’t like having to move so this is really helping). The use of unclaimed funds going to charity actually makes my heart melt. By providing freedom to choose what organization you would like to contribute to truly makes me feel like I am giving back in some way, and it is beyond noble and inspiring for you to use that money to help others turn lemons into lemonade. Also, your customer service has been impeccable. Every question I have had has been answered quickly and by a friendly representative of the company. I don’t know much about the insurance world, other than we need to have it, but, you make me want to work for you!!! Where do I sign up?
--------------------------------------------------------------------------------

Related

Kite autocomplete is not giving me free suggestions - only 30 per day - no pro suggestions

I installed the Kite autocomplete for Python (Anaconda-Spyder is my main IDE , though I also use SublimeText)
And i get this pic:
Kite Pro ripping me off
(Even when I do not get any starred autocompletions of any sort ...)
Isn't it supposed to say 3 starred completions left today??
Even when i sign out and back in , uninstall Kite, clear my registry and remove temp and program files, i notice it somehow knows that I have used up some of its 30 completions and gives me only 30 per day
Even if I sign out of Kite (no way for it to know I have an account with a Pro free trial , right?? ),I still get only 30 completions per day, and after that kite is "locked" - null completions today ...
And i don't even get the Starred completions - the Pro completions
So is it a mistake on my installation ? Will adding Kite to my PATH help?? (I seriously doubt it )
Or is it a bug on the devs' side ??
Any help regarding this matter would be appreciated ...
I simply cannot stand the fact that my bro gets FREE autocompletions per day, and that too unlimited ones at that ( which I guess is the free tier of Kite , since he gets only 3 starred ones)
I think there might be a bug on Kite's side.
However, I do not face the problem mentioned above :
https://i.stack.imgur.com/259T8.png |
I receive unlimited free tier auto-completions and only 3 Pro completions.Hope this answer helps you!

Searching a lot of keywords on twitter via tweepy

I am trying to make a python code with tweepy that will track all the tweets from a specific country from a date which will have some of the chosen specific keywords. And I have chosen a lot of keywords like 24-25.
My keywords are vigilance anticipation interesting ecstacy joy serenity admiration trust acceptance terror fear apprehensive amazement surprize distraction grief sadness pensiveness loathing disgust boredom rage anger annoyance.
for more understanding, my code till now is:
places = api.geo_search(query="Canada",granularity="country")
place_id = places[0].id
public_tweets = tweepy.Cursor(api.search,
q="place:"+place_id+" since:2020-03-01",
lang="en",
).items(num_tweets)
Please help me with this question as soon as possible.
Thank You

How to combine multiple columns in CSV file using pandas? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I have a csv file for lyrics songs that I took from Genius. Right now, I m preparing my data. I have two column "songs" and "artist". In the "songs" columns I have a lot information: title, album, year, lyrics and URL. I need to separate the column "songs" in 5 columns.
Then I tried to split the data by comma like this:
df = pd.read_csv('output.csv', header=None)
df = pd.DataFrame(df[0].str.split(',').tolist())
But with this code, I got 122 columns, because all the time that I have a comma in lyrics was created other column:
I guess I have to keep all my lyrics inside of double quotes, then if I split by comma the full lyric will remains in one single column.
Someone know how I can do that?
Adding 1 sample of the data:
songs,artist
"{'title': 'Berzerk', 'album': 'The Marshall Mathers LP 2', 'year': '2013-08-27', 'lyrics': '[Verse 1]\nNow this shit\'s about to kick off, this party looks wack\nLet\'s take it back to straight hip-hop and start it from scratch\nI\'m \'bout to bloody this track up, everybody get back\nThat\'s why my pen needs a pad, \'cause my rhymes on the rag\nJust like I did with addiction, I\'m \'bout to kick it\nLike a magician, critics I turn to crickets\nGot \'em still on the fence whether to picket\nBut quick to get it impaled when I tell \'em, ""Stick it!""\nSo sick I\'m looking pale, wait, that\'s my pigment\n\'Bout to go ham, ya bish, shout out to Kendrick\nLet\'s bring it back to that vintage Slim, bitch!\nThe art of MCing mixed with da Vinci and MC Ren\nAnd I don\'t mean Stimpy\'s friend, bitch\nBeen Public Enemy since you thought PE was gym, bitch\n\n[Pre-Chorus]\nKick your shoes off, let your hair down\n(And go berserk) all night long\nGrow your beard out, just weird out\n(And go berserk) all night long\n\n[Chorus 1]\nWe\'re gonna rock this house until we knock it down\nSo turn the volume loud\n\'Cause it\'s mayhem \'til the A.M.\nSo, baby, make just like K-Fed\nAnd let yourself go, let yourself go\nSay ""Fuck it!"" before we kick the bucket\nLife\'s too short to not go for broke\nSo everybody, everybody, go berserk, grab your vial, yeah\n\n[Verse 2]\nGuess it\'s just the way that I\'m dressed, ain\'t it?\nKhakis pressed, Nike shoes crispy and fresh laced\nSo I guess it ain\'t that aftershave\nOr cologne that made \'em just faint\nPlus I showed up with a coat fresher than wet paint\nSo if love is a chess game, check mate\nBut girl, your body\'s bangin\', jump me in, dang, bang-bang\nYes siree \'Bob\', I was thinking the same thang\nSo come get on this Kid\'s rock, baw with da baw, dang-dang\nPow-p-p-p-pow, chica, pow, chica, wow-wow\nGot your gal blowin\' up a valve, valve-valve\nAin\'t slowin\' down, throw in the towel, towel-towel\nDumb it down, I don\'t know how, huh-huh, how-how\nAt least I know that I don\'t know\nQuestion is, are you bozos smart enough to feel stupid?\nHope so, now ho…\n\n[Pre-Chorus]\nKick your shoes off, let your hair down\n(And go berserk) all night long\nGrow your beard out, just weird out\n(And go berserk) all night long\n\n[Chorus 2]\nWe\'re gonna rock this house until we knock it down\nSo turn the volume loud\n\'Cause it\'s mayhem \'til the A.M.\nSo crank the bass up like crazy\nAnd let yourself go, let yourself go\nSay ""Fuck it!"" before we kick the bucket\nLife\'s too short to not go for broke\nSo everybody, everybody, go berzerk, get your vinyls!\n\n[Scratch]\n\n[Verse 3]\nThey say that love is powerful as cough syrup in styrofoam\nAll I know is I fell asleep and woke up in that Monte Carlo\nWith the ugly Kardashian, Lamar, oh\nSorry yo, we done both set the bar low\nFar as hard drugs are though, that\'s the past\nBut I done did enough codeine to knock Future into tomorrow\nAnd girl, I ain\'t got no money to borrow\nBut I am tryin\' to find a way to get you alone: car note\nOh, Marshall Mathers\nShithead with a potty mouth, get the bar of soap lathered\nKangol\'s and Carheartless Cargos\nGirl, you\'re fixin\' to get your heart broke\nDon\'t be absurd, ma\'am, you birdbrain, baby\nI ain\'t called anybody baby since Birdman, unless you\'re a swallow\nWord, Rick? (Word, man, you heard)\nBut don\'t get discouraged, girl\nThis is your jam, unless you got toe jam\n\n[Pre-Chorus]\nKick your shoes off, let your hair down\n(And go berserk) all night long\nGrow your beard out, just weird out\n(And go berserk) all night long\n\n[Chorus 1]\nWe\'re gonna rock this house until we knock it down\nSo turn the volume loud\n\'Cause it\'s mayhem \'til the A.M.\nSo, baby, make just like K-Fed\nAnd let yourself go, let yourself go\nSay ""Fuck it!"" before we kick the bucket\nLife\'s too short to not go for broke\nSo everybody, everybody, go berserk, grab your vial, yeah', 'image': 'https://images.genius.com/a47bb228d28fd8a0e6e73abfabef7832.1000x1000x1.jpg'}",Eminem
Try this.
import ast
import pandas as pd
raw = pd.read_csv("output.csv")
raw["songs"] = raw["songs"].apply(lambda x: ast.literal_eval(x))
songs = raw["songs"].apply(pd.Series)
result = pd.concat([raw[["artist"]], songs], axis=1)
result.head()

Finding Related Topics using Google Knowledge Graph API

I'm currently working on a behavioral targeting application and I need a considerably large keyword database/tool/provider that enables applications to reach to the similar keywords via given keyword for my app. I've recently found that Freebase, which had been providing a similar service before Google acquired them and then integrated to their Knowledge Graph. I was wondering if it's possible to have a list of related topics/keywords for the given entity.
import json
import urllib
api_key = 'API_KEY_HERE'
query = 'Yoga'
service_url = 'https://kgsearch.googleapis.com/v1/entities:search'
params = {
'query': query,
'limit': 10,
'indent': True,
'key': api_key,
}
url = service_url + '?' + urllib.urlencode(params)
response = json.loads(urllib.urlopen(url).read())
for element in response['itemListElement']:
print element['result']['name'] + ' (' + str(element['resultScore']) + ')'
The script above returns the queries below, though I'd like to receive related topics to yoga, such as health, fitness, gym and so on, rather than the things that has the word "Yoga" in their name.
Yoga Sutras of Patanjali (71.245544)
Yōga, Tokyo (28.808222)
Sri Aurobindo (28.727333)
Yoga Vasistha (28.637642)
Yoga Hosers (28.253984)
Yoga Lin (27.524054)
Patanjali (27.061115)
Yoga Journal (26.635073)
Kripalu Center (26.074436)
Yōga Station (25.10318)
I'd really appreciate any suggestions, and I'm also open to using any other API if there is any that I could make use of. Cheers.
See your point:) So here's the script I use for that using Serpstat's API. Here's how it works:
Script collects the keywords from Serpstat's database
Then, collects search suggestions from Serpstat's database
Finally, collects search suggestions from Google's suggestions
Note that to make script work correctly, it's preferable to fill all input boxes. But not all of them are required.
Keyword — required keyword
Search Engine — a search engine for which the analysis will be carried out. For example, for the US Google, you need to set the g_us. The entire list of available search engines can be found here.
Limit the maximum number of phrases from the organic issue, which will participate in the analysis. You cannot set more than 1000 here.
Default keys — list of two-word keywords. You should give each of them some "weight" to receive some kind of result if something goes wrong.
Format: type, keyword, "weight". Every keyword should be written from a new line.
Types:
w — one word
p — two words
Examples:
"w; bottle; 50" — initial weight of word bottle is 50.
"p; plastic bottle; 30" — initial weight of phrase plastic bottle is 30.
"w; plastic bottle; 20" — incorrect. You cannot use a two-word phrase for the "w" type.
Bad words — comma-separated list of words you want the script to exclude from the results.
Token — here you need to enter your token for API access. It can be found on your profile page.
You can download the source code for script here

Searching closest venues using ll and radius not working properly

I know there are a lot of questions about this issue, but I've reached a point where I can't really do anything else but to ask if somebody else has a solution for this issue...
Using the Foursquare api explorer to test out my query I can't seem to obtain an accurate or even good fit for the data I need to obtain.
It is quite simple. I need to obtain the closest venue from a set of coordinates. I don't mind not having results if nothing is found near by.
So, reading the API documentation (https://developer.foursquare.com/docs/venues/venues) I conclude that I need a search and not an explore because I don't want sugestions of recommended venues (and the results when I tested it proved that it wasn't what I was expecting).
So, using search api I want to find places (the place, but places would do...) close to these coordinates
ll=37.424782,-122.162989
considering that I want places close by, I add
radius=51
and I don't really want many results
limit=2
from the documentation I see that radius is
Only valid for requests with intent=browse, or requests with
intent=checkin and categoryId or query
so, I use
intent=browse
which concludes my query to:
venues/search?intent=browse&ll=37.424782,-122.162989&radius=51&limit=2
Query Result:
https://developer.foursquare.com/docs/explore#req=venues/search%3Fintent%3Dbrowse%26ll%3D37.424782,-122.162989%26radius%3D51%26limit%3D2
Here we can see that the first result is straight outside of the radius ... distance: 135
the second result however is cool ... distance: 50
What am I doing wrong to get these results? If I increase the limit all I get is more results that are also outside the radius, I could iterate through them and find the one with the smallest distance... but I have no guarantee that the closest result will be on the top X that I limit, even If I had that guarantee, it would be a tiresome solution to an apparently simple question...
Thanks for the help...
Marc
EDIT:
I managed to make have the query perform as I intended ... But I had to add all of the parent categories from:
https://developer.foursquare.com/categorytree
categoryId=
4d4b7104d754a06370d81259, Arts & Entertainment
4d4b7105d754a06372d81259, College & University
4d4b7105d754a06373d81259, Event
4d4b7105d754a06374d81259, Food
4d4b7105d754a06376d81259, Nightlife Spot
4d4b7105d754a06377d81259, Outdoors & Recreation
4d4b7105d754a06375d81259, Professional & Other Places
4e67e38e036454776db1fb3a, Residence
4d4b7105d754a06378d81259, Shop & Service
4d4b7105d754a06379d81259 Travel & Transport
making my query into:
venues/search?
intent=checkin&ll=37.424782,-122.162989&radius=60&categoryId=4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
Query Result:
https://developer.foursquare.com/docs/explore#req=venues/search%3Fintent%3Dcheckin%26ll%3D37.424782,-122.162989%26radius%3D60%26categoryId%3D4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
It still has results outside of my radius still ... but it's an acceptable error margin ... it is weird however.
I managed to make have the query perform as I intended ... But I had to add all of the parent categories from:
https://developer.foursquare.com/categorytree
categoryId=
4d4b7104d754a06370d81259, Arts & Entertainment
4d4b7105d754a06372d81259, College & University
4d4b7105d754a06373d81259, Event
4d4b7105d754a06374d81259, Food
4d4b7105d754a06376d81259, Nightlife Spot
4d4b7105d754a06377d81259, Outdoors & Recreation
4d4b7105d754a06375d81259, Professional & Other Places
4e67e38e036454776db1fb3a, Residence
4d4b7105d754a06378d81259, Shop & Service
4d4b7105d754a06379d81259 Travel & Transport
making my query into: venues/search?
intent=checkin&ll=37.424782,-122.162989&radius=60&categoryId=4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
Query Result: https://developer.foursquare.com/docs/explore#req=venues/search%3Fintent%3Dcheckin%26ll%3D37.424782,-122.162989%26radius%3D60%26categoryId%3D4d4b7104d754a06370d81259,4d4b7105d754a06372d81259,4d4b7105d754a06373d81259,4d4b7105d754a06374d81259,4d4b7105d754a06376d81259,4d4b7105d754a06377d81259,4d4b7105d754a06375d81259,4e67e38e036454776db1fb3a,4d4b7105d754a06378d81259,4d4b7105d754a06379d81259
It still has results outside of my radius still ... but it's an acceptable error margin ... it is weird however.
Although this question is old I'm responding for others. I was working on something similar to this recently and what I learned was that in order to use radius you also need to use the 'query' parameter. What I did was to use the star character '*' and it worked for me. I have to say though that the limit of 50 is something I haven't solved yet which I'm working on at the moment.

Resources