I am a newbie working with facebook python sdk. I found this file that enabled me to obtain event data from my facebook business page.
However, I am unable to fetch all of the fields I need. The missing field I am unable to obtain is the event videos. I am unable to retrieve the video source or video_url for the video that was posted with the event.
I have tried many different variations of fields =
to no avail. Thank You
import facebook
import requests
def some_action(post):
""" Here you might want to do something with each post. E.g. grab the
post's message (post['message']) or the post's picture (post['picture']).
In this implementation we just print the post's created time.
"""
print(post["id"])
# You'll need an access token here to do anything. You can get a temporary one
# here: https://developers.facebook.com/tools/explorer/
access_token = "My_Access_Token"
# Look at Bill Gates's profile for this example by using his Facebook id.
user = "397894520722082"
graph = facebook.GraphAPI(access_token)
profile = graph.get_object(user)
posts = graph.get_connections(profile['id'], 'posts')
print(graph.get_object(id=397894520722082,
fields='events{videos,id,category,name,description,cover,place,start_time,end_time,interested_count,attending_count,declined_count,event_times,comments}'))
# Wrap this block in a while loop so we can keep paginating requests until
# finished.
while True:
try:
# Perform some action on each post in the collection we receive from
# Facebook.
[some_action(post=post) for post in posts["data"]]
# Attempt to make a request to the next page of data, if it exists.
posts = requests.get(posts["paging"]["next"]).json()
except KeyError:
# When there are no more pages (['paging']['next']), break from the
# loop and end the script.
break
Related
I'm a very junior developer, tasked with automating the creation, download and transformation of a query from Stripe Sigma.
I've been able to get the bulk of my job done: I have daily scheduled queries that generate a report for the prior 24 hours, which is linked to a dummy account purely for those reports, and I've got the Transformation and reports done on the back half of this problem.
The roadblock I've run into though is getting this code to pull the csv that manually clicking the link generates.
import re
from imbox import Imbox # pip install imbox
import traceback
import requests
from bs4 import BeautifulSoup
mail = Imbox(host, username=username, password=password, ssl=True, ssl_context=None, starttls=False)
messages = mail.messages(unread=True)
message_list = []
for (uid, message) in messages:
body = str(message.body.get('html'))
message_list.append(body)
mail.logout()
def get_download_link(message):
print(message[0])
soup = BeautifulSoup(message, 'html.parser')
urls = []
for link in soup.find_all('a'):
print(link.get('href'))
urls.append(link.get('href'))
return urls[1]
# return urls
dl_urls = []
for m in message_list:
dl_urls.append(get_download_link(m))
for url in dl_urls:
print(url)
try:
s = requests.Session()
s.auth = (username, password)
response = s.get(url, allow_redirects=True, auth= (username, password))
# print(response.headers)
if (response.status_code == requests.codes.ok):
print('response headers', response.headers['content-type'])
response = requests.get(url, allow_redirects=True, auth= HTTPDigestAuth(username, password))
# print(response.text)
print(response.content)
# open(filename, 'wb').write(response.content)
else:
print("invalid status code",response.status_code)
except:
print('problem with url', url)
I'm working on this in jupyter notebooks, I've tried to just include relevant code detailing how I got into the email, how I extracted URLS from said email, and which one upon being clicked would download the csv.
All the way til the last step, I've had remarkably good luck, but now, the URL that I manually click downloads the csv as expected, however that same URL is being treated as the HTML for a stripe page by python/requests.
I've tried poking around in the headers, the one header that was suggested on another post ('Content-Disposition') wasn't present, and the printing the headers that are present takes up a good 20-25 lines.
Any suggestions on either headers that could contain the csv, or other approaches I would take would be appreciated.
I've included a (intentionally broken) URL to show the rough format of what is working for manual download, not working when kept in python entirely.
https://59.email.stripe.com/CL0/https:%2F%2Fdashboard.stripe.com%2Fscheduled_query_runs%xxxxxxxxxxxxxxxx%2Fdownload/1/xxxxxxxxxxxx-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx-000000/xxxxxxxxx-xxxxx_xxxxxxxxxxxxxxxx=233
If you're using scheduled queries, you can receive notification about completion/availability as webhooks and then access the files programmatically using the url included in the event payload. You can also list/retrieve scheduled queries via the API and check their status before accessing the data at the file link.
There should be no need to parse these links out of an email.
You can also set up a Stripe webhook for the Stripe Sigma query run, pointing that webhook notification to a Zap webhook trigger. Then:
Catch the hook in Zapier
Filter by the Sigma query name
Use another Zapier webhook custom request to authenticate and grab the data object file URL
Use the utilities formatter by Zapier to transform the results into a CSV
file
I'm trying to automate filling a car insurance quote form on a site:
(following the same format as the site URL lets call it: "https://secure.examplesite.com/css/car/step1#noBack")
I'm stuck on the rego section as once the rego has been added, a button needs to be clicked to perform the search and it seems this is heavy Javascript and I know mechanizer can't handle this. I'm not versed in JavaScript, but I can see that when clicking the button a POST request is made to this URL: ("https://secure.examplesite.com/css/car/step1/searchVehicleByRegNo") Please see image also.
How can I emulate this POST request in Mechanize to run the javascript? So I can see the response / interact with the response? Or is this not possible? Can I consider bs4/requests/robobrowser instead. I'm only ~4 months into learning! Thanks
# Mechanize test
import mechanize
br = mechanize.Browser()
br.set_handle_robots(False) # ignore robots
br.set_handle_refresh(False) # can sometimes hang without this
res = br.open("https://secure.examplesite.com/css/car/step1#noBack")
br.select_form(id = "quoteCollectForm")
br.set_all_readonly(False) # allow everything to be written to
controlDict = {}
# List all form controls
for control in br.form.controls:
controlDict[control.name] = control.value
print("type = %s, name = %s, value = %s" %(control.type, control.name, control.value))
# Enter Rego etc "example"
br.form["vehicle.searchRegNo"] = "example"
# Now for control name = vehicle.searchRegNo, value = example
# BUT Now how do I click the button?? Simulate POST? The post url is formatted like:
# https://secure.examplesite.com/css/car/step1/searchVehicleByRegNo
Javascript POST
Solved my own problem-
Steps:
open dev tools in browser
Go to network tab and clear
interact with form element (in my case car rego finder)
click on the event that occurs from interaction
copy the exact URL, Request header data, and payload
I used Postman to quickly test the request and responses were correct / the same as the Webform and found the relevant headers
in postman convert to python requests code
Now I can interact completely with the form
I have Angular 8 web app. It needs to send some data for analysis to python flask App. This flask App will send it to 3rd party service and get response through webhook.
My need is to provide a clean interface to the client so that there is no need to provide webhook from client.
Hence I am trying to initiate a request from Flask app, wait until I get data from webhook and then return.
Here is the code.
#In autoscaled micro-service this will not work. Right now, the scaling is manual and set to 1 instance
#Hence keeping this sliding window list in RAM is okay.
reqSlidingWindow =[]
#app.route('/handlerequest',methods = ['POST'])
def Process_client_request_and_respond(param1,param2,param3):
#payload code removed
#CORS header part removed
querystring={APIKey, 'https://mysvc.mycloud.com/postthirdpartyresponsehook'}
response = requests.post(thirdpartyurl, json=payload, headers=headers, params=querystring)
if(response.status_code == SUCCESS):
respdata = response.json()
requestId = respdata["request_id"]
requestobject = {}
requestobject['requestId'] = requestId
reqSlidingWindow.append(requestobject)
#Now wait for the response to arrive through webhook
#Keep checking the list if reqRecord["AllAnalysisDoneFlag"] is set to True.
#If set, read reqRecord["AnalysisResult"] = jsondata
jsondata = None
while jsondata is None:
time.sleep(2)
for reqRecord in reqSlidingWindow:
if(reqRecord["requestId"] == da_analysis_req_Id ):
print("Found matching req id in req list. Checking if analysis is done.")
print(reqRecord)
if(reqRecord["AllAnalysisDoneFlag"] == True):
jsondata = reqRecord["AnalysisResult"]
return jsonify({"AnalysisResult": "Success", "AnalysisData":jsondata})
#app.route('/webhookforresponse',methods = ['POST'])
def process_thirdparty_svc_response(reqIdinResp):
print(request.data)
print("response receieved at")
data = request.data.decode('utf-8')
jsondata = json.loads(data)
#
for reqRecord in reqSlidingWindow:
#print("In loop. current analysis type" + reqRecord["AnalysisType"] )
if(reqRecord["requestId"] == reqIdinResp ):
reqRecord["AllAnalysisDoneFlag"] = True
reqRecord["AnalysisResult"] = jsondata
return
I'm trying to maintain sliding window of requests in list. Upon the
Observations so far:
First, this does not work. The function of 'webhookforresponse' does not seem to run until my request function comes out. i.e. my while() loop appears to block everything though I have a time.sleep(). My assumption is that flask framework would ensure that callback is called since sleep in another 'route' gives it adequate time and internally flask would be multithreaded?
I tried running python threads for the sliding window data structure and also used RLocks. This does not alter behavior. I have not tried flask specific threading.
Questions:
What is the right architecture of the above need with flask? I need clean REST interface without callback for angular client. Everything else can change.
If the above code to be used, what changes should I make? Is threading required at all?
Though I can use database and then pick from there, it still requires polling the sliding window.
Should I use multithreading specific to flask? Is there any specific example with skeletal design, threading library choices?
Please suggest the right way to proceed to achieve abstract REST API for angular client which hides the back end callbacks and other complexities.
I have a python program which is doing millions of comparisons across records. Occasionally a comparison fails and I need to have a user (me) step and update a record. Today I do this by calling a function which:
creates a flask 'app'
creates and populates a wtform form to collect the necessary information
instantiates the flask app (e.g. app.run() and webbrowser.open() call to pull up the form)
I update the data in the form, when the form is submitted, the handler puts the updated data into a variable and then shuts down the flask app returning the
data to the caller
This seems kludgy. Is there a cleaner way of doing this recognizing that this is not a typical client-driven web application?
The minimal problem is how best to programmatically launch an 'as needed' web-based UI which presents a form, and then return back the submitted data to the caller.
My method works, but seems a poor design to meet the goal.
As I have not yet found a better way - in case someone else has a similar need, here is how I'm meeting the goal:
...
if somedata_needs_review:
review_status = user_review(somedata)
update_data(review_status)
...
def user_review(data_to_review):
""" Present web form to user and receive their input """
returnstruct = {}
app = Flask(__name__)
#app.route('/', methods=['GET'])
def show_review_form():
form = create_review_form(data_to_review)
return render_template('reviewtemplate.tpl', form=form)
# TODO - I currently split the handling into a different route because
# when using the same route previously Safari would warn of resubmitting data.
# This has the slightly unfortunate effect of creating multiple tabs.
#app.route('/process_compare', methods=['POST'])
def process_review_form():
# this form object will be populated with the submitted information
form = create_review_form(request.form, matchrecord=matchrecord)
# handle submitted updates however necessary
returnstruct['matched'] = form.process_changes.data
shutdown_flask_server()
return "Changes submitted, you can close this tab"
webbrowser.open('http://localhost:5000/', autoraise=True)
app.run(debug=True, use_reloader=False)
# The following will execute after the app is shutdown.
print('Finished manual review, returning {}'.format(returnstruct))
return(returnstruct)
def shutdown_flask_server():
func = request.environ.get('werkzeug.server.shutdown')
if func is None:
raise RuntimeError('Not running with the Werkzeug Server')
func()
So the title is a little confusing I guess..
I have a script that I've been writing that will display some random data and other non-essentials when I open my shell. I'm using grequests to make my API calls since I'm using more than one URL. For my weather data, I use WeatherUnderground's API since it will offer active alerts. The alerts and conditions data are on separate pages. What I can't figure out is how to insert the appropriate name in the grequests object when it is making requests. Here is the code that I have:
URLS = ['http://api.wunderground.com/api/'+api_id+'/conditions/q/autoip.json',
'http://www.ourmanna.com/verses/api/get/?format=json',
'http://quotes.rest/qod.json',
'http://httpbin.org/ip']
requests = (grequests.get(url) for url in URLS)
responses = grequests.map(requests)
data = [response.json() for response in responses]
#json parsing from here
In the URL 'http://api.wunderground.com/api/'+api_id+'/conditions/q/autoip.json' I need to make an API request to conditions and alerts to retrieve the data I need. How do I do this without rewriting a fourth URLS string?
I've tried
pages = ['conditions', 'alerts']
URL = ['http://api.wunderground.com/api/'+api_id+([p for p in pages])/q/autoip.json']
but, as I'm sure some of you more seasoned programmers know, threw and exception. So how can I iterate through these pages, or will I have to write out both complete URLS?
Thanks!
Ok I was actually able to figure out how to call each individual page within the grequests object by using a simple for loop. Here is the the code that I used to produced the expected results:
import grequests
pages = ['conditions', 'alerts']
api_id = 'myapikeyhere'
for p in pages:
URLS = ['http://api.wunderground.com/api/'+api_id+'/'+p+'/q/autoip.json',
'http://www.ourmanna.com/verses/api/get/?format=json',
'http://quotes.rest/qod.json',
'http://httpbin.org/ip']
#create grequest object and retrieve results
requests = (grequests.get(url) for url in URLS)
responses = grequests.map(requests)
data = [response.json() for response in responses]
#json parsing from here
I'm still not sure why I couldn't figure this out before.
Documentation for the grequests library here