Automate the export of Facebook Insights data - excel

I'm looking for a way of programmatically exporting Facebook insights data for my pages, in a way that I can automate it. Specifically, I'd like to create a scheduled task that runs daily, and that can save a CSV or Excel file of a page's insights data using a Facebook API. I would then have an ETL job that puts that data into a database.
I checked out the oData service for Excel, which appears to be broken. Does anyone know of a way to programmatically automate the export of insights data for Facebook pages?

It's possible and not too complicated once you know how to access the insights.
Here is how I proceed:
Login the user with the offline_access and read_insights.
read_insights allows me to access the insights for all the pages and applications the user is admin of.
offline_access gives me a permanent token that I can use to update the insights without having to wait for the user to login.
Retrieve the list of pages and applications the user is admin of, and store those in database.
When I want to get the insights for a page or application, I don't query FQL, I query the Graph API: First I calculate how many queries to graph.facebook.com/[object_id]/insights are necessary, according to the date range chosen. Then I generate a query to use with the Batch API (http://developers.facebook.com/docs/reference/api/batch/). That allows me to get all the data for all the available insights, for all the days in the date range, in only one query.
I parse the rather huge json object obtained (which weight a few Mb, be aware of that) and store everything in database.
Now that you have all the insights parsed and stored in database, you're just a few SQL queries away from manipulating the data the way you want, like displaying charts, or exporting in CSV or Excel format.
I have the code already made (and published as a temporarily free tool on www.social-insights.net), so exporting to excel would be quite fast and easy.
Let me know if I can help you with that.
It can be done before the week-end.

You would need to write something that uses the Insights part of the Facebook Graph API. I haven't seen something already written for this.

Check out http://megalytic.com. This is a service that exports FB Insights (along with Google Analytics, Twitter, and some others) to Excel.

A new tool is available: the Analytics Edge add-ins now have a Facebook connector that makes downloads a snap.
http://www.analyticsedge.com/facebook-connector/

There are a number of ways that you could do this. I would suggest your choice depends on two factors:
What is your level of coding skill?
How much data are you looking to move?
I can't answer 1 for you, but in your case you aren't moving that much data (in relative terms). I will still share three options of many.
HARD CODE IT
This would require a script that accesses Facebook's GraphAPI
AND a computer/server to process that request automatically.
I primarily use AWS and would suggest that you could launch an EC2
and have it scheduled to launch your script at X times. I haven't used AWS Pipeline, but I do know that it is designed in a way that you can have it run a script automatically as well... supposedly with a little less server know-how
USE THIRD PARTY ADD-ON
There are a lot of people who have similar data needs. It has led to a number of easy-to-use tools. I use Supermetrics Free to run occasional audits and make sure that our tools are running properly. Supermetrics is fast and has a really easy interface to access Facebooks API's and several others. I believe that you can also schedule refreshes and updates with it.
USE THIRD PARTY FULL-SERVICE ETL
There are also several services or freelancers that can set this up for you at little to no work on your own. Depending on where you want the data. Stitch is a service I have worked with on FB-ads. There might be better services, but it has fulfilled our needs for now.
MY SUGGESTION
You would probably be best served by using a third-party add-on like Supermetrics. It's fast and easy to use. The other methods might be more worth looking into if you had a lot more data to move, or needed it to be refreshed more often than daily.

Related

Gather browser details used to browse webpage

I am tasked with the development of a web page.
As a part of this, I also need to collect the details of browser used by the users to browse the web page (along with version, timestamp & IP).
Tried searching over the internet but maybe my search was not properly directed.
I am not sure how to go about this.
Any pointer to get me started here would help me a long way.
I would also like to know how to store the information thus collected.
Not sure if this is a good idea - but just thinking out loud - are there any online service/channel available where the data can be uploaded in real time - like thingspeak.
Thanks
Google Analytics is great, but it doens't give you the data (for free). So if you want your data in e.g. SQL format then I may suggest a you use a tool that collects the data for you and then sends it to Google Analytics.
We use Segment (segment.io, but there are probably other tools out there too) for this, they have a free plan. You will need to create a Warehouse in AWS and all your data will be stored there, incl. details of the browser (version, timestamp, IP).
Disclaimer: I do not work for Segment, but I am a customer.
Enable Google Analytics in your website, then after 1 week, take a look at the Google Analytics page to see data that was collected.
Follow the guide here to configure Google Analytics on your website: https://support.google.com/analytics/answer/1008080?hl=en
You should go for alternatives like Segment(https://segment.com/), Mixpanel(https://mixpanel.com/) and similars. They can guarantee consistency for your data and also integrate to many different analytics tools and platforms.

How do I test restful APIs with constantly changing and random user defined data

I'm developing a clean-up API (github.com/Shadowys/btapi) for a Mediawiki application, Baka-Tsuki to pull meaningful data from the novel project pages like author, volume lists and cover images. The pages are user-defined and formatted in various ways decided by the translator(user). The pages are also updated and created daily, with the creation of new formats occurring sporadically. However, the API parser is able to handle most, if not all of the current pages, no matter their format.
Baka-Tsuki is not going to change into a database-based application in the near future, since the wiki is currently the most user friendly and cost-effective way to share translations, and we don't have enough developers to constantly work on a new application.
I'm looking into using mocha to automate testing of the API but as the input data constantly changes, testing is nearly impossible without checking every page available. I've looked at twitter and facebook testing methods but they have constantly formatted user input.
Is this case, which testing method should I refer to? Should I run the test simply based on the types returned, and the availability of the values returned or do I have to make a copy test-page to stimulate testing?

storing quick analytics using redis and node.js

I am new to redis and would like to store the web analytic of web site globally and per user activity .
Below is what i am stuck with.
// to get all unique ips
client.sadd('visitors',ip);
// to records hits per ip
client.hincrby('hits',ip,1);
The above so far works fine and i do get number of different ips and hit counter per ip.
the problem comes to store the activities made by each ip. i.e. Storing the link he clicked, searches he did, with datetime
Can some one please throw light on how to best manage it.
Thanks
the problem comes to store the activities made by each
You will need a separate structure for storing these.
The simplest rational structure is to have a "list of actions by session". Take a look at the sorted sets commands which provide a basic framework for creating a list of actions within a session.
This will get you something quickly. However, this is probably not what you really want. In fact redis is probably not useful for this at all.
If you want to re-trace an entire site visit you really want to connect to some sort of true analytics framework. There are dozens of website tracking tools that provide this type of functionality, so it's not really clear that building one is very efficient.

Use Google Analytics for data to display on our webpage?

On some of our pages, we display some statistics like number of times that page has been viewed today, number of times it's been viewed the past week, etc. Additionally, we have an overall statistics page where we list the pages, in order, that have been viewed the most.
Today, we just insert these pageviews and event counts into our database as they happen. We also send them to Google Analytics via normal page tracking and their API. Ideally, instead of querying our database for these stats to display on our webpages, we just query Google Analytics' API. Google Analytics does a FAR better job figuring out who the real uniques are and avoids counting people who artificially inflate their pageview counts (we allow people to create pages on our site).
So the question is if it's possible to use Google Analytics' API for updating the statistics on our webpages? If I cache the results is it more feasible? Or just occasionally update our stats? I absolutely love Google Analytics for our site metrics, but maybe there's a better solution for this particular need?
So the question is if it's possible to use Google Analytics' API for updating the statistics on our webpages?
Yes, it is. But, the authentication process and xml return may slow things up. You can speed it up by limiting the rows/columns returned. Also, authentication for the way you want to display the data (if I understood you correctly) would require you to use the client authentication method. You send the username and password. Security is an issue.
I have done exactly what you described but had to put a loading graphic on the page for the stats.
If I cache the results is it more feasible? Or just occasionally update our stats?
Either one but caching seems like it would work especially since GA data is not real-time data anyway. You could make the api call and store (or process then store) the returned xml for display later.
I haven't done this but I think I might give it a go. Could even run as a scheduled job.
I absolutely love Google Analytics for our site metrics, but maybe there's a better solution for this particular need?
There are some third-party solutions (googling should root them out) but money and feasibility should be considered.

Is it possible to create an app for a site without an API?

I would like to create an app for a myBB forum. So the site on the forum will look nicer and much more cleaner on an iPhone or Android.
Is it possible without an API? It isn't my site ether.
everything is possible, it's just a matter of resources...
technically, you can write an app for everything on the web, but:
an API will tell you how you can do things with the site, without having to reverse engineer all pages/posts/..., and the format of every output resulting from post/get operations. reverse engineering may take a long time, and you will surely not come accross all possible results (error pages, bad authentication...);
an API is quite stable and is always updated with great care from the developpers so as not to break existing applications. without an API, there is no guarantees that your app will not break with the next release of the forum when it is upgraded;
a web API generally defines an output format which is easily parseable: many API outputs XML or JSON, which can be processed with standard libraries. without an API, the output format is plain HTML, which may be difficult to reorganize in order to show the results in a different format.
so, yes, you can definitely write an app for a myBB forum, but it may require a fair amount of work.
You can do, it's called screen scraping and is what was done before XML, the semantic web, SOAP, web services and then JSON apis tried to solve the problem better.
In screen scraping, you grab the site's HTML, parse it, get the data you want out of it, then do what you need with that data. It's more work, and breaks each time the site's layout changes, hence the history of improvements to it.
You mention the site in question is not yours. Many sites do not regard screen scraping as fair use, so check with the site's terms and conditions that you can legally create an app from the data posted there.
you can consider useing HTML5 ... do you think it doable for use app ?

Resources