Using quantopian for data analysis - python-3.x

I want to know were Quantopian gets data from?
If I want to do an analysis on a stock market other than NYSE, will I get the data? If not, can I manually upload the data so that I can run my algorithms on it.

1.) Quantopian gets its data from several places, and provides most online although some are premium and require subscription.
2.) Yes, you can get standard stock market data, but if you have something like a Bloomberg, other subscription or something else you've built and want to pull it in, you can use fetcher.
The basic code is:
fetch_csv(url, pre_func=None, post_func=None, date_column='date',
date_format='%m/%d/%y', timezone='UTC', symbol=None, **kwargs)
Here is an example for something like Dropbox:
def initialize(context):
# fetch data from a CSV file somewhere on the web.
# Note that one of the columns must be named 'symbol' for
# the data to be matched to the stock symbol
fetch_csv('https://dl.dropboxusercontent.com/u/169032081/fetcher_sample_file.csv',
date_column = 'Settlement Date',
date_format = '%m/%d/%y')
context.stock = symbol('NFLX')
def handle_data(context, data):
record(Short_Interest = data.current(context.stock, 'Days To Cover'))

You can get data for non-NYSE stocks as well like Nasdaq securities. Screens are also available by fundamentals(market, exchange, market cap). These screens can limit stocks analyzed from the broad universe.

You can get stock data from Yahoo or other quant sites.

Related

How to use companies-house 0.1.2 python API wrapper to get company filing history?

I recently learned that Companies House has API that allows access to companies filling history and I want to get data from the API and load it in pandas dataframe.
I have set up API account but I am having difficulties with the python wrapper companies-house 0.1.2 https://pypi.org/project/companies-house/
from companies_house.api import CompaniesHouseAPI
ch = CompaniesHouseAPI('my_api_key')
This works, but when I try to get the data with get_company or get_company_filing_history I seem to pass incorrect parameters. I tried passing CompaniesHouseAPI.get_company('02627406') but get KeyError: 'company_number'. Quite puzzled as there is no example provided in the documentation. Please help me figure out what should I pass as a parameter/parameters in both functions.
# what errors
CompaniesHouseAPI.get_company('02627406')
I am not a python expert but want to learn by doing interesting projects. Please help. If you know how to get financial history from Companies House API using another python wrapper your solution is also welcome.
I recently wrote a blog post describing how to make your own wrapper and then use that to create an application that loads the data into a pandas dataframe as you described. You can find it here.
By creating your own wrapper class, you avoid the limitations of whichever library you have chosen. You may also learn a lot about calling an API from python and working with the response.
Here is a code example that does not need a Companies House-specific library.
import requests
import json
url = "https://api.companieshouse.gov.uk/search/companies?q={}"
query = "tesco"
api_key = "vLmk-4YxYS-QH8nMi8767zJSlcPlo3MKn41-d" #Fake key - insert your key here
response = requests.get(url.format(query),auth=(api_key,''))
json_search_result = response.text
search_result = json.JSONDecoder().decode(json_search_result)
for company in search_result['items']:
print(company['title'])
Running this should give you the top 20 matches for the keyword "tesco" from the Companies House search function. Check out the blog post to see how you could adapt this to perform any function from the API.

Extract entities from text using Knowledge Bases in Python

I have an entity extraction tasks which needs KBs like wikidata, freebase, DBpedia. Given the huge size of them, it is hard to download and extract entities from them. Is there a python client which can make API calls to get the extractions through them with unstructured text as input?
For DBPedia at least, you can use DBPedia Spotlight, something like that:
spotlight_url = 'http://api.dbpedia-spotlight.org/en/annotate?'
params = dict(text="Barack Obama was a president", confidence='0.2', support='10')
headers = {'Accept':'application/json'}
resp = requests_retry_session().get(url=spotlight_url, params=params,headers=headers)
results = resp.json()
If you were to do loads of queries, you'd have a local install of the knowledge base in a triplestore and a local install of Spotlight too.

Pandas Column Names Not Lining Up When .dat File Read

I'm going through Wes McKinney's Python for Data Analysis 2nd Edition and in Chapter 2 he has several examples based of merging three .dat files about movie reviews.
I can get two of the three data files to work (users and reviews), but the third one (movie titles) I can not get to work and can't figure out what to do.
Here's the code:
mnames = ['movie_id', 'title', 'genres']
movies = pd.read_table('movies.dat', sep = '::', header = None, engine = 'python', names = mnames)
print(movies[:5])
And here is what the output/problem looks like. Seems the file is not lining up the separator correctly and I've tried recreating the file and comparing to the other two files which are working but they look exactly the same.
Here's a sample data taken from here:
1::Toy Story (1995)::Animation|Children's|Comedy
2::Jumanji (1995)::Adventure|Children's|Fantasy
3::Grumpier Old Men (1995)::Comedy|Romance
4::Waiting to Exhale (1995)::Comedy|Drama
5::Father of the Bride Part II (1995)::Comedy
6::Heat (1995)::Action|Crime|Thriller
7::Sabrina (1995)::Comedy|Romance
8::Tom and Huck (1995)::Adventure|Children's
9::Sudden Death (1995)::Action
10::GoldenEye (1995)::Action|Adventure|Thriller
11::American President, The (1995)::Comedy|Drama|Romance
12::Dracula: Dead and Loving It (1995)::Comedy|Horror
13::Balto (1995)::Animation|Children's
14::Nixon (1995)::Drama
I'd like to be able to read this file properly so I can join it to the other two example files and keep learning Pandas :)
try adding encoding='UTF-16' to pd.read_table()
(Sorry, not enough reputation to add a comment.)

Python 3: Saving API Results into CSV

I'm writing a script which requires a daily updated CSV source file which lists many movie details and have decided to use Python3 to create and update it even though I don't know too much about it.
I believe I've got the code down to pull the information via TheMovieDB.org's API that I need, but currently can only get it to echo the results and not save in a CSV. Below are a couple of questions I have, the code that I currently have, and an example of it's current output.
Questions:
1. What do I need to do add to get the resulting data into a CSV? I've tried many things but so far haven't gotten anything to work
2. What would I need to add so that rerunning the script would completely overwrite the CSV produced from the last run? (not append or error out)
3. Optional: Unless tedious or a pain, it would be nice to have a column for each of the values provided per title within the CSV.
Thanks!!
Current Code
import http.client
import requests
import csv
conn = http.client.HTTPSConnection("api.themoviedb.org")
payload = "{}"
conn.request("GET", "/3/discover/movie?page=20&include_video=false&include_adult=false&sort_by=primary_release_date.desc&language=en-US&api_key=XXXXXXXXXXXXXXXXXXXXXXXXXXX", payload)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))
Result That's Echoed from the above Current Code
{"page":20,"total_results":360846,"total_pages":18043,"results":[{"vote_count":0,"id":521662,"video":false,"vote_average":0,"title":"森のかたみ","popularity":1.098018,"poster_path":"/qmj1gJ33lF7BhEOWAvK0mt6hRGH.jpg","original_language":"ja","original_title":"森のかたみ","genre_ids":[],"backdrop_path":null,"adult":false,"overview":"","release_date":"2019-01-01"},{"vote_count":0,"id":518636,"video":false,"vote_average":0,"title":"Stadtkomödie:
Geschenkt","popularity":1.189812,"poster_path":null,"original_language":"de","original_title":"Stadtkomödie:
Geschenkt","genre_ids":[35],"backdrop_path":null,"adult":false,"overview":"","release_date":"2019-01-01"},{"vote_count":0,"id":520720,"video":false,"vote_average":0,"title":"Kim
Possible","popularity":1.188148,"poster_path":"/3QGHTLgNKRphu3bLvGpoTZ1Ce9U.jpg","original_language":"en","original_title":"Kim
Possible","genre_ids":[10751,28,12],"backdrop_path":null,"adult":false,"overview":"Live-action
film adaptation of the Disney Channel original series Kim
Possible.","release_date":"2019-01-01"},{"vote_count":0,"id":521660,"video":false,"vote_average":0,"title":"Speak
Low","popularity":1.098125,"poster_path":"/qYQQlizCTfD5km7GIrTWrBb4E9b.jpg","original_language":"ja","original_title":"小さな声で囁いて","genre_ids":[],"backdrop_path":null,"adult":false,"overview":"","release_date":"2019-01-01"},{"vote_count":0,"id":497834,"video":false,"vote_average":0,"title":"Saturday Fiction","popularity":1.148142,"poster_path":null,"original_language":"zh","original_title":"兰心大剧院","genre_ids":[],"backdrop_path":null,"adult":false,"overview":"An
actress working undercover for the Allies in 1941 Shanghai discovers
the Japanese plan to attack Pearl
Harbor.","release_date":"2019-01-01"},{"vote_count":0,"id":523461,"video":false,"vote_average":0,"title":"Wie
gut ist deine
Beziehung?","popularity":1.188171,"poster_path":null,"original_language":"de","original_title":"Wie
gut ist deine
Beziehung?","genre_ids":[35],"backdrop_path":null,"adult":false,"overview":"","release_date":"2019-01-01"},{"vote_count":0,"id":507118,"video":false,"vote_average":0,"title":"Schwartz &
Schwartz","popularity":1.345715,"poster_path":null,"original_language":"de","original_title":"Schwartz
&
Schwartz","genre_ids":[80],"backdrop_path":null,"adult":false,"overview":"","release_date":"2019-01-01"},{"vote_count":0,"id":505916,"video":false,"vote_average":0,"title":"Kuru","popularity":1.107158,"poster_path":null,"original_language":"ja","original_title":"来る","genre_ids":[],"backdrop_path":null,"adult":false,"overview":"After
the inexplicable message, at his workplace, of a mysterious death, a
man is introduced to a freelance writer and his
girlfriend.","release_date":"2019-01-01"},{"vote_count":0,"id":521028,"video":false,"vote_average":0,"title":"Tsokos:
Zersetzt","popularity":1.115739,"poster_path":null,"original_language":"de","original_title":"Tsokos:
Zersetzt","genre_ids":[53],"backdrop_path":null,"adult":false,"overview":"","release_date":"2019-01-01"},{"vote_count":0,"id":516910,"video":false,"vote_average":0,"title":"Rufmord","popularity":1.658291,"poster_path":null,"original_language":"de","original_title":"Rufmord","genre_ids":[18],"backdrop_path":null,"adult":false,"overview":"","release_date":"2019-01-01"},{"vote_count":0,"id":514224,"video":false,"vote_average":0,"title":"Shadows","popularity":1.289124,"poster_path":null,"original_language":"en","original_title":"Shadows","genre_ids":[16],"backdrop_path":null,"adult":false,"overview":"Plot
kept under
wraps.","release_date":"2019-01-01"},{"vote_count":0,"id":483202,"video":false,"vote_average":0,"title":"Eli","popularity":1.118757,"poster_path":null,"original_language":"en","original_title":"Eli","genre_ids":[27],"backdrop_path":null,"adult":false,"overview":"A
boy receiving treatment for his auto-immune disorder discovers that
the house he's living isn't as safe as he
thought.","release_date":"2019-01-01"},{"vote_count":0,"id":491287,"video":false,"vote_average":0,"title":"Untitled Lani Pixels
Project","popularity":1.951231,"poster_path":null,"original_language":"en","original_title":"Untitled
Lani Pixels
Project","genre_ids":[10751,16,12,35],"backdrop_path":null,"adult":false,"overview":"Evil
forces have invaded an isolated island and have targeted Patrick and
Susan's grandfather, Mr. Campbell. Guided by Jack, a charming Irish
rogue, the siblings end up on a dangerous journey filled with magic
and
mystery.","release_date":"2019-01-01"},{"vote_count":2,"id":49046,"video":false,"vote_average":0,"title":"All
Quiet on the Western
Front","popularity":6.197559,"poster_path":"/jZWVtbxyztDTSM0LXDcE6vdVTVC.jpg","original_language":"en","original_title":"All
Quiet on the Western
Front","genre_ids":[28,12,18,10752],"backdrop_path":null,"adult":false,"overview":"A
young German soldier's terrifying experiences and distress on the
western front during World War
I.","release_date":"2018-12-31"},{"vote_count":1,"id":299782,"video":false,"vote_average":0,"title":"The
Other Side of the
Wind","popularity":4.561363,"poster_path":"/vnfNbuyPqo5zJavqlgI3J50xJSi.jpg","original_language":"en","original_title":"The
Other Side of the
Wind","genre_ids":[35,18],"backdrop_path":null,"adult":false,"overview":"Orson
Welles' unfinished masterpiece, restored and assembled based on
Welles' own notes. During the last 15 years of his life, Welles, who
died in 1985, worked obsessively on the film, which chronicles a
temperamental film director—much like him—who is battling with the
Hollywood establishment to finish an iconoclastic
work.","release_date":"2018-12-31"},{"vote_count":0,"id":289600,"video":false,"vote_average":0,"title":"The
Sandman","popularity":3.329464,"poster_path":"/eju4vLNx9sSvscowmnKNLi3sFVe.jpg","original_language":"en","original_title":"The
Sandman","genre_ids":[27],"backdrop_path":"/zo67d5klQiFR3PCyvER39IMwZ73.jpg","adult":false,"overview":"THE
SANDMAN tells the story of Nathan, a young student in the city who
struggles to forget his childhood trauma at the hands of the serial
killer dubbed \"The Sandman.\" Nathan killed The Sandman years ago, on
Christmas Eve, after he witnessed the murder of his mother... until he
sees the beautiful woman who lives in the apartment across the way
dying at the hands of that same masked killer. This brutal murder
plunges Nathan into an odyssey into the night country of his past, his
dreams... and the buried secrets of The
Sandman.","release_date":"2018-12-31"},{"vote_count":0,"id":378177,"video":false,"vote_average":0,"title":"Luxembourg","popularity":1.179703,"poster_path":null,"original_language":"en","original_title":"Luxembourg","genre_ids":[],"backdrop_path":null,"adult":false,"overview":"The
story of a group of people living in a permanent nuclear winter in the
ruins of the old civilisation destroyed by an atomic
war.","release_date":"2018-12-31"},{"vote_count":0,"id":347392,"video":false,"vote_average":0,"title":"Slice","popularity":3.248065,"poster_path":"/ySWPZihd5ynCc1aNLQUXmiw5H2V.jpg","original_language":"en","original_title":"Slice","genre_ids":[35],"backdrop_path":"/rtL9nzXtSvo1MW05kho9oeimCdb.jpg","adult":false,"overview":"When
a pizza delivery driver is murdered on the job, the city searches for
someone to blame: ghosts? drug dealers? a disgraced
werewolf?","release_date":"2018-12-31"},{"vote_count":0,"id":438674,"video":false,"vote_average":0,"title":"Dragged
Across
Concrete","popularity":3.659627,"poster_path":"/p4tpV4nGeocuOKhp0enuiQNDvhi.jpg","original_language":"en","original_title":"Dragged
Across
Concrete","genre_ids":[18,80,53,9648],"backdrop_path":null,"adult":false,"overview":"Two
policemen, one an old-timer (Gibson), the other his volatile younger
partner (Vaughn), find themselves suspended when a video of their
strong-arm tactics becomes the media's cause du jour. Low on cash and
with no other options, these two embittered soldiers descend into the
criminal underworld to gain their just due, but instead find far more
than they wanted awaiting them in the
shadows.","release_date":"2018-12-31"},{"vote_count":0,"id":437518,"video":false,"vote_average":0,"title":"Friend
of the
World","popularity":4.189267,"poster_path":"/hf3LucIg7t7DUvgGJ9DjQyHcI4J.jpg","original_language":"en","original_title":"Friend
of the
World","genre_ids":[35,18,27,878,53,10752],"backdrop_path":null,"adult":false,"overview":"After
a catastrophic war, an eccentric general guides a filmmaker through a
ravaged bunker.","release_date":"2018-12-31"}]}
import json
import http.client
import requests
import csv
conn = http.client.HTTPSConnection("api.themoviedb.org")
payload = "{}"
conn.request("GET", "/3/discover/movie?page=20&include_video=false&include_adult=false&sort_by=primary_release_date.desc&language=en-US&api_key=XXXXXXXXXXXXXXXXXXXXXXXXXXX", payload)
res = conn.getresponse()
data = res.read()
json_data = json.loads(data)
results=json_data["results"]
for item in results:
print (item('vote_count'))
#write code to get necessary objects to write in csv
This is a way how you can do it. Comment if you have any query.
That looks like a JSON object, so you can parse it into a python dictionary using:
import json
mydict = json.loads(data)
Probably the values you want are in mydict[results] which is another set of key:value pairs. Depending on how you want these you could use a CSV library or just iterate through them and the print the contents with a tab between them.
for item in vars["results"]:
for k in item:
print("{}\t{}".format(k,item.get(k)))

Random tweet collecting without keyword

I want a script code to collecting random tweet from Chicago without any keyword that every 30 min run automatically and collect tweet for 20 millisecond (for example)
All Available codes need keywords and in most of them I can't define geographic location.
Thanks for your helps.
See these pages : An Introduction to Text Mining using Twitter Streaming API and Python and this page too run a python script every hour
This is very doable. With Twitter's REST API a keyword is required; however, Twitter also provides a Streaming API which can use either a keyword or a location to filter tweets. In your case, you would need to define the bounding box of of Chicago in longitudes and latitudes. Then supply this to Twitter's statuses/filter endpoint documented here: https://developer.twitter.com/en/docs/tweets/filter-realtime/api-reference/post-statuses-filter.html. This endpoint has a locations parameter that you would use. It returns tweets as they are posted. No timer required.
You can use tweepy for this. Or, with TwitterAPI you would simply do something like this:
from TwitterAPI import TwitterAPI
api = TwitterAPI(CONSUMERKEY,CONSUMERSECRET,ACCESSTOKENKEY,ACCESSTOKENSECRET)
r = api.request('statuses/filter', {'locations':'-87.9,41.6,-87.5,42.0'})
for item in r:
print(item)

Resources