Data To Munge: Stock Trading, Exchange Trading

Data To Munge: Stock Trading, Exchange Trading - core-data

I know that a lot of this information is probably entirely privatized, but does anyone know of a good source of real time information on what kind of trading activity is where in the market? It doesn't need to be fast enough to actually make informed trading decisions based on it, I'm more looking to aggregate it into some beautiful graphics. For fun. Because I have personal problems.
I'd be grateful for any help!

The best I'm aware of is the Yahoo Finance API. It'll give you delayed prices and some bid/ask stuff. There's a description of how it works here:
http://www.gummy-stuff.org/Yahoo-data.htm

Not sure, but I was of the opinion that Google Finance API was better than Yahoo:
http://code.google.com/apis/finance/

There was a project called OpenTick that planned on giving access to data from the exchanges themselves (eg., the Chicago Board of Trade), provided you paid the exchanges whatever fees were required. That project quietly died.
You can get some market benchmark data from the St Louis Fed. Aside from that, I haven't found anything better than Yahoo! Finance or Google Finance. Both the NASD and the NYSE give access to historical data on their websites, but I don't see any kind of web service interface.

Bloomberg open api http://www.openbloomberg.com/open-api/ which is recently made free can be used to get historical market data and also real time data. If you are looking for historical stock price there is a nice api http://www.quandl.com/ , you can get even more then 10 year old stock prices for co. in many formats.

I would have subscribed to the suggestion of the Google API, but it is not available anymore.
This post offers the best list of Financial Data accessible from R I've encountered online: http://www.r-bloggers.com/financial-data-accessible-from-r-part-iv/.
Yet this is not an R post. Beyond those sources, I would wholeheartedly recommend TD Ameritrade's Thinkorswim platform (www.thinkorswim.com). It is a trading platform with free real time data to US financial markets. You can open an account and keep just one cent on it if not needed for actual investing/trading.
Furthermore, I would recommend the Ninja Trader platform (http://ninjatrader.com), which offers free end of day historical data for US financial markets. You can export data from Ninja Trader to txt format and then import it into R or Python if so desired.

Related

How to get unsampled data from Google Analytics API in a specific day

I am building a package that uses the Google Analytics API for Python.
But, in severous cases when I have multiple dimensions the extraction by day is sampled.
I know that if I use sampling_level = LARGE will use a sample more accurate.
But, somebody knows if has a way to reduce a request that you can extract one day without sampling?
Grateful

setting sampling to LARGE is the only method we have to decide the amount of sampling but as you already know this doesn't prevent it.
The only way to reduce the chances of sampling is to request less data. A reduced number of dimensions and metrics as well as a shorter date range are the best ways to ensure that you dont get sampled data

This is probably not the answer you want to hear but, one way of getting unsampled data from Google analytics is to use unsampled reports. However this requires that you sign up for Google Marketing Platform. With these you can create an unsampled report request using the API or the UI.
There is also a way to export the data to Big Query. But you lose the analysis that Google provides and will have to do that yourself. This too requires that you sign up for Google Marketing Platform.

there are several tactics of building unsampled reports, most popular is splitting your report into shorter time ranges up to hours. Mark Edmondson did a great work on anti-sampling in his R package so you might find it useful. You may start with this blog post https://code.markedmondson.me/anti-sampling-google-analytics-api/

is there API for past NOAA weather forecasts (forecast archive)?

I'm looking for a source for old weather forecasts--yesterdays, last months, last years. For major cities in US.
Seems like it's easy to find future forecasts, and historical actual data, but not historical forecasts.

The product you're probably looking for is the National Digital Forecast Database, the gridded system the NWS uses to input most of its forecast. There's no API that I know of, but there are archived data files in places like here. This NWS page on degrib also offers some potential hints on what you may need.
The NWS does still also issue some specific point forecasts for certain locations, specialized forecasts for events like fires, plus forecast discussions, warning text, etc. If those are the types of things you are looking for, it may be a bit more of a slog to dig through and piece together find the product identifiers and archive resources you want. Iowa State offers a tool for accessing some of the past data, but only by office. You also may want to dig into some of the text products on their MTArchive site, particularly perhaps the Public files - the specific data is less organized, yet the simple layout may make it more straightforward to find what you need. This StormTrack thread may offer one final rabbit trail towards finding archives of NWS text products.
As mentioned in comments, you may also find there are additional users with useful input on the Earth Science Stack Exchange Beta community.

FourSquare vs. Google Places vs. Yelp API

I am trying to create an app that will help users find restaurants/movie theaters/malls/etc. to hang out based on ratings and distance. Other than just the place itself, I would also like to know more detailed information about the place. For example, if I were to look for parks, I would also like to know if theres a basketball or tennis court there. Ratings and popularity would also be an important aspect to prioritize suggestions.
After looking through all three of the APIs, I could not really find any substantial differences other than their search limits. Could anyone really differentiate each API for me? Maybe even recommend one based on my specific need?
Thanks!

The Foursquare API would fit this use case perfectly because you can supply very specific filters through the API. Also, they have extensive coverage around the world, unlike Google or Yelp.
I would check out the venues/explore endpoint and use a categoryId of Parks. You can use a query parameter of "basketball" or "tennis" to find parks that have courts for these.

How to export specific price and volume data from the LMAX level 2 widget to excel

Background -
I am not a programmer.
I do trade spot forex on an intraday basis.
I am willing to learn programming
Specific Query -
I would like to know how to export into Excel in real time 'top of book' price and volume data as displayed on the LMAX level 2 widget/frame on -
https://s3-eu-west-1.amazonaws.com/lmax-widget/website-widget-quote-prof-flex.html?a=rTWcS34L5WRQkHtC
In essence I am looking to export
price and volume data where the coloured flashes occur.
price and volume data for when the coloured flashes do not occur.
I understand that 1) and 2) will encompass all the top of book prices and volume. However i would like to keep 1) and 2) separate/distinguished as far as data collection is concerned.
Time period for which the collected data intends to be stored -> 2-3 hours.
What kind of languages do I need to know to do the above?
I understand that I need to be an advanced excel user too.
Long term goals -
I intend to use the above information to make discretionary intraday trading decisions.
In the long run I will get more involved with creating an algo or indicator to help with the decision making process, which would include the information above.
I have understood that one needs to know coding to get involved in activities such as the above. Hence I have started learning C ++. More so to get a hang/feel for coding.
I have been searching all over the web as to where to start in this endeavor. However I am quite confused and overwhelmed with all the information.
Hence apart from the specific data export query, any additional guidelines would also be helpful.
As of now I use MT4 to trade. Hence I believe to do the above - I will need more than just MT4.
Any help would be highly appreciated.

Yes, MetaTrader4 is still not able ( in spite of all white-label-ed Terminals' OrderBook Add-On(s) marketing and PR efforts ) to provide an OrderBook-L2/DoM-data into your MQL4 / NewMQL4 algorithm for any decision making. Third party software tools' integration is needed to make MQL4-code aware of the real-time L2/DoM-data.
LMAX widget has impressive look & feel, however for your Excel export it requires a lot of programming efforts to re-use it for an automated scanner to produce data for 1 & 2 while there may be some further, non-technical, troubles on legal / operational restrictions for automated scanner to be operated on such data-source. To bring an example, the data-publisher policy restrict automated Options-pricing scanners for options on { FTSE | CAC | AMS | DAX }, may re-visit the online published data-sources no more than once a quarter of an hour and get blocked / black-listed otherwise. So a care and a proper data-source engineering is in place.
Size of data collection is another issue. Excel has some restrictions on an amount of rows/columns that may get imported. Large data-files, the more the CSV-imports may strike these limits. L2/DoM-data, collected for 2-3 hours just for one single FX Major may go beyond such a limit, as there are many records per second ( tens, if not hundreds, with just a few miliseconds between them ). Static file-size of collected data-records take typically several minutes to just get written on disk, so proper distributed processing data-flow-design and non-blocking-fileIO engineering is a must.
Real-time system design is the right angle to view the problem solution approach, rather than just some programming language excersise. Having mastered some programming language is a great move, nevertheless, so called robust real-time system design, and Trading software is such a domain, requires, with all respect, a lot more insights and hands-on experience than to make an MQL4 code run multi-thread-ed & multi-process-ed with a few DLL services for a Cloud/Grid-based distributed processing system.
How much real-time traffic is expected to be there?
For just a raw idea, what the Market can produce per second, per milisecon, per microsecond, let's view a NYNEX traffic analysis for one instrument:
One second can have this wild relief:
And once looking into 5-msec sampling:
How to export
Check if the data-source owner legally permits your automated processing.
Create your own real-time DataPump software, independent of the HTML-wrapped Widget
Create your own 'DB-store' to efficiently off-load scanned data-records from real-time DataPump
Test the live data-source >> DataPump >> DB-store performance & robustness on being able to serve error-free a 24/6 duty for several FX Majors in parallel
Integrate your DataPump fed DB-store local data-source for on-line/off-line interactions with your preferred { MT4 | Excel | quantitative-analytics } package
Integrate a monitoring of any production environment irregularity in your real-time processing pipeline, which may range from network issues, VPN / hosting issues, data-source availability issues to an unexpected change in the scanned data-source format/access conditions.

Converting data into information:Where to start?

We (my company) runs a website which have lots of data recorded like user registration, visits, clicks, what the stuff they post etc etc but so far we don't have a tool to find out how to monitor entire thing or how to find patterns in it so that we can understand what kind of information we can get from it? So that Mgmt can take decisions based on it. In short, the people do at Amazon or Google based on data they retrieve, we want a similar thing.
Now, after the intro, I would like to know what technology could it be called;is it Data Mining,Machine Learning or what? Where should we start to convert meaningless data into useful Information?

I think what you need enters in the "realm" of: parsing data, creating graphs, showing statistics about some elements, etc.
There is no "easy" answer, I can only answer parts of your question.
There are no premade magical analytical tools, big companies have their own backend tools tunned to parse the large amounts of data and spit out data summaries that are then used to build graphs or for statistical analysis.
I think the domain you are searching for is statistical data analysis. But there are many parts that go together here.
Best advice I can give you is to set up specific goals for you analysis and then try to see what is the best solution, you question is too open.
ie. if you are interested in visits/clicks/website related statistics Google Analytics is a great tool, and very easy to use.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string