Crunchbase - how to export investment data for a number of companies? - statistics

I'm currently working on a research project whereby I need to extract all the investment data from a large number of companies on CrunchBase. We'd like to extract all the total amount raised at Seed, Series A, Series B, Series C, etc. but cannot work out how to do this other than manually going into every company one by one. Does anyone know a quick way to extract this?
We looked into saving the companies to a list, but found no way to extract the individual funding round data.
Cheers!

Related

How to best store time series data in Elasticsearch?

I regularly have to conduct chemical experiments which result in a huge set of time serie data. For example, 100 lists with measured concentration of fluids and for each measurement an assigned timestamp in microseconds.
I would like to track and model each experiment and assign to it multiple lists with (measurement, timestamp) pairs. The measurement lists do not have to be of equal length and can greatly vary. For example, one measurement list could be of length 100, the next one 4000, always depending on the conducted experiment. When being at the university lab, I also take notes for different timestamps, which I would also like to track for the timestamps in the DB (tagged timestamps).
Later on, the full analysis text of the experiment should also be stored.
Is Elasticsearch capable of storing such time series data or measurement lists? Because this is not mostly text but rather numbers I'm a bit restraint.
Even though I have searched through the net for a while, I could not find a proper way to set up measurement lists as explained yet.
Any help, ideas and maybe helpful links are highly appreciated!

Using Sumproduct to calculate two tables using horizontal (table headers) and vertical references

Hopefully the title makes some sense because I'm trying to wrap my head around the logic and I'm not quite sure how to phrase the question.I'll try to give a brief explanation of the end goal without over complicating it with unnecessary details.
I have a table of survey score averages for every month per person and a correlating table with the number of surveys each person received for each month. The logic is essentially multiple the score for each month by the number of surveys, combine them, divide by the total number of surveys within that time period to get their true average. Where things get a little complicated is that I have to include the ability to set a custom date range and return the value. So sometimes I might be looking at the average for Jan - Apr, other times I might just be looking at Feb-Mar etc.
I think sumproduct is going to get what I need done but I'm running into issues trying to write it out. I've written it several different ways and none of them worked so here's one that best conveys what I'm trying to do,
=SUMPRODUCT(--(F7:I7,L7:O7>=C2),--(F7:I7,L7:O7<=C3),--(E8:E12,K8:K12=B9),tbl_average[[Jan-20]:[Apr-20]],tbl_surveys[[Jan-20]:[Apr-20]])
I super appreciate any assistance I can get on this. I'm hoping the end result is not nearly as difficult as I'm making it out to be.
Some additional information:
I'm going to be using this same process to calculate multiple metrics across multiple worksheets.In the test example each of the tables will most likely be on different sheets. The dashboard with the calculated results will contain everyone's names and will be filtered and rearranged frequently, so I need to make sure we're always matching directly to their names and not just the relative rows. Basically, in my example I show that Agent 1 is always lined up on row 8 but that's not always going to be the case. Agent 1 could be in Row 8 on Sheet 1, Row 10 on Sheet 2, and Row 12 on Sheet 3 and I need all the correct values to multiply and sum against one another.

turning rows to columns in Excel

I have an excel datasheet with more than 1,000,000 rows and 80 columns. the datasheet contains sales information of a chain store with more than 1700 store nationwide. each store is repeated 52(weeks in a year)* about 30 (products sold in that given week)* 2(two years). I want to convert the rows corresponding to products to columns. I can't do that using transverse because the products sold each week might not be exactly the same as those sold next week. do you have any solutions?
thanks
I just made a very simplified version of that excel file. the problem is that the products sold are not the same each week. there is a limited set of product, but only some of the items are sold each week
https://drive.google.com/open?id=1B2vjIL2hemfQNrCz0X6u_pzi7Euy6IWa3Lj0_HzDXDE
This isn't much of an answer yet - but it either will become one or I'll delete it, depending on the OP's response.
I'm thinking that transverse/transpose is the wrong term for what you're trying to do.
Perhaps you're just trying to better organize/visualize this data, something similar to one of these Pivot Tables:
or
These are just two of the infinite ways you can organize data in a Pivot Table.
Is that similar to what you're trying to do? If so I'll share some more info.
If this quantity of data is going to keep coming your way, way you really need is to start using an Access database to get this under control and be able to report on it properly (and easily, once it's setup).

Excel Index Function

I just want to thank you guys in advance. I think you guys are doing a great job in helping people out with programming stuff. Pats on the back for all of you.
Here is what I've been working on: I have daily stock price return data on about 4000 stocks. I want to add them to my portfolio after observing their performance for 12 months. I will choose the top 10% best performers and bottom 10% worst performers. I will create multiple portfolios over a period of time. I have done that with no problem.
I want to use the INDEX function to calculate the daily return of my portfolio. Not all 4000 stocks are in my portfolio, about 300 stocks are in my portfolio at any given time. The daily portfolio returns will be calculated by multiplying the weights (they are equal weighted, so 1/300) to that stock's return on the specific date. I assume it has to do with a combination of INDEX, SUMPRODUCT, and IF or MATCH functions.
I have been thinking this for a long time and I just can't get to the bottom of it. I have attached pictures for a portion of what I was working on. I think will give you a good picture of what I'm trying to do. I bet this is such an easy thing for you guys. I hope you can help me out! Thanks again!
PICTURES:IN or OUT portfolio & Stock's individual returns
Charles
Not sure I understood your problem, but here is a trial suggestion:
You get data for 4000 stocks while you are monitoring 300. So, you need to find the correct one within your sheet (there will be 3700 that will not match anything).
If you have your stocks listed in, say, column "A", you could use the function LOOKUP (well explained in the Web). If you need to get the row of your stock, you can use the function MATCH.
If this is not what you are looking for, it means that I (at least) did not understand you, so you would need to add details to your question.

Dynamic Excel 2007 Dashboard Without VBA

Morning guys,
I'm hoping that one (or more) of you can help me.
I have been tasked with creating a dashboard which needs to display trends and have a dynamic frontsheet, preferably with drop-down or data forms so as to update a chart / graph.
The information itself is incredibly limited - the scope of the document is tracking a value (0-4) assigned to a staff member's ability to fulfill a task, e.g. 'Quotes - 4', 'Cancellation - 2' and so on. So the metrics are limited to:
Month (a worksheet for each month of the year and one front for the dashboard)
Team (Presently 6 teams, but this is likely to increase over time, so hopefully the solution facilitates relatively easy incorporation of new teams)
Employee (Self explanatory)
Task (Presently 25, but as above - subject to change)
Score (the 0-4 value referred to above)
So as you can see, it's a very simple dataset. The sheets are presently set out with six grids with data validation lists for determining Team and Score (dropdowns for easy data input), with the Task being pre-written and the employee entered manually by the user.
What I'm hoping to do is have a frontsheet with dynamic tables that update accordingly when a dropdown and/or data form is changed. The key focus is on getting the staff members up to 4s for all tasks, so ultimately, the charts will display trends for the individual teams (one chart for each team - 6 charts) on a month-on-month basis and also a dynamic table which can reflect specific information (e.g. employee performance on a specific month, or number of '3s' achieved by a specific team to date).
I've read a reasonable amount on this, but seem to have overwhelmed myself with the sheer amount of options. However, the options can be narrowed given that I'm working on a large corporate network that doesn't really facilitate downloads (so add-ins or anything extraneous to Excel 2007 'out-the-box' isn't an option) and preferably without the use of VBA (1. I'm quite a novice insofar as VBA, 2. Easy distribution and maintainence of the document might be marred by VBA?), though I appreciate that my requirements may dictate VBA to be essential.
Does anyone have any suggestions around how best to proceed creation of this dashboard?
Any and all help is appreciated and I apologise as a newbie if I've contravened any conventions around forum etiquette.
Thank you all for your time,
Rob
There are a couple of things that you need to consider in a task such as this:
a) what sort of output do you require?
b) how are you going to manage the data?
For a) I'd separate it further into the basics of what's required (time series charts of employee and/or team performances [how will team performance be measured? average, % achieving 4, or ?]) and then the bells and whistles of drop-downs. Focus on the basics, the other stuff first the whizzy stuff can come later. Getting b) right is vital - you are going to be extracting subsets of the data to build the charts you want to display. Get b) wrong and you'll just create a horrible task for yourself.
In your position I would consider re-organising the data into the form of a table. Excel's help defines what is meant by a table, but in essence it is a list of your observations where each observation simply comprises the score for a particular month/team/employee/task combination (so each observation comprises 5 values). The observations are arranged as successive rows of the table with the first row being the header row which will contain suitable labels such as "Month", "Team", "Employee", "Task", "Score". The real advantage of using a table such as this is that Excel provides a heap of in-built facilities for manipulating them - look up the help for Sort and Filter on the Data tab. In your case there is an even more compelling reason for using a table - you can use the Pivot Table and Pivot Chart facilities for analysing and displaying the data. If you have not used these before some time and effort spent learning about them will pay dividends. Once your data is organised and you know how to use Pivot Tables and Charts you should be able to prototype sum output very quickly.
If you do decide to organise your data as a table you can still keep a nice friendly looking grid of 6 team "tables" (different from Excel's use of the word) as a data entry facility to enter each month's scores by employee and task. You will need to find a way of getting each month's data from the data entry "tables" to the main data table. (Easiest way would be to use a bit of spare worksheet under the data entry tables to reproduce the entered data as a series of observation rows and then use Paste Special Values to append these rows to the end of the main table of observations. You can use VBA to automate the copy/paste operation if you want, you just need to figure out a way of identifying how may observations are currently in the main table and precisely where you want the paste to end up - COUNT() or COUNTA() is a useful friend here). Main problem to avoid (whether automated or not) is to avoid appending same entered data more than once to main data table.
Have a look at http://www.mediafire.com/download/x64swkp689k10a1/DataEntrytoTable.xlsx for a simple example of some of the above thoughts

Resources