Tableau Multiple Sheets Excel Visualisation - excel

So I am trying to work out a dummy data set and my goals are;
To find a pattern for one of the measurements over time or across different regions.
To find a pattern between two or three measurements over time or across different regions.
Following is the link to the dataset.
https://docs.google.com/spreadsheets/d/1Cp94KetMACXpNie4qV8XnkTJfddvZTP6/edit?usp=sharing&ouid=112792924141687891126&rtpof=true&sd=true
As a beginner I have no understanding of how I am supposed to make relationships with this data, you can see that there are 3 sheets available in the excel file and I am not sure how to combine these data.
For example, if you see diabetes and raised blood pressure. It's understandable that with diabetes there is an increased risk of high blood pressure, now how am I supposed to show this in tableau?
I have tried using Joins but I am unable to make sense of the data. I am unable to somehow relate the increased blood pressure levels to diabetes.

Related

Binomial Options Pricing Calculation in PowerQuery

Im trying to build an excel sheet that calculates synthetic options prices and greeks for time series data to model intraday options pricing, input is simply intraday price data, say Tick level to 5 minute interval. I found this https://www.thebiccountant.com/2021/12/28/black-scholes-option-pricing-with-power-query-in-power-bi/ which provides for powerBI and Black Scholes but possibly not very accurately. I prefer the Binomial method (I have used this excellent tutuorial to build a manual version for a large number of strikes but it takes a long time to calculate and is very very complex and also inaccurate due to not being able to calculate many steps before topping excel out: https://www.macroption.com/binomial-option-pricing-excel/).
Does anyone have any idea if this is possible to create an entire column in Power Query that will calculate bionomially derived options pricing using >100 even up to 1000 steps? The reason is intraday pricing using high resolution data 5min, 1min, Seconds and Tick I think needs a large number of steps to properly converge. This is just about doing a good enough model that can be used for visualising the progress of a trade on a given day.
Any pointers on how this could be done and calculated using M Language would be much appreciated and useful!

Descriptive Stock Analysis using Python

I want to conduct a descriptive analysis of stock markets, I have introduced three CSV files which contain historical data for three different companies. What code should I use to enable the program to be able to output information such as mean, STD, for all three companies? I could do them individually at the moment. However, moving forward, I want to add more companies to my dataset.

How Can I Model Many Short Time Series Samples?

How Can I Model Multiple Short Time Series Samples?
For example, let's say I have a new subject each month, and I measure each subject every day for the entire month. I then want to model these multiple strings of independent time series because I assume that there is an underlying pattern that applies to all 12 subjects. However, a time series with an n of 30 is too short to model, so is there some way to group these 12 time series together for a parallel analysis?
I imagine the way to handle this is similar to how one might handle a time series with multiple breaks of unknown length. Unfortunately, I unaware of how to deal with this type of data structure.
Any thoughts on where to even begin? What terms I should research?
Well. Depends on what you're interested in. Makes it a lot easier if we know what kind of data you have, and what you're trying to analyse.
Trying to answer your question: If you assume that there is some underlying structure which is homogenous for, say, 6 of the subjects, and different for the other half, you can just pool the two data sets and do some kind of group-mean analysis. If you're interested in a temporal change over the 12 months, then you need to assume that each subject are homogenous across whatever variable you're measuring.
Normally, for e.g. timeseries in economics, what you're describing is called "censored" or "truncated data".
If we want to measure the income of everyone in a country, we do this by checking electronic paychecks or something. But some people at the end of each tail, may not have a visible income. Poor people may be earning income in other ways, and rich people may want to hide some of their income. This is censored data, and any advanced timeseries stats book will have something on that.
Truncated data is similar. Just imagine income again. If we truncate everyone who makes < 10,000$ a year, then this will "cut off the end" of your distribution. There are also remedies for this. Again check an advanced time series book.
Hope this helped a bit.

Dynamic Excel 2007 Dashboard Without VBA

Morning guys,
I'm hoping that one (or more) of you can help me.
I have been tasked with creating a dashboard which needs to display trends and have a dynamic frontsheet, preferably with drop-down or data forms so as to update a chart / graph.
The information itself is incredibly limited - the scope of the document is tracking a value (0-4) assigned to a staff member's ability to fulfill a task, e.g. 'Quotes - 4', 'Cancellation - 2' and so on. So the metrics are limited to:
Month (a worksheet for each month of the year and one front for the dashboard)
Team (Presently 6 teams, but this is likely to increase over time, so hopefully the solution facilitates relatively easy incorporation of new teams)
Employee (Self explanatory)
Task (Presently 25, but as above - subject to change)
Score (the 0-4 value referred to above)
So as you can see, it's a very simple dataset. The sheets are presently set out with six grids with data validation lists for determining Team and Score (dropdowns for easy data input), with the Task being pre-written and the employee entered manually by the user.
What I'm hoping to do is have a frontsheet with dynamic tables that update accordingly when a dropdown and/or data form is changed. The key focus is on getting the staff members up to 4s for all tasks, so ultimately, the charts will display trends for the individual teams (one chart for each team - 6 charts) on a month-on-month basis and also a dynamic table which can reflect specific information (e.g. employee performance on a specific month, or number of '3s' achieved by a specific team to date).
I've read a reasonable amount on this, but seem to have overwhelmed myself with the sheer amount of options. However, the options can be narrowed given that I'm working on a large corporate network that doesn't really facilitate downloads (so add-ins or anything extraneous to Excel 2007 'out-the-box' isn't an option) and preferably without the use of VBA (1. I'm quite a novice insofar as VBA, 2. Easy distribution and maintainence of the document might be marred by VBA?), though I appreciate that my requirements may dictate VBA to be essential.
Does anyone have any suggestions around how best to proceed creation of this dashboard?
Any and all help is appreciated and I apologise as a newbie if I've contravened any conventions around forum etiquette.
Thank you all for your time,
Rob
There are a couple of things that you need to consider in a task such as this:
a) what sort of output do you require?
b) how are you going to manage the data?
For a) I'd separate it further into the basics of what's required (time series charts of employee and/or team performances [how will team performance be measured? average, % achieving 4, or ?]) and then the bells and whistles of drop-downs. Focus on the basics, the other stuff first the whizzy stuff can come later. Getting b) right is vital - you are going to be extracting subsets of the data to build the charts you want to display. Get b) wrong and you'll just create a horrible task for yourself.
In your position I would consider re-organising the data into the form of a table. Excel's help defines what is meant by a table, but in essence it is a list of your observations where each observation simply comprises the score for a particular month/team/employee/task combination (so each observation comprises 5 values). The observations are arranged as successive rows of the table with the first row being the header row which will contain suitable labels such as "Month", "Team", "Employee", "Task", "Score". The real advantage of using a table such as this is that Excel provides a heap of in-built facilities for manipulating them - look up the help for Sort and Filter on the Data tab. In your case there is an even more compelling reason for using a table - you can use the Pivot Table and Pivot Chart facilities for analysing and displaying the data. If you have not used these before some time and effort spent learning about them will pay dividends. Once your data is organised and you know how to use Pivot Tables and Charts you should be able to prototype sum output very quickly.
If you do decide to organise your data as a table you can still keep a nice friendly looking grid of 6 team "tables" (different from Excel's use of the word) as a data entry facility to enter each month's scores by employee and task. You will need to find a way of getting each month's data from the data entry "tables" to the main data table. (Easiest way would be to use a bit of spare worksheet under the data entry tables to reproduce the entered data as a series of observation rows and then use Paste Special Values to append these rows to the end of the main table of observations. You can use VBA to automate the copy/paste operation if you want, you just need to figure out a way of identifying how may observations are currently in the main table and precisely where you want the paste to end up - COUNT() or COUNTA() is a useful friend here). Main problem to avoid (whether automated or not) is to avoid appending same entered data more than once to main data table.
Have a look at http://www.mediafire.com/download/x64swkp689k10a1/DataEntrytoTable.xlsx for a simple example of some of the above thoughts

What kind of stats and info can I get (mine) from time series data?

I have a database with time series data of different solar power plants: how strong was the sun and how much power that plant created / harvested. This data is in 15 min increments.
I would like to use data mining to get new insights and to then visualize the findings to the users.
I know this falls into the domain of data mining, but my problem is maybe more specific (dealing with time series data). So what can I extract from this kind of data or where can I read about this?
Time Series Analysis is a whole field in itself. That said, you can always start with a few basics and keep adding more to your analysis.
Here are a few things to try for starters from your solar power data:
First, profile your solar power data. That is, calculate Min, Max, daily averages, hourly peaks and lows etc. to get a feel for the data. Plotting with x-axis as time will give you visual information.
Time Series data can be decomposed into "Trend" & "Seasonality" (can be for any repeating time interval)
Look for outliers, abnormalities in your data stream. Missing values, repeats etc.
If you want to learn more about time-series, (and if know R) then the forecast package is a good way to get started. (Especially this free e-book)
Any search on Time Series will take you to Prof. Hyndman's pages, and I have found the free chapters of his forecasting book very useful.
Hope that helps you get started.

Resources