What test to conduct when I wanna see if different measures vary in the same way for all subjects. I’m working on glucose data and have repetitive measures for blood glucose levels throughout the day. I want to see if the levels of fasting glucose and post meal glucose vary in the same way for all of my subjects (within subjects variation is the same between subjects)
Related
So I am trying to work out a dummy data set and my goals are;
To find a pattern for one of the measurements over time or across different regions.
To find a pattern between two or three measurements over time or across different regions.
Following is the link to the dataset.
https://docs.google.com/spreadsheets/d/1Cp94KetMACXpNie4qV8XnkTJfddvZTP6/edit?usp=sharing&ouid=112792924141687891126&rtpof=true&sd=true
As a beginner I have no understanding of how I am supposed to make relationships with this data, you can see that there are 3 sheets available in the excel file and I am not sure how to combine these data.
For example, if you see diabetes and raised blood pressure. It's understandable that with diabetes there is an increased risk of high blood pressure, now how am I supposed to show this in tableau?
I have tried using Joins but I am unable to make sense of the data. I am unable to somehow relate the increased blood pressure levels to diabetes.
I have a statistics course assignment regarding the "pure" effect of the mileage on second hand cars' sales price.
The dataset contains several factors which may affect the sales price of cars on an exchange website, including:
Year manufactured
Mileage
Make
Type (Sedan, Wagon, SUV, etc)
Color
Complete logbook service (Y/N)
Fuel efficiency
Seller Zip Code
My understanding of the analysis of the "pure" effect of one independent variable on the dependent variable should limit all the other variables as the same, as in same make, manufacturing year range, type, color, etc. However, if I do that just for a single combination of cars of the same characteristics, I'd give up many data points.
So what's the best approach to tackle this kind of problem? Should I do many sets of single-variable linear regressions between mileage and sales price on many combinations of similar cars and average the effect?
Sorry there isn't any data here. I just want to have a road map of solving the problem. Thanks very much.
I know how many customers of each customer-size group each sales representative handles on an annual basis. Is it possible to calculate the likely time/effort required by each customer size based on this data set? Or said differently, I'm trying to find out if larger customers require more or less effort than smaller customers.
Is there a function or formula in Excel that will allow be to answer the above based on the data set below?
To add some context, in case it is helpful. There are 2080 work hours a year. I'm assuming they spend all their time with the customers under their responsibility. I also expect that the largest customers require more time than a small customers, but I dont know how much more. That is what I'm trying to figure out. Some employees do handle a lot more customers than others, so its probably best to look a the relative difference between the customer sizes for each employee...
Customer size is rated from 0 (very small) to 7 (the largest).
Below is a small data extract of a large Data table
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
During my preparation for an exam in software engineering, I came across the following task in an old exam:
For a client, you create a new financial software whose task is, among other things, to perform tax calculations. The following requirements have been communicated to you by the Client:
The system must be able to:
calculate and display VAT for different countries and tax rates (Germany 19%, Austria 20%, Switzerland 8%).
calculate and display the income tax according to country-specific tax tables (separate table for Germany, Austria, Switzerland).
The system must allow the user to:
enter the tax relevant data (gross amount for VAT, annual income for income tax)
print the result of the tax calculation on a network printer.
send the result of the tax calculation to the appropriate tax office.
Task 1: Capture the requirements communicated by the client in a domain model (class diagram) with the following information: classes, attributes, methods, relationships, multiplicities, relationship name.
Solution:
I am not sure how to define the right classes, relationships and multiplicities. But I tried it and came to the following incomplete solution:
First Update:
Second Update:
Could someone help me with this? Thanks :)
Review of your diagram
I propose you to read your first diagram, and leave it as an exercise to cross-check if it really meets the requirements:
"A tax rate is composed of a country" (top composition). So countries do not exist independently of tax codes. Is this really what you meant? And does anything in the requirements tell that there is only one tax rate per country?
"A tax rate is composed of an (optional) income tax rate, and an (optional) VAT rate" (double composition in the middle). Ouh!?
"Every income tax rate has its own tax category(ies)" (bottom composition). Isn't the idea of categories to group similar income tax rates?
"A tax rate aggregates tax administrations, and a tax administration may appear in several aggregates" (aggregation). Why should an administration be aggregated in tax codes?
First recommentation: read in your course the difference between association, aggregation and composition. THe use of aggreegation and composition are in principle exceptional and there must be strong reasons to use use it.
Some more questions:
Where are the names of the relations?
What requirement justifies the tax administration? If it is justified, should'nt it be related to a country?
Is printing some elements really part of the domain model or does it already belong to some user-interface?
Second recommendation: only show elements taht you can reasonably derive from the requirements, and avoid any user-interface related behaviors.
Edit: your final diagram following our exchanges in the comment section represents much better what you wanted to represent initially. You could add the multiplicity 1..* rate for 1 category. You could also add a separator, in order to show classes consistently with a property and operation sections, even if one of the two is empty. The design is still basic, since all properties/attributes are public which is not recommended (but for I suppose you did this to avoid a lot of extra getters/setters in your design).
Alternate approach:
Your narrative describes one single use-case, which is perform tax calculation and consists of entering the calculation data, printing it and sending it. The actors are probably some clerc of your customer and perhaps tax offices.
I find the following candidates for classes chronologically, when reading the narrative: VAT, country, tax rate, income tax, "country-specific tax tables", gross amount, annual income, tax calculation, tax office. Let's have a closer look:
Tax office is very unclear: is there a network printer per tax office? how is the relevant tax office determined? are there one office per country, or can the organisation be more complex?
VAT and income tax are very different:
for VAT there are different rates per country. The applicable rate is always known, and the calculation is based on the applicable rate and the gross value.
For income tax the narrative speaks of country-specific tax tables: this means that the rate might not be known in advance, but depend on the taxable income level. (e.g. in Austria there is a minimum, and beyond it's flat rate; but in France, there is a normal rate, and a reduced rate for the first 500K€). In reality, income tax is much more complicated, since it may also depend on the legal form of the enterprise, or what is done with the income (re-invested vs. distributed), but let's keep it simple for the exercise. The wording leaves an ambiguity whether there is one table per country or several.
You could nevertheless generalize the concept of tax, if you'd want, considering in this exercise, that its amount is calculated for a base amount (gross amount or annual income).
The tax calculation is not fully clear: is it just the user interface, or is the calculation actually some domain object. This would give us:
This would lead to a diagram like:
I have a large excel file that has monthly sales per customer for January - December 2016. I want to predict what their sales will be in January 2017.
You could average each client's data and ignore the zeros with a formula like
=AVERAGEIF(D2:D12,"<>0)
D2:D12 would be the range of a single client's sales variable and it would give you a monthly average for that client that you could use for January Predicted Sales.
You have several problems to solve:
Determining (a) candidate forecasting model(s) to use.
Organising your existing data to test whether such model(s) are actually suitable, performing such tests and selecting (a) suitable model(s) [There may be more than one model to be used dependent on whether your data are homogeneous or not.]
Organising your existing data to apply your chosen model(s) for the
purposes of making your prediction. (A different organisation to 2. may be required.)
Your description talks about "sales" but the data sample you provided mentions "claims". These are very different entities - sales (dependent on what type of sales) may well be as frequent as monthly, but claims are likely to be a lot less frequent. If this is the case and claims are highly infrequent, then there is little sense in trying to predict an individual customer's claim. In such a case it would make more sense to predict the aggregate level of claims across a group of customers.
With all modelling, and particularly with forecasting models, context is highly important in steering towards which particular types of model are likely to be suitable. As it is, you have provided no context about what your data really represents, so are unlikely (beyond random chance) to find that any solution offered to you is actually going to be suitable. A solution might compute but, in the context in which you are operating, will it provide anything like a sensible or justifiable set of forecasts?
The "AverageIf" solution may be sufficient; however, you may be able to do better if there is in fact any trends/seasonality in the data that could be used to modeling advantage. For each customer, I would check for autocorrelation in the data. "Autocorrelation, also known as serial correlation, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations as a function of the time lag between them."(https://en.wikipedia.org/wiki/Autocorrelation) For instance, if there is significant autocorrelation at lag = 12, this would suggest yearly seasonality in the data (maybe every January is similar). There is a nice tutorial to analyze autocorrelation in Excel at:
http://www.real-statistics.com/time-series-analysis/stochastic-processes/autocorrelation-function/
If autocorrelation does exist, it would likely then be useful to perform regression with that time component(s). If there is a trend with time in additional to a cyclical component, that should also be factored into the regression (i.e., such as a "Year" variable); or a more sophisticated time series method could be applied that would accomodate trend and autocorrelation such as an Autoregressive Integrated Moving Average (ARIMA) model:
https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average
Excel has a forecasting function that might help:
FORECAST.ETS function
Calculates or predicts a future value based on existing (historical) values by using the AAA version of the Exponential Smoothing (ETS) algorithm. The predicted value is a continuation of the historical values in the specified target date, which should be a continuation of the timeline. You can use this function to predict future sales, inventory requirements, or consumer trends.
This function requires the timeline to be organized with a constant step between the different points. For example, that could be a monthly timeline with values on the 1st of every month, a yearly timeline, or a timeline of numerical indices. For this type of timeline, it’s very useful to aggregate raw detailed data before you apply the forecast, which produces more accurate forecast results as well.
Syntax
FORECAST.ETS(target_date, values, timeline, [seasonality], [data_completion], [aggregation])
And you can see it in action in a workbook from the FORECAST.ETS.SEASONALITY page:
Download a sample workbook