I have a date range 7/5/2019 to 7/19/2019. The values for each date depicted:
7/5 19
7/8 35
7/9 29
7/10 40
7/11 33
7/12 19
7/15 35
7/16 37
7/17 30
7/18 60
7/19 41
(As noticed, not all dates are shown as they do not have any value assigned.)
Essentially, I am wondering why do the Average and AverageIF formulas differ with the returned value?
AverageIF formula: =AVERAGEIF($A$1:$K$1,">="&TODAY()-14,$A2:$K2) returns 37
Average formula: =AVERAGE(A2:K2) returns 34
Related
Let's say I have below array of dates (not necessarily sorted):
import numpy as np
np.array(["2000Q1", "2000Q2", "2000Q3", "2000Q4", "2001Q1", "2001Q2", "2001Q3", "2001Q4", "2002Q1",
"2002Q2", "2002Q3", "2002Q4", "2003Q1", "2003Q2", "2003Q3", "2003Q4", "2004Q1", "2004Q2", "2004Q3",
"2004Q4", "2005Q1", "2005Q2", "2005Q3", "2005Q4", "2006Q1", "2006Q2", "2006Q3", "2006Q4", "2007Q1",
"2007Q2", "2007Q3", "2007Q4", "2008Q1", "2008Q2", "2008Q3", "2008Q4", "2009Q1", "2009Q2", "2009Q3",
"2009Q4"])
From this I want to create a DataFrame with 2 columns for start-date and end-date, where this dates corresponds to the starting date of a date range and ending date for that date rage spanning 4 years. This will continue for each element of above array until the last element. For example, first 3 rows of this new DataFrame would look like below
Is there any direct function/method to achieve above in Python?
Here's one way using PeriodIndex and DateOffset functions in pandas. Note that I named your array arr below:
df = pd.DataFrame({'start-date': arr,
'end-date': (pd.PeriodIndex(arr, freq='Q').to_timestamp() +
pd.DateOffset(years=4, months=10)).to_period('Q')})
Output:
start-date end-date
0 2000Q1 2004Q4
1 2000Q2 2005Q1
2 2000Q3 2005Q2
3 2000Q4 2005Q3
4 2001Q1 2005Q4
5 2001Q2 2006Q1
6 2001Q3 2006Q2
7 2001Q4 2006Q3
8 2002Q1 2006Q4
9 2002Q2 2007Q1
10 2002Q3 2007Q2
11 2002Q4 2007Q3
12 2003Q1 2007Q4
13 2003Q2 2008Q1
14 2003Q3 2008Q2
15 2003Q4 2008Q3
16 2004Q1 2008Q4
17 2004Q2 2009Q1
18 2004Q3 2009Q2
19 2004Q4 2009Q3
20 2005Q1 2009Q4
21 2005Q2 2010Q1
22 2005Q3 2010Q2
23 2005Q4 2010Q3
24 2006Q1 2010Q4
25 2006Q2 2011Q1
26 2006Q3 2011Q2
27 2006Q4 2011Q3
28 2007Q1 2011Q4
29 2007Q2 2012Q1
30 2007Q3 2012Q2
31 2007Q4 2012Q3
32 2008Q1 2012Q4
33 2008Q2 2013Q1
34 2008Q3 2013Q2
35 2008Q4 2013Q3
36 2009Q1 2013Q4
37 2009Q2 2014Q1
38 2009Q3 2014Q2
39 2009Q4 2014Q3
I have several multiple linear regressions to carry out, I am wondering if there is a VBA solution for getting the VIF of regression outputs for different equations.
My current data format:
i=1
Year DependantVariable Variable2 Variable3 Variable4 Variable5 ....
2009 100 10 20 -
2010 110 15 25 -
2011 115 20 30 -
2012 125 25 35 -
2013 130 25 40 -
I have the above table, with the value of i determining the value of the variables (essentially, different regression input tables in place for every value of i)
I am looking for a VBA that will check every value of i (stored in a column), calculate the VIF for every value of i and output something like below
ivalue variable1VIF variable2VIF ...
1 1.1 1.3
2 1.2 10.1
I'm trying to make a cohort in Excel Pivot with a dataset having:
aggregated number of monthly sign ups (month by month), aggregated number user of completed next step, number of months between sign up and the next action taken.
What I can't figure out when i do the pivot to have the cohort, is what to put into the value field in the pivot? Normally I would take the Customer IDs as value, but since I only have the data on aggregated monthly level I'm not sure if i put the number of sign ups or the number of next step completed?
Also how do I get the sum of each cohort so i can calculated the retention rate?
I hope this make sense.
Signup month Action completed month Months between sign up and action completed signups conversion to Action completed
Jan-17 Sep-18 20 95 71
Jan-17 Jan-18 12 95 77
Jan-17 Jun-18 17 96 72
Jan-17 Jan-18 12 92 78
Jan-17 Dec-18 23 91 78
Jan-17 Jul-18 18 100 73
Jan-17 Oct-18 21 92 79
Jan-17 Feb-18 13 95 70
Jan-17 Jan-18 12 91 79
Jan-17 May-18 16 93 71
Jan-17 Jun-18 17 95 72
Is this what you are looking to achieve?
REVISION #1
This layout shows the total number of signups, by the month in which the signup occurred, distributed by the number of months btwn the signup and action completed. The action completed month may be omitted and will still achieve the same result; it is there FYI only.
REVISION #2
This is an example of the average months between the signup and action. Is this what you are looking for?
I have a DAX formula for my Powerpivot I cannot get to solve and was hoping for help.
I have two pivot tables connected already
Showing a cohort of actions taken within Month 1,….X on the sign up month
Total Sign Ups on monthly basis
I have tried to attached the sheet here but somehow I cant so I have add a screenshot of the sheet.1
What I have so far is:
=DIVIDE(
SUM(Range[conversion to KYC completed]),
SUM('Range 1'[Sum of signups]))
But this does not give me what I want as I think I’m missing the monthly grouping somehow.
Question 1:
What I want is to get the share of actions completed within 1,...,X months out of the total sign up that given month (e.g. Jan) (so the data from Table 2)
Question 2:
In best case I would also like to show total sign ups in the beginning of the cohort to make the cohort easier to understand, so having the monthly total sign up (which the cohort is calculated based on). But now I cannot get just the totals month by month. Is there anyways just to add in a monthly total column in the pivot without applying these number as a value across all columns?
Something like this is the ultimate outcome for me 2
UPDATED WITH SAMPLE DATA
Signup month, KYC completed month, Age by month, signups, conversion to KYC completed
Jan-17 Jul-18 18 97 75
Jan-17 Jul-18 18 99 79
Jan-17 Dec-18 23 95 80
Feb-17 May-18 15 99 74
Feb-17 Jul-18 17 90 75
Feb-17 Jul-18 17 95 76
Feb-17 Aug-18 18 92 71
Mar-17 May-18 14 94 73
Apr-17 Jul-18 15 93 75
May-17 Sep-18 16 94 70
May-17 Oct-18 17 98 72
Jun-17 May-18 11 95 79
Jul-17 Oct-18 15 97 74
Jul-17 Jul-18 12 94 78
Aug-17 Sep-18 13 96 74
Sep-17 Nov-18 14 95 80
Sep-17 Oct-18 13 94 79
DESIRED OUTCOME
The % for Month 1....X is calculated KYC Completed / Monthly Sign up
OUTPUT WITH THIS CODE
=VAR SignUpMonth = IF(HASONEVALUE('Range 1'[Row Labels]), BLANK())
RETURN
DIVIDE(CALCULATE(SUM([conversion to KYC completed])),
CALCULATE(SUM('Range 1'[Sum of signups]),
FILTER(ALL(Range), Range[Signup month (Month Index)] = SignUpMonth)))
[
Thanks for the sample data Franzi. Still not too clear what you're asking for, but perhaps this will help a little.
Signed Up to Signed In Ratio =
VAR SignUpMonth = SELECTEDVALUE(Table1[Signup month], BLANK())
RETURN
DIVIDE(CALCULATE(SUM([conversion to KYC completed])),
CALCULATE(SUM(Table1[ signups]),
FILTER(ALL(Table1), Table1[Signup month] = SignUpMonth)))
So. Let's break it down.
If I understand correct, you want to see the cross section of number of signins for a given month ( x axis ) signup combo ( y axis ) and divide that number by the total signups ( y axis ) per signup month.
number of signins for a given month ( x axis ) signup combo ( y axis ):
CALCULATE(SUM([conversion to KYC completed]))
TOTAL signups ( y axis ) per signup month
CALCULATE(SUM(Table1[ signups]),
FILTER(ALL(Table1), Table1[Signup month] = SignUpMonth))
IE:
23 HL*3*2*23*0
24 PAT*19
25 NM1*QC*1*CUSTOMER*COLE
26 N3*228 PINEAPPLE CIRCLE
27 N4*CORA*PA*15108
28 DMG*D8*19940921*M
29 CLM*945405*5332.54***12>B>1*Y*A*Y*Y*P
30 HI*BK>2533
31 LX*1
32 SV1*HC>J2941*5332.54*UN*84***1
33 DTP*472*RD8*20110511-20110511
34 REF*6R*1099999731
35 NTE*ADD*GENERIC 12MG CARTRIDGE
36 LIN**N4*00013264681
37 CTP****7*UN
I want to populate column C with the text from row 29 as a min row with "945405" all the way to row 37 (the one with the text "CTP" in it). I cannot do this in VBA due to permissions. Is there a formula that will grab this value (it is always CLM * xxxxxx *...), assign it to column C using the "CLM" as the min row and CTP as the MAX row all the way through the SS? IE:
23 HL*3*2*23*0
24 PAT*19
25 NM1*QC*1*CUSTOMER*COLE
26 N3*228 PINEAPPLE CIRCLE
27 N4*CORA*PA*15108
28 DMG*D8*19940921*M
29 CLM*945405*5332.54***12>B>1*Y*A*Y*Y*P 945405
30 HI*BK>2533 945405
31 LX*1 945405
32 SV1*HC>J2941*5332.54*UN*84***1 945405
33 DTP*472*RD8*20110511-20110511 945405
34 REF*6R*1099999731 945405
35 NTE*ADD*GENERIC 12MG CARTRIDGE 945405
36 LIN**N4*00013264681 945405
37 CTP****7*UN 945405
38 NM1*DK*1*PATIENT*DEBORAH****XX*1
39 N3*123 MAIN ST*APT B
****Update*****
I was given permissions in VBA. How would I loop this?
Here is a clearer picture of what I am trying to accomplish
enter image description here
you can use the =MID(Source_Cell, Start_Position, Desired_Length) function to pull the substring. In your case it would be:
=MID(B29, 5, 6)
You can then put this formula in all of the cells you'd like it to be in.