Unit Root Test for Panel Data (strongly balanced with gaps) - statistics

I have a data set of 17 countries for the years 2000 to 2019.
Country Year LNFPI LNFDILAG INFL POL CRD GDP KAOPEN EXC Y1 C1 ny1 t1
in this data LNFPI is dependent variable and want to test if it is stationary. Y1 was transformed from string to year. C1 is encoded. When the first panel was created it showed (unbalanced with 140 gaps). I created ny1 and t1 to remove the gaps.
Now the panel shows (Strongly balanced with gaps).
First, I ran
xtunitroot fisher LNFPI, dfuller lag(1)
it returned r(2000) error
Second, I ran Levin-Lin-Chu test for the LNFPI
it returned
r(498)
Levin-Lin--Chiu test cannot have gaps in data

Related

Survival rates >1 when using mrOpen in the package FSA

I am currently doing some population analyses with the package "FSA" in R.
By using the mrOpencommand, I want to get the survival rate.
My rawdata is a simple table with one row per indidivual, one column per sample date and values of 0 and 1 (for not capured or captured during that respective sampling).
id
total.captures
date1
date2
date3
etc
1
3
1
1
1
...
2
1
1
0
0
...
The first two columns contain the individual id and the aggregated number of captures which is why I excluded them in the analysis.
This is the exact code:
hold.data<-capHistSum(data, cols2use = c(3:13))
est.data<-mrOpen(hold.data)
summary(est.data)
confint(est.data)
It seems to work out, as I get the tables and summaries with all the parameters. See here as an example:
Screenshot_Results
However, there's a problem with the survival estimate phi.
The phi value is not between 0 and 1, but in some cases, exceeds 1.
Any idea, what went wrong here?
Thanks,
Pia

Cuberankedmember getting wrong order

I'm trying to create a top3 ranking from a data table varying metrics but each time I get the wrong order from the cuberankedmember, usually misplacing ranks 2 and 3.
The data I'm mostly focused on is regarding sales revenue. Power pivot sums all sales by store, quite straight forward.
From this I use a cubeset formula that captures store name, filtered by a month and year, which the user types in as any day for the month, and set the measure which to sort by (NTS) (code 1).
The cuberankedmember selects the cubeset and defines the position (code 2).
Then the cubevalue selects as members the cuberankedmember, filters once again month and year, then pulls in the measure (code 3).
E4 is the date
Code1 (cell C21):
=CUBESET("ThisWorkbookDataModel";
"NONEMPTY([Store_Dict].[Nome_DSR].children,
([Calendar].[Year].[All].["&YEAR($E$4)&"],
[Calendar].[Month Number].[All].["&MONTH($E$4)&"]))";
"Ranking";
2;
"[Measures].[NTS]")`
`Code2` (cell `D22`):
`=CUBERANKEDMEMBER("ThisWorkbookDataModel";$C$21;1;"a")
`C21` is the `CUBESET` formula
Code3:
CUBEVALUE("ThisWorkbookDataModel";
$D22;
"[Calendar].[Month Number].["&MONTH($E$4)&"]";
"[Calendar].[Year].["&YEAR($E$4)&"]";
"[Measures].[NTS]")
Actual Result:
Ranking Store NTS
1 a 606
2 c 425
3 b 428
Expected result:
Ranking Store NTS
1 a 606
2 b 428
3 c 425

Spotfire: Show and calculate difference of two values from selected dates in the plot

I am showing a data of pressure in a graph by date which can be selected from the filter (days, months, years).
I would like to calculate the difference between the two data extrema in the plot [last Value - first Value] (when user changes a filter I show the new calculation as the graph will change)
PropertyName AverageReading Date
LevelPressure 1 1/1/2018
LevelPressure 5 1/3/2018
LevelPressure 24 1/2/2018
LevelPressure 4 1/5/2018
LevelPressure 3 2/2/2018
LevelPressure 2 2/3/2018
LevelPressure 1 2/4/2018
LevelPressure 77 2/1/2018
LevelPressure 33 2/2/2018
Here is my custom expression but it's not working properly (date is X axis values, level pressure Y axis):
Abs(if([Property Name]="LevelPressure",[Average Reading]))
- sum(if([Property Name]="LevelPressure",[Average Reading]))
over (PreviousPeriod([Date]))
If you are inserting a calculated column, it will always take into account the entire data set. It will not take filtering into account. You can create a calculated value and apply data limiting or filtering OR write an expression on the axis of a visualization. Based on the expression you gave, it seems like you are inserting a calculated column. This will not work.
Here is a solution that may or may not work for your use case. Your explanation did not specify what type of visualization you are working with. I assumed a scatter plot. This solution will work with any visualization type.
Go to Properties > Lines and Curves > Add a Horizontal Line configured with a custom expression > Abs(Max([Y]) - Min([Y])). This will put a line on the chart that is absolute value of the max and min average reading (average reading is your y axis value). It will update with filtering.

Find a growth rate that creates values adding to a determined total

I am trying to create a forecast tool that shows a smooth growth rate over a determined number of steps while adding up to a determined value. We have variables tied to certain sales values and want to illustrate different growth patterns. I am looking for a formula that would help us to determine the values of each individual step.
as an example: say we wanted to illustrate 100 units sold, starting with sales of 19 units, over 4 months with an even growth rate we would need to have individual month sales of 19, 23, 27 and 31. We can find these values with a lot of trial and error, but I am hoping that there is a formula that I could use to automatically calculate the values.
We will have a starting value (current or last month sales), a total amount of sales that we want to illustrate, and a period of time that we want to evaluate -- so all I am missing is a way to determine the change needed between individual values.
This basically is a problem in sequences and series. If the starting sales number is a, the difference in sales numbers between consecutive months is d, and the number of months is n, then the total sales is
S = n/2 * [2*a + (n-1) * d]
In your example, a=19, n=4, and S=100, with d unknown. That equation is easy to solve for d, and we get
d = 2 * (S - a * n) / (n * (n - 1))
There are other ways to write that, of course. If you substitute your example values into that expression, you get d=4, so the sales values increase by 4 each month.
For excel you can use this formula:
=IF(D1<>"",(D1-1)*($B$1-$B$2*$B$3)/SUMPRODUCT(ROW($A$1:INDEX(A:A,$B$3-1)))+$B$2,"")
I would recommend using Excel.
This is simply a Y=mX+b equation.
Assuming you want a steady growth rate over a time with x periods you can use this formula to determine the slope of your line (growth rate - designated as 'm'). As long as you have your two data points (starting sales value & ending sales value) you can find 'm' using
m = (y2-y1) / (x2-x1)
That will calculate the slope. Y2 represents your final sales goal. Y1 represents your current sales level. X2 is your number of periods in the period of performance (so how many months are you giving to achieve the goal). X1 = 0 since it represents today which is time period 0.
Once you solve for 'm' this will plug into the formula y=mX+b. Your 'b' in this scenario will always be equal to your current sales level (this represents the y intercept).
Then all you have to do to calculate the new 'Y' which represents the sales level at any period by plugging in any X value you choose. So if you are in the first month, then x=1. If you are in the second month X=2. The 'm' & 'b' stay the same.
See the Excel template below which serves as a rudimentary model. The yellow boxes can be filled in by the user and the white boxes should be left as formulas.

Conditional Interpolation Excel

I have the following excel setup that is extremely massive but here is a simplified setup:
Site1 X-Given Y-Given Site2 X-New-Given Y-Interpolated
A 10 400 A 25 550
A 20 500 A 25 550
A 30 600 A 26 560
A 40 700 B 27 570
A 50 800 B 30 600
B 10 400 B 15 450
B 20 500 B 25 550
B 30 600 B 30 600
What I'm trying to accomplish is to have each Y-Interpolated only interpolate based upon its specific site and not have any cross over. So site A would only interpolate with site A, and same with site B... so on and so forth.
I'm using the interpolate excel addin which has the following syntax:
=interpolate(x_array,y_array,x_given)
Thanks for the help!
You could try this worksheet function alternative... with data in A1:E9, enter this in F2 and fill down:
=FORECAST(E2,IF(MMULT(ROW(B$2:B$9)-LOOKUP(0,(B$2:B$9>=E2)/(A$2:A$9=D2),ROW(B$2:B$9))-0.5,1)^2<1,C$2:C$9),B$2:B$9)
Update: Here's a slightly shorter alternative entered with CTRL+SHIFT+ENTER
=PERCENTILE(IF(A$2:A$9=D2,C$2:C$9),PERCENTRANK(IF(A$2:A$9=D2,B$2:B$9),E2,20))
This assumes a positive relationship between variables and returns values at both boundaries.
Background
If you're going to use worksheet functions for this, the obvious approach is to find the neighboring two points to X: (X1,Y1) and (X2,Y2). Then calculate Y using:
Y = Y1 + (X - X1) * (Y2 - Y1) / (X2 - X1)
The problem is that this leads to a lengthy formula involving six INDEX/MATCH combinations and six more conditions for restricting data to the specified site. This leads one to look for other options...
1. The first formula looks complicated but all it's doing is applying a straight line fit based on the two neighboring points for the same site. Evaluating the formula for the third row above - by highlighting each part of the formula and pressing F9 - gives:
=FORECAST(26,{FALSE;500;600;FALSE;...},{10;20;30;40;...})
FORECAST ignores non-numeric data so the result is the same as just using {500,600} and {20,30} for the 2nd and 3rd arguments. You can use F9 on other parts of the formula to break it down further - I'll leave details to you. (The MMULT(...,1) part just changes the argument to an array so you can enter the formula without array-entry.)
2. The second formula is easier to follow. First note that in Excel percentiles are calculated by linear interpolation and the IF part is just restricting the numeric data to the specified site. Assuming data is increasing it follows that we can find the k-value in the PERCENTILE formula that matches the lookup value in the x-range and return the y-range value with that k-value. For the example in question:
26 =PERCENTILE({10,20,30,40,50},0.4)
560 =PERCENTILE({400,500,600,700,800},0.4)
To calculate the value of 0.4 the PERCENTRANK can be used which is inverse to PERCENTILE:
0.4 =PERCENTRANK({10,20,30,40,50},26)
0.4 =PERCENTRANK({400,500,600,700,800},560)
The formula above follows by combining these two functions, the last argument is set to 20 for full precision (Excel stores values internally to around 15-17 digits of precision).
Because the tool that you're using is based on a .xll add in for excel, you(or we) can not modify the code or create a custom version of interpolate that allows adding conditions.
Instead, you'll have to filter your data apart and then run the custom-function on the filtered datasets.

Resources