Linear calculation of time in excel - excel

I have a column containing time (s) in excel. But the problem is that there are duplicate time values and a given time could be repeated "n" times. What I'm trying to achieve is devide the time step linearly. So as you can see below 0.02 was repeated 3 times (i.e. n=3), so ideally I would want to find the difference between 0.02 and 0.01 and then divide that by n. so the first time value after 0.01 would be = 0.01333 which can be worked out as follows (0.02-0.01)/n then 0.01+n.
The problem is n is not constant and could have any value between 2 and 10.
Please find a sample of the data below.
time (s)
0.00
0.01
0.02
0.02
0.02
0.03
0.03
0.03
0.03
0.03
0.03
0.04
0.04
0.04
0.04

Assuming your list starts in cell A1, put this in cell B2:
=IF(COUNTIF(A:A,A2)=1,A2,B1+(A2-AGGREGATE(14,6,($A$2:A2)/($A$2:A2<>A2),1))/COUNTIF(A:A,A2))

Related

How to get the column name of a dataframe from values in a numpy array

I have a df with 15 columns:
df.columns:
0 class
1 name
2 location
3 income
4 edu_level
--
14 marital_status
after some transformations I got an numpy.ndarray with shape (15,3) named loads:
0.52 0.33 0.09
0.20 0.53 0.23
0.60 0.28 0.23
0.13 0.45 0.41
0.49 0.9
so on so on so on
So, 3 columns with 15 values.
What I need to do:
I want to get the df column name of the values from the first column of loads that are greater then .50
For this example, the columns of df related to the first column of loadswith values higher than 0.5 should return:
0 Class
2 Location
Same for the second column of loads, should return:
1 name
3 income
4 edu_level
and the same logic to the 3rd column of loads.
I managed to get the numparray loads they way I need it but I am having a bad time with this last part. I know I can simple manually pick the columns but this will be a hard task when df has more than 15 features.
Can anyone help me, please?
given your threshold you can create a boolean array in order to filter df.columns:
threshold = .5
for j in range(loads.shape[1]):
print(df.columms[loads[:,j]>threshold])

Create unique list from 2 columns and sum values per row based on that unique list from 2 value columns

Having scoured numerous posts I am still struggling to find a solution for a report I am trying to transition over to PowerBI, from MS Excel.
Problem
Create a table in the report section of PowerBI, which has a unique list of currencies (based on 2 columns) and their corresponding FXexposure, which are defined based on each currency leg from 2 columns. Below I have shown the source data and workings I use in Excel, which i am trying to replicate.
Source data (from database table)
a
b
d
d
e
f
g
Instrument
Currency 1
Currency 2
FX nominal 1
FX nominal 2
FXNom1 - Gross
FXNom2 - Gross
FWD EUR/USD
EUR
USD
-7.965264529
7.90296523
7.97
7.90
FWD USD/JPY
USD
JPY
1.030513307
-1.070305687
1.03
1.07
Instrument 1
USD
1.75862819
1.76
0.00
Instrument 2
USD
TRY
0
3.45E-04
0.00
0.00
Instrument 3
JPY
1.121782037
1.12
0.00
Instrument 4
EUR
6.2505079
6.25
0.00
FWD EUR/CNH
EUR
CNH
0.007591392
3.00E-09
0.01
0.00
Instrument 5
RUB
6.209882675
6.21
0.00
F2 = ABS(FX nominal 1)
G2 = ABS(FX nominal 2)
Report output in excel
a
b
c
d
e
FX
Long
Short
Net
**Gross **
0
0.00
0.00
0.00
0.00
RUB
6.21
0.00
6.21
6.21
EUR
6.26
-7.97
-1.71
14.22
JPY
1.12
-1.07
0.05
2.19
USD
10.69
0.00
10.69
10.69
CNH
0.00
0.00
0.00
0.00
TRY
0.00
0.00
0.00
0.00
My Excel formulas are below to recreate what i am looking for.
A2: =IFERROR(LOOKUP(2, 1/(COUNTIF(Report!$A$1:A1,Data!$B$2:$B$553)=0), Data!$B$2:$B$553), LOOKUP(2, 1/(COUNTIF(Report!$A$1:A1, Data!$C$2:$C$553)=0), Data!$C$2:$C$553))
B2: =((SUMIFS(Data!$D$2:$D$553, Data!$B$2:$B$553, Report!$A2, Data!$D$2:$D$553, ">0"))+(SUMIFS(Data!$E$2:$E$553, Data!$C$2:$C$553, Report!$A2, Data!$E$2:$E$553, ">0")))
C2: =((SUMIFS(Data!$D$2:$D$553, Data!$B$2:$B$553, Report!$A3, Data!$D$2:$D$553, "<0"))+(SUMIFS(Data!$E$2:$E$553, Data!$C$2:$C$553, Report!$A3, Data!$E$2:$E$553, "<0")))
D2: =(SUMIF(Data!$B$1:$B$553,Report!$A3,Data!$D$1:$D$553)+SUMIF(Data!$C$1:$C$553,Report!$A3,Data!$E$1:$E$553))
E2: =(SUMIF(Data!$B$1:$B$554,Report!$A3,Data!$F$1:$F$554)+SUMIF(Data!$C$1:$C$554,Report!$A3,Data!$G$1:$G$554))
Now I believe I've managed to find a hack by using the UNIQUE/SELECTCOLUMNS function, but when you try and graph the output it is very small (as if there is other data it is trying to find behind the scenes). Note i tend to filter on date to get the output I need (this is mapped using relationships across other data tables).
FX =
DISTINCT (
UNION (
SELECTCOLUMNS ( DATA, "Date", [DATE], "Currency", [CURRENCY1], "FXNom", [FXNOMINAL1] ),
SELECTCOLUMNS ( DATA, "Date", [DATE], "Currency", [CURRENCY2], ,"FXNom", [FXNOMINAL2] )
)
)
If anyone has any ideas I would be very grateful as I still feel my workaround is more of a lucky hack.
Thanks!
The approach that you're using looks nearly ideal. From a dimensional model perspective, you want one column for values and one column for currency labels. So selecting those pairs as different tables and appending with UNION is the right way to go. Generally, I think it's better to do all the transformation you can in power query, using DAX this way can lead to some limitations.
But if we're going with DAX, I do think you want to get rid of DISTINCT. This could cause identical positions to be collapsed into a single row and you'd lose data this way.
FX =
UNION (
SELECTCOLUMNS ( FX_Raw, "Date", "FakeDate", "Currency", [CURRENCY 1], "FXNom", [FX nominal 1] ),
SELECTCOLUMNS ( FX_Raw, "Date", "FakeDate", "Currency", [CURRENCY 2], "FXNom", [FX nominal 2] )
)
And then a few measures:
Long =
CALCULATE(sum(FX[FXNom]), FX[FXNom] >= 0)
Short =
CALCULATE(sum(FX[FXNom]), FX[FXNom] < 0)
Gross =
SUMX( FX, if(FX[FXNom] > 0, FX[FXNom], 0-FX[FXNom]))
Net =
SUM(FX[FXNom])
Seems to produce the desired result:

How to add number from file to variable?

I have file:
0 3 0.071 0.082 0.002
0 4 144 145.5 0.2
0 6 0.36 0.46 0.02
and I would like to add some number to variable. How to do that? I know how to add a column to variable. Is it possible by using table?
This is code to add column ti variable, but I don't know how to add one number to variable.
x = np.loadtxt('file', unpack=True, usecols=[1])

Excel and selecting variables conditionally

I have a data set which contains information by country. For example, Australia_F is the observation for Australia and Australia_Weight is the weight of Australia. Each period, represents a specific year.
Period Australia_F Canada_F Denmark_F Japan_F Australia_Weight Canada_Weight Denmark_Weight Japan_weight
1985 0.05 -0.02 0.02 0.03 0.10 0.30 0.45 0.15
1986 -0.04 -0.03 0.02 0.01 0.15 0.30 0.30 0.25
The user can input any value to the following cell. For example I have inserted 3
Weight_Modification = 3
The goal is to only include countries where the variable XXXXX_F are positive
and use those with the highest values such that the total weight of counties selected is not greater than 1.
The problem is complicated by the fact that the weight_modification variable, multiplies each individual county weight by whatever the value is. For example, the Weight for Australia would be 0.10 *3 = 0.3 in 1985.
Total weights can be less than 1.00 but can't be greater than 1.00
So taking the above data as an example and for 1985 the results would be
Australia_weight Canada_weight Denmark_weight Japan_weight Total_weight
0.3 0.45 0.75
This is because in 1985 Australia has the highest value (Australia_F = 0.05), followed by Japan (Japan_F = 0.03).
Each countries weights are multiplied by 3.
Denmark is not selected even through Denmark_F is positive, because including Denmark the total weight exceeds 1.
In the actual file there are many more countries (12 in total) and many years.
Any help with how to put this together in excel is greatly appreciated.

pandas, how to get close price from returns?

I'm trying to convert from returns to a price index to simulate close prices for the ffn library, but without success.
import pandas as pd
times = pd.to_datetime(pd.Series(['2014-07-4',
'2014-07-15','2014-08-25','2014-08-25','2014-09-10','2014-09-15']))
strategypercentage = [0.01, 0.02, -0.03, 0.04,0.5,-0.3]
df = pd.DataFrame({'llt_return': strategypercentage}, index=times)
df['llt_close']=1
df['llt_close']=df['llt_close'].shift(1)*(1+df['llt_return'])
df.head(10)
llt_return llt_close
2014-07-04 0.01 NaN
2014-07-15 0.02 1.02
2014-08-25 -0.03 0.97
2014-08-25 0.04 1.04
2014-09-10 0.50 1.50
2014-09-15 -0.30 0.70
How can I make this correct?
You can use the cumulative product of return-relatives.
A return-relative is one-plus that day's return.
>>> start = 1.0
>>> df['llt_close'] = start * (1 + df['llt_return']).cumprod()
>>> df
llt_return llt_close
2014-07-04 0.01 1.0100
2014-07-15 0.02 1.0302
2014-08-25 -0.03 0.9993
2014-08-25 0.04 1.0393
2014-09-10 0.50 1.5589
2014-09-15 -0.30 1.0912
This assumes the price index starts at start on the close of the trading day prior to 2014-07-04.
On 7-04, you have a 1% return and the price index closes at 1 * (1 + .01) = 1.01.
On 7-15, return was 2%; close price will be 1.01 * (1 + .02) = 1.0302.
Granted, this is not completely realistic given you're forming a price indexing from irregular-frequency data (missing dates), but hopefully this answers your question.

Resources