I have a dataset (in Excel) consisting of quarterly observations.
For each of these Observations, I also have an exact date and time, like:
Obs.value, 20000101, 07:00.
Also I created an eViews Workfile with 10 Minute frequency.
Now I want to import that quarterly data to the eViews workfile.
For all (10min frequency) observations where there is no corresponding value in the data, the observation value should be zero. That means in the end I need to have a 10 Min frequency vector with one observation value per quarter and zeros otherwise:
.
.
.
.
20000101, 06:50, 0
20000101, 07:00, Obs. Value (one per quarter)
20000101, 07:10, 0
.
.
.
Does anyone know how to programm that in eViews?
For those finding this...looks like it was resolved over on the Eviews forums.
http://forums.eviews.com/viewtopic.php?f=3&t=9068
Related
I am trying to find duration for time where wave height is under 3m and time period is between 5:00am and 6:00pm. Trying to find this duration for a month of tidal data.
I have raw data for wave height and timestamps when it is high and low.
eg.
Timestamp Wave_Height
1/01/2022 3:16 0.68
1/01/2022 9:37 6.62
1/01/2022 16:14 1.07
1/01/2022 21:54 5.37
2/01/2022 4:06 0.59
etc…
So far I have got linear interpolation to find points where wave height=3. I am struggling to get a function to find the durations for my limits on time.
Included a picture to explain
Graph of wave data over time
The timestamps occur over different days in the month so difference between times must consider the changed dates in some cases(see rev 2 errors ####### where errors occur for changing of dates)
rev 2 error
The following should work. I have added some columns to avoid complicated formulas.
interpolate when the wave_height = 3 (column G)
add column H which is True when wave_height increases and False if it decreases (at the time in column G):
so cell H6 = F7<3 gives TRUE
add column E to limit the time window to 5:00-18:00.
E7 is =IF(D7<$G$2;$G$2;IF(D7>$H$2;$H$2;D7))
Added column I to calculate the time during wich wave_height < 3. The sum of that column is what you need.
I8 is =H8*(G8-E7)+NOT(H8)*(D8-G8)
I'm looking to perform data analysis on 100-years of climatological data for select U.S. locations (8 in particular), for each day spanning the 100-years. I have a pandas dataFrame set up with columns for Max temperature, Min temperature, Avg temperature, Snowfall, Precip Total, and then Day, Year, and Month values (then, I have an index also based on a date-time value). Right now, I want to set up a for loop to print the first Maximum temperature of 90 degrees F or greater from each year, but ONLY the first. Eventually, I want to narrow this down to each of my 8 locations, but first I just want to get the for loop to work.
Experimented with various iterations of a for loop.
for year in range(len(climate['Year'])):
if (climate['Max'][year] >=90).all():
print (climate.index[year])
break
Unsurprisingly, the output of the loop I provided prints the first 90 degree day period (from the year 1919, the beginning of my data frame) and breaks.
for year in range(len(climate['Year'])):
if (climate['Max'][year] >=90).all():
print (climate.index[year])
break
1919-06-12 00:00:00
That's fine. If I take out the break statement, all of the 90 degree days print, including multiple in the same year. I just want the first value from each year to print. Do I need to set up a second for loop to increment through the year? If I explicitly state the year, ala below, while trying to loop through a counter, the loop still begins in 1919 and eventually reaches an out of bounds index. I know this logic is incorrect.
count = 1919
while count < 2019:
for year in range(len(climate['Year'])):
if (climate[climate['Year']==count]['Max'][year] >=90).all():
print (climate.index[year])
count = count+1
Any input is sincerely appreciated.
You can achieve this without having a second for loop. Assuming the climate dataframe is ordered chronologically, this should do what you want:
current_year = None
for i in range(climate.shape[0]):
if climate['Max'][i] >= 90 and climate['Year'][i] != current_year:
print(climate.index[i])
current_year = climate['Year'][i]
Notice that we're using the current_year variable to keep track of the latest year that we've already printed the result for. Then, in the if check, we're checking if we've already printed a result for the year of the current row in the loop.
That's one way to do it, but I would suggest taking a look at pandas.DataFrame.groupby because I think it fits your use case well. You could get a dataframe that contains the first >=90 max days per year with the following (again assuming climate is ordered chronologically):
climate[climate.Max >= 90].groupby('Year').first()
This just filters the dataframe to only contain the >=90 max days, groups rows from the same year together, and retains only the first row from each group. If you had an additional column Location, you could extend this to get the same except per location per year:
climate[climate.Max >= 90].groupby(['Location', 'Year']).first()
I am trying to create a forecast tool that shows a smooth growth rate over a determined number of steps while adding up to a determined value. We have variables tied to certain sales values and want to illustrate different growth patterns. I am looking for a formula that would help us to determine the values of each individual step.
as an example: say we wanted to illustrate 100 units sold, starting with sales of 19 units, over 4 months with an even growth rate we would need to have individual month sales of 19, 23, 27 and 31. We can find these values with a lot of trial and error, but I am hoping that there is a formula that I could use to automatically calculate the values.
We will have a starting value (current or last month sales), a total amount of sales that we want to illustrate, and a period of time that we want to evaluate -- so all I am missing is a way to determine the change needed between individual values.
This basically is a problem in sequences and series. If the starting sales number is a, the difference in sales numbers between consecutive months is d, and the number of months is n, then the total sales is
S = n/2 * [2*a + (n-1) * d]
In your example, a=19, n=4, and S=100, with d unknown. That equation is easy to solve for d, and we get
d = 2 * (S - a * n) / (n * (n - 1))
There are other ways to write that, of course. If you substitute your example values into that expression, you get d=4, so the sales values increase by 4 each month.
For excel you can use this formula:
=IF(D1<>"",(D1-1)*($B$1-$B$2*$B$3)/SUMPRODUCT(ROW($A$1:INDEX(A:A,$B$3-1)))+$B$2,"")
I would recommend using Excel.
This is simply a Y=mX+b equation.
Assuming you want a steady growth rate over a time with x periods you can use this formula to determine the slope of your line (growth rate - designated as 'm'). As long as you have your two data points (starting sales value & ending sales value) you can find 'm' using
m = (y2-y1) / (x2-x1)
That will calculate the slope. Y2 represents your final sales goal. Y1 represents your current sales level. X2 is your number of periods in the period of performance (so how many months are you giving to achieve the goal). X1 = 0 since it represents today which is time period 0.
Once you solve for 'm' this will plug into the formula y=mX+b. Your 'b' in this scenario will always be equal to your current sales level (this represents the y intercept).
Then all you have to do to calculate the new 'Y' which represents the sales level at any period by plugging in any X value you choose. So if you are in the first month, then x=1. If you are in the second month X=2. The 'm' & 'b' stay the same.
See the Excel template below which serves as a rudimentary model. The yellow boxes can be filled in by the user and the white boxes should be left as formulas.
Sorry for the ambiguous title, I have a query which is stumping me in Excel:
I have a range of temperature data, recordings from every minute of every day for 3 months.
I want to find out how many times the average temperature from 20:30-21:30 on each day is lower than the average temperature from 01:00-02:00 the following morning (about 5 hours difference).
If that is difficult to understand here is a "logic formula":
count(averageTemp(dateX(timeA-timeA+1))<(averageTemp(dateY(timeB-timeB+1)))
Here's a sample of the data as a screenshot:
Please help me out, this one has me scratching my head.
Enter this as an array formula (ctrl+shift+enter) and change "122401" to the last row number of your data range:
=SUM(IFERROR(--(AVERAGEIFS(C2:C122401,B2:B122401,"<="&TIMEVALUE("21:30"),B2:B122401,">="&TIMEVALUE("20:30"),A2:A122401,ROW(INDIRECT(A2&":"&A122401)))<AVERAGEIFS(C2:C122401,B2:B122401,"<="&TIMEVALUE("02:00"),B2:B122401,">="&TIMEVALUE("01:00"),A2:A122401,ROW(INDIRECT(A2+1&":"&A122401)))),0))
This assumes that the first set of temperatures from 01:00-02:00 does not have a matching set from 20:30-21:30.
I would input a flag in column D that takes value 1/0 whether the time is in the frame you are interested in.
So input in D2 = IF(OR(AND(B2<21:30,B2>20:30),AND(B2<01:00,B2>02:00))),1,0).
Then I would go in column C and check if in D I got 1, input a simple IF statement to check for the temperature.
Let me know if it works!
I'm trying to find the maximum value from the 15 minute interval data that has dates associated with each row seen below:
DATE UOM 00:01-00:15 kW 00:16-00:30 kW 00:31-00:45 kW 00:46-01:00 kW
7/1/2010 KW 907.2 892.8 883.2 883.2
7/2/2010 KW 907.2 849.6 859.2 825.6
7/3/2010 KW 811.2 806.4 806.4 801.6
7/4/2010 KW 763.2 768 758.4 772.8
This data is electrical demands for my school's campus, and I'm trying to find peak, partial peak, and off peak maximum demands. There are approximately 4 years of data with each row consisting of a single data.
Peak hours occur during 12:00 - 18:00 hours
Partial Peak occurs during 08:31 - 11:59 & 18:00 21:30
Off Peak occurs during 21:30 - 08:30
I'd like to be able to get those values for each month of each year. But so far the logic isn't coming to me, and everything I'm looking up just shows me index-match tutorials. Any help would be greatly appreciated.
Simply use MAX or a combination of two MAX functions in order to determine maximums for any given timespan.
In my screenshot, you can see how the ranges are defined by the columns. Therefore you may have to adjust the ranges to correspond to your actual spreadsheet.
For example, for cell CW1 it uses the formula =MAX($AY2:$BV2). This determines the value of the maximum value for all 15-minute time spans within that range. Because 12:01 occurs in column AY, and 18:00 ends in column BV, it's possible to find the maximum between 12:01 - 18:00 by using the MAX function.
For time spans that are not continuous, we can split them into multiple ranges. For CX and CY we do this by using two MAX functions. So a maximum value is retrieved for each continuous time span, and then the outer MAX determines the maximum of the two local maximums.
Therefore, for CX:
=MAX(MAX($AK2:$AX2),MAX($BW2:$CJ2))
For CY:
=MAX(MAX($C2:$AJ2),MAX($CK2:$CT2))
Note that I don't have your full data set, so these values are garbage.