Hi Guys i have a dataFrame where i want to frist group rows by a column, then i find any rows that sum up to a given value in another column.
**A** **B** **c**
XCD 1 5
FFF 12 2
VB 3 6
XCD 8 5
AAA 2 7
AAA 5 7
XCD 4 5
VB 6 6
VB 3 6
FFF 2 2
For each unique entry in column A say XCD, the value of column C is always the same to represent the total sum needed per unique entry. To illustrate what i need, see the below final data Frame.
**A** **B** **c**
XCD 1 5
XCD 4 5
FFF 2 2
VB 6 6
AAA 2 7
AAA 5 7
The algorithm should select the rows that sum up to the column c. The algorithm can select a single row as long as its total sums up to the number in column c but we only take the first occurance that sum up to column c and leave out the rest, then have a new data Frame
Related
I have two tables with same column names, but different data.
Table1:
A B C D
1 3 4 5 OK
2 6 7 8
3 9 8 7
Table2:
A B C D
1 9 8 7
2 1 2 8
3 3 4 5
I want to write formula in D of Table2, which would copy D-column values by row values from Table1. (I want to find same row in other table and set D-column value for it).
I could use SUMIFS(Table1!$D:$D, Table1!$A:$A, "="&A1, Table1!$B:$B, "="&B1, Table1!$C:$C, "="&C1), but all the rows are unique and I have string value in $D:$D - I don't need SUM exactly, I need only one string value.
Is there any function to find column value by row condition?
The result I want:
A B C D
1 9 8 7
2 1 2 8
3 3 4 5 OK
I am wanting to backtest a trading strategy.
The data I have is OHLC (open,high,low, close) for a financial product, that is formatted into a dataframe with 300 rows (each row is 1 day) like so:
datetime O H L C
2020-03-24 1 2 3 4
2020-03-23 5 6 7 8
2020-03-22 9 1 2 3
2020-03-21 9 2 2 3
2020-03-20 9 3 2 3
2020-03-19 9 4 2 3
2020-03-18 9 5 2 3
What I want to do is, starting on the date closet to current date, in this case row with 2020-03-24:
1. take the number in column `L`
2. compare if the number in column `L` is at any point greater than the values in column `L` for the previous two days.
3. Create and fill in new column if value from 1 is greater than value in interation.
4. Repeat steps 1, 2, & 3 but take the number in column `L` that was not into included in the iteration.
Example:
1. Starting on row `2020-03-24`, take value `3`
2. Is `3` at any point greater than `7` or `2` for rows starting with `2020-03-23` and `2020-03-22`?
3. YES,assign `TRUE` to column `comparison` in df for row starting with `2020-03-24`
4. Repeat, starting on row `2020-03-21`, take value `2` in column `L`
4a. Is `2` at any point greater than values in rows `2020-03-20` or `2020-03-19`?
4b. NO, assign `FALSE` to column `comparison` in df for row starting with `2020-03-21`.
New df looks like this:
datetime O H L C Comparison
2020-03-24 1 2 3 4 TRUE
2020-03-23 5 6 7 8
2020-03-22 9 1 2 3
2020-03-21 9 2 2 3 FALSE
2020-03-20 9 3 2 3
2020-03-19 9 4 2 3
2020-03-18 9 5 2 3
The only way I know how to do this is with a FOR loop, but that doesnt work on iterating and comparing only certain subsets like so:
for i in df['L']:
if df['L'] >
You need a combination of rolling() and shift():
df.index = pd.to_datetime(df.index)
df.sort_index(inplace=True, ascending=False)
df['Comparison'] = False
df['Comparison'] = df.loc[:, 'L'] > df.loc[:, 'L'].rolling(window=2).min().shift(-2)
With rolling() you get the minimum of the last two days, shift() moves it to the right row.
My input data in column A
1
2
3
4
5
6
7
8
9
If I want the above in Column B C D like
1 2 3
4 5 6
7 8 9
Use INDEX with some math:
=INDEX($A:$A,(ROW($A1)-1)*3+COLUMN(A$1))
Put in B1 copy over 3 columns and down 3 rows.
The *3 is the number of columns desired.
HOW TO JOIN MULTIPLE COLUMN IN ONE COLUMN
TABLE 1 TABLE 2 TABLE 3
1 2 5
2 4 3
3 5 3
4 5 1
I WANT TO
1
2
3
4
2
4
5
5
5
3
3
1
If your data is like below, enter the formula in the first row of any column and drag down until there is no value left over,
=IF(ROW()<=COUNTA(A:A),INDEX(A:A,ROW()),IF(ROW()<=COUNTA(A:B),INDEX(B:B,ROW()-COUNTA(A:A)),IF(ROW()>COUNTA(A:C),"",INDEX(C:C,ROW()-COUNTA(A:B)))))
I have an Excel spreadsheet with data laid out like so:
Column A Column B Column C Column D
Row 1 Sector Rail
Row 2 A B C
Row 3 Type 1 1 5 1
Row 4 Type 2 2 3 0
Row 5 Type 3 1 1 6
Row 6 Total 4 9 7
In row 1 you can see in column A I have sector, and in column B I have the type of sector, i.e. Rail. My spreadsheet has several occurrences of different data following this structure, like so:
Column A Column B Column C Column D
Row 1 Sector Rail
Row 2 A B C
Row 3 Type 1 1 5 1
Row 4 Type 2 2 3 0
Row 5 Type 3 1 1 6
Row 6 Total 4 9 7
Row 7
Row 8 Sector aerospace
Row 9 A B C
Row 10 Type 1 0 9 9
Row 11 Type 2 3 3 1
Row 12 Type 3 4 5 6
Row 13 Total 0 1 8
Row 14
Row 15 Sector Rail
Row 16 A B C
Row 17 Type 1 8 9 9
Row 18 Type 2 3 3 1
Row 19 Type 3 4 5 6
Row 20 Total 9 1 8
Note that there are two sets of data with the sector Rail and one with Aerospace in the middle.
Now I need a formula that will add the numbers from the total in column B row 6 and add the total number from column b row 20. But only for Rail.
So Column B row 1 has a sector Rail, but so does Column B row 14.
So I need one formula which will find both occurrences of the word Rail and then give me the total numbers of row 6 + row 20, which would be 1 + 8 = Total of 9
Can anyone please show me if this is possible?
Please try deleting Row2 and:
=SUMIF(B1:B98,"Rail",B2:B99)