pandas selecting rows whose sum equals to a value in another column

pandas selecting rows whose sum equals to a value in another column - python-3.x

Hi Guys i have a dataFrame where i want to frist group rows by a column, then i find any rows that sum up to a given value in another column.
**A** **B** **c**
XCD 1 5
FFF 12 2
VB 3 6
XCD 8 5
AAA 2 7
AAA 5 7
XCD 4 5
VB 6 6
VB 3 6
FFF 2 2
For each unique entry in column A say XCD, the value of column C is always the same to represent the total sum needed per unique entry. To illustrate what i need, see the below final data Frame.
**A** **B** **c**
XCD 1 5
XCD 4 5
FFF 2 2
VB 6 6
AAA 2 7
AAA 5 7
The algorithm should select the rows that sum up to the column c. The algorithm can select a single row as long as its total sums up to the number in column c but we only take the first occurance that sum up to column c and leave out the rest, then have a new data Frame

Related

Excel: find row in a table by condition

I have two tables with same column names, but different data.
Table1:
A B C D
1 3 4 5 OK
2 6 7 8
3 9 8 7
Table2:
A B C D
1 9 8 7
2 1 2 8
3 3 4 5
I want to write formula in D of Table2, which would copy D-column values by row values from Table1. (I want to find same row in other table and set D-column value for it).
I could use SUMIFS(Table1!$D:$D, Table1!$A:$A, "="&A1, Table1!$B:$B, "="&B1, Table1!$C:$C, "="&C1), but all the rows are unique and I have string value in $D:$D - I don't need SUM exactly, I need only one string value.
Is there any function to find column value by row condition?
The result I want:
A B C D
1 9 8 7
2 1 2 8
3 3 4 5 OK

How to compare and iterate over certain rows in column while creating output as new column in dataframe?

I am wanting to backtest a trading strategy.
The data I have is OHLC (open,high,low, close) for a financial product, that is formatted into a dataframe with 300 rows (each row is 1 day) like so:
datetime O H L C
2020-03-24 1 2 3 4
2020-03-23 5 6 7 8
2020-03-22 9 1 2 3
2020-03-21 9 2 2 3
2020-03-20 9 3 2 3
2020-03-19 9 4 2 3
2020-03-18 9 5 2 3
What I want to do is, starting on the date closet to current date, in this case row with 2020-03-24:
1. take the number in column `L`
2. compare if the number in column `L` is at any point greater than the values in column `L` for the previous two days.
3. Create and fill in new column if value from 1 is greater than value in interation.
4. Repeat steps 1, 2, & 3 but take the number in column `L` that was not into included in the iteration.
Example:
1. Starting on row `2020-03-24`, take value `3`
2. Is `3` at any point greater than `7` or `2` for rows starting with `2020-03-23` and `2020-03-22`?
3. YES,assign `TRUE` to column `comparison` in df for row starting with `2020-03-24`
4. Repeat, starting on row `2020-03-21`, take value `2` in column `L`
4a. Is `2` at any point greater than values in rows `2020-03-20` or `2020-03-19`?
4b. NO, assign `FALSE` to column `comparison` in df for row starting with `2020-03-21`.
New df looks like this:
datetime O H L C Comparison
2020-03-24 1 2 3 4 TRUE
2020-03-23 5 6 7 8
2020-03-22 9 1 2 3
2020-03-21 9 2 2 3 FALSE
2020-03-20 9 3 2 3
2020-03-19 9 4 2 3
2020-03-18 9 5 2 3
The only way I know how to do this is with a FOR loop, but that doesnt work on iterating and comparing only certain subsets like so:
for i in df['L']:
if df['L'] >

You need a combination of rolling() and shift():
df.index = pd.to_datetime(df.index)
df.sort_index(inplace=True, ascending=False)
df['Comparison'] = False
df['Comparison'] = df.loc[:, 'L'] > df.loc[:, 'L'].rolling(window=2).min().shift(-2)
With rolling() you get the minimum of the last two days, shift() moves it to the right row.

Formula to transpose horizontal numeric data vertical in excel

My input data in column A
1
2
3
4
5
6
7
8
9
If I want the above in Column B C D like
1 2 3
4 5 6
7 8 9

Use INDEX with some math:
=INDEX($A:$A,(ROW($A1)-1)*3+COLUMN(A$1))
Put in B1 copy over 3 columns and down 3 rows.
The *3 is the number of columns desired.

How to multiple column join in one column in excel (I Want Formula)

HOW TO JOIN MULTIPLE COLUMN IN ONE COLUMN
TABLE 1 TABLE 2 TABLE 3
1 2 5
2 4 3
3 5 3
4 5 1
I WANT TO
1
2
3
4
2
4
5
5
5
3
3
1

If your data is like below, enter the formula in the first row of any column and drag down until there is no value left over,
=IF(ROW()<=COUNTA(A:A),INDEX(A:A,ROW()),IF(ROW()<=COUNTA(A:B),INDEX(B:B,ROW()-COUNTA(A:A)),IF(ROW()>COUNTA(A:C),"",INDEX(C:C,ROW()-COUNTA(A:B)))))

Sum numbers if multiple criteria is met in other cells

I have an Excel spreadsheet with data laid out like so:
Column A Column B Column C Column D
Row 1 Sector Rail
Row 2 A B C
Row 3 Type 1 1 5 1
Row 4 Type 2 2 3 0
Row 5 Type 3 1 1 6
Row 6 Total 4 9 7
In row 1 you can see in column A I have sector, and in column B I have the type of sector, i.e. Rail. My spreadsheet has several occurrences of different data following this structure, like so:
Column A Column B Column C Column D
Row 1 Sector Rail
Row 2 A B C
Row 3 Type 1 1 5 1
Row 4 Type 2 2 3 0
Row 5 Type 3 1 1 6
Row 6 Total 4 9 7
Row 7
Row 8 Sector aerospace
Row 9 A B C
Row 10 Type 1 0 9 9
Row 11 Type 2 3 3 1
Row 12 Type 3 4 5 6
Row 13 Total 0 1 8
Row 14
Row 15 Sector Rail
Row 16 A B C
Row 17 Type 1 8 9 9
Row 18 Type 2 3 3 1
Row 19 Type 3 4 5 6
Row 20 Total 9 1 8
Note that there are two sets of data with the sector Rail and one with Aerospace in the middle.
Now I need a formula that will add the numbers from the total in column B row 6 and add the total number from column b row 20. But only for Rail.
So Column B row 1 has a sector Rail, but so does Column B row 14.
So I need one formula which will find both occurrences of the word Rail and then give me the total numbers of row 6 + row 20, which would be 1 + 8 = Total of 9
Can anyone please show me if this is possible?

Please try deleting Row2 and:
=SUMIF(B1:B98,"Rail",B2:B99)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

pandas selecting rows whose sum equals to a value in another column - python-3.x

Related

Excel: find row in a table by condition

How to compare and iterate over certain rows in column while creating output as new column in dataframe?

Formula to transpose horizontal numeric data vertical in excel

How to multiple column join in one column in excel (I Want Formula)

Sum numbers if multiple criteria is met in other cells

Categories

Resources