I have a problem in Excel like this:
A
B
C
D
123
02/03/2021
02/03/2021
02/03/2021
02/03/2021
456
03/03/2021
03/03/2021
03/03/2021
03/03/2021
03/03/2021
03/03/2021
789
04/03/2021
04/03/2021
I want to add value to column B like this:
A
B
C
D
123
123
02/03/2021
123
02/03/2021
123
02/03/2021
123
02/03/2021
456
456
03/03/2021
456
03/03/2021
456
03/03/2021
456
03/03/2021
456
03/03/2021
456
03/03/2021
789
789
04/03/2021
789
04/03/2021
Can anyone help me? Thanks in advance.
Assuming no Excel version constraints as per tag listed in the question. You can try the following in cell A2:
=LET(B, B2:B19, idx, SCAN(0, B, LAMBDA(ac,bb, IF(bb<>"", ac+1,0))),
seq, SEQUENCE(ROWS(B)), MAP(seq, idx, LAMBDA(s,i, IF(i>1, INDEX(B, s-i+1),""))))
Update: As #VBasic2008 pointed out in the comment section, MAP call can be removed, taking advantage of IF array behavior, so the formula can be simplified as follows:
=LET(B, B2:B19, idx, SCAN(0, B, LAMBDA(ac,bb, IF(bb<>"", ac+1,0))),
IF(idx>1, INDEX(B, SEQUENCE(ROWS(B))-idx+1),""))
The name idx, counts the number of non blank elements within each group of values starting from 1, and empty cells have the 0 value. The condition to fill with the first element start when i>1 (second element of the group). The index position: s-i+1 always points to the first element of the group.
We use seq, in MAP to avoid using OFFSET that is a volatile function and instead we can use INDEX.
Related
colA colB
A 125
B 546
C 4586
D 547
A 869
B 789
A 258
E 123
I want to create two new dataframe and the first one should be based on the unique values in 'colA' and the second one should be the repeated values of 'colB'. The colB has no repeated values. The first output is like this:
ColA colB
A 125
B 546
C 4586
D 547
E 123
The second output is like this:
colA colB
A 869
B 789
A 258
For the first group, use drop_duplicates. For second group, use duplicated:
print (df.drop_duplicates("colA"))
colA colB
0 A 125
1 B 546
2 C 4586
3 D 547
7 E 123
print (df[df.duplicated("colA")])
colA colB
4 A 869
5 B 789
6 A 258
I have a table with an ID column and 6 other value columns:
A B C D E F G
ID Col1 Col2 Col3 Col4 Col5 Col6
001 123 456 789
002 901 234 567 890 123 456
I'm looking for a formula that will concatenate the ID with what ever columns have values, separate by dashes (in this example).
ie.
=CONCATENATE(A2,"-",B2,"-",C2,"-",D2,"-",E2,"-",F2,"-",G2)
Only, I don't want to put dashes next to cells that don't have any value in it.
The desired output should look like this
001-123-456-789
002-901-234-567-890-123-456
With the formula I used, it looks like this:
001-123-456-789---
002-901-234-567-890-123-456
For examples :
=IF(A2<>"","-"&A2,"")&IF(B2<>"","-"&B2,"")&IF(C2<>"","-"&C2,"")&IF(D2<>"","-"&D2,"")&IF(F2<>"","-"&F2,"")&IF(G2<>"","-"&G2,"")
If I have a data frame with nested headers like this:
John Joan
Smith, Jones, Smith,
Index1 234 432 324
Index2 2987 234 4354
...how do I create a new column that sums the values of each row?
I tried df['sum']=df['John']+df['Joan'] but that resulted in this error:
ValueError: Wrong number of items passed 3, placement implies 1
If I understand you correctly:
...how do I create a new column that sums the values of each row?
Solution
The sum of each row is just
df.sum(axis=1)
The trick is getting to be a new column. You need to ensure the column you add has 2 levels of column heading.
df.loc[:, ('sum', 'sum')] = df.sum(axis=1)
I'm not happy with it, but it works.
Joan John sum
Smith, Jones, Smith, sum
Index1 324 432 234 990
Index2 4354 234 2987 7575
Dance Party, haven't heard from you in a while.
You want to groupby, but specify a level and axis. axis=1 means you want to sum the rows instead of the columns. level=0 is the top row of the columns.
df = pd.DataFrame({
('John', 'Smith,'): [234, 2987],
('John', 'Jones,'): [432, 234],
('Joan', 'Smith,'): [324, 4354]}, index=['Index1', 'Index2'])
>>> df.groupby(level=0, axis=1).sum()
Joan John
Index1 324 666
Index2 4354 3221
I need a code to copy/paste based on duplicate cell values in a column. Please see example below. I want to take values from A2:B2 and paste to C1:D1 if there is duplication in E1:E2. Then loop through the rest of the spreadsheet. Any ideas? Thanks :-)
Sample
A B C D E
111 222 ABC
333 444 ABC
555 666 DEF
777 888 DEF
Result
A B C D E
111 222 333 444 ABC
333 444 ABC
555 666 777 888 DEF
777 888 DEF
in Cell C2:
=if(e2=e3,a3,"")
Copy this to every cell in C and D.
I have a long list of ID number in Column C with important information in Column D-Q.
I need to sort it accordingly to a specific set of ID numbers in Column B along with the matching information in Col D-Q, like this:
I have this:
B C D E . . .
123 234 male 12
234 345 female 13
345 555 male 12
444 123 male 11
I need this:
B C D E . . .
123 123 male 11
234 234 male 12
345 345 female 13
444 N/A N/A N/A
Essentially, I need the information from C (and the adjacent info) to match with B or get sorted by the ID numbers in B. The file is huge and I just need to pull/sort it just by a specific set of ID numbers.
Thank you!
EDIT:I tried to use as suggested, the following in a new column. However, I receive the #N/A and #REF error.
=Index(D:Q,Match(B:B, C:C,0)
Provides error: #N/A and #REF!
You can get all the matching data from C-Q using index(array, match()). Just make the match() on B and C.