Microsoft Excel If Statements - excel

I have altered a statement I got from a previous answer a bit and it now looks like this:
=IF(C6=$R$3,IF(D6<=0.99,$U$2,IF(AND(D6>0.99,D6<=4.99),$U$3,IF(AND(D6>4.99,D6<=14.99),$U$4,IF(AND(D6>14.99,D3<=29.99),$U$5,IF(AND(D6>29.99,D6<99.99),$U$6,""))))),$S$8)
It all works fine until you change the value in cell D6 to say £45 when it still picks up the figure in cell U5.
Can you or anyone else help me tweak this so that it works? I need a statement to do the following:
If C2=R2 and D2 is < T2 then U2, if D2 is >T but T3 but < T4 then U4 if D2 is > T4 but < T5 then U5, if D2 is > T5 but < T6 then U6 BUT if C2 does not equal R2 then S8

Take all your problems and rip them apart:
If C2=R2 and D2 is < T2 then U2, if D2 is >T but T3 but < T4 then U4 if D2 is > T4 but < T5 then U5, if D2 is > T5 but < T6 then U6 BUT if C2 does not equal R2 then S8
Start with this using NA() to represent parts which haven't been completed yet (this will show the #N/A value in the cell):
=IF(C2=R2,NA(),S8)
Add the lookup based on D2:
=IF(C2=R2,IF(D2<T2,U2,NA()),S8)
Assuming that the next part is D2 > T2 and D2 < T3 (althought strictly this formula says D2 >= T2) and result is U3:
=IF(C2=R2,IF(D2<T2,U2,IF(D2<T3,U3,NA())),S8)
Now add between T3 and T4:
=IF(C2=R2,IF(D2<T2,U2,IF(D2<T3,U3,IF(D2<T4,U4,NA()))),S8)
Between T4 and T5:
=IF(C2=R2,IF(D2<T2,U2,IF(D2<T3,U3,IF(D2<T4,U4,IF(D2<T5,U5,NA())))),S8)
Finally between T5 and T6:
=IF(C2=R2,IF(D2<T2,U2,IF(D2<T3,U3,IF(D2<T4,U4,IF(D2<T5,U5,IF(D2<T6,U6,NA()))))),S8)
We still have NA() because you haven't defined the behaviour for C2=R2 and D2 >= T6
As Stobor said in the comment to your original question, using VLOOKUP would be much better - see http://office.microsoft.com/en-us/excel/HP052093351033.aspx for details
Your current structure in the T and U columns won't work with VLOOKUP because:
the next largest value that is less
than lookup value is returned
This would mean that VLOOKUP would return U3 when you wanted U2, U4 instead of U3 and so on. To solve this you would need to move all of the entries in the U column down by one row, put a dummy value or =NA() into U2 and create a value in T7 that was greater than the existing value in T6

Related

Excel comparing value from row to different columns

I have a Table like this in Sheet1
A B
1234.jpg | c1
1234.jpg | c2
1234.jpg | c3
3456.jpg | c8
3456.jpg | c9
3456.jpg | c10
haha.jpg | c2
haha.jpg | c5
haha.jpg | c9
I need the to match the data according to the Columns in Sheet2 and the data should result something like this.
c1 c2 c3 c4 c5
123.jpg Y Y Y N N
3456.jpg N N N N N
haha.jpg N Y N N Y
I am currently only able to make out this
=IF(ISERROR(MATCH(A2,Sheet1!$A$1:$B$9,0)),"Y","N")
Which returns Y as long as A2 matches something from the array. How do I go about matching it as the Column in Sheet2? I'm open to using functions or VBA
Use following formula to D3 cell as per screenshot.
=IF(SUMPRODUCT(($A$2:$A$10=$C3)*($B$2:$B$10=D$2))=1,"Y","N")
....................................................................................................................................................... You can also use this array formula.
=IF(ISNUMBER(MATCH($C3&D$2,$A$2:$A$10&$B$2:$B$10,0)),"Y","N")
Press CTRL+SHIFT+ENTER to evaluate the formula as it is an array formula.
After entering formula as array formula, drag and drop to right and down as you need.

pyspark - Read files with custom delimiter to RDD?

I am newbie in pyspark, and I'm trying to read and merge RDD rows into one row.
Assuming that I have the following text file:
A1 B1 C1
A2 B2 C2 D3
A3 X1 YY1
DELIMITER_ROW
Z1 B1 C1 Z4
X2 V2 XC2 D3
DELIMITER_ROW
T1 R1
M2 MB2 NC2
S3 BB1
AQ3 Q1 P1"
Now, I want to combine all rows appears in each section (between DELIMITER_ROW) into one row, and return a list of these merged rows.
I want to create this kind of list:
[[A1 B1 C1 A2 B2 C2 D3 A3 X1 YY1]
[Z1 B1 C1 Z4 X2 V2 XC2 D3]
[T1 R1 M2 MB2 NC2 S3 BB1 AQ3 Q1 P1]]
How can It be done in pyspark using RDD?
For now I know how to read the file and filter out the delimiter rows:
sc.textFile(pathToFile).filter(lambda line: DELIMITER_ROW not in line).collect()
but I don't know how to reduce/merge/combine/group the rows in each section into one row.
Thanks.
Rather than reading and splitting, You can use hadoopConfiguration.set to set the delimiter which separates the row and then split the row.
spark.sparkContext.hadoopConfiguration.set("textinputformat.record.delimiter", "DELIMITER_ROW")
Hope this helps!

Excel: Most frequent value/word in a not-continous range

I need to find the most frequent word (categorical text, e.g."T2") in a row, but not across all columns. If the range was continous I would attempt something like:
=INDEX(B3;M3,MODE(MATCH(B3;M3,B3;M3,0)))
However, I'm doing this for multiple variables and don't want to make a separate subset sheet/file for each one so hope this is possible. I'm attempting to use the following formula but get an error message that hightlights the MODE function:
=INDEX((B3;F3;J3),MODE(MATCH(B3;F3;J3,B3;F3;J3,0)))
My data looks something like this:
person A person B person C
ID Var1 Var2 Var3 Var4 Var1 Var2 Var3 Var4 Var1 Var2 Var3 Var4
1 T2 C1 N f T2 C1 N f T4 C9 Y e
2 T4 C5 Y b T4 C1 Y b T2 C1 N e
3 T2 C2 N g T4 C5 Y d T2 C1 N f
4 T4 C9 Y e T4 C1 Y b T2 C1 N e
5 T1 C2 N b T2 C2 N h T2 C2 N g
6 T4 C9 Y b T4 C1 Y b T4 C9 Y f
7 V2 C1 Y c V6 C2 N c T2 C2 N h
Example
And the result I want is to add a column to the end that gives me the most common value/name, exapmle for Var1:
ID Mode_Var1
1 T2
2 T4
3 T2
4 T4
5 T2
6 T4
7 NA
Am I on the right track? Is this possible using Index, Mode and Match? Is there another way if this doesn't work? Thanks for any help!
EDIT: Added table (same as in image), made range in first example correspond to example data
You would use a countif in an array form of INDEX/MATCH:
=INDEX(B2:M2,MATCH(MAX(IF(MOD(COLUMN(B2:M2),4)=2,COUNTIF(B2:M2,B2:M2))),IF(MOD(COLUMN(B2:M2),4)=2,COUNTIF(B2:M2,B2:M2)),0))
Being an array formula it needs to be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode. If done correctly Excel will put {} around the formula.
Put this formula in N2, press Ctrl-Shift-Enter, then copy/drag down.
Given a setup as shown, use this array formula* in cell B13 and copy over and down:
=INDEX(INDEX($B$3:$M$9,MATCH($A13,$A$3:$A$9,),0),MODE(IFERROR(MATCH(INDEX($B$3:$M$9,MATCH($A13,$A$3:$A$9,),0),IF($B$2:$M$2=B$12,INDEX($B$3:$M$9,MATCH($A13,$A$3:$A$9,),0)),0),-COLUMN($B$3:$M$9))))
The #N/A results mean that there was no most frequent entry for that ID and Var (all three had different entries). If you want to put something else there,wrap the formula in an IFERROR.
*Array formulas must be confirmed with Ctrl+Shift+Enter instead of just Enter. When done correctly, the formula will be surrounded by curly braces {=formula}, those are added automatically so don't try to add them manually.

How can i get my aggregated exposure by identifiers across a hierarchy?

Let's say I have the following data :
Trade Data :
TradeId,CptyID,Exposure
T1 , C3, 100
T2 , C2, 50
T3 , C6, 200
Business Hierarchy Data :
CptyID,L1-Acronym,L2-Acronym,L3-Acronym
C3, H1, H2, H3
C2, H4, H5, H2
C6, H4, H5, H6
ID Mapping :
Acronym,CptyID,Identifier
H1 , C1, B1
H2 , C2, B2
H3 , C3, B3
H4 , C4, B4
H5 , C5, B5
H6 , C6, B6
IE having hierarchies like :
level Acronym(Identifier)
L1 H1(B1) H4(B4)
L2 H2(B2) H5(B5)
L3 H3(B3) H2(B2) H6(B6)
Trade T1 T2 T3
I would like to get the exposure by identifiers (B1, B2, B3, B4, B5, B6) where Exp(B1) = Exp(T1), Exp(B2) = Exp(T1)+Exp(T2)...
Joining them together doesn't work. It would give me 3 facts :
TradeID, CptyID, Exposure, L1-Acronym, L2-Acronym, L3-Acronym, Identifier
T1 , C3 , 100, H1, H2, H3, B3
T2 , C2 , 50, H4, H5, H2, B2
T3 , C6 , 200, H4, H5, H6, B6
and give me the wrong results as I would only get the exposures for the identifiers at Level 3 :
Identifier,ResultInLive,ExpectedResult
B1 , Null, 100 (Null because I have no facts associated directly to B1)
B2 , 50, 150
B3 , 100, 100
B4 , Null, 250
B5 , Null, 250
B6 , 200, 200
Another difficulty is that those dimensions can have a lot of members (>300K).
Kind regards,
Christophe
Thanks for your answer !
Each level of my Business Hierarchy data are "entities" which have identifiers.
For instance, lets only consider trade T1, who has an exposure of 100. I have a hierarchy of 3 levels:
the first level is H1, which has an identifier = B1
the second level is H2, which has an identifier = B2
the third and lower level is H3, which has an identifier of B3
The thing we are trying to achieved is to have an identifier dimension with members B1,B2, B3... with the right exposure.
Hence, in this case :
B3 would have an exposure of 100 coming from T1 => Exposure(B3) = Exposure(T1)
B2, who is B3 parent, would also have an exposure of 100 coming from T1 => Exposure(B2) = Exposure(T1)
B1, who is B2 parent, would also have an exposure of 100 coming from T1 => Exposure(B1) = Exposure(T1)
Joining using the cptyId doesnt give us the expected result as the underlying fact would be :
TradeID, CptyID, Exposure, L1-Acronym, L2-Acronym, L3-Acronym, Identifier
T1 , C3 , 100, H1, H2, H3, B3
Therefore, in ActivePivot Live, we would see :
Identifier,ResultIn AP Live,ExpectedResult
B1 , Null, 100 (Null because there is no facts associated directly to B1)
B2 , Null, 100 (Null because there is no facts associated directly to B2)
B3 , 100, 100 (given by the trade fact)
In the first post, I also wanted to illustrate the fact that the same identifier can be in 2 different hierarchies.
For instance :
L1 H1(B1) H4(B4)
L2 H2(B2) H5(B5)
L3 H3(B3) H2(B2) H6(B6)
Trade T1 T2 T3
we can see that B2 is present in at the L2 of the first hierarchy and L3 of the second hierarchy.
Therefore, we would expect to have Exposure(B2) = Exposure (T1) + Exposure (T2) = 150.
Kind regards

Logical calculation in Excel

I need advice/help. I am working on calculation in excel where I have data like mentioned below.
. A B C D E F G H
1| A275 A277 A273 A777 A777 TOTAL A222 GRAND TOTAL
2| 5 7 4 3 4 7 7
Now, I want to count row 2 based on the header.
Here is the condition.
If A1 <> B1 then take A1, if B1 <> C1 then take B1, if C1 <> D1 then C1, so on.
But tricky part is...
If D1<>E1 then D1 else (if E1<>F1 then E1 else (if F1 = "TOTAL" then F1 else(if F1<>G1 then F1)))
In short H2 should have 30 and not 37.
Added comments:------------------------------------
So, Basically if A1<>B1 then take A1 but if A1=B1 then take B1, but then for B1, its a same rule like if B1<>C1 then take B1, but if B1=C1 then take C1 and for C1, same rule. Stopping point will be "TOTAL". Along with these logic I need to check if any cell in row 1 is "TOTAL" then take value for same column. Now this "TOTAL" can be in any cell in row 1.
So from above table my calculation will be 5(A2) + 7(B2) + 4(C2) + 7(F2) + 7(G2) = 30
In this calculation I have not included D2 and E2 as D2=E2 so I took D2, here E2<>F2 so I should have taken E2, but as F2="TOTAL" so I took F2 and not D2 and E2.
I hope this make sense. (Sorry, I know its confusing.)
I have data in more then 100 columns.
Can this be achieved using Macro?
------------------------------------------------------------
Another pain point is data and header are dynamic, so I can't have a fix format. Logic should be in a way that can handle the dynamic data and header.
Any help or suggestion will be greatly appreciated.
I achieved the results you want with this.
Add a helper row. In cell A3 write this formula and drag it to the right:
=IF(OR(A$1=B$1,B$1="TOTAL"),0,1)
Calculate sum in say cell H4 (not H2 because if the formula refers to entire row 2 there will be circular reference):
=SUMIF($3:$3,1,$2:$2)

Resources