IF / Then / And / Increment - excel

Logically I just can't figure out how i want to right this.
So given a value in 3 different locations will either be a 1 or a 0.
Location 1, Location 2, Location 3. I want to see if something is in stock somewhere and if not get it from this location and put it in a position.
so ideally im trying to wright something that looks like:
I was thinking something like =IF(SUM(A1-A3)>1, increment?, ) but it doesn't like this.
Location 1 Location 2 Location 3 Get it here
0 0 0 1 <- increment first in the list where all locations show 0
1 0 0 <- this would be left blank (meaning its available elsewhere)
0 0 0 2 <- this would increment from the previous 1
Just trying to figure it out.

Do you mean something like this:
For your information, this is the formula:
=IF(AND(B3=0,C3=0,D3=0),"",MAX(E$2:E2) + 1)
The first part is easy: if all location columns are zero, then put an empty string.
The second part is a bit more difficult: take the maximum from the fix cell to "E2" (while being located in "E3", which means take the cell above. So, take the maximum of all cells above, starting at "E$2". Once you have that maximum, add one.
I admit, the formula, with the MAX(E$2:E2) looks confusing, but drag it down, see how "E2" becomes "Ex" and how "E$2" stays "E$2" and you'll understand.

Related

How to color max. 2 consecutive values in Excel without using VBA?

I'm out of idea how I could format consecutive same (respectively only even) values in Excel tables without using VBA.
The conditional formatting shall color only consecutive values and only
all 0s or all even values, when there are not more than 2.
A: ID
B: binary
C: counting
1
1
1
2
0
2
3
0
2
4
1
3
5
0
4
6
0
4
7
0
4
8
1
5
9
1
5
I tried to format with: =COUNTIF(C1:C9, C1) < 3, but then it also colors the 1s and C6:C7, eventho there are more than 2.
I also tried =AND( COUNTIF(C1:C9,C1) < 3, ISEVEN(C1:C9) ) but then it colors nothing.
I could replace the 0s with empty cells so I could check ISEMPTY(B1:B9) but it again colors nothing. Using $ to set absolute changes nothing as well.
Formating duplicates also colors triplets, which also doesn't work for me.
=OR(COUNTIF($C$1:$C$9,C1) = 1, COUNTIF($C$1:$C$6,C1) = 2) works so far, but also colors the 1s (uneven).
=AND(OR(COUNTIF($C$1:$C$9,C1) = 1, COUNTIF($C$1:$C$6,C1) = 2), ISEVEN($C$1:$C$9)) doesn't work.
=AND(OR(COUNTIF($C$1:$C$9,C1) = 1, COUNTIF($C$1:$C$6,C1) = 2), $B$1:$B$9 <> 1) doesn't work as well.
My only solution so far is using 2 formating rules:
color =OR(COUNTIF($C$1:$C$9,C1) = 1, COUNTIF($C$1:$C$6,C1) = 2)
do not color =$B$1:$B$9 = 1
but I think it is terrible.
I worked on it for some hours, maybe I'm missing something really obvious.
I'm not allowed to use VBA, therefore this is ot an option.
EDIT: My 2.rule-solution can be simplificed with:
color =COUNTIF($C$1:$C$9,C1) < 3
do not color =$B$1:$B$9 = 1
I'm still confused why combining both doesn't work:
AND(COUNTIF($C$1:$C$9,C1) < 3; $B$1:$B$9 <> 1)
EDIT2: I know why it didn't work. Don't check <>1 with absolute value-range $B$1$:$B$9
Solution: B1 <> 1 then it loops through.
Now combining both works:
=AND( COUNTIF($C$1:$C$9, C1) < 3, B1 <> 1)
I can't see an easy answer for the binary numbers. You have two cases:
(1) Current cell is zero, previous cell is 1, next cell is zero and next cell but one is 1.
(2) Current cell is zero, previous cell is zero, previous cell but one is 1, next cell is 1.
But then the first pair of numbers is a special case because there is no previous cell.
Strictly speaking the last pair of numbers is a special case as well because there is no following cell.
=OR(AND(ROW()=1,B$1=0,B$2=0,B$3=1),AND(ROW()=2,B$1=0,B$2=0,B$3=1),AND(B1=0,B1048576=1,B2=0,B3=1),AND(B1=0,B1048576=0,B1048575=1,B2=1))
where I have used the fact that you are allowed to wrap ranges to the end of the sheet (B1048576) in conditional formatting.
Adding the condition for the case where there there are two zeroes at the end of the range:
=OR(AND(ROW()=1,B$1=0,B$2=0,B$3=1),
AND(ROW()=2,B$1=0,B$2=0,B$3=1),
AND(B1=0,B1048576=1,B2=0,OR(B3=1,B3="")),
AND(B1=0,B1048576=0,B1048575=1,OR(B2=1,B2="")))
Even this could go wrong if there was something in the very last couple of rows of the sheet, so I suppose to be absolutely safe:
=OR(AND(ROW()=1,B$1=0,B$2=0,B$3=1),
AND(ROW()=2,B$1=0,B$2=0,B$3=1),
AND(Row()>1,B1=0,B1048576=1,B2=0,OR(B3=1,B3="")),
AND(Row()>2,B1=0,B1048576=0,B1048575=1,OR(B2=1,B2="")))
Shorter:
=OR(AND(ROW()<=2,B$1+B$2=0,B$3=1),
AND(B1+B2=0,B1048576=1,OR(B3=1,B3="")),
AND(B1+B1048576=0,B1048575=1,OR(B2=1,B2="")))
Not the cleanest wat but it works. You only need to move your data 1 row below, so headers would be in row 2 and data in row 3 for this formula to work:
=IF(AND(B3=B4,B3<>B5),IF(AND(B4=B3,B4<>B2),TRUE,FALSE),IF(AND(B3=B2,B3<>B1),IF(AND(B3=B4,B3<>B5),FALSE,TRUE),FALSE))
How about this approach (Office 365):
=LET(range,B$1:B$9,
s,IFERROR(TRANSPOSE(INDEX(range,ROW()+SEQUENCE(5,,-2))),1),
t,TEXTJOIN("",,(s=INDEX(range,ROW()))*ISEVEN(s)),
IFERROR(SEARCH("0110",t)<4,IFERROR(SEARCH("010",t)=2,FALSE)))
It creates an array s of 5 values starting point is the current row of the range, adding the 2 values above and below. If the value is out of range it will replace the error with a 1.
The array s is checked for being even (TRUE/FALSE, IFERROR created values are uneven) and the values to equal the value of the current row of the range (TRUE/FALSE).
These two booleans are multiplied creating 1 for both values being TRUE, else 0.
These values are joined and checked for 2 consecutive 1's (surrounded by 0) to be found in the 2nd or 3rd position of the range (this would be the case if two even consecutive equal numbers are found),
if it errors it will look if a unique even number is found (1 surrounded by 0 in 2nd position).
PS I'm unable to test if conditional formatting allows you to type the range as B:B instead of B$1:B$9 (working from a mobile) but that would make it more dynamical, because that way you can easily expand the conditional range.

Counting if part of string is within interval

I am currently trying to check if a number in a comma-separated string is within a number interval. What I am trying to do is to check if an area code (from the comma-separated string) is within the interval of an area.
The data:
AREAS
Area interval
Name
Number of locations
1000-1499
Area 1
?
1500-1799
Area 2
?
1800-1999
Area 3
?
GEOLOCATIONS
Name
Areas List
Location A
1200, 1400
Location B
1020, 1720
Location C
1700, 1920
Location D
1940, 1950, 1730
The result I want here is the number of unique locations in the "Areas list" within the area interval. So Location D should only count ONCE in the 1800-1999 "area", and the Location A the same in the 1000-1499 location. But location B should count as one in both 1000-1499 and one in 1500-1799 (because a number from each interval is in the comma-separated string in "Areas list"):
Area interval
Name
Number of locations
1000-1499
Area 1
2
1500-1799
Area 2
3
1800-1999
Area 3
2
How is this possible?
I have tried with a COUNTIFS, but it doesnt seem to do the job.
Here is one option using FILTERXML():
Formula in C2:
=SUM(FILTERXML("<x><t>"&TEXTJOIN("</s></t><t>",,"1<s>"&SUBSTITUTE(B$7:B$10,", ","</s><s>"))&"</s></t></x>","//t[count(.//*[.>="&SUBSTITUTE(A2,"-","][.<=")&"])>0]"))
Where:
"<x><t>"&TEXTJOIN("</s></t><t>",,"1<s>"&SUBSTITUTE(B$7:B$10,", ","</s><s>"))&"</s></t></x>" - Is the part where we construct a valid piece of XML. The theory here is that we use three axes here. Each t-node will be named a literal 1 to make sure that once we return them with xpath we can sum the result. The outer x-nodes are there to make sure Excel will handle the inner axes correctly. If you are curious to know how this xml-syntax looks at the end, it's best to step through using the 'Evaluate Formula' function on the Data-tab;
//t[count(.//*[.>="&SUBSTITUTE(A2,"-","][.<=")&"])>0]")) - Basically means that we collect all t-nodes where the count of child s-nodes that are >= to the leftmost number and <= to the rightmost number is larger than zero. For A2 the xpath would look like //t[count(.//*[.>=1000][.<=1499])>0]")) after substitution. In short: //t - Select t-nodes, where count(.//* select all child-nodes where count of nodes that fullfill both requirements [.>=1000][.<=1499] is larger than zero;
Since all t-nodes equal the number 1, the SUM() of these t-nodes equals the amount of unique locations that have at least one area in its Areas List;
Important to note that FILTERXML() will result into an error if no t-nodes could be found. That would mean we need to wrap the FILTERXML() in an IFERROR(...., 0) to counter that and make the SUM() still work correctly.
Or, wrap the above in BYROW():
Formula in C2:
=BYROW(A2:A4,LAMBDA(a,SUM(FILTERXML("<x><t>"&TEXTJOIN("</s></t><t>",,"1<s>"&SUBSTITUTE(B$7:B$10,", ","</s><s>"))&"</s></t></x>","//t[count(.//*[.>="&SUBSTITUTE(a,"-","][.<=")&"])>0]"))))
Using MMULT and TEXTSPLIT:
=LET(rng,TEXTSPLIT(D2,"-"),
tarr,IFERROR(--TRIM(TEXTSPLIT(TEXTJOIN(";",,$B$2:$B$5),",",";")),0),
SUM(--(MMULT((tarr>=--TAKE(rng,,1))*(tarr<=--TAKE(rng,,-1)),SEQUENCE(COLUMNS(tarr),,1,0))>0)))
I am in very distinguished company but will add my version anyway as byrow probably is a slightly different approach
=LET(range,B$2:B$5,
lowerLimit,--#TEXTSPLIT(E2,"-"),
upperLimit,--INDEX(TEXTSPLIT(E2,"-"),2),
counts,BYROW(range,LAMBDA(r,SUM((--TEXTSPLIT(r,",")>=lowerLimit)*(--TEXTSPLIT(r,",")<=upperLimit)))),
SUM(--(counts>0))
)
Here the ugly way to do it, with A LOT of helper columns. But not so complicated 🙂
F4= =TRANSPOSE(FILTERXML("<m><r>"&SUBSTITUTE(B4;",";"</r><r>")&"</r></m>";"//r"))
F11= =TRANSPOSE(FILTERXML("<m><r>"&SUBSTITUTE(A11;"-";"</r><r>")&"</r></m>";"//r"))
F16= =SUM(F18:F21)
F18= =IF(SUM(($F4:$O4>=$F$11)*($F4:$O4<=$G$11))>0;1;"")
G18= =IF(SUM(($F4:$O4>=$F$12)*($F4:$O4<=$G$12))>0;1;"")
H18= =IF(SUM(($F4:$O4>=$F$13)*($F4:$O4<=$G$13))>0;1;"")

how to vlookup if prefix found in the list?

HI.
how can i come up with return value of "company name" (column H) at Column B IF any of the "PrefiX" (Column G) found at "con no" (Column A).
Sample of outcome needed as in column B.
Sample:
620011113 = DD
CN1234 = BB
thanks
=INDEX($H:$H,AGGREGATE(15,6,ROW($G$1:$G$7)/(--(FIND($G$1:$G$7,$A2)=1)*--(LEN($G$1:$G$7)>0)),1),1)
Breaking this down, the INDEX retrieves the Nth item from Column H (Company name). To find the value of N, we are using the AGGREGATE function
AGGREGATE is a weird function - it lets us use things like MAX or LARGE or SUM while ignoring any error values. In this case, we will be using it for SMALL (first argument, 15), while Ignoring Error Values (second argument, 6). We will want the very smallest value, so the fourth argument will be 1. (If we wanted the second smallest, it would be 2, and so on)
=INDEX($H:$H,AGGREGATE(15,6, <SOMETHING> ,1),1)
So, all we need now is a list of values to compare! To make things slightly simpler, I'll break that bit of the code out for you here:
ROW($G$1:$G$7) / (--(FIND($G$1:$G$7,$A2)=1) * --(LEN($G$1:$G$7)>0))
There are 3 parts to this. The first, ROW($G$1:$G$7)is the actual value we want to retrieve - these will be the Row Numbers for each Prefix that matches your value. On its own, however, it will be all the row numbers. Since we are skipping errors, we want any Rows that don't match the prefix to throw an error. The easiest way to do this is to Divide by Zero
At the start of --(FIND($G$1:$G$7,$A2)=1) and --(LEN($G$1:$G$7)>0) we have a double-negative. This is a quick way to convert True and False to 1 and 0. Only when both tests are True will we not divide by 0, as this table shows:
A | B | A*B
1 | 1 | 1
1 | 0 | 0
0 | 1 | 0
0 | 0 | 0
Starting with the second test first (it's easier), we have LEN($G$1:$G$7)>0 - basically "don't look at blank cells".
The other test (FIND($G$1:$G$7,$A2)=1) will search for the Prefix in the Con No, and return where it is found (or a #VALUE! error if it isn't). We then check "is this at position 1" - in other words, "Is this at the start of the Con No, rather than in the middle". We don't want to say Con No CNQ6060 is part of Company AA instead of Company BB by mistake!
So, if the Prefix is at the Start of the Con No, AND it isn't Blank (because there is an infinite amount of Nothing Before, After, and Between every number and letter), then we get it added to our list of Rows. We then take the smallest row (i.e. closest to the top - change AGGREGATE(15 to AGGREGATE(14 if you want the closest to the bottom!), and use that to get the Company Name
You could try the below formal:
=VLOOKUP(IF(LEFT(A3,1)="6",LEFT(A3,4),IF(LEFT(A3,1)="C",LEFT(A3,2),IF(LEFT(A3,1)="E",LEFT(A3,7)))),$G$3:$H$7,2,0)
Have in mind that you have to use ' before the cell value of column A & G in order to convert cell value into text get the correct out comes using VLOOKUP
Result:

Identifying first cluster of 1s of a certain size in a binary column

I have a dataframe in which one of the columns is a binary column with 1s and 0s. I want to identify the first cluster of 1s of size 5 in that column(i.e., the first time 5 continuous 1s occur), and then delete all subsequent rows after the first 1 in that cluster.
I tried writing a loop that would count the 1s, and "continue" (i.e., break and start again) when it encountered a zero. However, I have not been able to write it correctly, because I'm unsure of the syntax. I'm new to Python, apologies if the following is completely wrong -
for i in randomstring["random"]:
i = i+1
if i%5 == 0:
i.remove(i)
elif i == 0:
continue
The loop above ran without error but I'm not sure what it achieved, there was no output.
This is roughly what the dataframe looks like (without the other columns) :
1
0
1
0
1
1
1
1
1
I want this -
1
0
1
0
1
If i rephrase your problem. It seems like you want to find an index.
I will propose you a way with numpy (just for personal reason).
#Just for the purpose of test
X=np.random.randint(0,2,100)
#TO have the index
Z=np.arange(len(X))
#Under it works only cause you have 0 and 1.
M=np.diff(X.cumsum())==0
U=X.cumsum()[1:][M]
Z=Z[1:][M]
COUNT=np.zeros(len(U))
COUNT[1:]=np.diff(U)
COUNT[0]=U[0]
#In COUNT there is the COUNT of consecutive 1
Z=Z-COUNT
#It gives you all the first index where the number of consecutive zero is 5
ANSWER=np.array(Z[COUNT==5],dtype=np.int32)
This is way too long :D . I try to find a better solution and will edit when i do.
First Edit : Change to use numpy diff.

Extracting a substring from a string of arbitrary length

I have just a hair over 30,000 tweets. I have one column that has the actual tweet. There are two things that I would like to accomplish with this column.
First here is a snippet of sample data:
RT #Just_Sports: Cool page for fans of early pro #baseball. https://t.co/QCMYFQNSq8 #mlb #vintage #Chicago #Detroit #Boston #Brooklyn #Phil…
#brettjuliano you already know #unity #newengland #hiphop #boston #watertown #network
I have a column that uses the following formula to see if the message starts out with RT meaning a re-tweet. It returns 1 for yes and 0 for no.
What I would like to accomplish is to create a formula in two columns. One that will get the username if the RT column has a value of 1 and in the second column the username if the RT column has a value of 0. Since usernames are of arbitrary length I am unsure of how to go about this.
Example
RT #Just_Sports: | 1 | #Just_Sports | 0
#brettjuliano | 0 | | #brettjuliano
Take a look at Excel's FIND function. You can use this to identify the position of the #, then using a specified delimiter, match the end of the user name:
=MID(A1, FIND("#",A1), FIND(":",A1,FIND("#",A1)) - FIND("#",A1))
Where A1 is the cell containing the tweet, and ":" is your delimiter.
You can use the same feature to check for the existence of the "RT" identifier.
=FIND("RT",A1)>0
Which returns TRUE if "RT" is found. You may want to consider a search for " RT " (spaces), or some other variation, since there is no standard for using this in a tweet:
=OR(FIND("RT",A1)>0,FIND(" RT",A1)>0,FIND("RT ",A1)>0, FIND(" RT ",A1)>0)
But beware of false positives: ART, START, ARTOO, etc...
Additionally, your "RT" may be lower/upper/mixed case, in which case you'll want to normalize that search:
=OR(FIND("RT",UPPER(A1))>0,FIND(" RT",UPPER(A1))>0,FIND("RT ",UPPER(A1))>0, FIND(" RT ",UPPER(A1))>0)
My OR check is different than the 0/1 check you say you already have, so you can jsut add IF to that to convert to the 0/1 as needed:
=IF(OR(FIND("RT",A1)>0,FIND(" RT",A1)>0,FIND("RT ",A1)>0, FIND(" RT ",A1)>0),1,0)
Once you know you have the RT check correct, and your second column is filled properly, you can add to my original formula:
Case for 1 in 2nd column:
=IF(B1=1,MID(A1, FIND("#",A1), FIND(":",A1,FIND("#",A1)) - FIND("#",A1)),"")
Case for 0 in 2nd column:
=IF(B1=0,MID(A1, FIND("#",A1), FIND(":",A1,FIND("#",A1)) - FIND("#",A1)),"")

Resources