I have a column with some info displayed like that:
Product Info
I am the 3rd product from 2020
I was created in 1995 and I went public in 2021
I am a not sure if I'm from 2019 2020 2021
I have a formula to extract the year in the above column that is:
=IFERROR(FILTERXML("<k><m>"&SUBSTITUTE([#[Product Name]]," ","</m><m>")&"</m></k>","//m[.=number() and string-length()=4]"),"")
The problem with this formula is that it works fine with the first case, but it gives me a #SPILL! Error on the other two cases. My ideal output would be:
Product Info
Year
I am the 3rd product from 2020
2020
I was created in 1995 and I went public in 2021
2021
I am a not sure if I'm from 2019 2020 2021
Basically, for the first case, just return the 4 digits. EVERY time that I only have one sequence of 4 digits, I want to return that sequence.
For the second case, I want to return ONLY the second year. EVERY time I have 2 sequences of 4 digits, I want to return ONLY the second year.
For the third case, I want to return nothing. EVERY time I have more than 2 sequences of 4 digits, I want to return blank.
The last thing I tried to add was position()>5 and that would cut off the 1995 in the second example, but I would continue having the Error on the third example. Also, my list is quite huge, and I am not sure if the position()>5 thing would work for ALL products that fall in the same second example.
I am not very good with XPATH, so any help would be greatly appreciated.
Thank you!
Disclaimer: Below solution is written on the assumption that when 'count of years < 3', return the last given year. If 'count >= 3' then only return the last year if years come in pairs of two. Hence the use of 'modulus 2 == 0'.‡
You can expand the xpath for sure if you so desire. However, I'd rewrite it a little bit. Each predicate, the structure between the opening and closing square brackets, is a filter of a given nodelist. To write multiple of these structures is in fact anding such predicates. To get a better understanding of what most common xpath 1.0 functions can do within FILTERXML(), I'd like to redirect you to this post.
So to write a consecutive pattern of predicates I'd opt for:
[.*0=0] - First return a filtered nodelist of all numbers where a node multiplied by zero equals zero;
[string-length()=4] - Then return only those that are 4 characters long‡‡;
[position() = last() and (position() = 1 or position() mod 2 = 0)] - The 3rd and last predicate is the trickiest for your query. This is done with a first check that position() = last() meaning the node needs to be the last node in the filtered nodelist of step 2 and (position() = 1 or position() mod 2 = 0) means we want to check that this node is also at the 1st index or the modulus 2 of the indexed position equals 0‡‡‡.
Formula in B2:
=IFERROR(FILTERXML("<t><s>"&SUBSTITUTE(A2," ","</s><s>")&"</s></t>","//s[.*0=0][string-length()=4][position() = last() and (position() = 1 or position() mod 2 = 0)]"),"")
Whilst the above would work for Excel 2013 and higher‡‡‡‡, you do talk about spilled behaviour. If you happen to work with the current channel in ms365 you could also try:
=LET(x,TEXTSPLIT(A2," "),y,--FILTER(x,ISNUMBER(-(x&"**0"))*(LEN(x)=4),{1,2,3}),z,COUNT(y),IF(OR(z=1,MOD(z,2)=0),TAKE(y,,-1),""))
‡ If you need to simply return the last year if 'count < 3' then you can use xpath "//s[.*0=0][string-length()=4][position()<3 and position() = last()]" or ms365 formula =LET(x,TEXTSPLIT(A2," "),y,FILTER(x,ISNUMBER(-(x&"**0"))*(LEN(x)=4),""),IF(COUNTA(y)>2,"",TAKE(y,,-1))).
‡‡ Note that you can be more strict about this if you'd wish to validate that a year is between say 1900-2050 or so. One could replace the 1st and 2nd predicate with [.*1>1899][.*1<2051].
‡‡‡ Note that the order or writing your and/or statements in xpath do matter. We need to use explicit parentheses to control the precedence. See this
‡‡‡‡ This is not true for Excel Online or Excel for Mac
Just add a simple clause to determine the number of returns, for example using ROWS (since by default FILTERXML returns a vertical array):
=LET(
ζ, FILTERXML(
"<k><m>" &
SUBSTITUTE(
[#[Product Name]],
" ",
"</m><m>"
) & "</m></k>",
"//m[.=number() and string-length()=4]"
),
ξ, ROWS(ζ),
IF(ξ > 2, "", INDEX(ζ, ξ))
)
Edit: I might prefer to avoid FILTERXML here:
=LET(
ζ, TEXTSPLIT([#[Product Name]], " "),
ξ, -(ζ & "**0"),
IF(COUNT(ξ) > 2, "", IFERROR(-LOOKUP(1, FILTER(ξ, LEN(ζ) = 4)), ""))
)
You can try the following using TEXTAFTER function. Assuming you have years at the end delimited by space. If that is not the case, the formula can be adapted to have additional checks (it is a number and four-digit, but strictly speaking a year can have less or more than 4 digits). Let me know if the previous assumption doesn't apply so I can try to adapt it. The following is an array version, so you can use the entire table column in case you are using excel tables:
=LET(in,A2:A4,last,TEXTAFTER(in," ",-1),
IF(ISNUMBER(1*TEXTAFTER(SUBSTITUTE(in," "&last,"")," ",-1)),"",last))
For the case of more than one year, it removes the last year found, and if the second search is a number, then it returns empty, otherwise returns the previous year found.
Here's the case I have a column with a number of text strings. Each string contains either a single or double-digit number followed by either an "x" or the words " set" or " rounds." I'm trying to extract the numbers preceding the "x" or the words. Here's an example:
string
Desired Outcome
jump 3x10
3
push 10x3
10
pull 3 sets 10 times
3
pull 3 rounds 8 times
3
push 10 times 3 sets
3
I've tried FIND, SEARCH, {1,2,3,4, 5, 6,7, 8, 9} only to over-complicate this. There has to be a simple way to locate these combinations (##&"x", "## sets" or ""## rounds") and extract the related numbers.
Assume "String" data housed in Column A1:A6 with header.
In "Outcome" B2, formula copied down :
=LOOKUP(9^9,0+RIGHT(LEFT(A2,MIN(SEARCH({"x"," sets"," rounds"},A2&"x sets rounds"))-1),ROW(A$1:A$250)))
I have a 5 character code that needs to be converted to a 4 character code. Additionally, the 5th character is either a 1, 2 or 5, and I need to convert them to 1, 5 or 9. As an example, if my query returns '20155', I need to translate that to '2159'. So far I have:
select substr(fieldname,1,1) || substr(fieldname,3,2) || substr(fieldname,-1,1) as newfieldname
This converts it from 5 to 4 characters. What I don't know how to do is also change the last character to the new value as described above.
A sample of what I want it to achieve is:
20141 becomes 2141
20142 becomes 2145
20145 becomes 2149
20151 becomes 2151
20152 becomes 2155
20155 becomes 2159
Any assistance would be appreciated. I am not a computer programmer - I am a functional analyst that has to validate over 500,000 rows of data, each row containing the fieldname above.
You can use "case when ":
Case when 2 then 5 , etc...
I have a table with 2 colums filled with numbers, like this:
A | B
-----
1 | 2
3 | 1
4 | 3
5 | 2
1 | 2
I would like to know how to obtain the number of coincidences in which there is a '1' in B A and out of those how many have a '2' in their correspondent row in B.
So for the example the result would be 2, because there is a 1&2 in the first row and a 1&2 in the last row:
The equivalent in code would be something like:
%MATLAB SINTAX
A = {1 ; 3 ; 4 ; 5 ; 1};
B = {2 ; 1 ; 3 ; 2 ; 2};
sum = 0;
for i=1:length(A)
if(A(i)==1 and B(i)==2)
sum = sum+1;
end
end
In this case, sum is the result that i want.
I was hoping to do something like SUM(IF(AND(A1:A5=1,B1:B5=2),1,{0))
Notes: This is for an assignment, the rules are simply no macros, just one formula without partial results in other cells.
Thank you for your answers.
There are so many ways and as the comments state, COUNTIFS() would be the simplest and most effective...
As you provided a coded example I thought I would try to formulate your logic as closely as I can with an array formula like this: (Ctrl+Shift+Enter while still in the formula bar)
=IFERROR(SUM(IF(IF(A1:A5=1,B1:B5)=2,1)),0)
We build an array of either FALSE or the resulting B:B cell content using the inner if (IF(A1:A5=1,B1:B5)) then equate that array against the logical in the outer IF([innerIf]=2,1) to get an array of FALSE or 1 which we then sum to get the result. I think it will handle the errors as is treating FALSE as 0, but as I wrote this pseudo I wrapped it in an IFERROR() just in case (if errors still occur, provide the false variable of the IF() statements as 0).
The issue with AND() is that it doesn't perform in array constructs, or at least I have never got it to produce an array result.
This is crazy.
I have =FIND("Model=",A3)+6 which produces 36.
I have =FIND("|",A3,FIND("Model=",A3)+6) which produces 40.
What does it take to get the results of 4?
=FIND("|",A3,FIND("Model=",A3)+6)-FIND("Model=",A3)+6 produces 16.
I am using Office 2007 with all current updates.
Logic says 40-36 should equal 4, but that is not what excel is producing.
This is my test string in A3
Year=1999|Make=Mercedes-Benz|Model=C230|Trim=Kompressor Sport Sedan 4-Door|Engine=2.3L 2295CC l4 GAS DOHC Supercharged
The formula that I am aiming for looks kinda like this:
=MID(A3,FIND("Model=",A3)+6,FIND("|",A3,FIND("Model=",A3)+6)-FIND("Model=",A3)+6)
This should return the results of C230 from above text.
You need to parenthesize in the '+6' before subtraction takes place, so change:
=FIND("|",A3,FIND("Model=",A3)+6)-FIND("Model=",A3)+6
to:
=FIND("|",A3,FIND("Model=",A3)+6)-(FIND("Model=",A3)+6)
so it subtracts the entire sum, not just the first part of the sum.
It's clearer looking at a trivial example - you wanted something like:
3 - (1 + 2) = 0
but instead were doing:
3 - 1 + 2 = 4