I have a list of file names. There are about 1 million lines.
In these lines I have a value starting with 58. This value. Starting with 58 and all following numbers I want to read out. Usually it is 8 characters. However, it can also be 9.
The names of the files are very different.
I was thinking about search/find functions or with part but unfortunately I'm not very familiar with Excel.
010 AHKG100 58098085 70085-DIS-0082.pdf
002 AHMQ32X32 58098524.pdf
AHSG160-58098564(=A-3129)_01.dwg
003 MVTA_78_ 58098861.pdf
These are some variant of file names
The execpted result is as followed, in a new column:
58098085
58098524
58098564
58098861
Suppose your data starts from cell A1, in cell B1 enter the following array formula:
=LARGE(IFERROR(--MID(A1,FIND("58",A1,ROW($Z$1:INDEX($Z:$Z,LEN(A1)))),{8,9}),0),1)
You MUST press Ctrl+Shift+Enter upon finishing the formula in the formula bar otherwise they will not function correctly. Then you can simply drag the formula down to apply across.
The logic is to use FIND function to find the starting point of 58 in the string, and then use MID function to return the 8-Character or 9-Character from the string and use -- to check if they are numerical values, if not use IFERROR function to return 0 instead. Lastly use LARGE function to return the largest numeric value from the range, which should only contain the value number and 0 (if there is any).
EDIT #2
You can also use the following two formulas which does not require to be entered by Ctrl+Shift+Enter:
=AGGREGATE(14,6,--MID(A1,FIND("58",A1,ROW($Z$1:INDEX($Z:$Z,LEN(A1)))),{8,9}),1)
This one is essentially the same as the first one but replaced LARGE+IFERROR with AGGREGATE which is able to ignore error values in the range.
or
=AGGREGATE(14,6,--MID(A1,ROW($Z$1:INDEX($Z:$Z,LEN(A1))),{8,9}),1)
This one directly extracting every 8-character and 9-character sub-strings from the main string, then use -- to check if the sub-string is a numerical value, and lastly use AGGREGATE to return the largest value from the range.
Let me know if you have any questions. Cheers :)
Related
I'm trying to create auto numbering for Agents that are currently present and has numbers including zeroes 0 in 3rd or 4th column(zero meaning they don't get any stats but they are present)
Agents who has TEXT Value in the 3rd or 4th column are those who are not present (Ex: A = Absent, SL = Sick Leave, VL = Vacation Leave). Meaning, they should not be counted, therefore their value on 1st column should be blank, and therefore this should not stop the auto numbering for the rest of the agents below and should continue the count in sequence.
Can anyone help create formula that would fill the numbers on the 1st column automatically for those agents that are present and has value including 0 on column 3 or 4 (stats 1 or stats 2)?
To give more idea, I'm trying to show the current total number of agents who are currently present in this situation and will count their stats, and exclude all other agents who are not present and should not be counted.
Thank you!
Sequence Two Numeric Columns
Single Cell
In cell A3, a basic not spilling formula would be...
=IF(AND(ISNUMBER(C3),ISNUMBER(D3)),MAX(A$2:A2)+1,"")
... with the condition of a string in cell A2.
Without any conditions, you could try an improved version, similar to one of David Leal's suggestions:
=IF(AND(ISNUMBER(C3),ISNUMBER(D3)),
SUM(ISNUMBER(C$3:C3)*ISNUMBER(D$3:D3)),"")
Spill
In cell A3 you could use the following:
=LET(Data1,C3:C13,Data2,D3:D13,
Data,ISNUMBER(Data1)*ISNUMBER(Data2),
IFERROR(SCAN(0,Data,LAMBDA(a,b,a+b))/Data,""))
Line1: the inputs ('constants'), the same-sized single-column ranges
Line2: the zeros and ones, where the ones present the data of interest
Line3: the formula to replace the ones with the sequence and the zeros (errors due to division by zero) with an empty string
Converted to a LAMBDA, it could look like the following:
=LAMBDA(Data1,Data2,LET(
Data,ISNUMBER(Data1)*ISNUMBER(Data2),
IFERROR(SCAN(0,Data,LAMBDA(a,b,a+b))/Data,"")))(C3:C13,D3:D13)
Since it's such a long formula, you could create your own Lambda function by using this part...
=LAMBDA(Data1,Data2,LET(
Data,ISNUMBER(Data1)*ISNUMBER(Data2),
IFERROR(SCAN(0,Data,LAMBDA(a,b,a+b))/Data,"")))
... to define a name, e.g. SeqNumeric, when in the same cell, you could use
it simply with...
=SeqNumeric(C3:C13,D3:D13)
... instead.
Now you can use the function like any other Excel function anywhere in the workbook.
The Path
F3 =ISNUMBER(C3:C13)*ISNUMBER(D3:D13) - multiply: zeros-no, ones-yes
G3 =SCAN(0,F3#,LAMBDA(a,b,a+b)) - use the 'LAMBDA' helper function 'SCAN'
H3 =G3#/F3# - divide the 'scan' result by the zeros and ones
I3 =IFERROR(H3#,"") - convert the '#DIV/0!' errors to empty strings
The translation of the SCAN part could be something like the following:
Set the initial result a to 0.
Create a new array of the size of the initial array in F3#.
Loop through each element of the initial array, write its value to b, and replace a with the sum of a+b.
Write (the accumulated) a to the current element of the new array and repeat for the remaining elements of either array.
Return the new array.
Combine all of it in a LET.
J3 =LET(Data1,C3:C13,Data2,D3:D13,
Data,ISNUMBER(Data1)*ISNUMBER(Data2),
IFERROR(SCAN(0,Data,LAMBDA(a,b,a+b))/Data,""))
Convert to LAMBDA.
K3 =LAMBDA(Data1,Data2,LET(
Data,ISNUMBER(Data1)*ISNUMBER(Data2),
IFERROR(SCAN(0,Data,LAMBDA(a,b,a+b))/Data,"")))(C3:C13,D3:D13)
Copy the first part of the LAMBDA (note how it results in a #CALC! error since no parameters are supplied)...
L3 =LAMBDA(Data1,Data2,LET(
Data,ISNUMBER(Data1)*ISNUMBER(Data2),
IFERROR(SCAN(0,Data,LAMBDA(a,b,a+b))/Data,"")))
... and select Formulas -> Name Manager -> New to create your own function and finally use it with the following:
A3 =SeqNumeric(C3:C13,D3:D13)
You can try the following in cell A1:
=LET(B, B2:B12, C, C2:C12, f, 1*ISNUMBER(B*C),seq, SEQUENCE(ROWS(B)),
MAP(seq, LAMBDA(s, IF(INDEX(f,s)=0, "",SUM(FILTER(f, (seq<=s),0))))))
Here is the output:
A non-array version, expanding down the formula would be:
=IF(ISNUMBER(B2*C2), SUM(1*ISNUMBER(B$2:B2*C$2:C2)),"")
For the array version, it counts only if both columns Stat1 and Stat2 are numeric. The name f, has a value of 1 if the condition is TRUE, otherwise is 0. The MAP does the count if the index position of the f array is not zero, otherwise returns an empty string.
I think I got it.
This is the formula that I made
=IF(COUNTIFS(D2:BE2,"*",$D$1:$BE$1,TODAY())>0,"",MAX(A1:A$4)+1)
Countif criteria 1 = if the cell contains a letter and is counted > 0 then it will return blank, otherwise it will start the count using max function. The countif criteria 2 will will return the correct value according to the date today since the excel sheet has several data daily.
I am currently trying to extract the prefix of a store ID to be able to then generate a list of stores with only that prefix.
Cell D1 has that formula to extract the unique prefix :
=TRANSPOSE(UNIQUE(LEFT(C2#, MIN(FIND({0,1,2,3,4,5,6,7,8,9},C2#&"0123456789"))-1)))
Cell C2 has that formula to extract the unique store ids from another sheet :
=UNIQUE(INDEX('Male Shoes'!A1#,,6))
The problem is that the formula in D1 only returns the first two characters from all the unique prefixes instead of using the correct value for each prefixes.
I have setup in column I the same formula as in D1 without the TRANSPOSE() and UNIQUE() functions and remove the # to see if that would return the correct value. I dragged it down the length of the C column.
=LEFT(C2, MIN(FIND({0,1,2,3,4,5,6,7,8,9},C2&"0123456789"))-1)
In Cell J2 I put the same formula has I2 but kept the # as a control.
=LEFT(C2#, MIN(FIND({0,1,2,3,4,5,6,7,8,9},C2#&"0123456789"))-1)
I believe the MIN() function is returning the minimum for the entire array and not for each row. I haven't found how to mitigated that problem anywhere online.
In my sample data that is not a problem since all the columns in D through G gave me the lists I was expecting but as more countries get added I might end up with duplicate country prefix. (i.e.: If the prefixes get shortened to 2 characters - Germany=GE and Georgia=GE)
If it is always 2 or three a simple IF instead of FIND will work:
=TRANSPOSE(UNIQUE(LEFT(C2#,IF(ISNUMBER(--MID(C2#,3,1)),2,3))))
Edit:
If there is only one time that it switches from alpha to numeric we can use:
=TRANSPOSE(UNIQUE(LEFT(C2#,MMULT(ISNUMBER(--MID(C2#,SEQUENCE(,MAX(LEN(C2#))),1))*ISERROR(--MID("A"&C2#,SEQUENCE(,MAX(LEN(C2#))),1)),SEQUENCE(MAX(LEN(C2#))))-1)))
This does not care how many characters are in the string, only that there is only 1 time that it switches from alpha to numeric. So ABDEFGHTEV4567 will work but A3D4 will not.
I was wondering if anyone could enlighten me on a way to condense/shorten this formula:
=IF(ISNUMBER(SEARCH("Kirkintilloch",B2)),"BRN01",IF(ISNUMBER(SEARCH("Tweacher",B2)),"BRN01",IF(ISNUMBER(SEARCH("Lenzie",B2)),"BRN01",IF(ISNUMBER(SEARCH("Bishopbrigg",B2)),"BRN03",IF(ISNUMBER(SEARCH("Torrance",B2)),"BRN03",IF(ISNUMBER(SEARCH("Bearsden",B2)),"BRN04",IF(ISNUMBER(SEARCH("Milngavie",B2)),"BRN04")))))))
Column B will contain an address which will be from one of these seven towns.
The reason I didn't do an IF then Lookup was that the result I want to return is not unique, I.e. Kirkintilloch & Torrance both need to return a result of BRN01.
If it's not possible to simplify this then no worries. It would just save me a lot of work in a larger piece of work with many more possible outcomes.
This is an array formula - confirm it with Ctrl+Shift+Enter while still in the formula bar:
=INDEX(Towns,SMALL(IF(ISNUMBER(SEARCH(INDEX(Towns,0,2),B2)),INDEX(Towns,0,1)),1),3)
Breaking this down:
First you need to create a named range (I called mine "Towns") that contains the array of an index, the town name and the desired return "BRN**" (you could make this a simple range and just reference that but I entered the actual array by selecting the range in formula, highlighting it and using F9 to calculate)
Now that you have this array, I use Index providing 0 for the row argument to return all rows so that I can equate against just a single column at a time.
IF(ISNUMBER(SEARCH(INDEX(Towns,0,2),B2)),INDEX(Towns,0,1))
As this is an array formula, each row is assessed individually and the results (first column when a match is found - index of the array) returned as an array, like so: {FALSE;FALSE;3;FALSE;FALSE;FALSE;FALSE}
I then use SMALL() to catch the lowest number or the first match to feed another index for the third column.
I know how to use index and match formulas to get the value or location of a matching cell. But what I don't know how to do is get that information when the cell I'm looking for isn't going to be the first match.
Take the image below for example. I want to get the location of the cell that says "Successful Deliveries". In this example there's a cell that matches that in rows 11 and 30. These locations can vary in the future so I need a formula that's smart enough to handle that.
How would I get the location of the second instance of "Successful Deliveries"? I figured I could use the "Combination 2 Stats" value from row 24 as a starting point.
I tried using this formula:
=MATCH("Successful Deliveries:",A24:A1000,0)
But it returns a row number of 7 which is just relative to the A24 cell I started my match at.
My end goal here is to get the value from the cell directly to the right of the second match of "Successful Deliveries".
In your formula, with no further intelligence, you can simply add 23 to adjust 7 to the result:
=MATCH("Successful Deliveries:",A24:A1000,0) + 23
You know that 23 is the number to add because you started your search on row 24.
The full answer is here:
https://exceljet.net/formula/get-nth-match-with-index-match
You use this formula:
=INDEX(B1:B100,SMALL(if(A1:A100 = "Successful Deliveries:",ROW(A1:A100) - ROW(INDEX(A1:A100,1,1))+1),2))
...where 2 is the instance you want.
Make sure to finish typing the formula by hitting ctrl-shift-enter. (You know you did this right because the formula gets curly brackets {})
HOW IT WORKS
Normally, we use INDEX / MATCH to find a value. The Index function gives you the nth value in a range, and the Match function determines which "n" is a match for our criteria.
Here we use INDEX the same way, but we need more intelligence to find that "n", since it's the second one that matches the criteria. That's where SMALL comes in. The Small function "gets the nth smallest value in an array". So we give Small the number of the desired instance (2 in this case) and we give it an array of blanks and the rows numbers of the rows we like.
We obtained the array of blanks and row numbers using the If function, asking it to check for our criterion (="Successful...") and making it return the row number where the criterion passes (=Row(A1:A100)). By using the If function as an array function (by giving it arrays and using ctrl-shift-enter) it can deliver a whole list of values.
Our final value is just one number because the Small function used the array from the IF to return just one thing: the second-smallest row we gave to it.
I'm using Excel 2010 and I'm looking for a way to return the first negative number of a column. For instance, I have the following numbers distributed in a column:
1
4
6
-3
4
-1
-10
8
Which function could I use to return -3?
Thanks!
This could be interpreted two ways... If all the numbers are in a single cell (one column) as a string, the MID function can be used. If the numbers are in A1, a formula that could work is this:
=VALUE(MID(A1,SEARCH("-",A1),SEARCH(" ",A1,SEARCH("-",A1))-SEARCH("-",A1)))
If the numbers are each in their own columns (in my example, A3:H3), a different technique must be used:
{=INDEX(A3:H3,1,MATCH(TRUE,A3:H3<0,0))}
Don't type the { } - enter the equation using CTRL+SHIFT+ENTER.
In each case, the formula will return the number -3, which is the first negative number in the series.
Another possibility, avoiding the array formula (which are a big source of performance issues):
=LOOKUP(1;1/(M2:M15<0);M2:M15)
(I assume your numbers are in the M2:M15 range).
This will return the first number matching the "<0" condition. You may use any other condition, including text comparisons.
You may also extract the value of another array corresponding to the matching cell:
=LOOKUP(1;1/(M2:M15<>"OK");T2:T15)
In this example, the first cell containing another string than "OK" will be searched for in the m2:m15 array and the corresponding value in array t2:t15 will be returned.
Please note that the usage of the lookup function should be avoided whenever possible (but in this case, it's very handy !)
(I got the original inspiration for this answer from this post)