excel - find exact value in a text - excel

So I have the following text:
192.1.2.30,192.1.2.39
192.1.2.32,192.1.2.3
using this formula
=COUNTIF(A:A,"*"&D1&"*")
this checks if the IP address is in the text. Which is the issue. It's a wild card search
D1 - D4
192.1.2.30 >> result >> 1 >> CORRECT
192.1.2.39 >> result >> 1 >> CORRECT
192.1.2.3 >> result >> 2 >> **INCORRECT** >> should be 1
192.1.2.32 >> result >> 1 >> CORRECT
192.1.2.3 shows as 2 because 192.1.2.3 is part of 192.1.2.30.
Is there a way exclude the incorrect IP as matching twice?

If your version supports TEXTJOIN() then could try-
=SUM(--(FILTERXML("<t><s>"&SUBSTITUTE(TEXTJOIN("</s><s>",TRUE,$A$1:$A$10),",","</s><s>")&"</s></t>","//s")=C1))

=COUNTIF(A:A,"*"&D1&"*") counts the total number of rows where D1 is a substring.
You want to get only a substring with full IP Address, that is you need to differentiate a full IP address.
Your input is separated by commas so we can utilize that. We are sure it's an IP if it is between two commas.
=COUNTIF(A:A,"*,"&D1&",*")
However the first and last IP addresses does not adhere to this. I would just add commas in front and end of input so it becomes consistent for our first and last IP via:
=","&A1&","

Assuming that all search patterns start with 192, there are three main ways of counting the matches of a string exactly:
(1) Countif with wildcards:
=COUNTIF(A$2:A$9,"*"&C2&",*")+COUNTIF(A$2:A$9,"*"&C2)
(2) Find with isnumber and sum (or sumproduct):
=SUM(--ISNUMBER(FIND(C2&",",A$2:A$9&",")))
(3) Substitute with len and sum:
=SUM((LEN(A$2:A$9&",")-LEN(SUBSTITUTE(A$2:A$9&",",C2&",","")))/LEN(C2&","))
The results are as follows:
Countif counts once per row as long as the same IP address can't be repeated in the same row.
Find only counts once per row.
Substitute counts multiple occurrences per row.
If the above assumption doesn't hold (e.g. you want to search for 92.1.2.3 and still demand an exact match), then you would have the following:
(1) Using countif to count the number of lines containing the string, it's necessary to consider separate cases for when the string is on its own, at the start of a line, in the middle of a line, or at the end of a line. Assumes as before that there is no more than one occurrence per line:
=COUNTIF(A$2:A$9,C2)+COUNTIF(A$2:A$9,C2&",*")+COUNTIF(A$2:A$9,"*,"&C2&",*")+COUNTIF(A$2:A$9,"*,"&C2)
(2) Find still counts the number of lines containing at least one occurrence if the string is fully delimited:
=SUM(--ISNUMBER(FIND(","&C2&",",","&A$2:A$9&",")))
(3) I don't think you can count all occurrences including multiple occurrences in the same line with substitute in the general case because it becomes recursive (not without VBA or, if you had Excel 365, a lambda, but in that case you would use Filterxml anyway).

Related

Nesting Excel formulas to extract e-mail address top-level domain

I want to extract the top-level domain from e-mail addresses using Excel formulas.
I tried it first with concatenating RIGHT(..) Formulas and splitting for the dot. Sadly I do not know how to do this recursively with excel formulas, so I swapped to deleting all characters except the last 4. Now the problem is, when I split my formulas into single cells it works perfectly fine. If I try to use them together, I get only the output of the first inner Formula. How do I fix this?
=RIGHT(B8; LEN(B8)-(LEN(B8)-4))
=RIGHT(BF8;LEN(BF8)-FIND(".";BF8))
These are the formulas split into single cells. And here both together
=RIGHT(RIGHT(B8; LEN(B8)-(LEN(B8)-4));LEN(B8)-FIND(".";B8))
I get the same return value as in the first row from this formula
=RIGHT(B8; LEN(B8)-(LEN(B8)-4))
This =RIGHT(B8; LEN(B8)-(LEN(B8)-4)) is just a uselessly complicated version of =RIGHT(B8; 4).
Substituting this for BF8 in
=RIGHT(BF8;LEN(BF8)-FIND(".";BF8))
yields this
=RIGHT(RIGHT(B8; 4);LEN(RIGHT(B8; 4))-FIND(".";RIGHT(B8; 4)))
which can be simplified as
=RIGHT(RIGHT(B8; 4);4-FIND(".";RIGHT(B8; 4)))
So that's the answer to your question.
But note that this will fail when parsing e-mail addresses whose top-level domain name has more than 3 characters! So it won't work for e.g. test#test.info. Note that top-level domains can be up to 63 characters long!
In this earlier answer, I give a more general solution to this problem, not limited to searching a predetermined number of characters from the right.
=MID(B8;FIND(CHAR(1);SUBSTITUTE(B8;".";CHAR(1);LEN(B8)-LEN(SUBSTITUTE(B8;".";""))))+1;LEN(B8))
returns everything after the last . in the string.
Dot character may appear in left part if e-mail, like: john.johnson#email.com
So, you can't just find "." you need firstly find #, then find dot in right substring.
Tehese are your steps:
1. =FIND("#"; B8)
find # character place
2. =RIGHT(B8;LEN(B8) - FIND("#"; B8))
get substring right from #
3. =FIND(".";RIGHT(B8;LEN(B8) - FIND("#"; B8)))
find "." in step 2 substring
4. =RIGHT(RIGHT(B8;LEN(B8) - FIND("#"; B8)); LEN(RIGHT(B8;LEN(B8) - FIND("#"; B8))) - FIND(".";RIGHT(B8;LEN(B8) - FIND("#"; B8))))
get right(step2; len(step2) - step3)

Extracting certain numbers from a cell containing numbers and special characters

I have cells that contain both numbers and special characters such as this:
[1:250:10]
The 'coordinates' shown above can be in the following format.
[(1-9):(1-499):(1-15)] in terms of what numbers can be within each part.
How do I extract these three numbers into three separate cells?
Assuming your data is in Cell A1 the to extract first number use following formula
=MID(A1,2,(FIND(":",A1,1)-2))
for second number use
=SUBSTITUTE(MID(SUBSTITUTE(":" & A1&REPT(" ",6),":",REPT(":",255)),2*255,255),":","")
finally for third number enter
=SUBSTITUTE(TRIM(RIGHT(SUBSTITUTE(A1,":",REPT(" ",LEN(A1))),LEN(A1))),"]","")
Just tossing out some other options.
First number since it only has a length of 1 digit and is on the left side, use the following:
=RIGHT(LEFT(A1,2))
second number will be found by locating the : in the string
=MID(A1,FIND(":",A1)+1,FIND(":",A1,FIND(":",A1)+1)-(FIND(":",A1)+1))
third number will be dealt with in the same way as the second but we will use the second : and the ] as the identifiers as to where to grab from and how much to pull.
=MID(A1,FIND(":",A1,FIND(":",A1)+1)+1,FIND("]",A1)-(FIND(":",A1,FIND(":",A1)+1)+1))
now all those number will actually come through as text. If you want to have them as numbers in the cells, send them through a math operation that will not change their value. Do something like +0, -0, or *1 at the end. Alternatively you could add -- at the start of each formula (yes that is double - incase you were wondering if it was a typo)

Count the number of individual entries in a cell

Is it possible to count the number of individual entries in a cell?
For example 2+2+4-1 = 4 entries
Using the count formula counts the entries as 1
I want to calculate the number of adjustments made in a particular period.
Each +/- in an individual cell represents 1 adjustment.
Assuming you're referring to a text cell, the trick is to count the symobols you'd like to find. Before we dig into that, if you want to enter this data as text, you can use the ` symbol (Usually to the left of the 1 key on your keyboard) before entering your text to make sure it gets processed as text.
If you want to verify that it is text, you can use the TYPE function and look for a return result of 2 (check the link for other possible return types)
There are no direct functions to count characters in Excel, so the trick is to find the length of the original text and subtract it from the length of a new text where you have removed all of the special characters. You mentioned you were trying to count the entries (i.e. the numbers), but you said your goal was to ultimately count the number of '+/-' operations. Since counting numbers can be tricky with excel formulas (since we'll get hung up on 2 and 3 digit numbers), I am going to approach this problem from the perspective of counting the operations you are looking for. So here is a basic example:
length("2+1") = 3
- length("21") = 2 (we replaced the + with "" [blank])
= 1
So we know there is 1 '+' since we replaced it. The appropriate functions used to accomplish this are LEN and SUBSTITUTE
Since you can only find one symbol at a time using the SUBSTITUTE function, we must take the output of the first formula, and give it to the second formula, and so on and so forth. Ultimately, we can put together as many functions as we need to achieve the desired result.
So we start with + for your example (And assuming your data is in A1)
=LEN(A1)-LEN(SUBSTITUTE(A1,"+",""))
which gives us a result of 2. But we also need to find the - symbol. So we wrap another SUBSTITUTE:
=LEN(A1)-LEN(SUBSTITUTE(SUBSTITUTE(A1,"+",""),"-",""))
You have said you wanted to count the number of +/- in the cell, and this does accomplish that, but if you want to expand it to more mathematical operators, you simply add more SUBSTITUTE functions (here is a complete function where I've added * and /)
=LEN(A1)-LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"+",""), "-",""),"*",""),"/",""))
Well, this formula would replace all your numbers with "" and then Count the +/- and adds one, should do it, but is ugly:
=LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1;"0";"");"1";"");"2";"");"3";"");"4";"");"5";"");"6";"");"7";"");"8";"");"9";""))+1
Could probably be done with RegEx, but I don't know how to do that in formulas
This makes 31+12+3-4 to ++-, Counts the LEN (3) and adds 1

retrieve part of the info in a cell in EXCEL

I vaguely remember that it is possible to parse the data in a cell and keep only part of the data after setting up certain conditions. But I can't remember what exact commands to use. Any help/suggestion?
For example, A1 contains the following info
0/1:47,45:92:99:1319,0,1320
Is there a way to pick up, say, 0/1 or 1319,0,1320 and remove the rest unchosen data?
I know I can do text-to-column and set the delimiter, followed by manually removing the "un-needed" data, but my EXCEL spreadsheet contains 100 columns X 500000 rows with each cell looking similar to the data above, so I am afraid EXCEL may crash before finishing the work. (have been trying with LEFT, LEN, RIGHT, MID, but none seems to work the way I had hoped)
Any suggestion will be greatly appreciated.
I think what you are looking for is combination of find and mid, but you'll have to work out exactly how you want to split your string:
A1 = 0/1:47,45:92:99:1319,0,1320 //your number
B1 = Find(“:“,A1) //location of first ":" symbol
C1 = LEN(A1) - B1 //character count to copy ( possibly requires +1 or -1 after B1.
=Left(A1,B1) //left of your symbol
=Mid(A1,B1+1,C1) //right size from your symbol (you can also replace C1 with better defined number to extract only 1 portion
//You can also nest the statements to save space, but usually at cost of processing quantity increase
This is the concept, you will probably need to do it in multiple cells to split a string as long as yours. For multiple splits you probably want to replicate this command to target the result of previous right/mid command.
That way, you will get cell result sequence like:
0/1:47,45:92:99:1319,0,1320; 47,45:92:99:1319,0,1320; 92:99:1319,0,1320; 99:1319,0,1320......
From each of those you can retrieve left side of the string up to ":" to get each portion of a string.
If you are working with a large table you probably want to look into VB scripting. To my knowledge there is no single excel command that can take 1 cell and split it into multiple ones.
Let me try to help you about this, I am not a professional so you may face some problems. First of all my solution contains 2 columns to be added to the source column as you can see below. However you can improve formulas with this principle.
Column B Formula:
=LEFT(A2,FIND(":",A2,1)-1)
Column C Formula:
=RIGHT(A2,LEN(A2)-FIND("|",SUBSTITUTE(A2,":","|",LEN(A2)-LEN(SUBSTITUTE(A2,":","")))))
Given you statement of having 100x columns I imagine in some instances you are needing to isolate characters in the middle of your string, thus Left and Right may not always work. However, where possible use them where you can.
Assuming your string is in cell F2: 0/1:47,45:92:99:1319,0,1320
=LEFT(F2,3)
This returns 0/1 which are the first 3 characters in the string counting from the left. Likewise, Right functions similarly:
=RIGHT(F2,4)
This returns 1320, returning the 4 characters starting from the right.
You can use a combination of Mid and Find to dynamically find characters or strings based off of defined characters. Here are a few examples of ways to dynamically isloate values in your string. Keep in mind the key to these examples is the nested Find formula, where the inner most Find is the first character to start at in the string.
1) Return 2 characters after the second : character
In cell F2 I need to isolate the "92":
=MID(F2,FIND(":",F2,FIND(":",F2)+1)+1,2)
The inner most Find locates the first : in the string (4 characters in). We add the +1 to move to the 5th character (moving beyond the first : so the second Find will not see it) and move to the next Find which starts looking for : again from that character. This second Find returns 10, as the second : is the 10th character in the string. The Mid formula takes over here. The formula is saying, Starting at the 10th character return the following 2 characters. Returning two characters is dictated by the 2 at the end of the formula (the last part of the Mid formula).
2) In this case I need to find the 2 characters after the 3rd : in the string. In this case "99":
=MID(F2,FIND(":",F2,FIND(":",F2,FIND(":",F2)+1)+1)+1,2)
You can see we have simply added one more nested Find to the formula in example 1.

Formula for text before and after space in a string

I'm attempting to create usernames based off of a given persons first and last name. Generally, we use the first initial and last name for a username. However, now many of our users have 2 last names and sometimes include a hyphen. I am trying to create a code that gives me the first initial, the first letter of the FIRST last name and then the last name.
For example --
Amy Smith-Jones ==
asjones
This is what I am currently using, but, of course, it would yield "asmithjones".
=LOWER(LEFT(A1,1)&SUBSTITUTE(SUBSTITUTE(A2,"-","")," ",""))
I've tried some variations of this, but with no luck.
=LOWER(LEFT(A1,1)&LEFT(A2,1)&SUBSTITUTE(SUBSTITUTE(A2,"-","")," ",""))
Is there a way to generate both the first letter of the first string and the full text of the 2nd string?
EDIT
I came up with something, but now I face another challenge
=IFERROR(LOWER(LEFT(D2,1)&SUBSTITUTE(SUBSTITUTE(RIGHT(F2,LEN(F2)-FIND(" ",F2&" ")),"-","")," ","")),LOWER(LEFT(D2,1)&SUBSTITUTE(SUBSTITUTE(F2,"-","")," ","")))
Some users have 1 last name so this applies if the formula comes across those. But I have some who have a hyphen instead of a space. The SUSTITUTE function accounts for both, but how can I make the FIND function do the same?
Try:
=LEFT(A1,1)&MID(A1,(SEARCH(" ",A1)+1),1)&RIGHT(A1,(LEN(A1)-(SEARCH(" ",(SUBSTITUTE(A1,"-"," ")),(SEARCH(" ",A1)+1)))))
Based on your edit, I'll assume first names are in column D and last names are in column F:
=LOWER(LEFT(D2) & IFERROR(LEFT(F2)&MID(F2,FIND("-",SUBSTITUTE(F2," ","-"))+1,99), F2))
SUBSTITUTE changes spaces to hyphens in the last name, so FIND can look for hyphens only.
IFERROR fails if a hyphen is not found (after substitution), in which case the entire last name is returned.
Example:

Resources