Error - COUNTIF with Text containing special characters - excel

I've tried using countif to count how many times text such as the ones shown below appear.
I've already searched a lot and I'm aware of the limitations that COUNTIF has for counting numeric data (such as the 15 characters limit).
But why isn't it working for the situation below, if it contains only Text?
LV3/SC*CZ2 1 (=COUNTIF($A$1:$A$2;A1)
LV3*CZ2 2 (=COUNTIF($A$1:$A$2;A2)
Thanks in advance.

You need to escape the "*" which in COUNTIF(), as in some other functions, is known as a wildcard for zero or more characters. You'd need to escape such characters to make them match the symbol literally using a tilde. Try:
=COUNTIF(A$1:A$2,SUBSTITUTE(A1,"*","~*"))
I like this source for some more explaination on the topic.

Related

How to make an excel (365) function that recognizes different words in the same cell and changes them individually

What im working with
I have a list of product names, but unfortunately they are written in uppercase I now want to make only the first letter uppercase and the rest lowercase but I also want all words with 3 or less symbols to stay uppercase
im trying if functions but nothing is really working
i use the german excel version but i would be happy if someone has any idea on how to do it im trying different functions for hours but nothing is working
=IF(LENGTH(C6)<=3,UPPER(C6),UPPER(LEFT(C6,1))&LOWER(RIGHT(C6,LENGTH(C6)-1)))
but its a #NAME error excel does not recognize the first and the last bracket
This is hard! Let me explain:
I do believe there are German words in the mix that are below 4 characters in length that you should exclude. My German isn't great but there would probably be a huge deal of words below 4 characters;
There seems to be substrings that are 3+ characters in length but should probably stay uppercase, e.g. '550E/ER';
There seem to be quite a bunch of characters that could be used as delimiters to split the input into 'words'. It's hard to catch any of them without a full list;
Possible other reasons;
With the above in mind I think it's safe to say that we can try to accomplish something that you want as best as we can. Therefor I'd suggest
To split on multiple characters;
Exclude certain words from being uppercase when length < 3;
Include certain words to be uppercase when length > 3 and digits are present;
Assume 1st character could be made uppercase in any input;
For example:
Formula in B1:
=MAP(A1:A5,LAMBDA(v,LET(x,TEXTSPLIT(v,{"-","/"," ","."},,1),y,TEXTSPLIT(v,x,,1),z,TEXTJOIN(y,,MAP(x,LAMBDA(w,IF(SUM(--(w={"zu","ein","für","aus"})),LOWER(w),IF((LEN(w)<4)+SUM(IFERROR(FIND(SEQUENCE(10,,0),w),)),UPPER(w),LOWER(w)))))),UPPER(LEFT(z))&MID(z,2,LEN(v)))))
You can see how difficult it is to capture each and every possibility;
The minute you exclude a few words, another will pop-up (the 'x' between numbers for example. Which should stay upper/lower-case depending on the context it is found in);
The second you include words containing digits, you notice that some should be excluded ('00SICHERUNGS....');
If the 1st character would be a digit, the whole above solution would not change 1st alpha-char in upper;
Maybe some characters shouldn't be used as delimiters based on context? Think about hypenated words;
Possible other reasons.
Point is, this is not just hard, it's extremely hard if not impossible to do on the type of data you are currently working with! Even if one is proficient with writing a regular expression (chuck in all (non-available to Excel) tokens, quantifiers and methods if you like), I'd doubt all edge-case could be covered.
Because you are dealing with any number of words in a cell you'll need to get crafty with this one. Thankfully there is TEXTSPLIT() and TEXTJOIN() that can make short work of splitting the text into words, where we can then test the length, change the capitalization, and then join them back together all in one formula:
=TEXTJOIN(" ", TRUE, IF(LEN(TEXTSPLIT(C6," "))<=3,UPPER(TEXTSPLIT(C6," ")),PROPER(TEXTSPLIT(C6," "))))
Also used PROPER() formula as well, which only capitalizes the first character of a word.

How to search for items with multiple "-" in excel or VBA?

I have a list of item numbers (100K) like this:
Some of the items have format like SAG571A-244-4 (thousands) which need to be filtered so I can delete them and only keep the items that have ONE hyphen per SKU. How can I isolate the items that have two instances of "-" in it's SKU? I'm open to solutions within Excel or using VBA as well.
Native text filters don't seem to be capable of this. I'm stumped.
As per John Coleman's comment, "*-*-*" can be used to isolate strings that have at least two dashes in them.
I would add that if you're entering them as a custom text filter, you should lose the double quotes (so just *-*-*) as otherwise the field seems to interpret the quotes literally.
Seems to work for me.
If you want just an excel formula to verify this and give you a result of the number of hyphens (0, 1, or 2+), here is one:
=IF(ISERROR(SEARCH("-",A1)),"0",IF(ISERROR(SEARCH("-",A1,IFERROR(SEARCH("-",A1)+1,LEN(A1)))),"1","2+"))
Replace A1 with your relevant column, then fill down. This is kind of a terrible way to do this performance wise, but you avoid using VBA and possibly xlsm files.
The code first checks to see if there is one hyphen, then if there is it checks to see if there is another hyphen after the position the first one was found. Looking for multiple hyphens in this manner is cumbersome and I don't recommend it.

How can I split a phrase into a new line every x characters on Google Sheets?

I am translating a game, and the game's text box only supports 50 characters max per line. Is there a way to use a formula to split the entire sentence every 50 characters or whole word (49, 48, 47, etc)?
I am currently working with this formula.
=JOIN(CHAR(10),SPLIT(REGEXREPLACE(A1, "(.{50})", "/$1"),"/"))
The problem with this code, is that it splits at exactly 50 characters (one time), and will split in the middle of the word.
So again, my goal is to have it not split on the 50th character IF the 50th character is in the middle of the word, and for the rule to apply for the rest of the lines too because it only applies on the first line.
Please take a look at this test google sheet to get an example of what I am talking about.
If it's impossible to do it on Google Sheets, I don't mind moving to Excel provided I get a functioning code.
For the record, I did ask in Google's product forums 2 days ago, and still haven't received an answer.
=REGEXREPLACE(A1, "(.{1,50})\b", "$1" & CHAR(10))
{50} matches exactly 50 times, but what you need is 50 or less.
\b is word boundary that matches between alphanumeric and non-alphanumeric character.
= REGEXEXTRACT(A1,"(?ism)^"&REPT("([\w\d'\(\),. ]{0,49}\s)", ROUNDUP(LEN(A1)/50,0))&"([\w\d'\(\),. ]{0,49})$")
Tested with various expressions and works as intended. Note that only these characters [a-zA-Z0-9_'(),.] are allowed, Which means - and other characters not mentioned will not work. If you need them, add them inside the REPT expression and finishing regexp formula. Otherwise, This will work perfectly.
You are pretty close. I'm not an expert in Sheets, so not sure if this is the best way, but your Regex is wrong for what you want.
Also, you need to be certain that you don't use a split character that might appear in the phrase itself. However, using CHAR(10) for the replace character allows you to insert LF without going through the JOIN SPLIT sequence.
replace any line feeds, carriage returns and spaces with a single space
Match strings that start with a non-Space character followed by up to 49 more characters which are followed by a space or the end of the string.
replace the capture group with the capturing group followed by the CHAR(10) (and delete the space following).
There will be extra CHAR(10) at the end which you can strip off.
EDIT Regex changed slightly due to a difference in behavior between Google's RE and what I am used to (probably has to do with how a non-backtracking regex works). The problem showed up on your example:
=regexreplace(REGEXREPLACE(REGEXREPLACE(A1 & " ","[\r\n\s]+"," "),"(\S.{0,49})\s","$1" & char(10)),"\n+\z","")

Excel conditional formating based on the multiple cells and values

I am trying to implement various conditional formatting to a specific data base. Looked for answer around here but can not find anything similar. Might not be possible but it is worth a try.
I am preforming various data cleansing and validation.
Here is the case: (small sample, working with 100k data entries in this particular file)
Ultimately what I want is the formula that will compare the low-level Description characters after the last "UNDERSCORE" to the characters after last "UNDERSCORE" of the higher level(highlighted). If it does not match then highlight the cell?
Asking for too much, yes, no, maybe? I am open to any other suggestions on how can I perform various data cleaning and validation!
Thank you!
If you must use the last "UNDERSCORE" character, and can't depend on the suffixes being four characters, the formula becomes quite complex. For simplicity's sake, I assumed the higher level is always missing the last five characters of the lower level, if you must go by the last "DASH" character, then this will be a lot longer.
Use this formula to highlight the cells, defining the two names LEVELS and DESCRS to be the two columns:
=IFNA(MID(B2,FIND("[]",SUBSTITUTE(B2,"_","[]",LEN(B2)-LEN(SUBSTITUTE(B2,"_",""))))+1,999)<>MID(INDEX(DESCRS,MATCH(LEFT(A2,LEN(A2)-5),LEVELS,0),1),FIND("[]",SUBSTITUTE(INDEX(DESCRS,MATCH(LEFT(A2,LEN(A2)-5),LEVELS,0),1),"_","[]",LEN(INDEX(DESCRS,MATCH(LEFT(A2,LEN(A2)-5),LEVELS,0),1))-LEN(SUBSTITUTE(INDEX(DESCRS,MATCH(LEFT(A2,LEN(A2)-5),LEVELS,0),1),"_",""))))+1,999),FALSE)
This uses a very nice trick with SUBSTITUTE to find the last occurrence of a character.
BTW, I would probably write a Perl program to parse the data and find errors.

Keeping leading zeros with find and replace

I'm using Excels find and replace to remove hyphens from long numbers. They are mixed between birth dates and organisation numbers that have been filled with leading zeros to have the same number of characters. There are a LOT of numbers so find and replace seems to be the simplest solution to remove the hyphens.
But when i use find and replace the leading zeros are truncated and I have not found a solution to keep them afterwards.
For example i have:
19551230-1234
01234567-8901
and after find and replace I have
1,95512E+11
12345678901
but want the format as:
195512301234
012345678901
So I want to keep the leading zeros after find and replace. I've tried formatting the cells as text, but it doesn't work as the find and replace automatically truncates the leading zero and keeps the remaining characters, so the zero is completely removed. I am using Excel 2010, but answers for several versions are appreciated.
Just put a single quote in front of your leading number - ex. '01234 It will take the number as-is literally and the quote will not show in the field.
Use the SUBSTITUTE formula instead of Find and Replace like so:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1," ",""),"/",""),")",""),"(",""),"-","")
The result is text.

Resources