How to remove emoji characters in excel? - excel

I would like to remove all the emoji characters in an excel spreadsheet. I know they are under char(63) which is not easy to tackle. I know some people will just remove all the special characters that are outside a-z and 0-9, but it doesn't solve my problem because my spreadsheet may have many other foreign languages like Chinese, Thai.

Related

How to make an excel (365) function that recognizes different words in the same cell and changes them individually

What im working with
I have a list of product names, but unfortunately they are written in uppercase I now want to make only the first letter uppercase and the rest lowercase but I also want all words with 3 or less symbols to stay uppercase
im trying if functions but nothing is really working
i use the german excel version but i would be happy if someone has any idea on how to do it im trying different functions for hours but nothing is working
=IF(LENGTH(C6)<=3,UPPER(C6),UPPER(LEFT(C6,1))&LOWER(RIGHT(C6,LENGTH(C6)-1)))
but its a #NAME error excel does not recognize the first and the last bracket
This is hard! Let me explain:
I do believe there are German words in the mix that are below 4 characters in length that you should exclude. My German isn't great but there would probably be a huge deal of words below 4 characters;
There seems to be substrings that are 3+ characters in length but should probably stay uppercase, e.g. '550E/ER';
There seem to be quite a bunch of characters that could be used as delimiters to split the input into 'words'. It's hard to catch any of them without a full list;
Possible other reasons;
With the above in mind I think it's safe to say that we can try to accomplish something that you want as best as we can. Therefor I'd suggest
To split on multiple characters;
Exclude certain words from being uppercase when length < 3;
Include certain words to be uppercase when length > 3 and digits are present;
Assume 1st character could be made uppercase in any input;
For example:
Formula in B1:
=MAP(A1:A5,LAMBDA(v,LET(x,TEXTSPLIT(v,{"-","/"," ","."},,1),y,TEXTSPLIT(v,x,,1),z,TEXTJOIN(y,,MAP(x,LAMBDA(w,IF(SUM(--(w={"zu","ein","für","aus"})),LOWER(w),IF((LEN(w)<4)+SUM(IFERROR(FIND(SEQUENCE(10,,0),w),)),UPPER(w),LOWER(w)))))),UPPER(LEFT(z))&MID(z,2,LEN(v)))))
You can see how difficult it is to capture each and every possibility;
The minute you exclude a few words, another will pop-up (the 'x' between numbers for example. Which should stay upper/lower-case depending on the context it is found in);
The second you include words containing digits, you notice that some should be excluded ('00SICHERUNGS....');
If the 1st character would be a digit, the whole above solution would not change 1st alpha-char in upper;
Maybe some characters shouldn't be used as delimiters based on context? Think about hypenated words;
Possible other reasons.
Point is, this is not just hard, it's extremely hard if not impossible to do on the type of data you are currently working with! Even if one is proficient with writing a regular expression (chuck in all (non-available to Excel) tokens, quantifiers and methods if you like), I'd doubt all edge-case could be covered.
Because you are dealing with any number of words in a cell you'll need to get crafty with this one. Thankfully there is TEXTSPLIT() and TEXTJOIN() that can make short work of splitting the text into words, where we can then test the length, change the capitalization, and then join them back together all in one formula:
=TEXTJOIN(" ", TRUE, IF(LEN(TEXTSPLIT(C6," "))<=3,UPPER(TEXTSPLIT(C6," ")),PROPER(TEXTSPLIT(C6," "))))
Also used PROPER() formula as well, which only capitalizes the first character of a word.

Only include alphanumeric characters in a column?

I am quite new to Excel and am looking for the most straightforward way to solve only including alphanumeric characters in a column of strings. If there are non-alphanumerics, they should simply be deleted in the string and leave only the alphanumerics.
I'd like to avoid using nested SUBSTITUTES functions because it looks clunky, but more importantly, I can't know/predict all the special symbols that could come up. I can't do SUBSTITUTE('hello-world', "-", "") -> "helloworld" because I don't want to exactly get rid of special characters, more-so only keep alphanumeric characters. So it should more so be something like - KEEP('hello-world', [A-Z,a-z,0-9]) (but I can't find something like that.
Is there a more straightforward way to do this without Visual Basic or Macros? I'd have to learn how to do those and it seems time-consuming. If not, I'd love a VBA or Macros idea.
Thanks so much!

Error - COUNTIF with Text containing special characters

I've tried using countif to count how many times text such as the ones shown below appear.
I've already searched a lot and I'm aware of the limitations that COUNTIF has for counting numeric data (such as the 15 characters limit).
But why isn't it working for the situation below, if it contains only Text?
LV3/SC*CZ2 1 (=COUNTIF($A$1:$A$2;A1)
LV3*CZ2 2 (=COUNTIF($A$1:$A$2;A2)
Thanks in advance.
You need to escape the "*" which in COUNTIF(), as in some other functions, is known as a wildcard for zero or more characters. You'd need to escape such characters to make them match the symbol literally using a tilde. Try:
=COUNTIF(A$1:A$2,SUBSTITUTE(A1,"*","~*"))
I like this source for some more explaination on the topic.

Extract strings of a certain language from a dataframe in python

I have a pandas DataFrame that contains a column with sentences from different languages (6 languages). The DataFrame also contains a column which states which language the corresponding sentence belongs to. However, a sentence may contain non letter ASCII characters such as =## etc.. and words that may not belong to the same language. Even though, it may be written in the same script. For an example please refer to the below sentence which, has been marked as Spanish;
'¿Vas a venir a la tienda conmigo?+== #loja' #Note that 'loja' is a Portuguese word.
Since the sentence is marked as Spanish I would like to remove all non Spanish words and non punctuation characters (+, =, =, #).
I have an idea to remove the non punctuation words by getting the set values and removing the ones that are not letters (there are only few punctuation characters. so no need to search). However, would someone be able to help remove the words that do not belong to the tagged language such as the Portuguese word in the above example using python.?
Thanks & Best Regards
Michael

How do I escape a dollar sign ('$') in an Excel formula?

Having a hard time checking if cells contain a dollar sign ('$'), as Ecxel thinks I'm trying to make an absolute reference.
I'm working with imported data that includes a column of usernames, and many of the usernames have a '$' character at the end. In Excel, I'm omitting some of the data in the username column, based on strings they may contain. Some example-ish accounts:
chi_smithcleve
letter_admin
NYCDB140$
outside3
NYCPRD148$
ATLDB12$
chi_goadjames
I want to test the usernames for three conditions: they don't contain the string 'NYC', 'chi', or '$'. The character-strings are easy, but I can't figure out how to escape the dollar-sign character! All the documentation I've found suggests double-quotes as an escape mechanism in Excel, but that doesn't seem to be working. The primary formula that documentation says should work is:
=ISNUMBER(SEARCH(""$"",A2))
where I'm checking the cell A2 to see if the '$' character occurs. But Excel's just telling me that I have an error. I've tried several other possible escape characters, to no avail.
(I could do a 'character replace' function at some point upstream, to replace the '$'s with a more manipulatable character, but I'd rather just leave the data in same state as when it's received)
Try this, it is working for me.
=ISNUMBER((SEARCH("$",A2)))

Resources