Sort text alpha numerically in excel - excel

I have part number data in a spreadsheet that has been converted to text data (not numeric as there are letters) that I need to sort alpha numerically. I have read enough that this appears to be almost impossible due to nulls (I have none of these) dashes (I have tons of these). As you will see below, there are multiple letters and numbers in different locations in the field.
MS16624-2066
RWR80S
02-6009-23
23032-1910
31708-1370
11SM1-T
111SM1-5
The final result required is:
MS16624-2066
RWR80S
02-6009-23
11SM1-T
111SM1-5
23032-1910
31708-1370
I have tried as much as I could by looking at the sorts in this forum, but have had no luck. Can anyone suggest a working approach?

Assuming part numbers are in ColumnA starting in A1, in B1:
=SUBSTITUTE(A1,"-","")
Copy down to suit then Copy ColumnB and Paste Special, Values over the top.
Apply Text to Columns, Fixed width to ColumnB and choose character by character (positions 1 to 11 for your example).
Then sort A:M on ColumnC descending and move the rows with C numbers below the C letters.
You may then choose to delete ColumnsB:M.

Related

Excel: How to copy specific cell data (zip codes) into a new column

I have two columns of cells that have irregularly formatted addresses.
I need:
1) just the zip codes to be copied into a new column;
2) the rows that do not contain zip codes to be either highlighted or empty so that I can easily identify which ones are missing.
This seems like it would be simple to do, but I can't figure out how to have Excel just find all instances of 5 consecutive numbers. Currently they are formatted as text so that the zero's are displayed. Any help greatly appreciated.
Here's what it would be to start with:
Here's what it would look like when done (highlighting optional):
You don't have Regular Expression in normal Excel. You would have to go into VBA to do that. However, for your case, there's an easy pattern: notice how the zip code is after the last space, and it's always 5 digits long? The challenge then become finding the index of this last space and extract the 5 characters that follow it. It will be clearer if you split them into 2 formula
// C3 (index of last space character):
=FIND("|",SUBSTITUTE(B3," ","|",LEN(B3)-LEN(SUBSTITUTE(B3," ",""))))
// D3, the 5 characters after that.
// Return an empty string if the address doesn't match the pattern
=IFERROR(MID(B3,C3+1,5),"")
Another approach to what Zoff Dino wrote is to break it out a bit as shown below:
In cell C3 enter the formula you see in the formula bar
Drag that down the row set and over 1 column (so it runs for column B as well)
In column use this formula: =IF(AND(C3="",D3=""),"",IF(C3="",D3,C3)) and drag it down.
This will account for all possible situations you have shown and not error out on you (unless other patterns emerge).
You can then use conditional formatting to highlight the rows with no zip code as shown in the picture:

Find duplicates with same number sequence

I am currently trying to filter through loads of user data to find duplicate accounts. The best way to find identify the users are telephone numbers.
Unfortunately the numbers are not saved in the same format, nor do all the cells have the same amount of digits. See below:
+1 912 555 1234
001 912 5551234
(912) 5551234
912 5551234
912-555-1234
Is there anyway to just duplicate search for a certain sequence? So in this case 5551234.
I could just remove all the special signs (brackets, dashes, spaces etc.) manually with a simple "search and replace", right? But still the cells would have different amount of digits which is why normal duplicate search does not work.
I really appreciate your help. Thank you a lot!
Assuming you can't use VBA, I've put together a quick series of functions to deal with all the examples you have above. It may not be comprehensive, but you'll get the general idea. Put all of the below code into row 2 of a spreadsheet (so you can use headings if you wish)
Column A: Tel numbers
Column B (remove whitespace): =SUBSTITUTE(A2, CHAR(32),"")
Column C (remove brackets and dashes): =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2, CHAR(40),""), CHAR(41),""),CHAR(45),"")
Column D (replace +1 with 0): =IF(LEFT(C2,1)="+","0"&RIGHT(C2,LEN(C2)-2),C2)
Column E (replace 001 with 0): =IF(LEFT(D2,3)="001","0"&RIGHT(D2,LEN(D2)-3),D2)
Column F (ensure leading 0): =IF(LEFT(E2,1)="0",E2,"0"&E2)
Just copy/paste the cells down, and all the numbers used in your example will have the same format (in column F).
Note that columns B/C could be combined easily into a single column, but I've left them separated to make it easier to understand how it works. The combined column would be
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2, CHAR(32),""), CHAR(40),""), CHAR(41),""),CHAR(45),"")
If you need to remove any more special characters (in addition to the brackets and dashes) you can find all the ascii codes used by the SUBSTITUTE function in this table.

Comparing two columns in excel for similarities

I have two columns in excel A and B from 1 - 1400
The value in column A is 10 characters, "K0123456789" and column B is 9 characters "0123456789"
I need to compare the value in column B is the same value as column A without the "k" and highlight it if they do not match. I am not familiar with excel too much, so any information here would help so I do not have to go through all these lines myself on a daily basis.
Thanks for any help!
GenZade
You can put a formula in column C such as (For cell C1):
=IF(A1="K"&B1,"Match","No Match")
Of course, you could also add conditional formatting with a similar formula if you want to literally highlight it.
i would have just commented on neelsg post, but i don't have enough reputation for that, apologies. his solution works based on string values rather than numerical values. Some kind of preceding zeros might not work nicely with it. to compare actual numerical values you can use the following:
=IF(RIGHT(A1,LEN(A1)-1)*1=B1,"match","no match")
so depends if you want to match them as strings or actual numbers

Remove Duplicates with or without sorting

I have a large column of texts (5 digit integers concatenated with two letters, like: 12345AB ) and values (up to 8 digit positive integers, like: 12345678) . The list is around 12,200 total and when I do remove duplicates, it reduces to 7015 total. If I sort the result and then do another remove duplicates, I am left with 6324 entries. On the other hand if I sort first and then do remove duplicates, I am left with 6324 entries.
Is this a common issue that when number and text are mixed up that removing duplicates works only after sorting.
I can upload my file if this is not a common issue and is a problem with my file. I'm guessing if the row starts with numbers (text) then the excel search algorithm only goes down the column till such a point that it stops seeing numbers (text) and we miss out on the duplicates that show up later?
I shudder at the thought that I've been using remove duplicates incorrectly all this while.
Please help. Thanks.
EDIT To Include the actual file I am working with:
Link here
seems like you want to ensure is that they're all the same type, no? an easy way to coerce a cell to be text is:
=A1 & ""
and a number is:
=A1 * 1
I was able to accomplish this by using the Text to Columns option.
Select column (B)
Select Text to Columns on the Data Tab
Select delimited click next
Click next as there are no delimiters
Under column data format select Text
Then remove duplicates
I ran into this issue with VLookup before as well it ensures proper formatting of all data in the column.

Formula for excel

Help me please, to find a formula for excel, which takes all the words in the text (for example, text from column A) and gives all the words from the text without repeating in a column B.
For example,
Column A
Text
Although simplicity is a virtue, theories regarding pedagogy do not work in practice if they are black and white. To say that the best way to teach is only to praise positive actions and to ignore negative ones is like saying that strawberries reduce one’s risk for cancer so people should cut apples out of their diet and only eat strawberries. In both situations, there does not have to be a choice.
Column B - Words from text
Although
simplicity
is
a
virtue,
theories
regarding
pedagogy
do
not
work
in
practice
if
they
are
black
and
white.
To
say
that
the
best
way
to
teach
is
only
to
praise
positive
actions
and
to
ignore
negative
ones
is
like
saying
that
strawberries
reduce
one’s
risk
for
cancer
so
people
should
cut
apples
out
of
their
diet
and
only
eat
strawberries.
In
both
situations,
there
does
not
have
to
be
a
choice.
This is a rather complex thing for a single formula .... here's a method ...
part 1: splitting a text into single words:
A1: your text
A3: =SUBSTITUTE(A1,",","") .... removing commas
A5= =SUBSTITUTE(A3,".","") .... removing full stops (repeat this for other punctations you might have
A8: constant value 0
A9: =FIND(" ",$A$5,A8+1) .... find the first blank in $A$5 after the position indicated by the cell above .... copy this formula down until you get the first #VALUE error
B9: =MID($A$5,A8+1,A9-A8-1) .... extract the word between previous and this blank position .... copy this formula down until you get the first #VALUE error
when you are happy with your split list, copy/paste as values the list and do some headers
part 2: finding uniques words:
You need to find each unique word exactly once. A method strictly without VBA would consist of the following:
sort the text in column B ascending
enter in C8: =IF(B8=B7,C7+1,1) and copy down to end of list ... you create a "running number starting with 1 and continuing to increment as long as the word remains the same
autofilter column C for value = 1 ... this will display the first occurence of each word
copy / paste the filtered list to whereever you want to store it for further processing ... I recommend a sheet different from your raw data
You can restore the original sort order of the result by sorting on the numeric values in column A.
As you can see in the example of words "in", "to", this method is case insensitive. A limitation is a possible false seperation between "ones" and "one's" ... this needs to be decided.
You can try this formula:
=TRIM(MID(SUBSTITUTE($A$1;" ";REPT(" ";LEN($A$1)));1+(ROW(A1)-1)*LEN($A$1);LEN($A$1)))
Assuming test in A1, write formula in B1 and copy down till you got last word
Depending on your regional settings you may need to replace ";" by ","

Resources