delete particular string values in the column - string

I have column which contains 4000 unique values(rows). I want to delete values such as 'I__ND_LD(1),I__ND_LD(2),P__ND_LN(1),I__XF_XF(4)'.these values are unique in numbers in the brackets. for example. 'I__ND_LD(1) starts with 1 and end with 'I__ND_LD(70).
By this code,I can remove only one character using above function. I want to remove all the values as mentioned in the problem.
eda[~eda.Devices.str.Contains("^I__ND_LD(1)")]
Is there any other technique through which i can remove all these values, also we have different number of 'I__ND_LD' and 'P__ND_LN(1). I want to implement this in the function so that I can just pass the values and it delete all the values in the column.

to_remove = ['abc\(\d+\)', 'bca']
eda[~eda.devices.str.contains('|'.join(to_remove), regex=True)]

Related

Five random items from a a list into a single cell separated by a comma

I have n number of unique values in n cells in Column A. (For ex: EDN12, EDN122, EDN991, ....)
I want to return any five unique values without repetition in a random order from Column A into an individual cell n times separated by a comma. For example; (EDN12, EDN112, EDN991, EDN881, EDN12)
How do I achieve this?
I have tried this formula provided here (Return a random order of a list into a single cell )
=TEXTJOIN(",",,INDEX($A$1:$A$5,UNIQUE(RANDARRAY(1000,1,1,5,TRUE))))
But it only generates five values for starting five cells in column A and rest are omitted.
Assuming values in column A are unique on their own, try:
=LET(x,TOCOL(A:A,3),TEXTJOIN(", ",,TAKE(SORTBY(x,RANDARRAY(COUNTA(x))),5)))
Otherwise just nest 'x' in UNIQUE():
=LET(x,UNIQUE(TOCOL(A:A,3)),TEXTJOIN(", ",,TAKE(SORTBY(x,RANDARRAY(COUNTA(x))),5)))
This is an alternate formula to get the required results without using LET.
Although I prefer the solution using the LET function.
=INDEX(A3:A22,INDEX(UNIQUE(RANDARRAY(COUNTA(A3:A22),1,1,COUNTA(A3:A22))),SEQUENCE(5)))
Breaking it down:
Get an array of random numbers based on the number of data rows.
=RANDARRAY(COUNTA(A3:A22),1,1,COUNTA(A3:A22),TRUE)
Extract the unique values from the array of random numbers.
=UNIQUE(C3#)
Extract the first five unique values
=INDEX(D3#,SEQUENCE(5))
Use the extracted values to extract matching rows from the source data.
=INDEX(A3:A22,E3#)
Finally join the values into a single cell.
=TEXTJOIN(", ",TRUE,F3#)
If your list of data is very short, then it can return non-unique values.
Although your example appears to have at least 1000 data rows, so it will not be a problem.

Excel look up values based on comma separated string values

I need help with an Excel formula. I have to find the cell value based on a comma separated value list in another cell
For e.g Here G5 will have the max of Estimated End Date column (H) whose ID column contains values 1 or 2 (comma separated list in E5). Again above is e.g. there could be more than 2 values in the list
so G5 here should be 09/03/22 since it is max of 04/03/22 and 09/03/22.
We normally only respond to questions that include code or formulas we can work from (it's probably why you've received a downvote). However, in this case, I can see how you wouldn't know where to start.
Assuming you have Excel 365, the steps are:
Convert CSV to array
Map the array to found row numbers
Grab the max of the resultant array
I don't have the TEXT_SPLIT function, so I have to roll my own. I have a few helper functions for this (note these are saved as Names, in Formula->Define Name, so you'll need to add them).:
csvText
=LAMBDA(txt,"," & txt & ",")
csvCount
=LAMBDA(txt,LEN(txt)-LEN(SUBSTITUTE(txt,",",""))-1)
csvPosOf
=LAMBDA(n,txt,FIND("#",SUBSTITUTE(txt,",","#",n)))
csvLenOf
=LAMBDA(n,txt,csvPosOf(n+1,txt)-csvPosOf(n,txt)-1)
csvArray
=LAMBDA(txt,MAKEARRAY(1,csvCount(txt),LAMBDA(r,c,TRIM(MID(txt,csvPosOf(c,txt)+1,csvLenOf(c,txt))))))
This covers the splitting of your csv text to an array.
The mapping of the IDs to row numbers and MAX call is simply this formula:
=MAX(MAP(csvArray(csvText(C3)),LAMBDA(v,INDEX($A:$H,MATCH(VALUE(v),$A:$A,0),8))))
You'd add this formula into each row of your G column. Again, adjust the cell references as you need them. Please note you must call the VALUE function on the LAMBDA expression, as you can't implicitly MATCH a text representation of a number to an actual number.

Generate a unique ID (As much As possible) from a string in Excel using string functions

Let's say I have two strings in two cells
Cell A1 = Customer Country
Cell B1 = Customer City
I need to generate a unique ID using the Excel string functions (LEN, LEFT, MID, RIGHT etc.) or any other (CONCAT etc.) along with the ROW function.
Get first letter & last letter of each word, remove spaces and dashes, get the row number and return a unique string.
If I use
=IF(LEN(A$1)-LEN(SUBSTITUTE(A$1," ",""))=0,LEFT(A$1,1),IF(LEN(A$1)-LEN(SUBSTITUTE(A$1," ",""))=1,LEFT(A$1,1)&MID(A$1,FIND(" ",A$1)+1,1),LEFT(A$1,1)&MID(A$1,FIND(" ",A$1)+1,1)&MID(A$1,FIND(" ",A$1,FIND(" ",A$1)+1)+1,1))) &ROW(A$1)
I get results as CC1 in both cases. How would I get a unique ID in such as case.
The idea in the comment-section by #JosWoolley is a good one. Though, be careful how/where you'd add a column index. If you'd just add the column index number you'd create confusion between say CC111 from row 11 column 1 and the number from row 1 and possibly column 11. Just adding the actual address of the cell instead of these indices will help but can create confusion too if you don't add a delimiter first. Therefor I'd suggest something along the lines of:
Formula in D1:
=CONCAT(LEFT(TEXTSPLIT(A1," ")),"|",ADDRESS(ROW(A1),COLUMN(A1),4))
Note: If you don't yet have access to TEXTSPLIT() you can swap this with FILTERXML(). Also, you mentioned CONCAT() but if used with Excel 2019 you may need to CSE the formula.

Using the LEFT function to extract everything before a number doesn't work well with spilled arrays

I am currently trying to extract the prefix of a store ID to be able to then generate a list of stores with only that prefix.
Cell D1 has that formula to extract the unique prefix :
=TRANSPOSE(UNIQUE(LEFT(C2#, MIN(FIND({0,1,2,3,4,5,6,7,8,9},C2#&"0123456789"))-1)))
Cell C2 has that formula to extract the unique store ids from another sheet :
=UNIQUE(INDEX('Male Shoes'!A1#,,6))
The problem is that the formula in D1 only returns the first two characters from all the unique prefixes instead of using the correct value for each prefixes.
I have setup in column I the same formula as in D1 without the TRANSPOSE() and UNIQUE() functions and remove the # to see if that would return the correct value. I dragged it down the length of the C column.
=LEFT(C2, MIN(FIND({0,1,2,3,4,5,6,7,8,9},C2&"0123456789"))-1)
In Cell J2 I put the same formula has I2 but kept the # as a control.
=LEFT(C2#, MIN(FIND({0,1,2,3,4,5,6,7,8,9},C2#&"0123456789"))-1)
I believe the MIN() function is returning the minimum for the entire array and not for each row. I haven't found how to mitigated that problem anywhere online.
In my sample data that is not a problem since all the columns in D through G gave me the lists I was expecting but as more countries get added I might end up with duplicate country prefix. (i.e.: If the prefixes get shortened to 2 characters - Germany=GE and Georgia=GE)
If it is always 2 or three a simple IF instead of FIND will work:
=TRANSPOSE(UNIQUE(LEFT(C2#,IF(ISNUMBER(--MID(C2#,3,1)),2,3))))
Edit:
If there is only one time that it switches from alpha to numeric we can use:
=TRANSPOSE(UNIQUE(LEFT(C2#,MMULT(ISNUMBER(--MID(C2#,SEQUENCE(,MAX(LEN(C2#))),1))*ISERROR(--MID("A"&C2#,SEQUENCE(,MAX(LEN(C2#))),1)),SEQUENCE(MAX(LEN(C2#))))-1)))
This does not care how many characters are in the string, only that there is only 1 time that it switches from alpha to numeric. So ABDEFGHTEV4567 will work but A3D4 will not.

VLOOKUP - Find lookup value may be seperated by comma

I have two lists, where I need to see if values from the 1st list is also present in the 2nd list. However, due to the way my system is formatted, some values from the 1st list contains multiple values, that needs to be looked up.
If just one of the values is present in the 2nd list, it should print that value.
1st list values:
COLUMN A:
C00276129, CDK1029191
CAE031070
CAU029379
2nd list values:
COLUMN B:
CDK1029191
CAE031070
CUS0000000
CUS0000002
As you can see, in list one, some of the values may be printed out on the same row, but seperated by comma.
I am trying to get VLOOKUP to search for both values in list 1 and compare to the entire list 2:
=IFERROR(VLOOKUP(A1 & "*";B:B;1;FALSE);"Value not present")
However, above just returns "Value not present", even though the value on the first row is indeed present in list 2.
You can use this "Clumsy" formula to return just value that was found in case 2 values are in same row. =TRIM(IFERROR(VLOOKUP(LEFT(A2,FIND(",",A2,1)-1),B:B,1,FALSE),"")&" "&IFERROR(VLOOKUP(RIGHT(A2,LEN(A2)-FIND(",",A2,1)-1),B:B,1,FALSE),"")&" "&IFERROR(VLOOKUP(A2,B:B,1,FALSE),""))

Resources