Excel Substitute given values and delete non given values - excel

I have 3 columns named Old Model, New Model, & Obsolete Model. The New Model should be the output and is the New name from the Old Model table which is the input where as: AB
should be IJ, CD should be KL, and EFshould be MN. The Obsolete Model containing GH should not be included in the output of column New Model.
Old Model New Model Obsolete Model
AB,CD,EF,GH IJ,KL,MN GH
I can simply use the Substitute excel formula for this one which goes like this:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2,"AB","IJ"),"CD","KL"),"EF","MN")
The problem with this formula is that it will still show GH in the New Model column.
Can Anyone help me with this one? I have more than 1000 lines and within are different Old Model different New Model and different Obsolete Model
Is there a simple approch for this one without using VBA codes since the fili I am running is quite big and I'm affraid it might slow down the process.
Best Regards,

EDIT
Extract one old model from the cell like this:
=MID(A1,FIND("CD",A1),(len("CD")))
Result: AB,CD,EF,GH --> CD
Concatenate as many of those sequences as necessary to isolate all your old models.
=MID(A1,FIND("AB",A1),(len("AB")))&MID(A1,FIND("CD",A1),(len("CD")))&MID(A1,FIND("EF",A1),(len("EF")))
Then in a separate column, you an apply your original substitute formula to the result.
Add another substitute statement and substitute "GH" with nothing, "".
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2,"AB","IJ"),"CD","KL"),"EF","MN"), "GH", "")
Then to isolate "GH" in the obsolete column, just replace all the other values with nothing.
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2,"AB",""),"CD",""),"EF",""), "GH", "")
Make sure to add commas to the strings to be replaced if you want to get rid of those also, i.e. "AB,", "CD,", "EF,".

Related

Returning a column reference from MATCH to avoid using INDIRECT with a Named Range

TL;DR: I'm basically trying to obtain a column range such as 'Sheet 1'!$A:$A where the A is obtained by matching the contents of a given cell to a 1:1 range within a sheet referenced by another given cell, for use in a dynamic range.
In the highly probable case where that made zero sense, here's an illustration:
PARAMETERS: A2 = "LIST" | C2 = "FirstName" | Desired result: 'LIST'!$A:$A
And I've obtained that, BUT, I can't use that output ('LIST'!$A:$A) within formulas (namely to create a dynamic range). For instance, here 'LIST'!$A:$A contains 101 cells with values in them:
V3 = NamedFormula = 'LIST'!$A:$A
COUNTA(INDIRECT(V3)) = 101
COUNTA(INDIRECT(NamedFormula)) = 1 because it evaluates to #VALUE and that is a singular result.
Before delving into the topic of using INDIRECT with a Named Range (which I've read about and am still getting over my confused grief), I'm realizing my Names are getting a bit out of hand. I tend to use Excel like a mad scientist. So, in case there's a much simpler solution to what I'm trying to do, here's my actual mission:
0. I'm building a tool to simplify a process where email addresses are built from different data, which needs to run without any scripts, only formulas.
1. A tab with no imposed name would contain a user database with minimally (firstname and lastname OR IDs) AND (potentially other data columns) in no specific order. Tool users would import that tab from wherever the data got to them depending on the client, and would only need to copy-paste relevant headers to the main tab without changing anything else here for data integrity.
2. The main tab would have specific input fields where tool users would paste in the name of the imported tab as well as the labels of the columns they need (for instance, the labels in the first row of the columns containing the first name and the last name), and an input field for the domain name to use to build those email addresses.
3. A Data tab is referenced for cleaning and preparing strings for email address formats.
4. The Export tab would spew out a list of clean email addresses that can be exported to CSV.
The Data tab is just 2 columns to use with SUBSTITUTE so that for instance apostrophes are removed but accented letters are normalized (é -> e). I've used LAMBDA within Names to get there. The problem is to tie everything in - to get those Named ranges into the final formula.
The Names I'm using so far (I'd like to use fewer but testing specific parts extended beyond simple usage I fear):
ALPH ={"A";"B";"C";"D";"E";"F";"G";"H";"I";"J";"K";"L";"M";"N";"O";"P";"Q";"R";"S";"T";"U";"V";"W";"X";"Y";"Z"}
LABELS =LAMBDA(labelname,ADDRESS(2,MATCH(labelname,INDIRECT("'"&PARAMETERS!$A$2&"'!$1:$1"),0),1,1,PARAMETERS!$A$2))
RANGECOL =LAMBDA(labelname,COLUMN(INDIRECT(LABELS(labelname))))
RNCOL =LAMBDA(label,"'"&PARAMETERS!$A$2&"'!$"&INDEX(ALPH,RANGECOL(label))&":$"&INDEX(ALPH,RANGECOL(label)))
I haven't tied everything in the Data tab yet - I'm still trying to automate my main tab before pushing further and using the Data tab substitutions on top of everything. That will be the next step, not my current focus. But, for the curious and interested, on the Data tab I'm using something something I found on ablebits which works wonders =]
So, now if I use the offset range with a static LIST!A:A it works:
=IF($C$2<>"",LOWER(INDEX(OFFSET(INDIRECT(ADDRESS(2,MATCH($C$2,INDIRECT("'"&$A$2&"'!$1:$1"),0),1,1,$A$2)),0,0,COUNTA(LIST!A:A)-1,1),ROW())),"") &IF($C$3<>"","."&LOWER(INDEX(OFFSET(INDIRECT(ADDRESS(2,MATCH($C$3,INDIRECT("'"&$A$2&"'!$1:$1"),0),1,1,$A$2)),0,0,COUNTA(LIST!A:A)-1,1),ROW())),"") &"#"&$C$4
But when I try to use the dynamic RNCOL($C$3) it does not:
=IF($C$2<>"",LOWER(INDEX(OFFSET(INDIRECT(LABELS($C$2)),0,0,COUNTA(INDIRECT(RNCOL($C$2)))-1,1),ROW())),"") &IF($C$3<>"","."&LOWER(INDEX(OFFSET(INDIRECT(LABELS($C$3)),0,0,COUNTA(INDIRECT(RNCOL($C$3)))-1,1),ROW())),"") &"#"&$C$4
This just gives #REF, and evaluating shows the digression starting at INDIRECT(RNCOL($C$3)) equating to #VALUE.
I'm starting to see double here but my undying and completely normal love for Excel prevents me from going home from work as I'm way too far down the rabbit hole to let my obsession die here.
Any pointers as to how this can work?
Note - all of the names in the supplied sheet were generated by an online fake name generator, nothing in here is actual user data #GDPR
Thanks in advance! <3
Test sheet is available via Google Drive.
Your current set-up is not good for many reasons, and in my opinion would require a complete overhaul, the scope of which lies beyond a response on this website.
As to a 'quick fix' to your current issue, the reason your formula in E1 is currently returning an error is due to the fact that, as you can see via stepping through with the Evaluate Formula tool, the part
COUNTA(INDIRECT(RNCOL($C$2)))-1
is resolving to
COUNTA(INDIRECT({"'LIST'!$A:$A"}))-1
and this is not the same as
COUNTA(INDIRECT("'LIST'!$A:$A"))-1
in that the value being passed to INDIRECT is an array in the former though not in the latter. Although INDIRECT can accept arrays, it is only within certain constructions in conjunction with other suitable functions; here it will simply error.
And the reason that it is returning an array is due to the fact that RNCOL($C$2) is returning an array, and that is because that function is defined as
=LAMBDA(label,"'"&PARAMETERS!$A$2&"'!$"&INDEX(ALPH,RANGECOL(label))&":$"&INDEX(ALPH,RANGECOL(label)))
and, since RANGECOL($C$2) resolves to 1 here, the above is equivalent to
"'PARAMETERS!$A$2'!$"&INDEX(ALPH,1)&":$"&INDEX(ALPH,1)
Here, because you are omitting the column_num parameter from INDEX, the part
INDEX(ALPH,1)
is resolving to
{"A"}
which is an array (albeit one comprising a single value) and technically different from
"A"
In most circumstances, this is not an issue. As such, it is almost always unnecessary to pass both a row_num and column_num parameter to INDEX when indexing a one-dimensional array. Here, however, it matters.
You can resolve this by explicitly including a column_num parameter, i.e. redefine RNCOL as
=LAMBDA(label,"'"&PARAMETERS!$A$2&"'!$"&INDEX(ALPH,RANGECOL(label),1)&":$"&INDEX(ALPH,RANGECOL(label),1))

Excel array formula to multi dimensional intermediate arrays

I have the following data
What I want to achieve is these intermediate arrays
i.e. for JohnProject there are two entries A and C. For Project A there are two tags 10 and 6 in the 2nd table, and for Project C there are three tags 22,9,7. I want to get these as intermediate arrays in a bigger array formula.
With array formula
=UNIQUE(IF({"D"}=D2:D9,ROW(D2:D9)))
I can achieve the closer result for JuliaProject D and 4,8 but for any other who has two projects it doesnt work.
=UNIQUE(IF({"A","C"}=D2:D9,ROW(D2:D9)))
I want to have intermediate results in the form for arrays to be used in other array formula.
EDIT: The complex version of my problem.
Ok for the bigger problem. Its a bit complex to explain. But I can try. I have a mapping table, truth table and Input data. I want to find if the input data can be mapped using the truth table and mapping table. So its a bit of a mapping I want to achieve at the end. Mapping table itself is a many-many where one project can have multiple tags, as well as one tag can associated with multiple projects, but there is no Person in there. Purely projects and tags. The truth table has Person and Projects but not the tags. I want to map From the Unique person to a Tag. I can easily get all the projects associated with one person (In array formula as an intermediate array) and now for each project that I get in an array I want to find out all the tags and make them consolidated in one array. The problem lies in the many-many relationship in the mapping table, as it returns multidimensional array as a result not a single dimension. I want to get an array (intermediate in array formula) to finally search if a claimed tag is in that list. I hope it makes sense.
The following array formula can be used to join USER with a TAG:
{=AGGREGATE(15,6,INDEX($E$2:$E$9,N(IF(MMULT(--IFERROR(TRANSPOSE(INDEX($B$2:$B$6,N(IF($G2=$A$2:$A$6,ROW($B$2:$B$6)))-1))=$D$2:$D$9,FALSE),ROW($B$2:$B$6)^0),ROW($D$2:$D$9)))-1),COLUMNS($G$2:G2))}
I don't know where the formula will be used next, but to display the result I put it into the AGGREGATE function. Without AGGREGATE, the array is returned including error values.
"Clean" formula:
{=INDEX($E$2:$E$9,N(IF(MMULT(--IFERROR(TRANSPOSE(INDEX($B$2:$B$6,N(IF($G2=$A$2:$A$6,ROW($B$2:$B$6)))-1))=$D$2:$D$9,FALSE),ROW($B$2:$B$6)^0),ROW($D$2:$D$9)))-1)}
The following formula will display "Array result without error values" by hitting the highlighted formula inside the formula bar with keystroke F9
1] "Project" array result, in H2 formula copied down :
=INDEX(B$2:B$6,N(IF(1,AGGREGATE(15,6,ROW(B$1:B$5)/($A$2:$A$6=G2),ROW(INDIRECT("1:"&COUNTIF(A$2:A$6,G2)))))))
2] "Tag" array result, in I2 formula copied down :
=INDEX(E$2:E$9,N(IF(1,AGGREGATE(15,6,ROW(E$1:E$8)/ISNUMBER(MATCH(D$2:D$9,IF(A$2:A$6=G2,B$2:B$6),0)),ROW(INDIRECT("1:"&COUNT(MATCH(D$2:D$9,IF(A$2:A$6=G2,B$2:B$6),0))))))))

excel vba Delete entire row if cell contains the GREP search

I have a single column of text in Excel that is to be used for translating into foreign languages. The text is automatically generated from an InDesign File. I would like to clean it up for the translator by removing rows that simply contain a number ("20", 34.5" etc), or if they contain a measurement "5mm", "3.5 µm", etc. I've found many posts (see link below) on how to remove a row with specific string, but none that use search strings, such as those I typically use with GREP searches: "\d+" and "\d.\d µm"
How would I do this? I am on Mac iOS if that helps.
Note that I would need to delete the row if the cell only contains a number or a measurement, not if the number is contained within a phrase, sentence, or paragraph, etc.
https://stackoverflow.com/a/30569969
It may not be what you are looking for, but how about just sorting the column and remove the rows starting with numbers? It is a manual approach but from what I understand this translation process only happens from time to time. Am I right?
I see two possible issues in your question:
How to work with regular expressions in Excel?
How to delete rows in a loop?
Let me start with the second question: when you want to create a for-loop in order to remove items from a list, you MUST start at the end and go back to the beginning (it's a beginner's trick, but a lot of people trip over it.
About the first question: this is a very useful post about this subject, it's too large to even give a summary here.

Requesting help editing a formula in a TM1 worksheet

The said TM1 worksheet uses the DBRW formula to write values that users enter, to a cube, and also uses the same to fetch the value and display it in the worksheet. The values in the cube consist of movie codes such as 7500023. This movie code can be mapped to a movie title in a dimension DIM. It is to be noted that this movie code and the movie title are both aliases of the principal name used in the dimension, which runs along like 0007500023 (The movie code with zeroes preceding). I'd like the movie titles to be displayed in the worksheet instead of the movie codes.
I tried using the SUBNM function, but it opens up a subset editor and also doesn't write values to the cube like DBRW. So, that's ruled out.
The DBRA function seemed perfect when it came to fetching the movie titles from the dimension DIM. But this doesn't write the values to the cube.
Is there any way I can combine the DBRA and the DBRW function in a formula or is there an alternative function for this purpose?
I will add an extra column which will extract the alias = DBRA.
And then I will base your IF statement on the new column.
Otherwise, if you don't need the code, then why not use the alias in the first place? Export data with the alias on. Then you don't even need to retrieve the alias, because it's already there.
It just doesn't make sense to put them both in one formula.
When you recalculate - excel assess the formula once.
But you're expecting it to assess it multiple times. (hence the circular reference)

Sharepoint: Calculated Column replace all spaces

Seems like it would be a simple thing really (and it may be), but I'm trying to take the string data of a column and then through a calculated column, replace all the spaces with %20's so that the HTML link in the workflow produced email will actually not break off at the first space.
For example, we have this in our source column:
file:///Z:/data/This is our report.rpt
And would like to end up with this in the calculated column:
file:///Z:/data/This%20is%20our%20report.rpt
Already used the REPLACE, and made up a ghastly super nested REPLACE/SEARCH version, but the problem there is that you have to nest for EACH potential space, and if you don't know how many up front, it doesn't work, or will miss some.
Have any of you come across this scenario and how did you handle it?
Thanks in advance!
As far as I know there is no generic solution using the calculated-column syntax. The standard solution for this situation is using an ItemAdded (/ItemUpdated) event and initializing the field value from code.
I was able to solve this issue for my circumstances by using a series of calculated columns.
In the first calculated column (C1) I entered a formula to remove the first space, something like this:
=IF(ISNUMBER(FIND(" ",[Title])),REPLACE([Title],FIND(" ",[Title]),1,"%20"),[Title])
In the second Calculated column (C2) I used:
=IF(ISNUMBER(FIND(" ",[C1])),REPLACE([C1],FIND(" ",[C1]),1,"%20"),[C1]).
In my case, I wanted to encode upto four spaces, so I used 3 calculated columns (C1, C2, C3) in the same fashion and got the desired result.
This is not as efficient as using a single calculated column, but if SUBSTITUTE will not work in your SharePoint environment, and you cannot use an event handler or workflow, it may offer a workable alternative.
I actually used a slightly different formula, but it was on a work machine to which I don't have access at the moment, so I just grabbed this formula from a similar S.O. question. Any formula that will replace the first occurrence of a space with "%20" will work, the trick is to a) make sure the formula returns the original string unchanged if it does not have more spaces in it, and b) test, test, test. Create a view of your list that has the field you are trying to encode, plus the calculated fields, and see if you are getting the results you want.
so that the HTML link in the workflow produced email will actually not break off at the first space.
The browser only does this if you have not enclosed your link in quotes
If you wrap the link in quotes, it does not cut off at the first space
In a SharePoint Formula it would be:
="""file:///Z:/data/This is our report.rpt"""
becuase two quotes are the SP escape notation to output a quote
You can use this formula (Start trim for 1, in my case was 4):
=IF(ISBLANK([EUR Amount]),"",(TRIM(MID([EUR Amount],4,2))&TRIM(MID([EUR Amount],6,2))&TRIM(MID([EUR Amount],8,2))&TRIM(MID([EUR Amount],10,2))&TRIM(MID([EUR Amount],12,2))&TRIM(MID([EUR Amount],14,2)))*1)

Resources