Replacing a section of the data in a cell for thousands of excel data - excel

I have a large spreadsheet with column data like:
ABC:1:I.0
ABC:1:I.1
ABC:1:I.2
ABC:1:I.3
ABC:2:I.0
ABC:2:I.1
ABC:2:I.2
ABC:2:I.3
ABC:3:I.0
ABC:3:I.2
ABC:3:I.3
ABC:4:I.0
ABC:4:I.1
ABC:4:I.2
ABC:4:I.3
ABC:5:I.0
ABC:5:I.1
ABC:5:I.2
ABC:5:I.3
ETC.
I need to replace the above with the following:
ABC:I.Data[1].0
ABC:I.Data[1].1
ABC:I.Data[1].2
ABC:I.Data[1].3
ABC:I.Data[2].0
ABC:I.Data[2].1
ABC:I.Data[2].2
ABC:I.Data[2].3
ABC:I.Data[3].0
ABC:I.Data[3].2
ABC:I.Data[3].3
ABC:I.Data[4].0
ABC:I.Data[4].1
ABC:I.Data[4].2
ABC:I.Data[4].3
ABC:I.Data[5].0
ABC:I.Data[5].1
ABC:I.Data[5].2
ABC:I.Data[5].3
ETC.
Here is a sample of the data, most of the data follows a similar format with the exception of the naming "ABC", which can vary in size, so it might be "ABCD" and also with the exception of the letter "I", it can be "O" as well. Also, some might be missing some values such as ABC:3:I.1 which is missing from the data. I am not too familiar with excel formulas or VBA code. Does anyone know how to do this? I have no preference on which method it has to be done in as I don't mind learning some VBA code if someone provides me with a VBA solution.
I was thinking of using some sort of loop along with some conditional statements.
Thanks!

Please try:
=LEFT(F11,FIND(":",F11))&MID(F11,FIND(":",F11,6)+1,1)&".Data["&MID(F11,FIND(":",F11,2)+1,1)&"]."&RIGHT(F11,1)
copied down to suit, assuming placed in Row11 and your data is in ColumnF starting in Row11.
Curiosities:
When this A was first posted it attempted to address only the tabulated example input and output. I temporarily deleted that version while addressing that what was in the table as ABC might at times be ABCD and that what was I might at times be O.
OP has posted an answer that I edited to make no visible change but which shows as the deletion of two characters. A copy of the OP’s formula exhibited a syntax error prior to my edit.
OP suggested an edit to my answer but this was rejected by the review process. As it happens, I think the edit suggestion was incorrect.
I have edited my answer again to include these ‘curiosities’ and to match the cell reference used by the OP in his answer.

=LEFT(A1,SEARCH(":",A1)) & MID(A1, SEARCH(".",A1)-1, 2) &
"Data[" & MID(A1,SEARCH(":",A1)+1,1) & "]" & RIGHT(A1,2)

With the help of pnuts I was able to come up with my own solution:
=LEFT(F11,LEN(F11)-5)&MID(F11,LEN(F11)-2,2)&"Data["&MID(F11,LEN(F11)-4,1)&"]"&RIGHT(F11,2)
My solution works based on the fact that the length of the last six values in the string ABC:1:I:0 will always be the same in size for all the data I have, hence you see LEN(F11)-some number in my code. The only part of the string that changes in size is the first part, in this case ABC which can also be ABCDEF, etc.

If you'd like to use formulas rather than VBA, an easy option is to split the data into 4 columns, using the Text To Columns option - first split using the colon as a delimiter, then using a full-stop / period as a delimiter.
Once you have 4 columns of data (one for each block), you can use the Concatenate function to join them and add in the extra characters: =CONCATENATE(A1,":",C1,".","Data[",B1,"].",D1)
This should still work if you have extra / alternative characters (eg ABCD instead of ABC), as long as you have the same delimiters, but obviously you'd need to test to make sure.

Related

Counting number of occurences of a specific search string

I'm building a monitoring system that takes a log (where people register their work in a set format) and returns a counter, which I can use for analysis. The monitor and log are two separate workbooks. The log has entries like this: INITALS;DATE;HOUR:RESULT|
Each cell can contain multiple entries.
My first attempt was to do a simple countif and look for a string (note that I use ; instead of , in formulas since I work on a Dutch excel):
=COUNTIF('LOCATION'!Table[LOG];"*NB;??/??/????;??:??:#A*|*")
This worked fine, but the formula only counted the number of cells where this string was present, not the actual number of occurences. I then tried this solution.
=SUM(LEN('LOCATION'!Tabel13[LOG])-LEN(SUBSTITUTE('LOCATION'!Tabel13[LOG];"NB";"")))
This indeed counted the number of times "NB" was present in the LOG. However, when I tried to use the original search string, this solution stopped working:
=SUM(LEN('LOCATION'!Tabel13[LOG])-LEN(SUBSTITUTE('LOCATION'!Tabel13[LOG];"*NB;??/??/????;??:??:#A*|*";"")))
It seems to me that SUM does not recognize symbols like ? or * which are necessary to define the correct search string. Where did I go wrong? Or can this be solved in another way? I can still look into VBA, but the workbooks are slow as hell already.
"?" and "*" are wildcards. Some functions support these (like COUNTIFS()) where others don't. Like you found out, SUBSTITUTE() does not.
Here is one way to count, assuming ms365:
Formula in C1:
=REDUCE(0,A1:A2,LAMBDA(a,b,a+LET(X,SEQUENCE(LEN(b)),SUM(--(IFERROR(SEARCH("NB;??/??/????;??:??:#A*|*",b,X),0)=X)))))
Note: I removed the asterisk in front of "NB" just to make searching for a position valid in comparison to what i called variable "X".

Find value within cell from a series in Excel

I have a list of addresses, such as this:
Lake Havasu,  Lake Havasu City,  Arizona.
St. Johns River,  Palatka,  Florida.
Tennessee River,  Knoxville,  Tennessee.
I would like to extract the State from these addresses and then have a column showing the abbreviated State name (AZ, FL, TN etc.).
I have a table that has the States with their abbreviation and once I extract the State, doing a simple INDEX MATCH to get the abbreviation is easy. I don't want to use text-to-columns because this file will constantly have values added to it and it would be much easier to just have a formula that does the extraction for me.
The ways I've tried to approach this that have failed so far are:
Some kind of SEARCH() function that looks at the full State list and tries to find a value that exists in the cell
A MID or RIGHT approach to only capture the last section but I can't work out how to have FIND only look for the second ", "
A version of INDEX MATCH but that fails because I can't find a good way to search or find the values as per approach (1)
Any help would be appreciated!
Please try this formula, where A2 is the original text.
=FILTERXML("<data><a>" & SUBSTITUTE(A2,", ","</a><a>") & "</a></data>","data/a[3]")
An alternative would be to look for the 2nd comma as shown below. Note that the "50" in the formula is an irrelevant number required by the MID() function. It shouldn't be smaller than the number of characters you need to return, however.
Char(160) is a character that wouldn't (shouldn't) naturally occur in your text, as it might if the text comes from a UNIX database. You can replace it with another one that fits the description.
=TRIM(MID(A2, FIND(CHAR(160),SUBSTITUTE(A2,",",CHAR(160),2)) + 1,50))
The following variation of the above would remove the final period. It will fail if there is anything following the period, such as an unwanted blank. That could be accommodated within the formula as well but it would be easier to treat the original data, if that is an option for you.
=TRIM(MID(LEFT(A2, LEN(A2)-1), FIND(CHAR(160),SUBSTITUTE(A2,",",CHAR(160),2)) + 1,50))
To find the abbreviation I would recommend to use VLOOKUP rather than INDEX/MATCH.
Use this (screenshot refers):
=MID(MID(B3,1,LEN(B3)-1),SEARCH(",",B3,SEARCH(",",B3,1)+1)+3,LEN(B3))

How to search for items with multiple "-" in excel or VBA?

I have a list of item numbers (100K) like this:
Some of the items have format like SAG571A-244-4 (thousands) which need to be filtered so I can delete them and only keep the items that have ONE hyphen per SKU. How can I isolate the items that have two instances of "-" in it's SKU? I'm open to solutions within Excel or using VBA as well.
Native text filters don't seem to be capable of this. I'm stumped.
As per John Coleman's comment, "*-*-*" can be used to isolate strings that have at least two dashes in them.
I would add that if you're entering them as a custom text filter, you should lose the double quotes (so just *-*-*) as otherwise the field seems to interpret the quotes literally.
Seems to work for me.
If you want just an excel formula to verify this and give you a result of the number of hyphens (0, 1, or 2+), here is one:
=IF(ISERROR(SEARCH("-",A1)),"0",IF(ISERROR(SEARCH("-",A1,IFERROR(SEARCH("-",A1)+1,LEN(A1)))),"1","2+"))
Replace A1 with your relevant column, then fill down. This is kind of a terrible way to do this performance wise, but you avoid using VBA and possibly xlsm files.
The code first checks to see if there is one hyphen, then if there is it checks to see if there is another hyphen after the position the first one was found. Looking for multiple hyphens in this manner is cumbersome and I don't recommend it.

Vlookup Not working on text between two tables

This is not your average vlookup error.
I have two Power Query tables that I've setup. One is coming from a CSV file with a list of names. The other is from a website pulling a list of names.
i.e.
=John Smith = John Smith would not be true for some reason.
They vlookup should be able to find the name easily. I've tried proper,upper, clean, trimming and text to columns and everything else that I could think of. I've changed data types to no avail.
I know that one query is causing the issue. I can type the name exactly and do a vlookup from one, and it works. The second query that I do this to doesn't return anything on the typed text.
Anyone encounter this issue while using Power Query?
EDIT: See Jeeped's Answer - When I replace the space from the web query with a normal space it works.
#Jeeped's comment has a good answer:
Assuming you have already trimmed off leading and trailing spaces, one of the John Smith entries (likely the one from the web) uses a non-breaking space (e.e. CHAR(160) or ASCII 0×A0) instead of a regular space (e.g CHAR(32) or ASCII 0×20). Use
=CODE(MID(A$1, ROW(1:1), 1))
on both, fill down to get a ASCII code for each letter and compare the numbers.

Sharepoint: Calculated Column replace all spaces

Seems like it would be a simple thing really (and it may be), but I'm trying to take the string data of a column and then through a calculated column, replace all the spaces with %20's so that the HTML link in the workflow produced email will actually not break off at the first space.
For example, we have this in our source column:
file:///Z:/data/This is our report.rpt
And would like to end up with this in the calculated column:
file:///Z:/data/This%20is%20our%20report.rpt
Already used the REPLACE, and made up a ghastly super nested REPLACE/SEARCH version, but the problem there is that you have to nest for EACH potential space, and if you don't know how many up front, it doesn't work, or will miss some.
Have any of you come across this scenario and how did you handle it?
Thanks in advance!
As far as I know there is no generic solution using the calculated-column syntax. The standard solution for this situation is using an ItemAdded (/ItemUpdated) event and initializing the field value from code.
I was able to solve this issue for my circumstances by using a series of calculated columns.
In the first calculated column (C1) I entered a formula to remove the first space, something like this:
=IF(ISNUMBER(FIND(" ",[Title])),REPLACE([Title],FIND(" ",[Title]),1,"%20"),[Title])
In the second Calculated column (C2) I used:
=IF(ISNUMBER(FIND(" ",[C1])),REPLACE([C1],FIND(" ",[C1]),1,"%20"),[C1]).
In my case, I wanted to encode upto four spaces, so I used 3 calculated columns (C1, C2, C3) in the same fashion and got the desired result.
This is not as efficient as using a single calculated column, but if SUBSTITUTE will not work in your SharePoint environment, and you cannot use an event handler or workflow, it may offer a workable alternative.
I actually used a slightly different formula, but it was on a work machine to which I don't have access at the moment, so I just grabbed this formula from a similar S.O. question. Any formula that will replace the first occurrence of a space with "%20" will work, the trick is to a) make sure the formula returns the original string unchanged if it does not have more spaces in it, and b) test, test, test. Create a view of your list that has the field you are trying to encode, plus the calculated fields, and see if you are getting the results you want.
so that the HTML link in the workflow produced email will actually not break off at the first space.
The browser only does this if you have not enclosed your link in quotes
If you wrap the link in quotes, it does not cut off at the first space
In a SharePoint Formula it would be:
="""file:///Z:/data/This is our report.rpt"""
becuase two quotes are the SP escape notation to output a quote
You can use this formula (Start trim for 1, in my case was 4):
=IF(ISBLANK([EUR Amount]),"",(TRIM(MID([EUR Amount],4,2))&TRIM(MID([EUR Amount],6,2))&TRIM(MID([EUR Amount],8,2))&TRIM(MID([EUR Amount],10,2))&TRIM(MID([EUR Amount],12,2))&TRIM(MID([EUR Amount],14,2)))*1)

Resources