Spotfire - Extract text based on conditions - text

I have a column containing a string value as shown is the example below :
ZAE/GER-ERT/HEZ/PDC
The idea is to extract the first trigraph (ZAE in this extract) and a second one based a rule.
The rule is, if there is a '-' separating two trigraphs, we don't extract them, we just take the first trigraph after a '/' and without a '-' after it.
We then use a - to separate the two results, here is the aim for the example : ZAE-HEZ
I would like to get this value in a new calculated column.
I've tried to play with the indexes based on the Find() and ExtractRX() functions, but couldn't make it work.
Thanks in advance !

I am not sure this is the simplest way, but it works for your example (assuming the strings are always alphanumeric in chunks of 3).
You can do it via an intermediate column (for sanity, although you could put the [tmp] formula directly into the final column):
[tmp] as
RXReplace(RXReplace([your_column],'\\w{3}-\\w{3}','','g'),'/+','/','g')
This removes any double trigraph like GER-ERT and then removes any leftover double /
Then the final column splits [tmp] by / and concatenates the first and second item
Concatenate(Split([tmp],'/',1),'-',Split([tmp],'/',2))

Related

Index Match Multiple Criteria in Excel, looking at a partial match

User Form Input
Loading Kit Reference
I have several columns of user input which needs to match each column in a reference sheet to return a given drawing number.
My initial attempt:
=INDEX('Loading Kits'!A$2:A$113,MATCH(1,('Shop Orders'!B5='Loading Kits'!C$2:C$113)*('Shop Orders'!E5='Loading Kits'!D$2:D$113)*('Shop Orders'!G5='Loading Kits'!E$2:E$113)*('Shop Orders'!H5='Loading Kits'!F$2:F$113)*('Shop Orders'!I5='Loading Kits'!G$2:G$113),0))
This works great when reference sheets only have one option for size ('Shop Orders'!B5='Loading Kits'!C$2:C$113).
How do I create a match when there are several (up to 6) options listed in one column delimited by commas (24C,24D,26A,26B,26AV,26BV)?
One of the few exceptions where I would say concatenation while using INDEX and MATCH would be safe to work with IF your values are concatenated through comma's without a space (otherwise the formula would be a little bit different). But let me give you a minimal example of how to get something like that to work:
The formula in F3:
=INDEX(A2:A4,MATCH(1,INDEX((B2:B4=F1)*(ISNUMBER(FIND(","&F2&",",","&C2:C4&","))),),0))
If your values follow eachother by comma space you obviously have to concatenate differently:
=INDEX(A2:A4,MATCH(1,INDEX((B2:B4=F1)*(ISNUMBER(FIND(", "&F2&",",", "&C2:C4&","))),),0))

Count the number of individual entries in a cell

Is it possible to count the number of individual entries in a cell?
For example 2+2+4-1 = 4 entries
Using the count formula counts the entries as 1
I want to calculate the number of adjustments made in a particular period.
Each +/- in an individual cell represents 1 adjustment.
Assuming you're referring to a text cell, the trick is to count the symobols you'd like to find. Before we dig into that, if you want to enter this data as text, you can use the ` symbol (Usually to the left of the 1 key on your keyboard) before entering your text to make sure it gets processed as text.
If you want to verify that it is text, you can use the TYPE function and look for a return result of 2 (check the link for other possible return types)
There are no direct functions to count characters in Excel, so the trick is to find the length of the original text and subtract it from the length of a new text where you have removed all of the special characters. You mentioned you were trying to count the entries (i.e. the numbers), but you said your goal was to ultimately count the number of '+/-' operations. Since counting numbers can be tricky with excel formulas (since we'll get hung up on 2 and 3 digit numbers), I am going to approach this problem from the perspective of counting the operations you are looking for. So here is a basic example:
length("2+1") = 3
- length("21") = 2 (we replaced the + with "" [blank])
= 1
So we know there is 1 '+' since we replaced it. The appropriate functions used to accomplish this are LEN and SUBSTITUTE
Since you can only find one symbol at a time using the SUBSTITUTE function, we must take the output of the first formula, and give it to the second formula, and so on and so forth. Ultimately, we can put together as many functions as we need to achieve the desired result.
So we start with + for your example (And assuming your data is in A1)
=LEN(A1)-LEN(SUBSTITUTE(A1,"+",""))
which gives us a result of 2. But we also need to find the - symbol. So we wrap another SUBSTITUTE:
=LEN(A1)-LEN(SUBSTITUTE(SUBSTITUTE(A1,"+",""),"-",""))
You have said you wanted to count the number of +/- in the cell, and this does accomplish that, but if you want to expand it to more mathematical operators, you simply add more SUBSTITUTE functions (here is a complete function where I've added * and /)
=LEN(A1)-LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"+",""), "-",""),"*",""),"/",""))
Well, this formula would replace all your numbers with "" and then Count the +/- and adds one, should do it, but is ugly:
=LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1;"0";"");"1";"");"2";"");"3";"");"4";"");"5";"");"6";"");"7";"");"8";"");"9";""))+1
Could probably be done with RegEx, but I don't know how to do that in formulas
This makes 31+12+3-4 to ++-, Counts the LEN (3) and adds 1

retrieve part of the info in a cell in EXCEL

I vaguely remember that it is possible to parse the data in a cell and keep only part of the data after setting up certain conditions. But I can't remember what exact commands to use. Any help/suggestion?
For example, A1 contains the following info
0/1:47,45:92:99:1319,0,1320
Is there a way to pick up, say, 0/1 or 1319,0,1320 and remove the rest unchosen data?
I know I can do text-to-column and set the delimiter, followed by manually removing the "un-needed" data, but my EXCEL spreadsheet contains 100 columns X 500000 rows with each cell looking similar to the data above, so I am afraid EXCEL may crash before finishing the work. (have been trying with LEFT, LEN, RIGHT, MID, but none seems to work the way I had hoped)
Any suggestion will be greatly appreciated.
I think what you are looking for is combination of find and mid, but you'll have to work out exactly how you want to split your string:
A1 = 0/1:47,45:92:99:1319,0,1320 //your number
B1 = Find(“:“,A1) //location of first ":" symbol
C1 = LEN(A1) - B1 //character count to copy ( possibly requires +1 or -1 after B1.
=Left(A1,B1) //left of your symbol
=Mid(A1,B1+1,C1) //right size from your symbol (you can also replace C1 with better defined number to extract only 1 portion
//You can also nest the statements to save space, but usually at cost of processing quantity increase
This is the concept, you will probably need to do it in multiple cells to split a string as long as yours. For multiple splits you probably want to replicate this command to target the result of previous right/mid command.
That way, you will get cell result sequence like:
0/1:47,45:92:99:1319,0,1320; 47,45:92:99:1319,0,1320; 92:99:1319,0,1320; 99:1319,0,1320......
From each of those you can retrieve left side of the string up to ":" to get each portion of a string.
If you are working with a large table you probably want to look into VB scripting. To my knowledge there is no single excel command that can take 1 cell and split it into multiple ones.
Let me try to help you about this, I am not a professional so you may face some problems. First of all my solution contains 2 columns to be added to the source column as you can see below. However you can improve formulas with this principle.
Column B Formula:
=LEFT(A2,FIND(":",A2,1)-1)
Column C Formula:
=RIGHT(A2,LEN(A2)-FIND("|",SUBSTITUTE(A2,":","|",LEN(A2)-LEN(SUBSTITUTE(A2,":","")))))
Given you statement of having 100x columns I imagine in some instances you are needing to isolate characters in the middle of your string, thus Left and Right may not always work. However, where possible use them where you can.
Assuming your string is in cell F2: 0/1:47,45:92:99:1319,0,1320
=LEFT(F2,3)
This returns 0/1 which are the first 3 characters in the string counting from the left. Likewise, Right functions similarly:
=RIGHT(F2,4)
This returns 1320, returning the 4 characters starting from the right.
You can use a combination of Mid and Find to dynamically find characters or strings based off of defined characters. Here are a few examples of ways to dynamically isloate values in your string. Keep in mind the key to these examples is the nested Find formula, where the inner most Find is the first character to start at in the string.
1) Return 2 characters after the second : character
In cell F2 I need to isolate the "92":
=MID(F2,FIND(":",F2,FIND(":",F2)+1)+1,2)
The inner most Find locates the first : in the string (4 characters in). We add the +1 to move to the 5th character (moving beyond the first : so the second Find will not see it) and move to the next Find which starts looking for : again from that character. This second Find returns 10, as the second : is the 10th character in the string. The Mid formula takes over here. The formula is saying, Starting at the 10th character return the following 2 characters. Returning two characters is dictated by the 2 at the end of the formula (the last part of the Mid formula).
2) In this case I need to find the 2 characters after the 3rd : in the string. In this case "99":
=MID(F2,FIND(":",F2,FIND(":",F2,FIND(":",F2)+1)+1)+1,2)
You can see we have simply added one more nested Find to the formula in example 1.

How to modify numbers at the end of a cell using a formula

I have cells in excel containing data of the form v-1-2-1, v-1-2-10, v-1-2-100. I want to convert it to v-1-2-001, v-1-2-010,v-1-2-100. I have nearly 2000 entries
If all of the data follows the format shown then you could use FIND to return the position of '-'. There will be three instances of this character and you need to find the third one so use the position given by the first instance as the start position parameter of the second FIND and again for the third (essentially nesting FIND). Once you have the position of the third '-' you know where the final set of numbers are (from the returned third position+1 to the LEN of the string) and could use SUBSTITUTE or a combination of other excel string functions to configure the final portion as you need it.
I'm assuming that excel has your data formatted as text.
If you need further assistance I'm happy to knock up the formula in excel but I'm off to work now and won't be able to do so for around 9 hours.
Please try:
=LEFT(A1,6)&TEXT(MID(A1,7,10),"000")

How to find a string within a string

I have the list with like 100,000 site link strings
Each link is unique, but it has consistent ?Item=
Then, it's either nothing or it continues after & symbol.
My question is: How do I pull out the item numbers?
I know replace function can offer similar functionality, but it works with Fixed sizes, in my case string can be different in size.
Link example:
www.site.com?sadfsf?sdfsdf&adfasfd?Item=JGFGGG55555
or
www.site.com?sadfsf?sdfsdf&adfasfd?Item=JGFGGG55555&sdafsdfsdfsdf
In both cases I need to get JGFGGG55555 only
If this always is the last portion of the string, you can use the following:
=MID(A1, FIND("?Item=", A1) + 6, 99)
This assumes:
no item numbers will be over 99 digits.
no additional fields follow the item number.
Edit:
With the update to your question, it is apparent you have some strings with additional data after the ?Item= field. Without using VBA there is not a simple means of using MID and FIND to extract this.
However you could create a column which acts as a placeholder.
For example, create a column using:
=MID(A1,FIND("?Item=",A1)+6,99)
This gets you the following value: JGFGGG55555&sdafsdfsdfsdf
Next, create a column using:
=IF(ISERROR(FIND("&",B2)),B2,LEFT(B2,FIND("&",B2)-1))
This produces: JGFGGG55555 by searching the first value for a & and using the portion before it. If it is not found, the first value is simply repeated.
This formula should work for both the examples given:
=MID(A1,FIND("=",A1),IFERROR(LEN(A1)-FIND("&",A1,FIND("=",A1))-1,LEN(A1)+1-FIND("=",A1)))

Resources