Delete last two character of a string - Pentaho Data Integration - string

This is probably a stupid question but I can't figure out for the life of me which step would allow the last two character of a string to be deleted in Pentaho Data Integration V9.2

There are many ways to do that, perhaps the simplest is a Replace in strings step, using a Regex (e.g., search for (.*).{2} and replace by $1).
This will replace the entire string by the first capture group, which is made up of the entire string except the last two characters.

Related

How to Automatically add thousand separators for every number in a string?

How can i create a thousand separator for every number which is in my string?
So for example this string:
string = "123456,78+1234"
should be displayed as:
TextView = "123.456,78+1.234"
And the string should be editable, so the thousand separator should adapt when i remove or add a digit.
I have already read all the posts I could find about it, but I could never find an up-to-date working answer. So I would be really grateful for your help!
Your question contains two sub-questions:
A. You want to add thousand separators to a string which contains a group of numbers.
B. You want it to change.
And the answers are:
A: In your example there's , as a delimiter, so you need to split the string using this delimiter to an array of strings.
Then iterate over them and have your dots added to every 3nth index of their characters; you can also use String.format("%,d", substr.toLong()).
Lastly, append all of the strings back together with , as the separator.
B: This one can be done in different ways. You may store the original string somewhere and observe it, so when it changes it goes to the function which does A, and use the function result the way you like (which I suppose is to be set in a TextView).

Excel: Find words of certain length in string?

I have this file where I want to make a conditional check for any cell that contains the letter combination "_SOL", or where the string is followed by any numeric character like "_SOL1524", and stop looking after that. So I don't want matches for "_SOLUTION" or "_SOLothercharactersthannumeric".
So when I use the following formula, I also get results for words like "_SOLUTION":
=IF(ISNUMBER(FIND("_SOL",A1))=TRUE,"Yay","")
How can I avoid this, and only get matches if the match is "_SOL" or "_SOLnumericvalue" (one numeric character)
Clarification: The whole strings may be "Blabla_SOL_BLABLA", "Blabla_SOLUTION_BLABLA" or "Blabla_SOL1524_BLABLA"
Maybe this, which will check if the character after "_SOL" is numeric.
=IF(ISNUMBER(VALUE(MID(A1,FIND("_SOL",A1)+4,1))),"Yay","")
Or, as per OP's request and suggestion, to include the possibility of an underscore after "SOL"
=IF(OR(ISNUMBER(VALUE(MID(A1,FIND("_SOL",A1)+4,1))),ISNUMBER(FIND("_SOL_",A1))),"Yay","")
Here is an alternative way to check if your string contains SOL followed by either nothing or any numeric value up to any characters after SOL:
=IF(COUNT(FILTERXML("<t><s>"&SUBSTITUTE(A1,"_","1</s><s>")&"</s></t>","//s[substring-after(.,'SOL')*0=0]")>0),"Yey","Nay")
Just to use in an unfortunate event where you would encounter SOL1TEXT for example. Or, maybe saver (in case you have text like AEROSOL):
=IF(COUNT(FILTERXML("<t><s>"&SUBSTITUTE(A1,"_","</s><s>")&"</s></t>","//s[translate(.,'1234567890','')='SOL']")>0),"Yey","Nay")
And to prevent that you have text like 123SOL123 you could even do:
=IF(COUNT(FILTERXML("<t><s>"&SUBSTITUTE(A1,"_","1</s><s>")&"</s></t>","//s[starts-with(., 'SOL') and substring(., 4)*0=0]")>0),"Yey","Nay")

Extract Number from string into a list in Scala

I have the following string :
var myStr = "abc12ef4567gh90ijkl789"
The size of the list is not fixed and it contains number in between. I want to extract the numbers and store them in the form of a list in this manner:
List(12,4567,90,789)
I tried the solution mentioned here but cannot extend it to my case. I just want to know if there is any faster or efficient solution instead of just traversing the string and extracting the numbers one by one using brute force ? Also, the string can be arbitrary length.
It seems you may just collect the numbers using
("""\d+""".r findAllIn myStr).toList
See the Scala demo. \d+ matches one or more digits, findAllIn searches for multiple occurrences of the pattern inside a string (and also un-anchors the pattern so that partial matches could be found).
If you prefer a splitting approach, you might use
myStr.split("\\D+").filter(_.nonEmpty).toList
See another demo. Here, \D+ matches one or more non-digit chars, and these chunks are used to split on (texts between these chunks land in the result). .filter(_.nonEmpty) will remove empty items that usually appear due to matches at the start/end of the string.

How can I split a phrase into a new line every x characters on Google Sheets?

I am translating a game, and the game's text box only supports 50 characters max per line. Is there a way to use a formula to split the entire sentence every 50 characters or whole word (49, 48, 47, etc)?
I am currently working with this formula.
=JOIN(CHAR(10),SPLIT(REGEXREPLACE(A1, "(.{50})", "/$1"),"/"))
The problem with this code, is that it splits at exactly 50 characters (one time), and will split in the middle of the word.
So again, my goal is to have it not split on the 50th character IF the 50th character is in the middle of the word, and for the rule to apply for the rest of the lines too because it only applies on the first line.
Please take a look at this test google sheet to get an example of what I am talking about.
If it's impossible to do it on Google Sheets, I don't mind moving to Excel provided I get a functioning code.
For the record, I did ask in Google's product forums 2 days ago, and still haven't received an answer.
=REGEXREPLACE(A1, "(.{1,50})\b", "$1" & CHAR(10))
{50} matches exactly 50 times, but what you need is 50 or less.
\b is word boundary that matches between alphanumeric and non-alphanumeric character.
= REGEXEXTRACT(A1,"(?ism)^"&REPT("([\w\d'\(\),. ]{0,49}\s)", ROUNDUP(LEN(A1)/50,0))&"([\w\d'\(\),. ]{0,49})$")
Tested with various expressions and works as intended. Note that only these characters [a-zA-Z0-9_'(),.] are allowed, Which means - and other characters not mentioned will not work. If you need them, add them inside the REPT expression and finishing regexp formula. Otherwise, This will work perfectly.
You are pretty close. I'm not an expert in Sheets, so not sure if this is the best way, but your Regex is wrong for what you want.
Also, you need to be certain that you don't use a split character that might appear in the phrase itself. However, using CHAR(10) for the replace character allows you to insert LF without going through the JOIN SPLIT sequence.
replace any line feeds, carriage returns and spaces with a single space
Match strings that start with a non-Space character followed by up to 49 more characters which are followed by a space or the end of the string.
replace the capture group with the capturing group followed by the CHAR(10) (and delete the space following).
There will be extra CHAR(10) at the end which you can strip off.
EDIT Regex changed slightly due to a difference in behavior between Google's RE and what I am used to (probably has to do with how a non-backtracking regex works). The problem showed up on your example:
=regexreplace(REGEXREPLACE(REGEXREPLACE(A1 & " ","[\r\n\s]+"," "),"(\S.{0,49})\s","$1" & char(10)),"\n+\z","")

String manipulation: randomly swapping out the first letter with one of the other letters of the word

I need to randomly swap out the first letter of a word with one of the other letters. What i am having trouble is with specifying that i only need to randomly generate ONE character. I cant use any conditionals, so can anyone please recommend a method to use?

Resources