Format string to fit pattern in Lua? - string

Say I have a pattern, and a string:
String = "ABCDEF"
Pattern = "%w%w%w - %w%w%w"
How can I make String match the format of Pattern, so it becomes "ABC - DEF"?

Use string.gsub:
string.gsub("ABCDEF", "(%w%w%w)(%w%w%w)", "%1 - %2")
Note that this would replaces all the occurrences of the pattern.

There no one to one match beetween string, pattern and capture.
Same capture can be produced by several patterns for same string.
Also if "%w%w%w - %w%w%w" in your example is Lua string pattern then
string "ABC - DEF" does not match to it. Patterns that match to it can be
%w%w%w %- %w%w%w or %w+%W+%w+ or %w*%s*.%s*%w* or several others.
So I suggest define your own subset of rules that you really need and
implement your own function to handle it.

Related

Groovy How to replace the exact match word in a String

Groovy How to replace the exact match word in a String.
I wanted to replace the exact matched word in a given string in Groovy. and when i tried the below am not getting the exact matched word
def str="My Name is Richards and Richardson"
log.info(str)
str=str.replace("Richards","Praveen")
log.info("After"+str)
Output after executing the above
My Name is Richards and Richardson
AfterMy Name is Praveen and Praveenon
Am Looking for the output like : AfterMy Name is Praveen and Richardson
I tried the boundaries \b
str=str.replace("\bRichards\b","Praveen")
which is in Java and its not working. Looks \b is ba backslash escape sequence in the Groovy
can someone help
def str="My Name is Richards and Richardson"
log.info(str)
str=str.replace("Richards","Praveen")
log.info("After"+str)
expecting:AfterMy Name is Praveen and Richardson
Using boundaries (/b) will not work with String::replace because the method argument does not accept a regular expression pattern but a simple string literal.
You have two options to get the expected outcome:
Instead of using String::replace you can use String::replaceFirst. As the method name suggests it will replace only the first occurrence of the Richards substring leaving the Richardson as is.
str = str.replaceFirst("Richards", "Praveen")
Instead of using String::replace you can use String::replaceAll, in opposite to String::replace it supports regular expressions so you can use word boundaries tokens
str = str.replaceAll("\\bRichards\\b","Praveen")
Mind the double slashes!
Also, according to the String::replaceAll documentation:
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceAll. Use Matcher.quoteReplacement(java.lang.String) to suppress the special meaning of these characters, if desired.

Regular expression to capture n lines of text between two regex patterns

Need help with a regular expression to grab exactly n lines of text between two regex matches. For example, I need 17 lines of text and I used the example below, which does not work. I
Please see sample code below:
import re
match_string = re.search(r'^.*MDC_IDC_RAW_MARKER((.*?\r?\n){17})Stored_EGM_Trigger.*\n'), t, re.DOTALL).group()
value1 = re.search(r'value="(\d+)"', match_string).group(1)
value2 = re.search(r'value="(\d+\.\d+)"', match_string).group(1)
print(match_string)
print(value1)
print(value2)
I added a sample string to here, because SO does not allow long code string:
https://hastebin.com/aqowusijuc.xml
You are getting false positives because you are using the re.DOTALL flag, which allows the . character to match newline characters. That is, when you are matching ((.*?\r?\n){17}), the . could eat up many extra newline characters just to satisfy your required count of 17. You also now realize that the \r is superfluous. Also, starting your regex with ^.*? is superfluous because you are forcing the search to start from the beginning but then saying that the search engine should skip as many characters as necessary to find MDC_IDC_RAW_MARKER. So, a simplified and correct regex would be:
match_string = re.search(r'MDC_IDC_RAW_MARKER.*\n((.*\n){17})Stored_EGM_Trigger.*\n', t)
Regex Demo

Determining a Regex for split()

I have a string, which represents a version:
abc-5.18.0.0_10
I'm trying to determine the regex to extract this subString:
abc-5.18.0
This in format terms is:
We want any characters up until the "-".
We also want between the "-" and the first '." which is the major version 1 to nn.
We also want between the first "." and the second "." which is the minor version 0 to nn.
We also want the subminor version between the second and third ".", 0-nn.
xxx-nn.nn.nn
I'm trying this:
.split("//|\\.")
And I'm getting this:
abc-5.18
What am I doing wrong? Should it be "///|\."?
I think the regex you want is stuff, period, stuff, period, stuff up to the third period.
This uses a group () to capture what you want. It looks confusing because . represents any character and \. represents a period.
s = 'abc-5.18.0.0_10'
re.match(r"(.*\..*\..*)\.", s).group(1)
Out: 'abc-5.18.0'
You can use restrictive regular expression like:
^([^-]+-\d+\.\d+\.\d+)(\..*)?$
You can apply it and extract the value in the following way in Groovy:
import java.util.regex.Pattern
final String str = 'abc-5.18.0.0_10'
final Pattern pattern = Pattern.compile(/^([^-]+-\d+\.\d+\.\d+)(\..*)?$/)
final String result = (str =~ pattern).replaceFirst('$1')
println result
Output:
abc-5.18.0

Lua -- match strings including non-letter classes

I'm trying to find exact matches of strings in Lua including, special characters. I want the example below to return that it is an exact match, but because of the - character it returns nil
index = string.find("test-string", "test-string")
returns nil
index = string.find("test-string", "test-")
returns 1
index = string.find("test-string", "test")
also returns 1
How can I get it to do full matching?
- is a pattern operator in a Lua string pattern, so when you say test-string, you're telling find() to match the string test as few times as possible. So what happens is it looks at test-string, sees test in there, and since - isn't an actual minus sign in this case, it's really looking for teststring.
Do as Mike has said and escape it with the % character.
I found this helpful for better understanding patterns.
You can also ask for a plain substring match that ignores magic characters:
string.find("test-string", "test-string",1,true)
you need to escape special characters in the pattern with the % character.
so in this case you are looking for
local index = string.find('test-string', 'test%-string')

How to match a part of string before a character into one variable and all after it into another

I have a problem with splitting string into two parts on special character.
For example:
12345#data
or
1234567#data
I have 5-7 characters in first part separated with "#" from second part, where are another data (characters,numbers, doesn't matter what)
I need to store two parts on each side of # in two variables:
x = 12345
y = data
without "#" character.
I was looking for some Lua string function like splitOn("#") or substring until character, but I haven't found that.
Use string.match and captures.
Try this:
s = "12345#data"
a,b = s:match("(.+)#(.+)")
print(a,b)
See this documentation:
First of all, although Lua does not have a split function is its standard library, it does have string.gmatch, which can be used instead of a split function in many cases. Unlike a split function, string.gmatch takes a pattern to match the non-delimiter text, instead of the delimiters themselves
It is easily achievable with the help of a negated character class with string.gmatch:
local example = "12345#data"
for i in string.gmatch(example, "[^#]+") do
print(i)
end
See IDEONE demo
The [^#]+ pattern matches one or more characters other than # (so, it "splits" a string with 1 character).

Resources