Get words from a file using groovy

Get words from a file using groovy - groovy

Using groovy how can I get the words/texts from a file which enclosed with parentheses.
Example:
George (a programmer) used to think much.
words to get: a programmer

Here you have an example program solving the issue:
String inp = 'George (a programmer) used to think much.'
def matcher = inp =~ /\(([^\)]+)\)/ // Try to find a match
if (matcher) { // Something found
String str = matcher[0][1] // Get the 1st capture group
printf("Found: %s.\n", str)
def words = str.tokenize() // Create a list of words
words.eachWithIndex{ it, i -> printf("%d: %s.\n", i, it)}
} else {
print("Not found")
}
Note the meaning of parentheses in the regular expression:
Outer (backslash quoted) parentheses are literal parentheses (we are
looking for these chars).
Unquoted parentheses (between them) are delimiters of the capture group.
The remaining (quoted) closing parenthesis between them is the char
that should not be present within the capture group.

Related

How do i find/count number of variable in string using Python

Here is example of string
Hi {{1}},
The status of your leave application has changed,
Leaves: {{2}}
Status: {{3}}
See you soon back at office by Management.
Expected Result:
Variables Count = 3
i tried python count() using if/else, but i'm looking for sustainable solution.

You can use regular expressions:
import re
PATTERN = re.compile(r'\{\{\d+\}\}', re.DOTALL)
def count_vars(text: str) -> int:
return sum(1 for _ in PATTERN.finditer(text))
PATTERN defines the regular expression. The regular expression matches all strings that contain at least one digit (\d+) within a pair of curly brackets (\{\{\}\}). Curly brackets are special characters in regular expressions, so we must add \. re.DOTALL makes sure that we don't skip over new lines (\n). The finditer method iterates over all matches in the text and we simply count them.

RegEX to extract Variables names, operators, and quoted strings with symbols

I'm trying to break a string (from command line argument) into 4 components. (C++ variable name):(python variable)(operator)(value or quoted string)
Example:
CVariable_1:PythonVariable.attribute<=2343.23
result=('CVariable_1','PythonVariable.attribute','<=','2343.23')
CVariable_2:PythonVariable2.attribute2.value<="Any string including SYMBOLS~!##$%^&*\"\'<> and spaces"
result=('CVariable_2','PythonVariable2.attribute2.value','==','Any string including SYMBOLS##$%\"\'<> and spaces')
The closest regex I've come up with is:
[^:'"<>=]+|[\.\w]+|[<>!=]+
But the string could have any symbols in it. Quotes would be escaped though.

I think ([^:]+(?=:))?([^<=!]+)([<>=!]+)(.*$) will work.
I tried the following code:
import re
import typing
pattern = re.compile(r"([^:]+(?=:))?([^<=!]+)([<>=!]+)(.*$)")
def split_name(name: str) -> typing.Tuple[str, str, str, str]:
global pattern
match = re.match(pattern, name)
if match is None:
raise ValueError("The name is invalid")
cname = match.group(1)
pyname = match.group(2)
operator = match.group(3)
value = match.group(4)
return cname, pyname, operator, value
test_names = ["PythonVariable1.attribute.value==False",
"CVariable_2:PythonVariable2.attribute<=2343.23",
"CVariable_3:PythonVariable3.attribute3.value<=\"Any string including SYMBOLS~!##$%^&*\\\"\\'<> and spaces\""]
print(list(map(split_name, test_names)))
If you want to write the allowed operators explicitly, you can change the third group. Note that the order is important in this case (<= needs to come before <):
([^:]+(?=:))?([^<=!]+)(<=|>=|!=|==|<|>)(.*$)

Lua: return content of "{foo}{bar}"

for a string like "{foo}{bar}" is there an easy
str = "{foo}{bar}"
first, second = str:gmatch(...)...
should give first="foo" and second="bar"
The problem is that foo itself can have some more parentheses, eg:
str = "{foo {baz}{bar}"
so that first = "foo {baz" The
bar part has only alphanumerical characters, no parentheses

You may use
first, second = str:match('{([^}]*)}%s*{([^}]*)}')
See the Lua demo online
The str.match function will find and return the first match and since there are two capturing groups there will be two values returned upon a valid match.
The pattern means:
{ - a { char
([^}]*) - Group 1: any 0+ chars other than }
} - a } char
%s* - 0+ whitespaces (not necessary, but a bonus)
{([^}]*)} - same as above, just there is a Group 2 defined here.

How to compare upper and lowercase letters in a conditional in Swift

Apologies if this is a duplicate. I have a helper function called inputString() that takes user input and returns a String. I want to proceed based on whether an upper or lowercase character was entered. Here is my code:
print("What do you want to do today? Enter 'D' for Deposit or 'W' for Withdrawl.")
operation = inputString()
if operation == "D" || operation == "d" {
print("Enter the amount to deposit.")
My program quits after the first print function, but gives no compiler errors. I don't know what I'm doing wrong.

It's important to keep in mind that there is a whole slew of purely whitespace characters that show up in strings, and sometimes, those whitespace characters can lead to problems just like this.
So, whenever you are certain that two strings should be equal, it can be useful to print them with some sort of non-whitespace character on either end of them.
For example:
print("Your input was <\(operation)>")
That should print the user input with angle brackets on either side of the input.
And if you stick that line into your program, you'll see it prints something like this:
Your input was <D
>
So it turns out that your inputString() method is capturing the newline character (\n) that the user presses to submit their input. You should improve your inputString() method to go ahead and trim that newline character before returning its value.
I feel it's really important to mention here that your inputString method is really clunky and requires importing modules. But there's a way simpler pure Swift approach: readLine().
Swift's readLine() method does exactly what your inputString() method is supposed to be doing, and by default, it strips the newline character off the end for you (there's an optional parameter you can pass to prevent the method from stripping the newline).
My version of your code looks like this:
func fetchInput(prompt: String? = nil) -> String? {
if let prompt = prompt {
print(prompt, terminator: "")
}
return readLine()
}
if let input = fetchInput("Enter some input: ") {
if input == "X" {
print("it matches X")
}
}

the cause of the error that you experienced is explained at Swift how to compare string which come from NSString. Essentially, we need to remove any whitespace or non-printing characters such as newline etc.
I also used .uppercaseString to simplify the comparison
the amended code is as follows:
func inputString() -> String {
var keyboard = NSFileHandle.fileHandleWithStandardInput()
var inputData = keyboard.availableData
let str: String = (NSString(data: inputData, encoding: NSUTF8StringEncoding)?.stringByTrimmingCharactersInSet(
NSCharacterSet.whitespaceAndNewlineCharacterSet()))!
return str
}
print("What do you want to do today? Enter 'D' for Deposit or 'W' for Withdrawl.")
let operation = inputString()
if operation.uppercaseString == "D" {
print("Enter the amount to deposit.")
}

Standard ML string to a list

Is there a way in ML to take in a string and output a list of those string where a separation is a space, newline or eof, but also keeping strings inside strings intact?
EX) hello world "my id" is 5555
-> [hello, world, my id, is, 5555]
I am working on a tokenizing these then into:
->[word, word, string, word, int]

Sure you can! Here's the idea:
If we take a string like "Hello World, \"my id\" is 5555", we can split it at the quote marks, ignoring the spaces for now. This gives us ["Hello World, ", "my id", " is 5555"]. The important thing to notice here is that the list contains three elements - an odd number. As long as the string only contains pairs of quotes (as it will if it's properly formatted), we'll always get an odd number of elements when we split at the quote marks.
A second important thing is that all the even-numbered elements of the list will be strings that were unquoted (if we start counting from 0), and the odd-numbered ones were quoted. That means that all we need to do is tokenize the ones that were unquoted, and then we're done!
I put some code together - you can continue from there:
fun foo s =
let
val quoteSep = String.tokens (fn c => c = #"\"") s
val spaceSep = String.tokens (fn c => c = #" ") (* change this to include newlines and stuff *)
fun sepEven [] = []
| sepEven [x] = (* there were no quotes in the string *)
| sepEven (x::y::xs) = (* x was unquoted, y was quoted *)
in
if length quoteSep mod 2 = 0
then (* there was an uneven number of quote marks - something is wrong! *)
else (* call sepEven *)
end

String.tokens brings you halfway there. But if you really want to handle quotes like you are sketching then there is no way around writing an actual lexer. MLlex, which comes with SML/NJ and MLton (but is usable with any SML) could help. Or you just write it by hand, which should be easy enough in this case as well.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Get words from a file using groovy - groovy

Using groovy how can I get the words/texts from a file which enclosed with parentheses. Example: George (a programmer) used to think much. words to get: a programmer

Related

How do i find/count number of variable in string using Python

RegEX to extract Variables names, operators, and quoted strings with symbols

Lua: return content of "{foo}{bar}"

How to compare upper and lowercase letters in a conditional in Swift

Standard ML string to a list

Categories

Resources