I want to split function argument list by arguments.
An argument can be a function invocation like:
foo(...)
or any sequence of character. For instance:
"fisrt_arg, some_foo(arg1, foo2(arg1, arg2, foo3()), arg3), third_arg"
I want to get:
List("first_arg", "some_foo(arg1, foo2(arg1, arg2, foo3())", "third_arg")
I implemented it as follows (DEMO):
private[this] def tokenizeArgumentList(argumentListExpression: String): List[String] = {
var functionInvokationCounter = 0
var previousArgumentPosition = 0
var arguments: List[String] = List()
for (i <- 0 until argumentListExpression.length)
argumentListExpression.charAt(i) match {
case '(' => functionInvokationCounter += 1
case ')' =>
if (functionInvokationCounter == 0)
0
else
functionInvokationCounter -= 1
case ',' if functionInvokationCounter == 0 =>
arguments :+= argumentListExpression.substring(previousArgumentPosition, i).trim
previousArgumentPosition = i + 1
case _ =>
}
arguments :+= argumentListExpression.substring(previousArgumentPosition).trim
arguments
}
It works, but looks ugly. 3 mutable variables and what I don't like most is this:
arguments :+= argumentListExpression.substring(previousArgumentPosition).trim
arguments
After iteration through the argumentListExpression's is done we have to attach the last argument.
Can we refactor it in more functional way? Maybe foldLeft would help...?
Recursion is divine. See - not a single var:
val s = "fisrt_arg, some_foo(arg1, foo2(arg1, arg2, foo3()), arg3), third_arg"
def tokenize (argumentlist: String): List[String] = {
def tokenize (arglist: List[Char], sofar: String, inList: Int): List[String] = arglist match {
case Nil => List (sofar)
case '(' :: tail => tokenize (tail, sofar + '(', inList + 1)
case ')' :: tail => tokenize (tail, sofar + ')', inList - 1)
case ',' :: tail => if (inList > 0) {
tokenize (tail, sofar + ',', inList)
} else {
sofar :: tokenize (tail, "", inList)
}
case c :: tail => tokenize (tail, sofar + c, inList)
}
tokenize (argumentlist.toList, "", 0)
}
tokenize (s)
The inList counts, how deep we are in the list.
If we woulnd't pass just the sofar-String, but the List-so-far too, we could make it tail-recursive, but it doesn't smell as if it might step too deep into functions with function as parameters, having functions as parameters, ...
Pitfall:
val s = "\"stringliteral w. (misleading brace\", f(a, b(c, d, e()), f), g"
You might like to apply some trimming in the end:
scala> val res = tokenize (s)
res: List[String] = List(fisrt_arg, " some_foo(arg1, foo2(arg1, arg2, foo3()), arg3)", " third_arg")
scala> res.mkString ("<", "><", ">")
res372: String = <fisrt_arg>< some_foo(arg1, foo2(arg1, arg2, foo3()), arg3)>< third_arg>
Indeed, foldLeft is a possibility. It helps us removing mutable variables (which we try to avoid in Scala):
val string =
"fisrt_arg, some_foo(arg1, foo2(arg1, arg2, foo3()), arg3), third_arg"
val result = (string :+ ',')
// The accumulator of foldLeft is a tuple (previous splits,
// current split, nbr of opened parentheses)
.foldLeft(List[String](), List[Char](), 0) {
// Opening of parenthesis (might be the first opening or not) =>
// increment nbr of opened parentheses to stop splitting:
case ((splits, currSplit, openPar), '(') =>
(splits, '(' :: currSplit, openPar + 1)
// Closing of parenthesis (might bring us back to 0, in which case
// we can start splitting again):
case ((splits, currSplit, openPar), ')') =>
(splits, ')' :: currSplit, openPar - 1)
// ',' (split char) and if the nbr of opened parentheses is 0 =>
// we can split!
case ((splits, currSplit, 0), ',') =>
(currSplit.reverse.mkString :: splits, Nil, 0)
// In any other case, we just add the new char to the current split:
case ((splits, currSplit, openPar), char) =>
(splits, char :: currSplit, openPar)
}
._1
.reverse
result.foreach(println)
which returns
List("fisrt_arg", "some_foo(arg1, foo2(arg1, arg2, foo3())", "arg3"))
foldLeft will traverse a sequence (in our case a List[Char]) to process each Char individually to fill an "accumulator" (List[String] to be returned).
Notice the initialization (string :+ ',') which allows us to also include the last split into the list of splits. Otherwise, at the end of foldLeft, we would have the last split in the second item (List[Char]) of the accumulator tuple instead of it being included in the first (List[String]).
My idea works out to be essentially the same as the other answer, except operating on the "word" level. I do not know that I am a fan of modifying the last element of the list in this way--the idea of carrying a "current split" would be an alternative.
val result = string
.split(",")
.foldLeft((0, List[String]())) {
case ((0, l), term) =>
(term.count(_ == '(') - term.count(_ == ')'), (term :: l))
case ((openCount, l), term) =>
val finalElt = l.head
(
openCount + term.count(_ == '(') - term.count(_ == ')'),
List(finalElt, term).mkString(",") :: l.tail)
}
._2
.reverse
I want to replace a pattern in text file and read variables x and y from the pattern to insert them into the pattern I will be replacing with.
I want to replace every occurrence of:
Array<x, y> someArray;
With the following:
Array<> someArray(x, y);
So, for example this line:
Array<3, 4> someArray;
Will be replaced with:
Array<> someArray(3, 4);
How do I achieve that using awk or sed?
With sed:
sed -E 's/^(Array)<([^>]+)>( someArray)/\1<>\3(\2)/' file
$ cat r.sh
awk '
BEGIN {
b = "[ \t]+" # blank
i = "[_a-zA-Z0-9]+" # identifier: should match x, 1 and so on
}
{
parse()
if (OK)
printf "Array<> %s(%s, %s);\n", name, x, y
else
print $0
}
function parse() {
l = $0; OK = 1
n("Array")
n("<")
n(i); x = P
n(",")
n(i); y = P
n(">")
n(i); name = P
}
function n(p) {
if (!OK) return
n1(b); OK = 1 # skip blank
return n1(p)
}
function n1(p) {
p = "^" p
if (match(l, p)) P = n0()
else OK = 0
}
function n0( s) {
s = substr(l, 1, RLENGTH)
l = substr(l, RLENGTH + 1)
return s
}
' "$#"
Usage
$ sh r.sh file
Array<> someArray(x, y);
Array<> someArray(3, 4);
I would to substitute all numbers in a text, for instance I would to add some value V to all numbers. For example, for V=3:
var inp = "Try to replace thsis [11-16] or this [5] or this [1,2]";
the substitution should give me:
var output = "Try to replace thsis [14-19] or this [8] or this [4,5]";
With RegExp I would like to do some like:
var V = 12;
var re = new RegExp(/[0-9]+/g);
var s = inp.replace(re,'$1' + V);
but obviously does not work.
In in.replace(re,'$1' + V), the V value is just added to $1 string, and the string replacement pattern looks like $112. Since your pattern does not contain any capturing group, the replacement pattern is treated as a literal string.
You may use a callback inside the replace method where you may manipulate the match value:
var V = 3;
var inp = "Try to replace thsis [11-16] or this [5] or this [1,2]";
var re = /[0-9]+/g;
var outp = inp.replace(re, function($0) { return parseInt($0, 10) + V; });
console.log(outp);
So I have the following code to split a string between whitespaces:
text = "I am 'the text'"
for string in text:gmatch("%S+") do
print(string)
end
The result:
I
am
'the
text'
But I need to do this:
I
am
the text --[[yep, without the quotes]]
How can I do this?
Edit: just to complement the question, the idea is to pass parameters from a program to another program. Here is the pull request that I am working, currently in review: https://github.com/mpv-player/mpv/pull/1619
There may be ways to do this with clever parsing, but an alternative way may be to keep track of a simple state and merge fragments based on detection of quoted fragments. Something like this may work:
local text = [[I "am" 'the text' and "some more text with '" and "escaped \" text"]]
local spat, epat, buf, quoted = [=[^(['"])]=], [=[(['"])$]=]
for str in text:gmatch("%S+") do
local squoted = str:match(spat)
local equoted = str:match(epat)
local escaped = str:match([=[(\*)['"]$]=])
if squoted and not quoted and not equoted then
buf, quoted = str, squoted
elseif buf and equoted == quoted and #escaped % 2 == 0 then
str, buf, quoted = buf .. ' ' .. str, nil, nil
elseif buf then
buf = buf .. ' ' .. str
end
if not buf then print((str:gsub(spat,""):gsub(epat,""))) end
end
if buf then print("Missing matching quote for "..buf) end
This will print:
I
am
the text
and
some more text with '
and
escaped \" text
Updated to handle mixed and escaped quotes. Updated to remove quotes. Updated to handle quoted words.
Try this:
text = [[I am 'the text' and '' here is "another text in quotes" and this is the end]]
local e = 0
while true do
local b = e+1
b = text:find("%S",b)
if b==nil then break end
if text:sub(b,b)=="'" then
e = text:find("'",b+1)
b = b+1
elseif text:sub(b,b)=='"' then
e = text:find('"',b+1)
b = b+1
else
e = text:find("%s",b+1)
end
if e==nil then e=#text+1 end
print("["..text:sub(b,e-1).."]")
end
Lua Patterns aren't powerful to handle this task properly. Here is an LPeg solution adapted from the Lua Lexer. It handles both single and double quotes.
local lpeg = require 'lpeg'
local P, S, C, Cc, Ct = lpeg.P, lpeg.S, lpeg.C, lpeg.Cc, lpeg.Ct
local function token(id, patt) return Ct(Cc(id) * C(patt)) end
local singleq = P "'" * ((1 - S "'\r\n\f\\") + (P '\\' * 1)) ^ 0 * "'"
local doubleq = P '"' * ((1 - S '"\r\n\f\\') + (P '\\' * 1)) ^ 0 * '"'
local white = token('whitespace', S('\r\n\f\t ')^1)
local word = token('word', (1 - S("' \r\n\f\t\""))^1)
local string = token('string', singleq + doubleq)
local tokens = Ct((string + white + word) ^ 0)
input = [["This is a string" 'another string' these are words]]
for _, tok in ipairs(lpeg.match(tokens, input)) do
if tok[1] ~= "whitespace" then
if tok[1] == "string" then
print(tok[2]:sub(2,-2)) -- cut off quotes
else
print(tok[2])
end
end
end
Output:
This is a string
another string
these
are
words
Am newbie here and tried the search, but not quite understood it, so I am thinking to ask to the forum for help.
I want to get the result into the text box from the following code but got an error.
Confused on how to overcome it, appreciate for any help. I believe it was an error on the conversion from linqIgroup to string to be put in textboxt.Text
It's about to display the most word(s) that has been occurred in a text file.
string sentence;
string[] result = {""};
sentence = txtParagraph.Text;
char[] delimiters = new char[] { ' ', '.', '?', '!' };
string[] splitStr = sentence.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
var dic = splitStr.ToLookup(w => w.ToLowerInvariant());
var orderedDic = dic.OrderByDescending(g => g.Count(m=>m.First()).ToString()));
txtFreqWord.Text = orderedDic.ToString();
Try the following to do what you are after. I am using regular expressions aswell.
var resultsList = System.Text.RegularExpressions.Regex.Split("normal text here normal normal".ToLower(), #"\W+")
.Where(s => s.Length > 3)
.GroupBy(s => s)
.OrderByDescending(g => g.Count());
string mostFrequent = resultsList.FirstOrDefault().Key;
To get all of them with their count, do the following :
foreach (var x in resultsList) {
txtFreqWord.Text = txtFreqWord.Text + x.Key + " " + x.Count() +", ";
}