So I have the following code to split a string between whitespaces:
text = "I am 'the text'"
for string in text:gmatch("%S+") do
print(string)
end
The result:
I
am
'the
text'
But I need to do this:
I
am
the text --[[yep, without the quotes]]
How can I do this?
Edit: just to complement the question, the idea is to pass parameters from a program to another program. Here is the pull request that I am working, currently in review: https://github.com/mpv-player/mpv/pull/1619
There may be ways to do this with clever parsing, but an alternative way may be to keep track of a simple state and merge fragments based on detection of quoted fragments. Something like this may work:
local text = [[I "am" 'the text' and "some more text with '" and "escaped \" text"]]
local spat, epat, buf, quoted = [=[^(['"])]=], [=[(['"])$]=]
for str in text:gmatch("%S+") do
local squoted = str:match(spat)
local equoted = str:match(epat)
local escaped = str:match([=[(\*)['"]$]=])
if squoted and not quoted and not equoted then
buf, quoted = str, squoted
elseif buf and equoted == quoted and #escaped % 2 == 0 then
str, buf, quoted = buf .. ' ' .. str, nil, nil
elseif buf then
buf = buf .. ' ' .. str
end
if not buf then print((str:gsub(spat,""):gsub(epat,""))) end
end
if buf then print("Missing matching quote for "..buf) end
This will print:
I
am
the text
and
some more text with '
and
escaped \" text
Updated to handle mixed and escaped quotes. Updated to remove quotes. Updated to handle quoted words.
Try this:
text = [[I am 'the text' and '' here is "another text in quotes" and this is the end]]
local e = 0
while true do
local b = e+1
b = text:find("%S",b)
if b==nil then break end
if text:sub(b,b)=="'" then
e = text:find("'",b+1)
b = b+1
elseif text:sub(b,b)=='"' then
e = text:find('"',b+1)
b = b+1
else
e = text:find("%s",b+1)
end
if e==nil then e=#text+1 end
print("["..text:sub(b,e-1).."]")
end
Lua Patterns aren't powerful to handle this task properly. Here is an LPeg solution adapted from the Lua Lexer. It handles both single and double quotes.
local lpeg = require 'lpeg'
local P, S, C, Cc, Ct = lpeg.P, lpeg.S, lpeg.C, lpeg.Cc, lpeg.Ct
local function token(id, patt) return Ct(Cc(id) * C(patt)) end
local singleq = P "'" * ((1 - S "'\r\n\f\\") + (P '\\' * 1)) ^ 0 * "'"
local doubleq = P '"' * ((1 - S '"\r\n\f\\') + (P '\\' * 1)) ^ 0 * '"'
local white = token('whitespace', S('\r\n\f\t ')^1)
local word = token('word', (1 - S("' \r\n\f\t\""))^1)
local string = token('string', singleq + doubleq)
local tokens = Ct((string + white + word) ^ 0)
input = [["This is a string" 'another string' these are words]]
for _, tok in ipairs(lpeg.match(tokens, input)) do
if tok[1] ~= "whitespace" then
if tok[1] == "string" then
print(tok[2]:sub(2,-2)) -- cut off quotes
else
print(tok[2])
end
end
end
Output:
This is a string
another string
these
are
words
Related
How'd I change everything that it matches in the string without changing the non matches?
local a = "\" Hello World! I want to replace this with a bytecoded version of this!\" but not this!"
for i in string.gmatch(a, "\".*\"") do
print(i)
end
For example I want [["Hello World!" Don't Replace this!]] to [["\72\101\108\108\111\32\87\111\114\108\100\33" Don't Replace this!]]
Your question is a little bit tricky because it could involve:
Lua patterns
string.gsub function
Access to string's bytes with string.byte
String concatenations with table.concat
First things first, if you need to implement Lua patterns, please know that there is a very handy Lua syntax which is very appropriate for dealing with quoted strings. With this syntax, instead of opening/closing a string with a double-quote, you do it with the characters [[ and ]]. The key difference is that between these markers, you don't have to escape the quoted strings anymore!
String = [["Hello World!" Don't Replace this!]]
Then, we need to build the proper Lua pattern, a possibility could be to match a double-quote (") and then match all the characters which are not a double-quote ("), this gives us the following pattern:
[["([^"]+)"]]
| **** |
| \-> the expression to match
| |
quote quote
Then if we study the function string.gsub, we can learn that the function can call a callback when a pattern is matched, the matched string will be replaced by the return value of the callback.
function ConvertToByteString (MatchedString)
local ByteStrings = {}
local Len = #MatchedString
ByteStrings[#ByteStrings+1] = [[\"]]
for Index = 1, Len do
local Byte = MatchedString:byte(Index)
ByteStrings[#ByteStrings+1] = string.format([[\%d]], Byte)
end
ByteStrings[#ByteStrings+1] = [[\"]]
return table.concat(ByteStrings)
end
In this function, we iterate through all the characters of the matched string. Then for each of the characters, we extract its byte value with the function string.byte and convert it to a string using the string.format function. We put this string in a temporary array that we will concatenate at the end of the function.
The function to concatenate the sub-strings into a larger string is table.concat. This is a very convenient function which could be used as follow:
> table.concat({ [[\10]], [[\11]], [[\12]] })
\10\11\12
The remaining thing we need to do is to test this outstanding function:
> String = [["Hello World!" Don't Replace this!]]
> NewString = String:gsub([["([^"]+)"]], ConvertToByteString)
> NewString
\"\72\101\108\108\111\32\87\111\114\108\100\33\" Don't Replace this!
Edit: I got some remarks regarding code performances, I personally don't focus much on performances, I focus on getting the code correct & simple. In order to address the performance question, I wrote a micro-benchmark to compare the versions:
function SOLUTION_DarkWiiPlayer (String)
local result = String:gsub('"[^"]*"', function(str)
return str:gsub('[^"]', function(char)
return "\\" .. char:byte()
end)
end)
return result
end
function SOLUTION_Robert (String)
local function ConvertToByteString (MatchedString)
local ByteStrings = {}
local Len = #MatchedString
ByteStrings[#ByteStrings+1] = [[\"]]
for Index = 1, Len do
local Byte = MatchedString:byte(Index)
ByteStrings[#ByteStrings+1] = string.format([[\%d]], Byte)
end
ByteStrings[#ByteStrings+1] = [[\"]]
return table.concat(ByteStrings)
end
local Result = String:gsub([["([^"]+)"]], ConvertToByteString)
return Result
end
function SOLUTION_Piglet (String)
return String:gsub('%b""' , function (match)
local ret = ""
for _,v in ipairs{match:byte(1, -1)} do
ret = ret .. string.format("\\%d", v)
end
return ret
end)
end
function SOLUTION_Renshaw (String)
local function convert(str)
local byte_str = ""
for i = 1, #str do
byte_str = byte_str .. "\\" .. tostring(string.byte(str, i))
end
return byte_str
end
local Result = string.gsub(String, "\"(.*)\"", function(matched_str)
return "\"" .. convert(matched_str) .. "\""
end)
return Result
end
String = "\"Hello World!\" Don't Replace this!"
print("INITIAL REQUIREMENT FROM OP ", [[\"\72\101\108\108\111\32\87\111\114\108\100\33\" Don't Replace this!]])
print("TEST SOLUTION_Robert: ", SOLUTION_Robert(String))
print("TEST SOLUTION_DarkWiiPlayer:", SOLUTION_DarkWiiPlayer(String))
print("TEST SOLUTION_Piglet: ", SOLUTION_Piglet(String))
print("TEST SOLUTION_Renshaw: ", SOLUTION_Renshaw(String))
The results show that only one answer fulfill 100% of OP's requirements. The other answers doesn't handle the first and ending double-quotes " properly.
INITIAL REQUIREMENT FROM OP \"\72\101\108\108\111\32\87\111\114\108\100\33\" Don't Replace this!
TEST SOLUTION_Robert: \"\72\101\108\108\111\32\87\111\114\108\100\33\" Don't Replace this!
TEST SOLUTION_DarkWiiPlayer: "\72\101\108\108\111\32\87\111\114\108\100\33" Don't Replace this!
TEST SOLUTION_Piglet: \34\72\101\108\108\111\32\87\111\114\108\100\33\34 Don't Replace this! 1
TEST SOLUTION_Renshaw: "\72\101\108\108\111\32\87\111\114\108\100\33" Don't Replace this!
To finalize this post, one could dive a little deeper and check the code performances with a micro-benchmark which could be copy/paste directly in a Lua interpreter.
function SOLUTION_DarkWiiPlayer (String)
local result = String:gsub('"[^"]*"', function(str)
return str:gsub('[^"]', function(char)
return "\\" .. char:byte()
end)
end)
return result
end
function SOLUTION_Robert (String)
local function ConvertToByteString (MatchedString)
local ByteStrings = {}
local Len = #MatchedString
ByteStrings[#ByteStrings+1] = [[\"]]
for Index = 1, Len do
local Byte = MatchedString:byte(Index)
ByteStrings[#ByteStrings+1] = string.format([[\%d]], Byte)
end
ByteStrings[#ByteStrings+1] = [[\"]]
return table.concat(ByteStrings)
end
local Result = String:gsub([["([^"]+)"]], ConvertToByteString)
return Result
end
function SOLUTION_Piglet (String)
return String:gsub('%b""' , function (match)
local ret = ""
for _,v in ipairs{match:byte(1, -1)} do
ret = ret .. string.format("\\%d", v)
end
return ret
end)
end
function SOLUTION_Renshaw (String)
local function convert(str)
local byte_str = ""
for i = 1, #str do
byte_str = byte_str .. "\\" .. tostring(string.byte(str, i))
end
return byte_str
end
local Result = string.gsub(String, "\"(.*)\"", function(matched_str)
return "\"" .. convert(matched_str) .. "\""
end)
return Result
end
---
--- Micro-benchmark environment
---
COUNT = 600000
function TEST_Function (Name, Function, String, Count)
local TimerStart = os.clock()
for Index = 1, Count do
Function(String)
end
local ElapsedSeconds = (os.clock() - TimerStart)
print(string.format("[%25.25s] %f sec", Name, ElapsedSeconds))
end
String = "\"Hello World!\" Don't Replace this!"
TEST_Function("SOLUTION_DarkWiiPlayer", SOLUTION_DarkWiiPlayer, String, COUNT)
TEST_Function("SOLUTION_Robert", SOLUTION_Robert, String, COUNT)
TEST_Function("SOLUTION_Piglet", SOLUTION_Piglet, String, COUNT)
TEST_Function("SOLUTION_Renshaw", SOLUTION_Renshaw, String, COUNT)
The results shows that #DarkWiiPlayer's answer is the fastest one.
[ SOLUTION_DarkWiiPlayer] 6.363000 sec
[ SOLUTION_Robert] 9.605000 sec
[ SOLUTION_Piglet] 7.943000 sec
[ SOLUTION_Renshaw] 8.875000 sec
local a = "\"Hello World!\" but not this!"
print(a:gsub('"[^"]*"', function(str)
return str:gsub('[^"]', function(char)
return "\\" .. char:byte()
end)
end))
you need string.gsub.
local a = "\"Hello World!\" Don't Replace this!"
local function convert(str)
local byte_str = ""
for i = 1, #str do
byte_str = byte_str .. "\\" .. tostring(string.byte(str, i))
end
return byte_str
end
a = string.gsub(a, "\"(.*)\"", function(matched_str)
return "\"" .. convert(matched_str) .. "\""
end)
print(a)
local a = "\" Hello World! I want to replace this with a bytecoded version of this!\" but not this!"
print((a:gsub('%b""' , function (match)
local ret = ""
for _,v in ipairs{match:byte(1, -1)} do
ret = ret .. string.format("\\%d", v)
end
return ret
end)))
Given a string like (2 5). I want to replace multiple spaces with a semi-colon in a string (2 5) in PowerBuilder
Thanks in Advance
This can be done with a regular expression. But the support for regular expression in Powerscript is minimal, you need to use an external COM object like VBScript.RegExp to do something useful.
OLEObject re
int li_retcode
string s
string value
re = Create OLEObject
li_retcode = re.ConnectToNewObject("VBScript.RegExp")
re.Pattern = "\s\s+"
re.Global = True
s = "4 2"
value = re.Replace("4 2" , ";")
MessageBox("", value) // 4;2
re.DisconnectObject()
simply cut out each word, trim and join into a string.
string ls_key = '2 5 8 9'
string ls_new = ''
long ll_pos
do
ll_pos = Pos( ls_key, ' ')
if ll_pos > 0 then
ls_new += trim( left( ls_key, ll_pos - 1) ) + ':'
ls_key = trim( mid( ls_key, ll_pos + 1 ) )
else
ls_new += trim( ls_key )
ls_key = ''
end if
loop while ll_pos > 0
return ls_new
How would I attempt this?
I'm trying to create something that would remove all quotes (" ") in a Lua file but I have had no luck so far. But it might be because im a newbie at Lua.
I'm using this from GitHub.
function from_base64(to_decode)
local padded = to_decode:gsub("%s", "")
local unpadded = padded:gsub("=", "")
local bit_pattern = ''
local decoded = ''
for i = 1, string.len(unpadded) do
local char = string.sub(to_decode, i, i)
local offset, _ = string.find(index_table, char)
if offset == nil then
error("Invalid character '" .. char .. "' found.")
end
bit_pattern = bit_pattern .. string.sub(to_binary(offset-1), 3)
end
for i = 1, string.len(bit_pattern), 8 do
local byte = string.sub(bit_pattern, i, i+7)
decoded = decoded .. string.char(from_binary(byte))
end
local padding_length = padded:len()-unpadded:len()
if (padding_length == 1 or padding_length == 2) then
decoded = decoded:sub(1,-2)
end
return decoded
end
I'm trying to create something that would remove all quotes (" ") in a Lua file
-- read contents of file into memory
local file = io.open(filename)
local text = file:read('*a')
file:close()
-- remove all double-quotes from the contents
text = text:gsub('"','')
-- write contents back to the file
local file = io.open(filename, 'w+')
local text = file:write(text)
file:close()
I want to convert string text to table and this text must be divided on characters. Every character must be in separate value of table, for example:
a="text"
--converting string (a) to table (b)
--show table (b)
b={'t','e','x','t'}
You could use string.gsub function
t={}
str="text"
str:gsub(".",function(c) table.insert(t,c) end)
Just index each symbol and put it at same position in table.
local str = "text"
local t = {}
for i = 1, #str do
t[i] = str:sub(i, i)
end
The builtin string library treats Lua strings as byte arrays.
An alternative that works on multibyte (Unicode) characters is the
unicode library that
originated in the Selene project.
Its main selling point is that it can be used as a drop-in replacement
for the string library, making most string operations “magically”
Unicode-capable.
If you prefer not to add third party dependencies your task can easily
be implemented using LPeg.
Here is an example splitter:
local lpeg = require "lpeg"
local C, Ct, R = lpeg.C, lpeg.Ct, lpeg.R
local lpegmatch = lpeg.match
local split_utf8 do
local utf8_x = R"\128\191"
local utf8_1 = R"\000\127"
local utf8_2 = R"\194\223" * utf8_x
local utf8_3 = R"\224\239" * utf8_x * utf8_x
local utf8_4 = R"\240\244" * utf8_x * utf8_x * utf8_x
local utf8 = utf8_1 + utf8_2 + utf8_3 + utf8_4
local split = Ct (C (utf8)^0) * -1
split_utf8 = function (str)
str = str and tostring (str)
if not str then return end
return lpegmatch (split, str)
end
end
This snippet defines the function split_utf8() that creates a table
of UTF8 characters (as Lua strings), but returns nil if the string
is not a valid UTF sequence.
You can run this test code:
tests = {
en = [[Lua (/ˈluːə/ LOO-ə, from Portuguese: lua [ˈlu.(w)ɐ] meaning moon; ]]
.. [[explicitly not "LUA"[1]) is a lightweight multi-paradigm programming ]]
.. [[language designed as a scripting language with "extensible ]]
.. [[semantics" as a primary goal.]],
ru = [[Lua ([лу́а], порт. «луна») — интерпретируемый язык программирования, ]]
.. [[разработанный подразделением Tecgraf Католического университета ]]
.. [[Рио-де-Жанейро.]],
gr = [[Η Lua είναι μια ελαφρή προστακτική γλώσσα προγραμματισμού, που ]]
.. [[σχεδιάστηκε σαν γλώσσα σεναρίων με κύριο σκοπό τη δυνατότητα ]]
.. [[επέκτασης της σημασιολογίας της.]],
XX = ">\255< invalid"
}
-------------------------------------------------------------------------------
local limit = 14
for lang, str in next, tests do
io.write "\n"
io.write (string.format ("<%s %3d> ->", lang, #str))
local chars = split_utf8 (str)
if not chars then
io.write " INVALID!"
else
io.write (string.format (" <%3d>", #chars))
for i = 1, #chars > limit and limit or #chars do
io.write (string.format (" %q", chars [i]))
end
end
end
io.write "\n"
Btw., building a table with LPeg is significantly faster than calling
table.insert() repeatedly.
Here are stats for splitting the whole of Gogol’s Dead Souls (in
Russian, 1023814 bytes raw, 571395 characters UTF) on my machine:
library method time in ms
string table.insert() 380
string t [#t + 1] = c 310
string gmatch & for loop 280
slnunicode table.insert() 220
slnunicode t [#t + 1] = c 200
slnunicode gmatch & for loop 170
lpeg Ct (C (...)) 70
You can below code to achieve this easily.
t = {}
str = "text"
for i=1, string.len(str) do
t[i]= (string.sub(str,i,i))
end
for k , v in pairs(t) do
print(k,v)
end
-- 1 t
-- 2 e
-- 3 x
-- 4 t
Using string.sub
string.sub(s, i [, j])
Return a substring of the string passed. The substring starts at i. If the third argument j is not given, the substring will end at the end of the string. If the third argument is given, the substring ends at and includes j.
I have string
'TEST1, TEST2, TEST3'
I want to have
'TEST1,TEST2,TEST3'
Is in powerbuilder is a function like replace, substr or something?
One way is to use the database since you probably have an active connection.
string ls_stringwithspaces = "String String String String"
string ls_stringwithnospace = ""
string ls_sql = "SELECT replace('" + ls_stringwithspaces + "', ' ', '')"
DECLARE db DYNAMIC CURSOR FOR SQLSA;
PREPARE SQLSA FROM :ls_sql USING SQLCA;
OPEN DYNAMIC db;
IF SQLCA.SQLCode > 0 THEN
// erro handling
END IF
FETCH db INTO :ls_stringwithnospace;
CLOSE db;
MessageBox("", ls_stringwithnospace)
Sure there is (you could have easily found it in the help) but it is not quite helpful, though.
Its prototype is Replace ( string1, start, n, string2 ), so you need to know the position of the string to replace before calling it.
There is a common wrapper for this that consists of looping on pos() / replace() until there is nothing left to replace. The following is the source code of a global function:
global type replaceall from function_object
end type
forward prototypes
global function string replaceall (string as_source, string as_pattern, string as_replace)
end prototypes
global function string replaceall (string as_source, string as_pattern, string as_replace);//replace all occurences of as_pattern in as_source by as_replace
string ls_target
long i, j
ls_target=""
i = 1
j = 1
do
i = pos( as_source, as_pattern, j )
if i>0 then
ls_target += mid( as_source, j, i - j )
ls_target += as_replace
j = i + len( as_pattern )
else
ls_target += mid( as_source, j )
end if
loop while i>0
return ls_target
end function
Beware that string functions (searching & concatenating) in PB are not that efficient, and an alternative solution could be to use the FastReplaceall() global function provided by the PbniRegex extension. It is a c++ compiled plugin for PB classic from versions 9 to 12.
I do that:
long space, ll_a
FOR ll_a = 1 to len(ls_string)
space = pos(ls_string, " ")
IF space > 0 THEN
ls_string= Replace(ls_string, space, 1, "")
END IF
NEXT