Lua split strings into Keys and Values of a table - string

So I want to split two strings, and be able to return a table with one string equaling the Keys and another the Values.
So if:
String1 = "Key1,Key2,Key3,Key4,Key Ect..."
String2 = "Value1,Value2,Value3,Value4,Value Ect..."
The output would be a table as folows:
Key1 - Value1
Key2 - Value2
Key3 - Value3
Key4 - Value4
Key Ect... - Value Ect...
I have been looking at this split function I found on the Lua wiki
split(String2, ",")
function split(String, pat)
local t = {} -- NOTE: use {n = 0} in Lua-5.0
local fpat = "(.-)" .. pat
local last_end = 1
local s, e, cap = str:find(fpat, 1)
while s do
if s ~= 1 or cap ~= "" then
table.insert(t,cap)
end
last_end = e+1
s, e, cap = str:find(fpat, last_end)
end
if last_end <= #str then
cap = str:sub(last_end)
table.insert(t, cap)
end
return t
end
But of course this only returns:
1 - Value1
2 - Value2
and so on...
I'm going to start trying to modify this code, but I don't know how far I'll get.

You can use it directly like this:
local t1 = split(String1, ",")
local t2 = split(String2, ",")
local result = {}
for k, v in ipairs(t1) do
result[v] = t2[k]
end
Or, create your own iterator:
local function my_iter(t1, t2)
local i = 0
return function() i = i + 1; return t1[i], t2[i] end
end
local result = {}
for v1, v2 in my_iter(t1, t2) do
result[v1] = v2
end

The code below avoids creating two temporary tables:
function join(s1,s2)
local b1,e1,k=1
local b2,e2,v=1
local t={}
while true do
b1,e1,k=s1:find("([^,]+)",b1)
if b1==nil then break end
b1=e1+1
b2,e2,v=s2:find("([^,]+)",b2)
if b2==nil then break end
b2=e2+1
t[k]=v
end
return t
end
String1 = "Key1,Key2,Key3,Key4"
String2 = "Value1,Value2,Value3,Value4"
for k,v in pairs(join(String1,String2)) do
print(k,v)
end

Related

Lua split string to table

I'm looking for the most efficient way to split a Lua string into a table.
I found two possible ways using gmatch or gsub and tried to make them as fast as possible.
function string:split1(sep)
local sep = sep or ","
local result = {}
local i = 1
for c in (self..sep):gmatch("(.-)"..sep) do
result[i] = c
i = i + 1
end
return result
end
function string:split2(sep)
local sep = sep or ","
local result = {}
local pattern = string.format("([^%s]+)", sep)
local i = 1
self:gsub(pattern, function (c) result[i] = c i = i + 1 end)
return result
end
The second option takes ~50% longer than the first.
What is the right way and why?
Added: I added a third function with the same pattern.
It shows the best result.
function string:split3(sep)
local sep = sep or ","
local result = {}
local i = 1
for c in self:gmatch(string.format("([^%s]+)", sep)) do
result[i] = c
i = i + 1
end
return result
end
"(.-)"..sep - works with a sequence.
"([^" .. sep .. "]+)" works with a single character. In fact, for each character in the sequence.
string.format("([^%s]+)", sep) is faster than "([^" .. sep .. "]+)".
The string.format("(.-)%s", sep) shows almost the same time as "(.-)"..sep.
result[i]=c i=i+1 is faster than result[#result+1]=c and table.insert(result,c)
Code for test:
local init = os.clock()
local initialString = [[1,2,3,"afasdaca",4,"acaac"]]
local temTable = {}
for i = 1, 1000 do
table.insert(temTable, initialString)
end
local dataString = table.concat(temTable,",")
print("Creating data: ".. (os.clock() - init))
init = os.clock()
local data1 = {}
for i = 1, 1000 do
data1 = dataString:split1(",")
end
print("split1: ".. (os.clock() - init))
init = os.clock()
local data2 = {}
for i = 1, 1000 do
data2 = dataString:split2(",")
end
print("split2: ".. (os.clock() - init))
init = os.clock()
local data3 = {}
for i = 1, 1000 do
data3 = dataString:split3(",")
end
print("split3: ".. (os.clock() - init))
Times:
Creating data: 0.000229
split1: 1.189397
split2: 1.647402
split3: 1.011056
The gmatch version is preferred. gsub is intended for "global substitution" - string replacement - rather than iterating over matches; accordingly it presumably has to do more work.
The comparison isn't quite fair though as your patterns differ: For gmatch you use "(.-)"..sep and for gsub you use "([^" .. sep .. "]+)". Why don't you use the same pattern for both? In newer Lua versions you could even use the frontier pattern.
The different patterns also lead to different behavior: The gmatch-based func will return empty matches whereas the others won't. Note that the "([^" .. sep .. "]+)" pattern allows you to omit the parentheses.

Getting all strings in a lua script

I'm trying to encode some strings in my lua script, and since that I have a lua script with over 200k characters, encrypting each string query in the script with a function such as this example below
local string = "stackoverflow"
local string = [[stackoverflow]]
local string = [==[stackoverflow]==]
local string = 'stackoverflow'
to
local string=decode("jkrtbfmviwcfn",519211)
Trying to provide all above results to thread through a gsub and have the gsub encode the string text with a random offset number.
So far, I was only capable of gsubbing full quotation marks through.
function encode(x,offset,a)
for char in string.gmatch(x, "%a") do
local encrypted = string.byte(char) + offset
while encrypted > 122 do
encrypted = encrypted - 26
end
while encrypted < 97 do
encrypted = encrypted + 26
end
a[#a+1] = string.char(encrypted)
end
return table.concat(a)
end
luacode=[==[thatstring.Value="Encryptme!" testvalue.Value=[[string with
a linebreak]] string.Text="STOP!"]==]
luacode=luacode:gsub([=["(.-)"]=],function(s)
print("Caught "..s)
local offset=math.random(1,4)
local encoded=encode(s,offset,{})
return [[decode("]]..encoded..[[",]]..offset..[[)]]
end)
print("\n"..luacode)
With its output being
Caught Encryptme!
Caught STOP!
thatstring.Value=decode("crgvctxqi",4) testvalue.Value=[[string with
a linebreak]] string.Text=decode("opkl",2)
Any better solutions?
local function strings_and_comments(lua_code, callback)
-- lua_code must be valid Lua code (an error may be raised on syntax error)
-- callback will be invoked as callback(object_type, value, start_pos, end_pos)
-- callback("comment", comment_text, start_pos, end_pos) -- for comments
-- callback("string", string_value, start_pos, end_pos) -- for string literals
local objects = {} -- possible comments and string literals in the code
-- search for all start positions of comments (with false positives)
for pos, br1, eq, br2 in lua_code:gmatch"()%-%-(%-*%[?)(=*)(%[?)" do
table.insert(objects, {start_pos = pos,
terminator = br1 == "[" and br2 == "[" and "]"..eq.."]" or "\n"})
end
-- search for all start positions of string literals (with false positives)
for pos, eq in lua_code:gmatch"()%[(=*)%[[%[=]*" do
table.insert(objects, {is_string = true, start_pos = pos,
terminator = "]"..eq.."]"})
end
for pos, quote in lua_code:gmatch"()(['\"])" do
table.insert(objects, {is_string = true, start_pos = pos, quote = quote})
end
table.sort(objects, function(a, b) return a.start_pos < b.start_pos end)
local end_pos = 0
for _, object in ipairs(objects) do
local start_pos, ok, symbol = object.start_pos
if start_pos > end_pos then
if object.terminator == "\n" then
end_pos = lua_code:find("\n", start_pos + 1, true) or #lua_code
-- exclude last spaces and newline
while lua_code:sub(end_pos, end_pos):match"%s" do
end_pos = end_pos - 1
end
elseif object.terminator then
ok, end_pos = lua_code:find(object.terminator, start_pos + 1, true)
assert(ok, "Not a valid Lua code")
else
end_pos = start_pos
repeat
ok, end_pos, symbol = lua_code:find("(\\?.)", end_pos + 1)
assert(ok, "Not a valid Lua code")
until symbol == object.quote
end
local value = lua_code:sub(start_pos, end_pos):gsub("^%-*%s*", "")
if object.terminator ~= "\n" then
value = assert((loadstring or load)("return "..value))()
end
callback(object.is_string and "string" or "comment", value, start_pos, end_pos)
end
end
end
local inv256
local function encode(str)
local seed = math.random(0x7FFFFFFF)
local result = '",'..seed..'))'
if not inv256 then
inv256 = {}
for M = 0, 127 do
local inv = -1
repeat inv = inv + 2
until inv * (2*M + 1) % 256 == 1
inv256[M] = inv
end
end
repeat
seed = seed * 3
until seed > 2^43
local K = 8186484168865098 + seed
result = '(decode("'..str:gsub('.',
function(m)
local L = K % 274877906944 -- 2^38
local H = (K - L) / 274877906944
local M = H % 128
m = m:byte()
local c = (m * inv256[M] - (H - M) / 128) % 256
K = L * 21271 + H + c + m
return ('%02x'):format(c)
end
)..result
return result
end
function hide_strings_in_lua_code(lua_code)
local text = { [[
local function decode(str, seed)
repeat
seed = seed * 3
until seed > 2^43
local K = 8186484168865098 + seed
return (str:gsub('%x%x',
function(c)
local L = K % 274877906944 -- 2^38
local H = (K - L) / 274877906944
local M = H % 128
c = tonumber(c, 16)
local m = (c + (H - M) / 128) * (2*M + 1) % 256
K = L * 21271 + H + c + m
return string.char(m)
end
))
end
]] }
local pos = 1
strings_and_comments(lua_code,
function (object_type, value, start_pos, end_pos)
if object_type == "string" then
table.insert(text, lua_code:sub(pos, start_pos - 1))
table.insert(text, encode(value))
pos = end_pos + 1
end
end)
table.insert(text, lua_code:sub(pos))
return table.concat(text)
end
Usage:
math.randomseed(os.time())
-- This is the program to be converted
local luacode = [===[
print"Hello world!"
print[[string with
a linebreak]]
local str1 = "stackoverflow"
local str2 = [[stackoverflow]]
local str3 = [==[stackoverflow]==]
local str4 = 'stackoverflow'
print(str1)
print(str2)
print(str3)
print(str4)
]===]
-- Conversion
print(hide_strings_in_lua_code(luacode))
Output (converted program)
local function decode(str, seed)
repeat
seed = seed * 3
until seed > 2^43
local K = 8186484168865098 + seed
return (str:gsub('%x%x',
function(c)
local L = K % 274877906944 -- 2^38
local H = (K - L) / 274877906944
local M = H % 128
c = tonumber(c, 16)
local m = (c + (H - M) / 128) * (2*M + 1) % 256
K = L * 21271 + H + c + m
return string.char(m)
end
))
end
print(decode("ef869b23b69b7fbc7f89bbe7",2686976))
print(decode("c2dc20f7061c452db49302f8a1d9317aad1009711e0984",1210253312))
local str1 = (decode("84854df4599affe9c894060431",415105024))
local str2 = (decode("a5d7db792f0b514417827f34e3",1736704000))
local str3 = (decode("6a61bcf9fd6f403ed1b4846e58",1256259584))
local str4 = (decode("cad56d9dea239514aca9c8b8e0",1030488064))
print(str1)
print(str2)
print(str3)
print(str4)
Output of output (output produced by the converted program)
Hello world!
string with
a linebreak
stackoverflow
stackoverflow
stackoverflow
stackoverflow

Compare to string of names

I am trying to compare the names of two strings, and trying to pick out the name that are not included in the other string.
h = 1;
for i = 1:name_size_main
checker = 0;
main_name = main(i);
for j = 1:name_size_image
image_name = image(j);
temp = strcmpi(image_name, main_name);
if temp == 1;
checker = temp;
end
end
if checker == 0
result(h) = main_name;
h = h+1;
end
end
but it keeps returning the entire string as result, the main string contain roughly 1000 names, the images name contain about 300 names, so it should return about 700 names in result but it keep returning all 1000 names.
I tried your code with small vectors:
main = ['aaa' 'bbb' 'ccc' 'ddd'];
image = ['bbb' 'ddd'];
name_size_main = size(main,2);
name_size_image = size(image,2);
h = 1;
for i = 1:name_size_main
checker = 0;
main_name = main(i);
for j = 1:name_size_image
image_name = image(j);
temp = strcmpi(image_name, main_name);
if temp == 1;
checker = temp;
end
end
if checker == 0
result(h) = main_name;
h = h+1;
end
end
I get result = 'aaaccc', is it not what you want to get?
EDIT:
If you are using cell arrays, you should change the line result(h) = main_name; to result{h} = main_name; like that:
main = {'aaa' 'bbb' 'ccc' 'ddd'};
image = {'bbb' 'ddd'};
name_size_main = size(main,2);
name_size_image = size(image,2);
result = cell(0);
h = 1;
for i = 1:name_size_main
checker = 0;
main_name = main(i);
for j = 1:name_size_image
image_name = image(j);
temp = strcmpi(image_name, main_name);
if temp == 1;
checker = temp;
end
end
if checker == 0
result{h} = main_name;
h = h+1;
end
end
You can use cells of string along with setdiff or setxor.
A = cellstr(('a':'t')') % a cell of string, 'a' to 't'
B = cellstr(('f':'z')') % 'f' to 'z'
C1 = setdiff(A,B,'rows') % gives 'a' to 'e'
C2 = setdiff(B,A,'rows') % gives 'u' to 'z'
C3 = setxor(A,B,'rows') % gives 'a' to 'e' and 'u' to 'z'

Find the last index of a character in a string

I want to have ability to use a lastIndexOf method for the strings in my Lua (Luvit) project. Unfortunately there's no such method built-in and I'm bit stuck now.
In Javascript it looks like:
'my.string.here.'.lastIndexOf('.') // returns 14
function findLast(haystack, needle)
local i=haystack:match(".*"..needle.."()")
if i==nil then return nil else return i-1 end
end
s='my.string.here.'
print(findLast(s,"%."))
print(findLast(s,"e"))
Note that to find . you need to escape it.
If you have performance concerns, then this might be a bit faster if you're using Luvit which uses LuaJIT.
local find = string.find
local function lastIndexOf(haystack, needle)
local i, j
local k = 0
repeat
i = j
j, k = find(haystack, needle, k + 1, true)
until j == nil
return i
end
local s = 'my.string.here.'
print(lastIndexOf(s, '.')) -- This will be 15.
Keep in mind that Lua strings begin at 1 instead of 0 as in JavaScript.
Here’s a solution using
LPeg’s position capture.
local lpeg = require "lpeg"
local Cp, P = lpeg.Cp, lpeg.P
local lpegmatch = lpeg.match
local cache = { }
local find_last = function (str, substr)
if not (str and substr)
or str == "" or substr == ""
then
return nil
end
local pat = cache [substr]
if not pat then
local p_substr = P (substr)
local last = Cp() * p_substr * Cp() * (1 - p_substr)^0 * -1
pat = (1 - last)^0 * last
cache [substr] = pat
end
return lpegmatch (pat, str)
end
find_last() finds the last occurence of substr in the string
str, where substr can be a string of any length.
The first return value is the position of the first character of
substr in str, the second return value is the position of the
first character following substr (i.e. it equals the length of the
match plus the first return value).
Usage:
local tests = {
A = [[fooA]], --> 4, 5
[""] = [[foo]], --> nil
FOO = [[]], --> nil
K = [[foo]], --> nil
X = [[X foo X bar X baz]], --> 13, 14
XX = [[foo XX X XY bar XX baz X]], --> 17, 19
Y = [[YYYYYYYYYYYYYYYYYY]], --> 18, 19
ZZZ = [[ZZZZZZZZZZZZZZZZZZ]], --> 14, 17
--- Accepts patterns as well!
[P"X" * lpeg.R"09"^1] = [[fooX42barXxbazX]], --> 4, 7
}
for substr, str in next, tests do
print (">>", substr, str, "->", find_last (str, substr))
end
To search for the last instance of string needle in haystack:
function findLast(haystack, needle)
--Set the third arg to false to allow pattern matching
local found = haystack:reverse():find(needle:reverse(), nil, true)
if found then
return haystack:len() - needle:len() - found + 2
else
return found
end
end
print(findLast("my.string.here.", ".")) -- 15, because Lua strings are 1-indexed
print(findLast("my.string.here.", "here")) -- 11
print(findLast("my.string.here.", "there")) -- nil
If you want to search for the last instance of a pattern instead, change the last argument to find to false (or remove it).
Can be optimized but simple and does the work.
function lastIndexOf(haystack, needle)
local last_index = 0
while haystack:sub(last_index+1, haystack:len()):find(needle) ~= nil do
last_index = last_index + haystack:sub(last_index+1, haystack:len()):find(needle)
end
return last_index
end
local s = 'my.string.here.'
print(lastIndexOf(s, '%.')) -- 15

match check in matlab

i have strings like these:
s{1,2} = 'string';
s{2,2} = 'string2';
and in workspace structure like this
U.W.string = [2 2.5 3]
I want to check (in loop) s{1,2} or s{2,2} or s{i,2} matches any structure with the same name. If so, assign values from this structure to some variable var(i). How can it be done?
Use isfields to check, if a string is the name of a field in a struct. Then use the syntax struct.(name), where name is a string to access the field. Your code might look something like:
test = struct('hello', 'world', 'count', 42, 'mean', 10);
fields = {'test', 'count';
'hello', 'text';
'more', 'less'};
values = {pi, 'dummy', -1};
for row = 1 : size(fields, 1)
for column = 1 : size(fields, 2)
if isfield(test, fields{row, column})
test.(fields{row, column}) = values{row};
end
end
end
This converts the initial struct
test =
hello: 'world'
count: 42
mean: 10
to this one
test =
hello: 'dummy'
count: 3.1416
mean: 10
A shorter implementation is achieved by removing the inner loop and giving a cell-array to isfields:
for row = 1 : size(fields, 1)
%# Note the parenthesis instead of curly braces in the next statement.
match = isfield(test, fields(row, :));
if any(match)
test.(fields{row, match}) = values{row};
end
end
Use isfield(structName,fieldName). This should do the trick:
strings{1,1} = 'foo';
strings{1,2} = 'bar';
strings{1, 3} = 'foobar';
U.W.foo = 1;
U.W.foobar = 5;
for idx = 1:length(strings)
if(isfield(U.W,strings{1,idx}))
expression = sprintf('outvar(idx) = U.W.%s',strings{1,idx});
eval(expression);
end
end

Resources