i want to remove words that are not in a list, from a string.
for example i have the string "i like pie and cake" or "pie and cake is good" and i want to remove words that are not "pie" or "cake" and end out with a string saying "pie cake".
it would be great, if the words it does not delete could be loaded from a table.
Here's another solution, but you may need to trim the last space in the result.
acceptable = { "pie", "cake" }
for k,v in ipairs(acceptable) do acceptable[v]=v.." " end
setmetatable(acceptable,{__index= function () return "" end})
function strip(s,t)
s=s.." "
print('"'..s:gsub("(%a+) %s*",t)..'"')
end
strip("i like pie and cake",acceptable)
strip("pie and cake is good",acceptable)
gsub is the key point here. There are other variations using gsub and a function, instead of setting a metatable for acceptable.
local function stripwords(inputstring, inputtable)
local retstring = {}
local itemno = 1;
for w in string.gmatch(inputstring, "%a+") do
if inputtable[w] then
retstring[itemno] = w
itemno = itemno + 1
end
end
return table.concat(retstring, " ")
end
Provided that the words you want to keep are all keys of the inputtable.
The following also implements the last part of the request (I hope):
it would be great, if the words it does not delete could be loaded from a table.
function stripwords(str, words)
local w = {};
return str:gsub("([^%s.,!?]+)%s*", function(word)
if words[word] then return "" end
w[#w+1] = word
end), w;
end
Keep in mind that the pattern matcher of Lua is not compatible with multibyte strings. This is why I used the pattern above. If you don't care about multibyte strings, you can use something like "(%a+)%s". In that case I would also run the words through string.upper
Tests / Usage
local blacklist = { some = true, are = true, less = true, politics = true }
print((stripwords("There are some nasty words in here!", blacklist)))
local r, t = stripwords("some more are in politics here!", blacklist);
print(r);
for k,v in pairs(t) do
print(k, v);
end
Related
I'm trying to find the difference in text between two string values in Lua, and I'm just not quite sure how to do this effectively. I'm not very experienced in working with string patterns, and I'm sure that's my downfall on this one. Here's an example:
-- Original text
local text1 = "hello there"
-- Changed text
local text2 = "hello.there"
-- Finding the alteration of original text with some "pattern"
print(text2:match("pattern"))
In the example above, I'd want to output the text ".", since that's the difference between the two texts. Same goes for cases where the difference could be sensitive to a string pattern, like this:
local text1 = "hello there"
local text2 = "hello()there"
print(text2:match("pattern"))
In this example, I'd want to print "(" since at that point the new string is no longer consistent with the old one.
If anyone has any insight on this, I'd really appreciate it. Sorry I couldn't give more to work with code-wise, I'm just not sure where to begin.
Just iterate over the strings and find when they don't match.
function StringDifference(str1,str2)
for i = 1,#str1 do --Loop over strings
if str1:sub(i,i) ~= str2:sub(i,i) then --If that character is not equal to it's counterpart
return i --Return that index
end
end
return #str1+1 --Return the index after where the shorter one ends as fallback.
end
print(StringDifference("hello there", "hello.there"))
local function get_inserted_text(old, new)
local prv = {}
for o = 0, #old do
prv[o] = ""
end
for n = 1, #new do
local nxt = {[0] = new:sub(1, n)}
local nn = new:sub(n, n)
for o = 1, #old do
local result
if nn == old:sub(o, o) then
result = prv[o-1]
else
result = prv[o]..nn
if #nxt[o-1] <= #result then
result = nxt[o-1]
end
end
nxt[o] = result
end
prv = nxt
end
return prv[#old]
end
Usage:
print(get_inserted_text("hello there", "hello.there")) --> .
print(get_inserted_text("hello there", "hello()there")) --> ()
print(get_inserted_text("hello there", "hello htere")) --> h
print(get_inserted_text("hello there", "heLlloU theAre")) --> LUA
I searched on Google and on Stack Overflow and didn't find answer for this question. Looking at the documentation I didn't find how to do this because every function that allows splits excludes the delimiter.
EDIT
for i, word in pairs(split(text, "<(.-)>")) do
print(word)
end
function split(string, delimiter) -- Got this function from https://helloacm.com/split-a-string-in-lua/
result = {};
for match in (string..delimiter):gmatch("(.-)"..delimiter) do
table.insert(result, match);
end
return result;
end
This code replaces the parts in the format "<(.-)>"
Example:
Input: "Hello<a>World</a>!"
Expected Output: {"Hello", "<a>", "World", "</a>", "!"}
Real Output: {"Hello", "World", "!"}
s = "Hello<a>World</a>!"
for a in s:gsub('%b<>','\0%0\0'):gmatch'%Z+' do
print(a)
end
I assume this is related to HTML tags or similar.
One quick-n-dirty possibility I can think of that should cover your specific use case is this:
s = 'Hello<a>World</a>!'
function split(s)
local ans = {}
for a,b in (s..'<>'):gmatch '(.-)(%b<>)' do
ans[#ans+1] = a
ans[#ans+1] = b
end
ans[#ans] = nil
return ans
end
for _,v in ipairs(split(s)) do
print(v)
end
There are some discussions here, and utility functions, for splitting strings, but I need an ad-hoc one-liner for a very simple task.
I have the following string:
local s = "one;two;;four"
And I want to split it on ";". I want, eventually, go get { "one", "two", "", "four" } in return.
So I tried to do:
local s = "one;two;;four"
local words = {}
for w in s:gmatch("([^;]*)") do table.insert(words, w) end
But the result (the words table) is { "one", "", "two", "", "", "four", "" }. That's certainly not what I want.
Now, as I remarked, there are some discussions here on splitting strings, but they have "lengthy" functions in them and I need something succinct. I need this code for a program where I show the merit of Lua, and if I add a lengthy function to do something so trivial it would go against me.
local s = "one;two;;four"
local words = {}
for w in (s .. ";"):gmatch("([^;]*);") do
table.insert(words, w)
end
By adding one extra ; at the end of the string, the string now becomes "one;two;;four;", everything you want to capture can use the pattern "([^;]*);" to match: anything not ; followed by a ;(greedy).
Test:
for n, w in ipairs(words) do
print(n .. ": " .. w)
end
Output:
1: one
2: two
3:
4: four
Just changing * to + works.
local s = "one;two;;four"
local words = {}
for w in s:gmatch("([^;]+)") do
table.insert(words, w)
print(w)
end
The magic character * represents 0 or more occurrene, so when it meet ',', lua regarded it as a empty string that [^;] does not exist.
Sorry for my carelessness, the words[3] should be a empty string, but when I run the original code in lua5.4 interpreter, everything works.
code here
running result here
(I have to put links because of lack of reputation)
function split(str,sep)
local array = {}
local reg = string.format("([^%s]+)",sep)
for mem in string.gmatch(str,reg) do
table.insert(array, mem)
end
return array
end
local s = "one;two;;four"
local array = split(s,";")
for n, w in ipairs(array) do
print(n .. ": " .. w)
end
result:
1:one
2:two
3:four
I'm trying to do a library in Lua with some function that manipulate strings.
I want to do a function that changes the letter case to upper only on odd characters of the word.
This is an example:
Input: This LIBRARY should work with any string!
Result: ThIs LiBrArY ShOuLd WoRk WiTh AnY StRiNg!
I tried with the "gsub" function but i found it really difficult to use.
This almost works:
original = "This LIBRARY should work with any string!"
print(original:gsub("(.)(.)",function (x,y) return x:upper()..y end))
It fails when the string has odd length and the last char is a letter, as in
original = "This LIBRARY should work with any strings"
I'll leave that case as an exercise.
First, split the string into an array of words:
local original = "This LIBRARY should work with any string!"
local words = {}
for v in original:gmatch("%w+") do
words[#words + 1] = v
end
Then, make a function to turn words like expected, odd characters to upper, even characters to lower:
function changeCase(str)
local u = ""
for i = 1, #str do
if i % 2 == 1 then
u = u .. string.upper(str:sub(i, i))
else
u = u .. string.lower(str:sub(i, i))
end
end
return u
end
Using the function to modify every words:
for i,v in ipairs(words) do
words[i] = changeCase(v)
end
Finally, using table.concat to concatenate to one string:
local result = table.concat(words, " ")
print(result)
-- Output: ThIs LiBrArY ShOuLd WoRk WiTh AnY StRiNg
Since I am coding mostly in Haskell lately, functional-ish solution comes to mind:
local function head(str) return str[1] end
local function tail(str) return substr(str, 2) end
local function helper(str, c)
if #str == 0 then
return ""
end
if c % 2 == 1 then
return toupper(head(str)) .. helper(tail(str),c+1)
else
return head(str) .. helper(tail(str), c+1)
end
end
function foo(str)
return helper(str, 1)
end
Disclaimer: Not tested, just showing the idea.
And now for real, you can treat a string like a list of characters with random-access with reference semantics on []. Simple for loop with index should do the trick just fine.
Problem Description:
HI there. I'm trying to figure out how to use the lua function "string.gsub". I've been reading the manual which says:
This is a very powerful function and can be used in multiple ways.
Used simply it can replace all instances of the pattern provided with
the replacement. A pair of values is returned, the modified string and
the number of substitutions made. The optional fourth argument n can
be used to limit the number of substitutions made:
> = string.gsub("Hello banana", "banana", "Lua user")
Hello Lua user 1
> = string.gsub("banana", "a", "A", 2) -- limit substitutions made to 2
bAnAna 2
Question
When it says that a pair of values is returned; how do I get the new string value?
Code
local email_filename = "/var/log/test.txt"
local email_contents_file_exists = function(filename)
file = io.open(filename, "r")
if file == nil then
return false
else
file.close(file)
return true
end
end
local read_email_contents_file = function()
print('inside the function')
if not email_contents_file_exists(email_filename) then
return false
end
local f = io.open(email_filename, "rb")
local content = f:read("*all")
f:close()
print(content)
--content = string.gsub(content, '[username]', 'myusername')
--local tmp {}
--tmp = string.gsub(content, '[username]', 'myusername')
print(string.gsub(content, '[username]', 'myusername'))
return content
end
local test = read_email_contents_file()
What I've Tried So Far:
I've tried just printing the results, as you see above. That returns a bunch of garbled text. Tried saving to original string and I've also tried saving the results to an array (local tmp = {})
Any suggestions?
> = string.gsub('banana', 'a', 'A', 2)
bAnAna 2
> = (string.gsub('banana', 'a', 'A', 2))
bAnAna
You were going pretty good with reading the Lua users wiki.
In Lua, when you a function returns more than one value, you can access them all as follows
function sth()
return 1, "hi", false
end
x, y, z, a, b, c = sth() -- x = 1; y = "hi" and z = false(boolean); a = b = c = nil
Now, coming back to string.gsub function. It returns two values. The first being the processed string and the second being the number of time gsub performed itself on the input string.
So, to get the new string value, something like this would be best:
local tempString = string.gsub(content, '[username]', 'myusername')
OR
local tempString = content:gsub( '[username]', 'myusername' )
Ofcourse, here, you need to be aware about the various patterns used in Lua which are mentioned in the Programming in Lua book.
You need to escape [ and ] because they are magic characters in Lua patterns.