I am trying to separate the string data in an HTTP protocol in wireshark using lua and I am not having success finding the end of the string, this is what I currently have
HTTP_protocol = Proto("ourHTTP", "HTTPProtocol")
first =ProtoField.string("HTTP_protocol.first", "first", base.ASCII)
second =ProtoField.string("HTTP_protocol.second", "second", base.ASCII)
HTTP_protocol.fields = {first}
function HTTP_protocol.dissector(buffer, pinfo, tree)
length = buffer:len()
if length ==0 then return end
pinfo.cols.protocol = HTTP_protocol.name
local subtree = tree:add(HTTP_protocol, buffer(), "HTTPProtocol data ")
local string_length
for i = 0, length - 1, 1 do
if (buffer(i,1):uint() == '\r') then
string_length = i - 0
break
end
end
subtree:add(first, buffer(0,string_length))
end
porttable = DissectorTable.get("tcp.port")
porttable:add(80, HTTP_protocol)
i have tried searching for '\r', '\0' and '\n' but no matter what I still get all the strings inputed as one. Is there something I am doing wrong?
You can use 0x0D instead. That's the ASCII code for \r. So it will end up as
if (buffer(i,1):uint() == 0x0D) then
In Wireshark:
Related
How'd I change everything that it matches in the string without changing the non matches?
local a = "\" Hello World! I want to replace this with a bytecoded version of this!\" but not this!"
for i in string.gmatch(a, "\".*\"") do
print(i)
end
For example I want [["Hello World!" Don't Replace this!]] to [["\72\101\108\108\111\32\87\111\114\108\100\33" Don't Replace this!]]
Your question is a little bit tricky because it could involve:
Lua patterns
string.gsub function
Access to string's bytes with string.byte
String concatenations with table.concat
First things first, if you need to implement Lua patterns, please know that there is a very handy Lua syntax which is very appropriate for dealing with quoted strings. With this syntax, instead of opening/closing a string with a double-quote, you do it with the characters [[ and ]]. The key difference is that between these markers, you don't have to escape the quoted strings anymore!
String = [["Hello World!" Don't Replace this!]]
Then, we need to build the proper Lua pattern, a possibility could be to match a double-quote (") and then match all the characters which are not a double-quote ("), this gives us the following pattern:
[["([^"]+)"]]
| **** |
| \-> the expression to match
| |
quote quote
Then if we study the function string.gsub, we can learn that the function can call a callback when a pattern is matched, the matched string will be replaced by the return value of the callback.
function ConvertToByteString (MatchedString)
local ByteStrings = {}
local Len = #MatchedString
ByteStrings[#ByteStrings+1] = [[\"]]
for Index = 1, Len do
local Byte = MatchedString:byte(Index)
ByteStrings[#ByteStrings+1] = string.format([[\%d]], Byte)
end
ByteStrings[#ByteStrings+1] = [[\"]]
return table.concat(ByteStrings)
end
In this function, we iterate through all the characters of the matched string. Then for each of the characters, we extract its byte value with the function string.byte and convert it to a string using the string.format function. We put this string in a temporary array that we will concatenate at the end of the function.
The function to concatenate the sub-strings into a larger string is table.concat. This is a very convenient function which could be used as follow:
> table.concat({ [[\10]], [[\11]], [[\12]] })
\10\11\12
The remaining thing we need to do is to test this outstanding function:
> String = [["Hello World!" Don't Replace this!]]
> NewString = String:gsub([["([^"]+)"]], ConvertToByteString)
> NewString
\"\72\101\108\108\111\32\87\111\114\108\100\33\" Don't Replace this!
Edit: I got some remarks regarding code performances, I personally don't focus much on performances, I focus on getting the code correct & simple. In order to address the performance question, I wrote a micro-benchmark to compare the versions:
function SOLUTION_DarkWiiPlayer (String)
local result = String:gsub('"[^"]*"', function(str)
return str:gsub('[^"]', function(char)
return "\\" .. char:byte()
end)
end)
return result
end
function SOLUTION_Robert (String)
local function ConvertToByteString (MatchedString)
local ByteStrings = {}
local Len = #MatchedString
ByteStrings[#ByteStrings+1] = [[\"]]
for Index = 1, Len do
local Byte = MatchedString:byte(Index)
ByteStrings[#ByteStrings+1] = string.format([[\%d]], Byte)
end
ByteStrings[#ByteStrings+1] = [[\"]]
return table.concat(ByteStrings)
end
local Result = String:gsub([["([^"]+)"]], ConvertToByteString)
return Result
end
function SOLUTION_Piglet (String)
return String:gsub('%b""' , function (match)
local ret = ""
for _,v in ipairs{match:byte(1, -1)} do
ret = ret .. string.format("\\%d", v)
end
return ret
end)
end
function SOLUTION_Renshaw (String)
local function convert(str)
local byte_str = ""
for i = 1, #str do
byte_str = byte_str .. "\\" .. tostring(string.byte(str, i))
end
return byte_str
end
local Result = string.gsub(String, "\"(.*)\"", function(matched_str)
return "\"" .. convert(matched_str) .. "\""
end)
return Result
end
String = "\"Hello World!\" Don't Replace this!"
print("INITIAL REQUIREMENT FROM OP ", [[\"\72\101\108\108\111\32\87\111\114\108\100\33\" Don't Replace this!]])
print("TEST SOLUTION_Robert: ", SOLUTION_Robert(String))
print("TEST SOLUTION_DarkWiiPlayer:", SOLUTION_DarkWiiPlayer(String))
print("TEST SOLUTION_Piglet: ", SOLUTION_Piglet(String))
print("TEST SOLUTION_Renshaw: ", SOLUTION_Renshaw(String))
The results show that only one answer fulfill 100% of OP's requirements. The other answers doesn't handle the first and ending double-quotes " properly.
INITIAL REQUIREMENT FROM OP \"\72\101\108\108\111\32\87\111\114\108\100\33\" Don't Replace this!
TEST SOLUTION_Robert: \"\72\101\108\108\111\32\87\111\114\108\100\33\" Don't Replace this!
TEST SOLUTION_DarkWiiPlayer: "\72\101\108\108\111\32\87\111\114\108\100\33" Don't Replace this!
TEST SOLUTION_Piglet: \34\72\101\108\108\111\32\87\111\114\108\100\33\34 Don't Replace this! 1
TEST SOLUTION_Renshaw: "\72\101\108\108\111\32\87\111\114\108\100\33" Don't Replace this!
To finalize this post, one could dive a little deeper and check the code performances with a micro-benchmark which could be copy/paste directly in a Lua interpreter.
function SOLUTION_DarkWiiPlayer (String)
local result = String:gsub('"[^"]*"', function(str)
return str:gsub('[^"]', function(char)
return "\\" .. char:byte()
end)
end)
return result
end
function SOLUTION_Robert (String)
local function ConvertToByteString (MatchedString)
local ByteStrings = {}
local Len = #MatchedString
ByteStrings[#ByteStrings+1] = [[\"]]
for Index = 1, Len do
local Byte = MatchedString:byte(Index)
ByteStrings[#ByteStrings+1] = string.format([[\%d]], Byte)
end
ByteStrings[#ByteStrings+1] = [[\"]]
return table.concat(ByteStrings)
end
local Result = String:gsub([["([^"]+)"]], ConvertToByteString)
return Result
end
function SOLUTION_Piglet (String)
return String:gsub('%b""' , function (match)
local ret = ""
for _,v in ipairs{match:byte(1, -1)} do
ret = ret .. string.format("\\%d", v)
end
return ret
end)
end
function SOLUTION_Renshaw (String)
local function convert(str)
local byte_str = ""
for i = 1, #str do
byte_str = byte_str .. "\\" .. tostring(string.byte(str, i))
end
return byte_str
end
local Result = string.gsub(String, "\"(.*)\"", function(matched_str)
return "\"" .. convert(matched_str) .. "\""
end)
return Result
end
---
--- Micro-benchmark environment
---
COUNT = 600000
function TEST_Function (Name, Function, String, Count)
local TimerStart = os.clock()
for Index = 1, Count do
Function(String)
end
local ElapsedSeconds = (os.clock() - TimerStart)
print(string.format("[%25.25s] %f sec", Name, ElapsedSeconds))
end
String = "\"Hello World!\" Don't Replace this!"
TEST_Function("SOLUTION_DarkWiiPlayer", SOLUTION_DarkWiiPlayer, String, COUNT)
TEST_Function("SOLUTION_Robert", SOLUTION_Robert, String, COUNT)
TEST_Function("SOLUTION_Piglet", SOLUTION_Piglet, String, COUNT)
TEST_Function("SOLUTION_Renshaw", SOLUTION_Renshaw, String, COUNT)
The results shows that #DarkWiiPlayer's answer is the fastest one.
[ SOLUTION_DarkWiiPlayer] 6.363000 sec
[ SOLUTION_Robert] 9.605000 sec
[ SOLUTION_Piglet] 7.943000 sec
[ SOLUTION_Renshaw] 8.875000 sec
local a = "\"Hello World!\" but not this!"
print(a:gsub('"[^"]*"', function(str)
return str:gsub('[^"]', function(char)
return "\\" .. char:byte()
end)
end))
you need string.gsub.
local a = "\"Hello World!\" Don't Replace this!"
local function convert(str)
local byte_str = ""
for i = 1, #str do
byte_str = byte_str .. "\\" .. tostring(string.byte(str, i))
end
return byte_str
end
a = string.gsub(a, "\"(.*)\"", function(matched_str)
return "\"" .. convert(matched_str) .. "\""
end)
print(a)
local a = "\" Hello World! I want to replace this with a bytecoded version of this!\" but not this!"
print((a:gsub('%b""' , function (match)
local ret = ""
for _,v in ipairs{match:byte(1, -1)} do
ret = ret .. string.format("\\%d", v)
end
return ret
end)))
Consider the two cases below:
local str1 = "abc"
str1:len gives 3
local str2 = "£££"
str2:len gives 6
Can someone explain this?
LuaJit version: 5.1
The length of strings in Lua is the number of bytes in it, not the number of chars.
To handle multibyte charsets, you need a library like utf8, which is available in Lua 5.3.
Found a solution.
local function parse_string(str)
local ret = ""
local flag = true
for i = 1, #str do
local c = str:sub(i,i)
local char = string.char(b2i.toint(c, "big", false, 1))
if char > "\127" then
flag = not flag
if(flag) then
ret = ret .. char
end
else
ret = ret .. char
end
end
return ret
end
So I have the following code to split a string between whitespaces:
text = "I am 'the text'"
for string in text:gmatch("%S+") do
print(string)
end
The result:
I
am
'the
text'
But I need to do this:
I
am
the text --[[yep, without the quotes]]
How can I do this?
Edit: just to complement the question, the idea is to pass parameters from a program to another program. Here is the pull request that I am working, currently in review: https://github.com/mpv-player/mpv/pull/1619
There may be ways to do this with clever parsing, but an alternative way may be to keep track of a simple state and merge fragments based on detection of quoted fragments. Something like this may work:
local text = [[I "am" 'the text' and "some more text with '" and "escaped \" text"]]
local spat, epat, buf, quoted = [=[^(['"])]=], [=[(['"])$]=]
for str in text:gmatch("%S+") do
local squoted = str:match(spat)
local equoted = str:match(epat)
local escaped = str:match([=[(\*)['"]$]=])
if squoted and not quoted and not equoted then
buf, quoted = str, squoted
elseif buf and equoted == quoted and #escaped % 2 == 0 then
str, buf, quoted = buf .. ' ' .. str, nil, nil
elseif buf then
buf = buf .. ' ' .. str
end
if not buf then print((str:gsub(spat,""):gsub(epat,""))) end
end
if buf then print("Missing matching quote for "..buf) end
This will print:
I
am
the text
and
some more text with '
and
escaped \" text
Updated to handle mixed and escaped quotes. Updated to remove quotes. Updated to handle quoted words.
Try this:
text = [[I am 'the text' and '' here is "another text in quotes" and this is the end]]
local e = 0
while true do
local b = e+1
b = text:find("%S",b)
if b==nil then break end
if text:sub(b,b)=="'" then
e = text:find("'",b+1)
b = b+1
elseif text:sub(b,b)=='"' then
e = text:find('"',b+1)
b = b+1
else
e = text:find("%s",b+1)
end
if e==nil then e=#text+1 end
print("["..text:sub(b,e-1).."]")
end
Lua Patterns aren't powerful to handle this task properly. Here is an LPeg solution adapted from the Lua Lexer. It handles both single and double quotes.
local lpeg = require 'lpeg'
local P, S, C, Cc, Ct = lpeg.P, lpeg.S, lpeg.C, lpeg.Cc, lpeg.Ct
local function token(id, patt) return Ct(Cc(id) * C(patt)) end
local singleq = P "'" * ((1 - S "'\r\n\f\\") + (P '\\' * 1)) ^ 0 * "'"
local doubleq = P '"' * ((1 - S '"\r\n\f\\') + (P '\\' * 1)) ^ 0 * '"'
local white = token('whitespace', S('\r\n\f\t ')^1)
local word = token('word', (1 - S("' \r\n\f\t\""))^1)
local string = token('string', singleq + doubleq)
local tokens = Ct((string + white + word) ^ 0)
input = [["This is a string" 'another string' these are words]]
for _, tok in ipairs(lpeg.match(tokens, input)) do
if tok[1] ~= "whitespace" then
if tok[1] == "string" then
print(tok[2]:sub(2,-2)) -- cut off quotes
else
print(tok[2])
end
end
end
Output:
This is a string
another string
these
are
words
I want to convert string text to table and this text must be divided on characters. Every character must be in separate value of table, for example:
a="text"
--converting string (a) to table (b)
--show table (b)
b={'t','e','x','t'}
You could use string.gsub function
t={}
str="text"
str:gsub(".",function(c) table.insert(t,c) end)
Just index each symbol and put it at same position in table.
local str = "text"
local t = {}
for i = 1, #str do
t[i] = str:sub(i, i)
end
The builtin string library treats Lua strings as byte arrays.
An alternative that works on multibyte (Unicode) characters is the
unicode library that
originated in the Selene project.
Its main selling point is that it can be used as a drop-in replacement
for the string library, making most string operations “magically”
Unicode-capable.
If you prefer not to add third party dependencies your task can easily
be implemented using LPeg.
Here is an example splitter:
local lpeg = require "lpeg"
local C, Ct, R = lpeg.C, lpeg.Ct, lpeg.R
local lpegmatch = lpeg.match
local split_utf8 do
local utf8_x = R"\128\191"
local utf8_1 = R"\000\127"
local utf8_2 = R"\194\223" * utf8_x
local utf8_3 = R"\224\239" * utf8_x * utf8_x
local utf8_4 = R"\240\244" * utf8_x * utf8_x * utf8_x
local utf8 = utf8_1 + utf8_2 + utf8_3 + utf8_4
local split = Ct (C (utf8)^0) * -1
split_utf8 = function (str)
str = str and tostring (str)
if not str then return end
return lpegmatch (split, str)
end
end
This snippet defines the function split_utf8() that creates a table
of UTF8 characters (as Lua strings), but returns nil if the string
is not a valid UTF sequence.
You can run this test code:
tests = {
en = [[Lua (/ˈluːə/ LOO-ə, from Portuguese: lua [ˈlu.(w)ɐ] meaning moon; ]]
.. [[explicitly not "LUA"[1]) is a lightweight multi-paradigm programming ]]
.. [[language designed as a scripting language with "extensible ]]
.. [[semantics" as a primary goal.]],
ru = [[Lua ([лу́а], порт. «луна») — интерпретируемый язык программирования, ]]
.. [[разработанный подразделением Tecgraf Католического университета ]]
.. [[Рио-де-Жанейро.]],
gr = [[Η Lua είναι μια ελαφρή προστακτική γλώσσα προγραμματισμού, που ]]
.. [[σχεδιάστηκε σαν γλώσσα σεναρίων με κύριο σκοπό τη δυνατότητα ]]
.. [[επέκτασης της σημασιολογίας της.]],
XX = ">\255< invalid"
}
-------------------------------------------------------------------------------
local limit = 14
for lang, str in next, tests do
io.write "\n"
io.write (string.format ("<%s %3d> ->", lang, #str))
local chars = split_utf8 (str)
if not chars then
io.write " INVALID!"
else
io.write (string.format (" <%3d>", #chars))
for i = 1, #chars > limit and limit or #chars do
io.write (string.format (" %q", chars [i]))
end
end
end
io.write "\n"
Btw., building a table with LPeg is significantly faster than calling
table.insert() repeatedly.
Here are stats for splitting the whole of Gogol’s Dead Souls (in
Russian, 1023814 bytes raw, 571395 characters UTF) on my machine:
library method time in ms
string table.insert() 380
string t [#t + 1] = c 310
string gmatch & for loop 280
slnunicode table.insert() 220
slnunicode t [#t + 1] = c 200
slnunicode gmatch & for loop 170
lpeg Ct (C (...)) 70
You can below code to achieve this easily.
t = {}
str = "text"
for i=1, string.len(str) do
t[i]= (string.sub(str,i,i))
end
for k , v in pairs(t) do
print(k,v)
end
-- 1 t
-- 2 e
-- 3 x
-- 4 t
Using string.sub
string.sub(s, i [, j])
Return a substring of the string passed. The substring starts at i. If the third argument j is not given, the substring will end at the end of the string. If the third argument is given, the substring ends at and includes j.
I'm having issues writing strings to binary in Lua. There is an existing example and I tried modifying it. Take a look:
function StringToBinary()
local file = io.open("file.bin", "wb")
local t = {}
local u = {}
local str = "Hello World"
file:write("string len = " ..#str ..'\n')
math.randomseed(os.time())
for i=1, #str do
t[i] = string.byte(str[i])
file:write(t[i].." ");
end
file:write("\n")
for i=1, #str do
u[i] = math.random(0,255)
file:write(u[i].." ");
end
file:write("\n"..string.char(unpack(t)))
file:write("\n"..string.char(unpack(u)))
file:close()
end
file:write(t[i].." ") and file:write(u[i].." ") write both tables with integer value. However with my last two writes: unpack(t) displays the original text, while unpack(u) displays the binaries.
It's probably string.byte(str[i]) that is mistaken. What should I replace it with? Am I missing something?
t[i] = string.byte(str[i])
is wrong, it should be:
t[i] = string.byte(str, i)