Changing the length of a string (allocating memory for characters) - genie

def get_avail_mb(): int
f: FILE = FILE.open("/proc/meminfo","r")
s: string = ""
while s.length < 200 do s += " "
f.read(s, 200, 1)
var a = s.split("\n")
s = a[2]
t: string = ""
for var i = 0 to s.length
if s[i] <= '9' && s[i] >= '0'
t += s[i].to_string()
return int.parse(t)/1000
Notice how I allocate the string to 200 charaters with while s.length < 200 do s += " " to read bytes into this string from the file? Is there a better way to set the length of a string to N characters in Genie other than appending space character N times?

Probably the best way is to create a fixed size array as a buffer and cast the buffer to a string. This avoids some C warnings when compiled. Compile with valac --pkg posix example.gs:
[indent=4]
uses
Posix
init
print( #"Available memory: $(get_avail_mb()) MB" )
def get_avail_mb():int
f:FILE = FILE.open("/proc/meminfo","r")
buffer:uint8[200]
f.read(buffer, 200, 1)
result:int = 0
match_result:MatchInfo
if ( /^MemAvailable:\s*([0-9]*).*$/m.match(
(string)buffer,
0,
out match_result
))
result = int.parse( match_result.fetch( 1 ))/1000
return result
Alternatively you could try string.nfill ():
[indent=4]
uses
Posix
init
print( #"Available memory: $(get_avail_mb()) MB" )
def get_avail_mb():int
f:FILE = FILE.open("/proc/meminfo","r")
s:string = string.nfill( 200, ' ' )
f.read(s, 200, 1)
result:int = 0
match_result:MatchInfo
if ( /^MemAvailable:\s*([0-9]*).*$/m.match(
s,
0,
out match_result
))
result = int.parse( match_result.fetch( 1 ))/1000
return result

yes there is, just avoid the dreaded for-loop which cannot handle certain corner cases!

Related

How to convert an array to a cstring?

I found (here) the source of a program to compute the MD5 of a file using Nim.
This program no longer compiles (nim 1.4) as the implicit conversion between array to cstring has been disabled.
How can this be fixed?
import md5
import os
proc calculateMD5Incremental(filename: string) : string =
const blockSize: int = 8192
var
c: MD5Context
d: MD5Digest
f: File
bytesRead: int = 0
buffer: array[blockSize, char]
byteTotal: int = 0
#read chunk of file, calling update until all bytes have been read
try:
f = open(filename)
md5Init(c)
bytesRead = f.readBuffer(buffer.addr, blockSize)
while bytesRead > 0:
byteTotal += bytesRead
md5Update(c, buffer, bytesRead) # <--- HERE buffer should be cstring
bytesRead = f.readBuffer(buffer.addr, blockSize)
md5Final(c, d)
except IOError:
echo("File not found.")
finally:
if f != nil:
close(f)
result = $d
if paramCount() > 0:
let arguments = commandLineParams()
echo("MD5: ", calculateMD5Incremental(arguments[0]))
else:
echo("Must pass filename.")
quit(-1)
Note: I'm more interested in the general question and not in MD5, this was the example that came to hand.
In your answer you turned array into string with $. This though does not return "ciao" but "['c', 'i', 'a', 'o']". Not sure if you want that but proper way to do the conversion is as follows:
const blockSize: int = 4
var
c: cstring
buffer: array[blockSize, char]
buffer = ['c','i','a','o']
c = cast[cstring](create(char, blockSize + 1))
moveMem(c[0].addr, buffer[0].addr, blockSize)
assert c == "ciao"
Mind that this is optimal unsafe solution. If you want to be safe but little slower the you can use this code:
const blockSize: int = 4
var
c: cstring
buffer: array[blockSize, char]
temp: string
buffer = ['c','i','a','o']
temp.setLen(blockSize)
for i in 0..<blockSize:
temp[i] = buffer[i]
c = temp
assert c == "ciao"
UPDATE In this context this is completely wrong. As pointed out in the accepted answer, $array will render it in a "screen friendly" way.
Correct implementation
The working procedure for md5(fromFile):
proc calculateMD5Incremental(filename: string) : string =
const blockSize: int = 8192
var
c: MD5Context
d: MD5Digest
f: File
s: cstring
bytesRead: int = 0
buffer: array[blockSize, char]
byteTotal: int = 0
#read chunk of file, calling update until all bytes have been read
try:
f = open(filename)
md5Init(c)
bytesRead = f.readBuffer(buffer.addr, blockSize)
while bytesRead > 0:
byteTotal += bytesRead
s = cast[cstring](create(char, blockSize + 1))
moveMem(s[0].addr, buffer[0].addr, blockSize)
md5Update(c, s, bytesRead)
bytesRead = f.readBuffer(buffer.addr, blockSize)
md5Final(c, d)
except IOError:
stderr.writeLine("ERROR: File not found: ", filename)
finally:
if f != nil:
close(f)
result = $d
Wrong conversion
~I found that a regular string is accepted so $buffer will convert the array to a valid string:~
const blockSize: int = 4
var
c: cstring
buffer: array[blockSize, char]
buffer = ['c','i','a','o']
c = $buffer

Getting all strings in a lua script

I'm trying to encode some strings in my lua script, and since that I have a lua script with over 200k characters, encrypting each string query in the script with a function such as this example below
local string = "stackoverflow"
local string = [[stackoverflow]]
local string = [==[stackoverflow]==]
local string = 'stackoverflow'
to
local string=decode("jkrtbfmviwcfn",519211)
Trying to provide all above results to thread through a gsub and have the gsub encode the string text with a random offset number.
So far, I was only capable of gsubbing full quotation marks through.
function encode(x,offset,a)
for char in string.gmatch(x, "%a") do
local encrypted = string.byte(char) + offset
while encrypted > 122 do
encrypted = encrypted - 26
end
while encrypted < 97 do
encrypted = encrypted + 26
end
a[#a+1] = string.char(encrypted)
end
return table.concat(a)
end
luacode=[==[thatstring.Value="Encryptme!" testvalue.Value=[[string with
a linebreak]] string.Text="STOP!"]==]
luacode=luacode:gsub([=["(.-)"]=],function(s)
print("Caught "..s)
local offset=math.random(1,4)
local encoded=encode(s,offset,{})
return [[decode("]]..encoded..[[",]]..offset..[[)]]
end)
print("\n"..luacode)
With its output being
Caught Encryptme!
Caught STOP!
thatstring.Value=decode("crgvctxqi",4) testvalue.Value=[[string with
a linebreak]] string.Text=decode("opkl",2)
Any better solutions?
local function strings_and_comments(lua_code, callback)
-- lua_code must be valid Lua code (an error may be raised on syntax error)
-- callback will be invoked as callback(object_type, value, start_pos, end_pos)
-- callback("comment", comment_text, start_pos, end_pos) -- for comments
-- callback("string", string_value, start_pos, end_pos) -- for string literals
local objects = {} -- possible comments and string literals in the code
-- search for all start positions of comments (with false positives)
for pos, br1, eq, br2 in lua_code:gmatch"()%-%-(%-*%[?)(=*)(%[?)" do
table.insert(objects, {start_pos = pos,
terminator = br1 == "[" and br2 == "[" and "]"..eq.."]" or "\n"})
end
-- search for all start positions of string literals (with false positives)
for pos, eq in lua_code:gmatch"()%[(=*)%[[%[=]*" do
table.insert(objects, {is_string = true, start_pos = pos,
terminator = "]"..eq.."]"})
end
for pos, quote in lua_code:gmatch"()(['\"])" do
table.insert(objects, {is_string = true, start_pos = pos, quote = quote})
end
table.sort(objects, function(a, b) return a.start_pos < b.start_pos end)
local end_pos = 0
for _, object in ipairs(objects) do
local start_pos, ok, symbol = object.start_pos
if start_pos > end_pos then
if object.terminator == "\n" then
end_pos = lua_code:find("\n", start_pos + 1, true) or #lua_code
-- exclude last spaces and newline
while lua_code:sub(end_pos, end_pos):match"%s" do
end_pos = end_pos - 1
end
elseif object.terminator then
ok, end_pos = lua_code:find(object.terminator, start_pos + 1, true)
assert(ok, "Not a valid Lua code")
else
end_pos = start_pos
repeat
ok, end_pos, symbol = lua_code:find("(\\?.)", end_pos + 1)
assert(ok, "Not a valid Lua code")
until symbol == object.quote
end
local value = lua_code:sub(start_pos, end_pos):gsub("^%-*%s*", "")
if object.terminator ~= "\n" then
value = assert((loadstring or load)("return "..value))()
end
callback(object.is_string and "string" or "comment", value, start_pos, end_pos)
end
end
end
local inv256
local function encode(str)
local seed = math.random(0x7FFFFFFF)
local result = '",'..seed..'))'
if not inv256 then
inv256 = {}
for M = 0, 127 do
local inv = -1
repeat inv = inv + 2
until inv * (2*M + 1) % 256 == 1
inv256[M] = inv
end
end
repeat
seed = seed * 3
until seed > 2^43
local K = 8186484168865098 + seed
result = '(decode("'..str:gsub('.',
function(m)
local L = K % 274877906944 -- 2^38
local H = (K - L) / 274877906944
local M = H % 128
m = m:byte()
local c = (m * inv256[M] - (H - M) / 128) % 256
K = L * 21271 + H + c + m
return ('%02x'):format(c)
end
)..result
return result
end
function hide_strings_in_lua_code(lua_code)
local text = { [[
local function decode(str, seed)
repeat
seed = seed * 3
until seed > 2^43
local K = 8186484168865098 + seed
return (str:gsub('%x%x',
function(c)
local L = K % 274877906944 -- 2^38
local H = (K - L) / 274877906944
local M = H % 128
c = tonumber(c, 16)
local m = (c + (H - M) / 128) * (2*M + 1) % 256
K = L * 21271 + H + c + m
return string.char(m)
end
))
end
]] }
local pos = 1
strings_and_comments(lua_code,
function (object_type, value, start_pos, end_pos)
if object_type == "string" then
table.insert(text, lua_code:sub(pos, start_pos - 1))
table.insert(text, encode(value))
pos = end_pos + 1
end
end)
table.insert(text, lua_code:sub(pos))
return table.concat(text)
end
Usage:
math.randomseed(os.time())
-- This is the program to be converted
local luacode = [===[
print"Hello world!"
print[[string with
a linebreak]]
local str1 = "stackoverflow"
local str2 = [[stackoverflow]]
local str3 = [==[stackoverflow]==]
local str4 = 'stackoverflow'
print(str1)
print(str2)
print(str3)
print(str4)
]===]
-- Conversion
print(hide_strings_in_lua_code(luacode))
Output (converted program)
local function decode(str, seed)
repeat
seed = seed * 3
until seed > 2^43
local K = 8186484168865098 + seed
return (str:gsub('%x%x',
function(c)
local L = K % 274877906944 -- 2^38
local H = (K - L) / 274877906944
local M = H % 128
c = tonumber(c, 16)
local m = (c + (H - M) / 128) * (2*M + 1) % 256
K = L * 21271 + H + c + m
return string.char(m)
end
))
end
print(decode("ef869b23b69b7fbc7f89bbe7",2686976))
print(decode("c2dc20f7061c452db49302f8a1d9317aad1009711e0984",1210253312))
local str1 = (decode("84854df4599affe9c894060431",415105024))
local str2 = (decode("a5d7db792f0b514417827f34e3",1736704000))
local str3 = (decode("6a61bcf9fd6f403ed1b4846e58",1256259584))
local str4 = (decode("cad56d9dea239514aca9c8b8e0",1030488064))
print(str1)
print(str2)
print(str3)
print(str4)
Output of output (output produced by the converted program)
Hello world!
string with
a linebreak
stackoverflow
stackoverflow
stackoverflow
stackoverflow

How to read a C generated binary file in Lua

I want to read a 32 bit integer binary file provided by another program. The file contains only integer and no other characters (like spaces or commas). The C code to read this file is as follows:
FILE* pf = fopen("C:/rktemp/filename.dat", "r");
int sz = width*height;
int* vals = new int[sz];
int elread = fread((char*)vals, sizeof(int), sz, pf);
for( int j = 0; j < height; j++ )
{
for( int k = 0; k < width; k++ )
{
int i = j*width+k;
labels[i] = vals[i];
}
}
delete [] vals;
fclose(pf);
But I don't know how to read this file into array using Lua.
I've tried to read this file using io.read, but part of the array looks like this:
~~~~~~xxxxxxxxyyyyyyyyyyyyyyzzzzzzzz{{{{{{{{{|||||||||}}}}}}}}}}}~~~~~~~~~xxxxxxxyyyyyyyyyyyyyyzzzzzz{{{{{{{{{{|||||||||}}}}}}}}}}}~~~~~~~~~xxyyyyyyyyyyyyyzzzzz{{{{{{|||}}}yyyyyyyyyyyz{{{yyyyyyyyÞľūơǿȵɶʢ˺̤̼ͽаҩӱľǿجٴȵɶʢܷݸ˺໻⼼ӱľǿ
Also the Matlab code to read this file is like this:
row = image_size(1);
colomn = image_size(2);
fid = fopen(data_path,'r');
A = fread(fid, row * colomn, 'uint32')';
A = A + 1;
B = reshape(A,[colomn, row]);
B = B';
fclose(fid);
I've tried a function to convert bytes to integer, my code is like this:
function bytes_to_int(b1, b2, b3, b4)
if not b4 then error("need four bytes to convert to int",2) end
local n = b1 + b2*256 + b3*65536 + b4*16777216
n = (n > 2147483647) and (n - 4294967296) or n
return n
end
local sup_filename = '1.dat'
fid = io.open(sup_filename, "r")
st = bytes_to_int(fid:read("*all"):byte(1,4))
print(st)
fid:close()
But it still not read this file properly.
You are only calling bytes_to_int once. You need to call it for every int you want to read. e.g.
fid = io.open(sup_filename, "rb")
while true do
local bytes = fid:read(4)
if bytes == nil then break end -- EOF
local st = bytes_to_int(bytes:byte(1,4))
print(st)
end
fid:close()
Now you can use the new feature of Lua language by calling string.unpack , which has many conversion options for format string. Following options may be useful:
< sets little endian
> sets big endian
= sets native endian
i[n] a signed int with n bytes (default is native size)
I[n] an unsigned int with n bytes (default is native size)
The arch of your PC is unknown, so I assume the data to read is unsigned and native-endian.
Since you are reading binary data from the file, you should use io.open(sup_filename, "rb").
The following code may be useful:
local fid = io.open(sup_filename, "rb")
local contents = fid:read("a")
local now
while not now or now < #contents do
local n, now = string.unpack("=I4", contents, now)
print(n)
end
fid:close()
see also: Lua 5.4 manual

Finding minimum moves required for making 2 strings equal

This is a question from one of the online coding challenge (which has completed).
I just need some logic for this as to how to approach.
Problem Statement:
We have two strings A and B with the same super set of characters. We need to change these strings to obtain two equal strings. In each move we can perform one of the following operations:
1. swap two consecutive characters of a string
2. swap the first and the last characters of a string
A move can be performed on either string.
What is the minimum number of moves that we need in order to obtain two equal strings?
Input Format and Constraints:
The first and the second line of the input contains two strings A and B. It is guaranteed that the superset their characters are equal.
1 <= length(A) = length(B) <= 2000
All the input characters are between 'a' and 'z'
Output Format:
Print the minimum number of moves to the only line of the output
Sample input:
aab
baa
Sample output:
1
Explanation:
Swap the first and last character of the string aab to convert it to baa. The two strings are now equal.
EDIT : Here is my first try, but I'm getting wrong output. Can someone guide me what is wrong in my approach.
int minStringMoves(char* a, char* b) {
int length, pos, i, j, moves=0;
char *ptr;
length = strlen(a);
for(i=0;i<length;i++) {
// Find the first occurrence of b[i] in a
ptr = strchr(a,b[i]);
pos = ptr - a;
// If its the last element, swap with the first
if(i==0 && pos == length-1) {
swap(&a[0], &a[length-1]);
moves++;
}
// Else swap from current index till pos
else {
for(j=pos;j>i;j--) {
swap(&a[j],&a[j-1]);
moves++;
}
}
// If equal, break
if(strcmp(a,b) == 0)
break;
}
return moves;
}
Take a look at this example:
aaaaaaaaab
abaaaaaaaa
Your solution: 8
aaaaaaaaab -> aaaaaaaaba -> aaaaaaabaa -> aaaaaabaaa -> aaaaabaaaa ->
aaaabaaaaa -> aaabaaaaaa -> aabaaaaaaa -> abaaaaaaaa
Proper solution: 2
aaaaaaaaab -> baaaaaaaaa -> abaaaaaaaa
You should check if swapping in the other direction would give you better result.
But sometimes you will also ruin the previous part of the string. eg:
caaaaaaaab
cbaaaaaaaa
caaaaaaaab -> baaaaaaaac -> abaaaaaaac
You need another swap here to put back the 'c' to the first place.
The proper algorithm is probably even more complex, but you can see now what's wrong in your solution.
The A* algorithm might work for this problem.
The initial node will be the original string.
The goal node will be the target string.
Each child of a node will be all possible transformations of that string.
The current cost g(x) is simply the number of transformations thus far.
The heuristic h(x) is half the number of characters in the wrong position.
Since h(x) is admissible (because a single transformation can't put more than 2 characters in their correct positions), the path to the target string will give the least number of transformations possible.
However, an elementary implementation will likely be too slow. Calculating all possible transformations of a string would be rather expensive.
Note that there's a lot of similarity between a node's siblings (its parent's children) and its children. So you may be able to just calculate all transformations of the original string and, from there, simply copy and recalculate data involving changed characters.
You can use dynamic programming. Go over all swap possibilities while storing all the intermediate results along with the minimal number of steps that took you to get there. Actually, you are going to calculate the minimum number of steps for every possible target string that can be obtained by applying given rules for a number times. Once you calculate it all, you can print the minimum number of steps, which is needed to take you to the target string. Here's the sample code in JavaScript, and its usage for "aab" and "baa" examples:
function swap(str, i, j) {
var s = str.split("");
s[i] = str[j];
s[j] = str[i];
return s.join("");
}
function calcMinimumSteps(current, stepsCount)
{
if (typeof(memory[current]) !== "undefined") {
if (memory[current] > stepsCount) {
memory[current] = stepsCount;
} else if (memory[current] < stepsCount) {
stepsCount = memory[current];
}
} else {
memory[current] = stepsCount;
calcMinimumSteps(swap(current, 0, current.length-1), stepsCount+1);
for (var i = 0; i < current.length - 1; ++i) {
calcMinimumSteps(swap(current, i, i + 1), stepsCount+1);
}
}
}
var memory = {};
calcMinimumSteps("aab", 0);
alert("Minimum steps count: " + memory["baa"]);
Here is the ruby logic for this problem, copy this code in to rb file and execute.
str1 = "education" #Sample first string
str2 = "cnatdeiou" #Sample second string
moves_count = 0
no_swap = 0
count = str1.length - 1
def ends_swap(str1,str2)
str2 = swap_strings(str2,str2.length-1,0)
return str2
end
def swap_strings(str2,cp,np)
current_string = str2[cp]
new_string = str2[np]
str2[cp] = new_string
str2[np] = current_string
return str2
end
def consecutive_swap(str,current_position, target_position)
counter=0
diff = current_position > target_position ? -1 : 1
while current_position!=target_position
new_position = current_position + diff
str = swap_strings(str,current_position,new_position)
# p "-------"
# p "CP: #{current_position} NP: #{new_position} TP: #{target_position} String: #{str}"
current_position+=diff
counter+=1
end
return counter,str
end
while(str1 != str2 && count!=0)
counter = 1
if str1[-1]==str2[0]
# p "cross match"
str2 = ends_swap(str1,str2)
else
# p "No match for #{str2}-- Count: #{count}, TC: #{str1[count]}, CP: #{str2.index(str1[count])}"
str = str2[0..count]
cp = str.rindex(str1[count])
tp = count
counter, str2 = consecutive_swap(str2,cp,tp)
count-=1
end
moves_count+=counter
# p "Step: #{moves_count}"
# p str2
end
p "Total moves: #{moves_count}"
Please feel free to suggest any improvements in this code.
Try this code. Hope this will help you.
public class TwoStringIdentical {
static int lcs(String str1, String str2, int m, int n) {
int L[][] = new int[m + 1][n + 1];
int i, j;
for (i = 0; i <= m; i++) {
for (j = 0; j <= n; j++) {
if (i == 0 || j == 0)
L[i][j] = 0;
else if (str1.charAt(i - 1) == str2.charAt(j - 1))
L[i][j] = L[i - 1][j - 1] + 1;
else
L[i][j] = Math.max(L[i - 1][j], L[i][j - 1]);
}
}
return L[m][n];
}
static void printMinTransformation(String str1, String str2) {
int m = str1.length();
int n = str2.length();
int len = lcs(str1, str2, m, n);
System.out.println((m - len)+(n - len));
}
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
String str1 = scan.nextLine();
String str2 = scan.nextLine();
printMinTransformation("asdfg", "sdfg");
}
}

How do you sort and efficiently find elements in a cell array (of strings) in Octave?

Is there built-in functionality for this?
GNU Octave search a cell array of strings in linear time O(n):
(The 15 year old code in this answer was tested and correct on GNU Octave 3.8.2, 5.2.0 and 7.1.0)
The other answer has cellidx which was depreciated by octave, it still runs but they say to use ismember instead, like this:
%linear time string index search.
a = ["hello"; "unsorted"; "world"; "moobar"]
b = cellstr(a)
%b =
%{
% [1,1] = hello
% [2,1] = unsorted
% [3,1] = world
% [4,1] = moobar
%}
find(ismember(b, 'world')) %returns 3
ismember finds 'world' in index slot 3. This is a expensive linear time O(n) operation because it has to iterate through all elements whether or not it is found.
To achieve a logarathmic time O(log n) solution, then your list needs to come pre-sorted and then you can use binary search:
If your cell array is already sorted, you can do O(log-n) worst case:
function i = binsearch(array, val, low, high)
%binary search algorithm for numerics, Usage:
%myarray = [ 30, 40, 50.15 ]; %already sorted list
%binsearch(myarray, 30, 1, 3) %item 30 is in slot 1
if ( high < low )
i = 0;
else
mid = floor((low + high) / 2);
if ( array(mid) > val )
i = binsearch(array, val, low, mid-1);
elseif ( array(mid) < val )
i = binsearch(array, val, mid+1, high);
else
i = mid;
endif
endif
endfunction
function i = binsearch_str(array, val, low, high)
% binary search for strings, usage:
%myarray2 = [ "abc"; "def"; "ghi"]; #already sorted list
%binsearch_str(myarray2, "abc", 1, 3) #item abc is in slot 1
if ( high < low )
i = 0;
else
mid = floor((low + high) / 2);
if ( mystrcmp(array(mid, [1:end]), val) == 1 )
i = binsearch(array, val, low, mid-1);
elseif ( mystrcmp(array(mid, [1:end]), val) == -1 )
i = binsearch_str(array, val, mid+1, high);
else
i = mid;
endif
endif
endfunction
function ret = mystrcmp(a, b)
%this function is just an octave string compare, its behavior follows the
%strcmp(str1,str2)'s in C and java.lang.String.compareTo(...)'s in Java,
%that is:
% -returns 1 if string a > b
% -returns 0 if string a == b
% -return -1 if string a < b
% The gt() operator does not support cell array. If the single word
% is passed as an one-element cell array, converts it to a string.
a_as_string = a;
if iscellstr( a )
a_as_string = a{1}; %a was passed as a single-element cell array.
endif
% The gt() operator does not support cell array. If the single word
% is passed as an one-element cell array, converts it to a string.
b_as_string = b;
if iscellstr( b )
b_as_string = b{1}; %b was passed as a single-element cell array.
endif
% Space-pad the shortest word so as they can be used with gt() and lt() operators.
if length(a_as_string) > length( b_as_string )
b_as_string( length( b_as_string ) + 1 : length( a_as_string ) ) = " ";
elseif length(a_as_string) < length( b_as_string )
a_as_string( length( a_as_string ) + 1 : length( b_as_string ) ) = " ";
endif
letters_gt = gt(a_as_string, b_as_string); %list of boolean a > b
letters_lt = lt(a_as_string, b_as_string); %list of boolean a < b
ret = 0;
%octave makes us roll our own string compare because
%strings are arrays of numerics
len = length(letters_gt);
for i = 1:len
if letters_gt(i) > letters_lt(i)
ret = 1;
return
elseif letters_gt(i) < letters_lt(i)
ret = -1;
return
endif
end;
endfunction
%Assuming that myarray is already sorted, (it must be for binary
%search to finish in logarithmic time `O(log-n))` worst case, then do
myarray = [ 30, 40, 50.15 ]; %already sorted list
binsearch(myarray, 30, 1, 3) %item 30 is in slot 1
binsearch(myarray, 40, 1, 3) %item 40 is in slot 2
binsearch(myarray, 50, 1, 3) %50 does not exist so return 0
binsearch(myarray, 50.15, 1, 3) %50.15 is in slot 3
%same but for strings:
myarray2 = [ "abc"; "def"; "ghi"]; %already sorted list
binsearch_str(myarray2, "abc", 1, 3) %item abc is in slot 1
binsearch_str(myarray2, "def", 1, 3) %item def is in slot 2
binsearch_str(myarray2, "zzz", 1, 3) %zzz does not exist so return 0
binsearch_str(myarray2, "ghi", 1, 3) %item ghi is in slot 3
To sort your array if it isn't already:
Complexity of sorting depends on the kind of data you have and whatever sorting algorithm GNU octave language writers selected, it's somewhere between O(n*log(n)) and O(n*n).
myarray = [ 9, 40, -3, 3.14, 20 ]; %not sorted list
myarray = sort(myarray)
myarray2 = [ "the"; "cat"; "sat"; "on"; "the"; "mat"]; %not sorted list
myarray2 = sortrows(myarray2)
Code buffs to make this backward compatible with GNU Octave 3. 5. and 7. goes to #Paulo Carvalho in the other answer here.
Yes check this: http://www.obihiro.ac.jp/~suzukim/masuda/octave/html3/octave_36.html#SEC75
a = ["hello"; "world"];
c = cellstr (a)
⇒ c =
{
[1,1] = hello
[2,1] = world
}
>>> cellidx(c, 'hello')
ans = 1
>>> cellidx(c, 'world')
ans = 2
The cellidx solution does not meet the OP's efficiency requirement, and is deprecated (as noted by help cellidx).
Håvard Geithus in a comment suggested using the lookup() function on a sorted cell array of strings, which is significantly more efficient than cellidx. It's still a binary search though, whereas most modern languages (and even many 20 year old ones) give us easy access to associative arrays, which would be a much better approach.
While Octave doesn't obviously have associated arrays, that's effectively what the interpreter is using for ocatve's variables, including structs, so you can make us of that, as described here:
http://math-blog.com/2011/05/09/associative-arrays-and-cellular-automata-in-octave/
Built-in Function: struct ("field", value, "field", value,...)
Built-in Function: isstruct (expr)
Built-in Function: rmfield (s, f)
Function File: [k1,..., v1] = setfield (s, k1, v1,...)
Function File: [t, p] = orderfields (s1, s2)
Built-in Function: fieldnames (struct)
Built-in Function: isfield (expr, name)
Function File: [v1,...] = getfield (s, key,...)
Function File: substruct (type, subs,...)
Converting Matlab to Octave is there a containers.Map equivalent? suggests using javaObject("java.util.Hashtable"). That would come with some setup overhead, but would be a performance win if you're using it a lot. It may even be viable to link in some library written in C or C++? Do think about whether this is a maintainable option though.
Caveat: I'm relatively new to Octave, and writing this up as I research it myself (which is how I wound up here). I haven't yet run tests on the efficiency of these techniques, and while I've got a fair knowledge of the underlying algorithms, I may be making unreasonable assumptions about what's actually efficient in Octave.
This is a version of mystrcmp() that works in Octave of recent version (7.1.0):
function ret = mystrcmp(a, b)
%this function is just an octave string compare, its behavior follows the
%strcmp(str1,str2)'s in C and java.lang.String.compareTo(...)'s in Java,
%that is:
% -returns 1 if string a > b
% -returns 0 if string a == b
% -return -1 if string a < b
% The gt() operator does not support cell array. If the single word
% is passed as an one-element cell array, converts it to a string.
a_as_string = a;
if iscellstr( a )
a_as_string = a{1}; %a was passed as a single-element cell array.
endif
% The gt() operator does not support cell array. If the single word
% is passed as an one-element cell array, converts it to a string.
b_as_string = b;
if iscellstr( b )
b_as_string = b{1}; %b was passed as a single-element cell array.
endif
% Space-pad the shortest word so as they can be used with gt() and lt() operators.
if length(a_as_string) > length( b_as_string )
b_as_string( length( b_as_string ) + 1 : length( a_as_string ) ) = " ";
elseif length(a_as_string) < length( b_as_string )
a_as_string( length( a_as_string ) + 1 : length( b_as_string ) ) = " ";
endif
letters_gt = gt(a_as_string, b_as_string); %list of boolean a > b
letters_lt = lt(a_as_string, b_as_string); %list of boolean a < b
ret = 0;
%octave makes us roll our own string compare because
%strings are arrays of numerics
len = length(letters_gt);
for i = 1:len
if letters_gt(i) > letters_lt(i)
ret = 1;
return
elseif letters_gt(i) < letters_lt(i)
ret = -1;
return
endif
end;
endfunction

Resources