Most efficient interval type search in Elixir - search

I am starting my journey with Elixir and am looking for some advice on how best to approach a particular problem.
I have a data set that needs to be searched as quickly as possible. The data consists of two numbers that form an enclosed band and some meta data associated with each band.
For example:
From,To,Data
10000,10999,MetaData1
11000,11999,MetaData2
12000,12499,MetaData3
12500,12999,MetaData4
This data set could have upwards of 100,000 entries.
I have a struct defined that models the data, along with a parser that creates an Elixir list in-memory representation.
defmodule Band do
defstruct from: 0, to: 0, metadata: 0
end
The parser returns a list of the Band struct. I defined a find method that uses a list comprehension
defp find_metadata(bands, number) do
match? = fn(x) -> x.from <= number and x.to >= number end
[match | _ ] = for band <- bands, match?.(band), do: band
{ :find, band }
end
Based on my newbie knowledge, using the list comprehension will require a full traversal of the list. To avoid scanning the full list, I have used search trees in other languages.
Is there an algorithm/mechanism/approach available in Elixir that would a more efficient approach for this type of search problem?
Thank you.

If the bands are mutually exclusive you could structure them into a tree sorted by from. Searching through that tree should take log(n) time. Something like the following should work:
defmodule Tree do
defstruct left: nil, right: nil, key: nil, value: nil
def empty do
nil
end
def insert(tree, value = {key, _}) do
cond do
tree == nil -> %Tree{left: empty, right: empty, key: key, value: value}
key < tree.key -> %{tree | left: insert(tree.left, value)}
true -> %{tree | right: insert(tree.right, value)}
end
end
def find_interval(tree, value) do
cond do
tree == nil -> nil
value < tree.key -> find_interval(tree.left, value)
between(tree.value, value) -> tree.value
true -> find_interval(tree.right, value)
end
end
def between({left, right}, value) do
value >= left and value <= right
end
end
Note that you can also use Ranges to store the "bands" as you call them. Also note that the tree isn't balanced. A simple scheme to (probably) achieve a balanced tree is to shuffle the intervals before inserting them. Otherwise you'd need to have a more complex implementation that balances the tree. You can look at erlang's gb_trees for inspiration.

Related

How can I generate a graph by constraining it to be subisomorphic to a given graph, while not subisomorphic to another?

TL;DR: How can I generate a graph while constraining it to be subisomorph to every graph in a positive list while being non-subisomorph to every graph in a negative list?
I have a list of directed heterogeneous attributed graphs labeled as positive or negative. I would like to find the smallest list of patterns(graphs with special values) such that:
Every input graph has a pattern that matches(= 'P is subisomorphic to G, and the mapped nodes have the same attribute values')
A positive pattern can only match a positive graph
A positive pattern does not match any negative graph
A negative pattern can only match a negative graph
A negative pattern does not match any negative graph
Exemple:
Input g1(+),g2(-),g3(+),g4(+),g5(-),g6(+)
Acceptable solution: p1(+),p2(+),p3(-) where p1(+) matches g1(+) and g4(+); p2(+) matches g3(+) and g6(+); and p3(-) matches g2(-) and g5(-)
Non acceptable solution: p1(+),p2(-) where p1(+) matches g1(+),g2(-),g3(+); p2(-) matches g4(+),g5(-),g6(+)
Currently, I'm able to generate graphs matching every graph in a list, but I can't manage to enforce the constraint 'A positive pattern does not match any negative graph'. I made a predicate 'matches', which takes as input a pattern and a graph, and uses a local array of variables 'mapping' to try and map nodes together. But when I try to use that predicate in a negative context, the following error is returned: MiniZinc: flattening error: free variable in non-positive context.
How can I bypass that limitation? I tried to code the opposite predicate 'not_matches' but I've not yet found how to specify 'for all node mapping, the isomorphism is invalid'. I also can't define the mapping outside the predicate, because a pattern can match a graph more than once and i need to be able to get all mappings.
Here is a reproductible exemple:
include "globals.mzn";
predicate p(array [1..5] of var 0..10:arr1, array [1..5] of 1..10:arr2)=
let{array [1..5] of var 1..5: mapping; constraint all_different(mapping)} in (forall(i in 1..5)(arr1[i]=0\/arr1[i]=arr2[mapping[i]]));
array [1..5] of var 0..10:arr;
constraint p(arr,[1,2,3,4,5]);
constraint p(arr,[1,2,3,4,6]);
constraint not p(arr,[1,2,3,5,6]);
solve satisfy;
For that exemple, the decision variable is an array and the predicate p is true if a mapping exists such that the values of the array are mapped together. One or more elements of the array can also be 0, used here as a wildcard.
[1,2,3,4,0] is an acceptable solution
[0,0,0,0,0] is not acceptable, it matches anything. And the solution should not match [1,2,3,5,6]
[1,2,3,4,7] is not acceptable, it doesn't match anything(as there is no 7 in the parameter arrays)
Thanks by advance! =)
Edit: Added non-acceptable solutions
It is probably good to note that MiniZinc's limitation is not coincidental. When the creation of a free variable is negated, rather then finding a valid assignment for the variable, instead the model would have to prove that no such valid assignment exists. This is a much harder problem that would bring MiniZinc into the field of quantified constraint programming. The only general solution (to still receive the same flattened constraint model) would be to iterate over all possible values for each variable and enforce the negated constraints. Since the number of possibilities quickly explodes and the chance of getting a good model is small, MiniZinc does not do this automatically and throws this error instead.
This technique would work in your case as well. In the not_matches version of your predicate, you can iterate over all possible permutations (the possible mappings) and enforce that they not correct (partial) mappings. This would be a correct way to enforce the constraint, but would quickly explode. I believe, however, that there is a different way to enforce this constraint that will work better.
My idea stems from the fact that, although the most natural way to describe a permutation from one array to the another is to actually create the assignment from the first to the second, when dealing with discrete variables, you can instead enforce that each has the exact same number of each possible value. As such a predicate that enforces X is a permutation of Y might be written as:
predicate is_perm(array[int] of var $$E: X, array[int] of var $$E: Y) =
let {
array[int] of int: vals = [i | i in (dom_array(X) union dom_array(Y))]
} in global_cardinality(X, vals) = global_cardinality(Y, vals);
Notably this predicate can be negated because it doesn't contain any free variables. All new variables (the resulting values of global_cardinality) are functionally defined. When negated, only the relation = has to be changed to !=.
In your model, we are not just considering full permutations, but rather partial permutations, and we use a dummy value otherwise. As such, the p predicate might also be written:
predicate p(array [int] of var 0..10: X, array [int] of var 1..10: Y) =
let {
set of int: vals = lb_array(Y)..ub_array(Y); % must not include dummy value
array[vals] of var int: countY = global_cardinality(Y, [i | i in vals]);
array[vals] of var int: countX = global_cardinality(X, [i | i in vals]);
} in forall(i in vals) (countX[i] <= countY[i]);
Again this predicate does not contain any free variables, and can be negated. In this case, the forall can be changed into a exist with a negated body.
There are a few things that we can still do to optimise p for this use case. First, it seems that global_cardinality is only defined for variables, but since Y is guaranteed par, we can rewrite it and have the correct counts during MiniZinc's compilation. Second, it can be seen that lb_array(Y)..ub_array(Y) gives the tighest possible set. In your example, this means that only slightly different versions of the global cardinality function are evaluated, that could have been
predicate p(array [1..5] of var 0..10: X, array [1..5] of 1..10: Y) =
let {
% CHANGE: Use declared values of Y to ensure CSE will reuse `global_cardinality` result values.
set of int: vals = 1..10; % do not include dummy value
% CHANGE: parameter evaluation of global_cardinality
array[vals] of int: countY = [count(j in index_set(Y)) (i = Y[j]) | i in vals];
array[vals] of var int: countX = global_cardinality(X, [i | i in 1..10]);
} in forall(i in vals) (countX[i] <= countY[i]);
Regarding the example. One approach might be to rewrite the not p(...) constraint to a specific not_p(...) constraint. But I'm how sure how that be formulated.
Here's an example but it's probably not correct:
predicate not_p(array [1..5] of var 0..10:arr1, array [1..5] of 1..10:arr2)=
let{
array [1..5] of var 1..5: mapping;
constraint all_different(mapping)
} in
exists(i in 1..5)(
arr1[i] != 0
/\
arr1[i] != arr2[mapping[i]]
);
This give 500 solutions such as
arr = [1, 0, 0, 0, 0];
----------
arr = [2, 0, 0, 0, 0];
----------
arr = [3, 0, 0, 0, 0];
...
----------
arr = [2, 0, 0, 3, 4];
----------
arr = [2, 0, 1, 3, 4];
----------
arr = [2, 1, 0, 3, 4];
Update
I added not before the exists loop.

Nim: Find index of element in seq based on predicate

If I have a sequence of values, how would I find the index of an element based on a predicate function? For example, if I had the following seq:
let values = #["pie", "cake", "ice cream"]
How would I find the index of the first element with four characters? I know of find, but it seems to only find index by equality, and does not allow passing a predicate. I could implement this myself but it feels as if it should be be in the standard library if find is.
A simple solution would be to use map from sequtils to map the predicate over the input sequence, and then to use find to get the index of the first true value in the result. This returns -1 when no element of the sequence satisfies the predicate:
import sequtils
proc isLen4(s: string): bool =
len(s) == 4
echo map(#["pie", "cake", "ice cream"], isLen4).find(true) #--> 1
This works, but is bad for large sequences since map processes the entire sequence. Thus even when the first element satisfies the predicate the entire sequence is processed. It would be better to just write a findIf procedure that returns the current index when the predicate is satisfied instead of continuing to process the rest of the input:
proc findIf[T](s: seq[T], pred: proc(x: T): bool): int =
result = -1 # return -1 if no items satisfy the predicate
for i, x in s:
if pred(x):
result = i
break
echo #["pie", "cake", "ice cream"].findIf(isLen4) #--> 1

Permutation to disjoint cycles in Haskell

I was trying to implement permutation to cycles in Haskell without using Monad. The problem is as follow: given a permutation of numbers [1..n], output the correspondence disjoint cycles. The function is defined like
permToCycles :: [Int] -> [[Int]]
For the input:
permToCycles [3,5,4,1,2]
The output should be
[[3,4,1],[5,2]]
By the definition of cyclic permutation, the algorithm itself is straightforward. Since [3,5,4,1,2] is a permutation of [1,2,3,4,5], we start from the first element 3 and follow the orbit until we get back to 3. In this example, we have two cycles 3 -> 4 -> 1 -> 3. Continue to do so until we traverse all elements. Thus the output is [[3,4,1],[5,2]].
Using this idea, it is fairly easy to implement in any imperative language, but I have trouble with doing it in Haskell. I find something similar in the module Math.Combinat.Permutations, but the implementation of function permutationToDisjointCycles uses Monad, which is not easy to understand as I'm a beginner.
I was wondering if I could implement it without Monad. Any help is appreciated.
UPDATE: Here is the function implemented in Python.
def permToCycles(perm):
pi_dict = {i+1: perm[i]
for i in range(len(perm))} # permutation as a dictionary
cycles = []
while pi_dict:
first_index = next(iter(pi_dict)) # take the first key
this_elem = pi_dict[first_index] # the first element in perm
next_elem = pi_dict[this_elem] # next element according to the orbit
cycle = []
while True:
cycle.append(this_elem)
# delete the item in the dict when adding to cycle
del pi_dict[this_elem]
this_elem = next_elem
if next_elem in pi_dict:
# continue the cycle
next_elem = pi_dict[next_elem]
else:
# end the cycle
break
cycles.append(cycle)
return cycles
print(permToCycles([3, 5, 4, 1, 2]))
The output is
[[3,4,1],[5,2]]
I think the main obstacle when implementing it in Haskell is how to trace the marked (or unmarked) elements. In Python, it can easily be done using a dictionary as I showed above. Also in functional programming, we tend to use recursion to replace loops, but here I have trouble with thinking the recursive structure of this problem.
Let's start with the basics. You hopefully started with something like this:
permutationToDisjointCycles :: [Int] -> [[Int]]
permutationToDisjointCycles perm = ...
We don't actually want to recur on the input list so much as we want to use an index counter. In this case, we'll want a recursive helper function, and the next step is to just go ahead and call it, providing whatever arguments you think you'll need. How about something like this:
permutationToDisjointCycles perm = cycles [] 0
where
cycles :: [Int] -> Int -> [[Int]]
cycles seen ix = ...
Instead of declaring a pi_dict variable like in Python, we'll start with a seen list as an argument (I flipped it around to keeping track of what's been seen because that ends up being a little easier). We do the same with the counting index, which I here called ix. Let's consider the cases:
cycles seen ix
| ix >= length perm = -- we've reached the end of the list
| ix `elem` seen = -- we've already seen this index
| otherwise = -- we need to generate a cycle.
That last case is the interesting one and corresponds to the inner while loop of the Python code. Another while loop means, you guessed it, more recursion! Let's make up another function that we think will be useful, passing along as arguments what would have been variables in Python:
| otherwise = let c = makeCycle ix ix in c : cycles (c ++ seen) (ix+1)
makeCycle :: Int -> Int -> [Int]
makeCycle startIx currentIx = ...
Because it's recursive, we'll need a base case and recursive case (which corresponds to the if statement in the Python code which either breaks the loop or continues it). Rather than use the seen list, it's a little simpler to just check if the next element equals the starting index:
makeCycle startIx currentIx =
if next == start
then -- base case
else -- recursive call, where we attach an index onto the cycle and recur
where next = perm !! i
I left a couple holes that need to be filled in as an exercise, and this version works on 0-indexed lists rather than 1-indexed ones like your example, but the general shape of the algorithm is there.
As a side note, the above algorithm is not super efficient. It uses lists for both the input list and the "seen" list, and lookups in lists are always O(n) time. One very simple performance improvement is to immediately convert the input list perm into an array/vector, which has constant time lookups, and then use that instead of perm !! i at the end.
The next improvement is to change the "seen" list into something more efficient. To match the idea of your Python code, you could change it to a Set (or even a HashSet), which has logarithmic time lookups (or constant with a hashset).
The code you found Math.Combinat.Permutations actually uses an array of Booleans for the "seen" list, and then uses the ST monad to do imperative-like mutation on that array. This is probably even faster than using Set or HashSet, but as you yourself could tell, readability of the code suffers a bit.

Elixir - Convert a string number or empty string to a float or nil

I am trying to convert the price field, which is a string (eg "2.22" or "") to a float or nil, and then add it to the database.
def insert_product_shop(conn, product_id, shop_id, price) do
priceFloat = nil
if price not in [""] do
price = elem(Float.parse(price), 0)
priceFloat = price / 1
IO.inspect(priceFloat)
else
priceFloat = nil
end
IO.inspect(priceFloat)
changeset = Api.ProductShop.changeset(%Api.ProductShop{
p_id: product_id,
s_id: shop_id,
price: priceFloat,
not_in_shop_count: 0,
is_in_shop_count: 0
})
errors = changeset.errors
valid = changeset.valid?
IO.inspect(changeset)
case insert(changeset) do
{:ok, product_shop} ->
{:ok, product_shop}
{:error, changeset} ->
{:error, :failure}
end
end
the output is:
2.22
nil
#Ecto.Changeset<action: nil, changes: %{}, errors: [], data: #Api.ProductShop<>,
valid?: true>
13:25:41.745 [debug] QUERY OK db=2.0ms
INSERT INTO "product_shops" ("is_in_shop_count","not_in_shop_count","p_id","s_id") VALUES ($1,$2,$3,$4) RETURNING "id" [0, 0, 40, 1]
As the output shows, priceFloat becomes nil, I assume because when I set it to 2.22 it was out of scope. Maybe my code is too imperative. How can I rewrite this to convert "2.22" to 2.22 without making it nil, and allow "" to be converted to nil?
As the output shows, priceFloat becomes nil, I assume because when I set it to 2.22 it was out of scope.
Almost right. Rather that the variable you are trying to set being out of scope, the problem is that the variable you assign to inside the if statement goes out of scope. It just happens to have the same name as the variable outside the if statement.
The solution is to assign the result of the if/else statement to the variable. Here is your code with minimal changes:
price = "2.22"
priceFloat =
if price not in [""] do
elem(Float.parse(price), 0)
else
nil
end
IO.inspect(priceFloat)
However, it's still not very idiomatic. You can take advantage of the fact that Float.parse/1 returns :error when the input is the empty string to write it like with a case expression:
priceFloat =
case Float.parse(price) do
{float, ""} -> float
:error -> nil
end
You can use case to evaluate the returned value by Float.parse and assign nil when it returns :error, assuming that the purpose of your if is to avoid the parsing error
def insert_product_shop(conn, product_id, shop_id, price) do
priceFloat = case Float.parse(price) do
{value, _remainder} -> value
:error -> nil
end
...
end
You can use a combination of pattern matching and method overloading to solve the problem:
defmodule Example do
def parsePrice(""), do: nil
def parsePrice(price) when is_float(price), do: price
def parsePrice(price) when is_binary(price) do
{float, _} = Float.parse(price)
float
end
end
Example.parsePrice(2.22) |> IO.inspect
Example.parsePrice("2.22") |> IO.inspect
(The equivalent is achievable using a case statement)
If you pass anything that is not a binary (a string) or a float to this function it will cause a pattern unmatched error. This may be good in case you have some error reporting in place, so you can detect unexpected usage of your code.
For a better debugging experience, I encourage you to use the built-in debugger via IEx.pry/0.
For the sake of diversity, I’d post another approach that uses with/1 special form.
with {f, ""} <- Float.parse("3.14"),
do: f,
else: (_ -> nil)
Here we explicitly match the float only. Any trailing garbage would be discarded. If the match succeeds, we return the float, otherwise, we return nil.
Beware of Float.parse/1 might be confused by garbage that looks like scientific notation.
(with {f, ""} <- Float.parse("3e14"), do: f) == 300_000_000_000_000
#⇒ true
Important sidenote: assigning priceFloat inside if does not change the value of the priceFloat variable outside of the scope. Scoping in elixir is pretty important and one cannot propagate local variables to the outermost scope, unlike most of the languages.
foo = 42
if true, do: foo = 3.14
IO.puts(foo)
#⇒ 42
Well, to some extent it’s possible to affect outermost scope variables from macros with var!/2, and if is indeed a macro, but this whole stuff is definitely far beyond the scope of this question.

How to search Lua table values

I have a project that calls for a relational database like structure in an environment where an actual database isn't possible. The language is restricted to Lua, which is far from being my strongest language. I've got a table of tables with a structure like this:
table={
m:r={
x=1
y=1
displayName="Red"
}
m:y={
x=1
y=2
displayName="Yellow"
}
}
Building, storing and retrieving the table is straightforward enough. Where I'm running into issues is searching it. For the sake of clarity, if I could use SQL I'd do this:
SELECT * FROM table WHERE displayName="Red"
Is there a Lua function that will let me search this way?
The straightforward way is to iterate through all elements and find one that matches your criteria:
local t={
r={
x=1,
y=1,
displayName="Red",
},
y={
x=1,
y=2,
displayName="Yellow",
},
}
for key, value in pairs(t) do
if value.displayName == 'Red' then
print(key)
end
end
This should print 'r'.
This may be quite slow on large tables. To speed up this process, you may keep track of the references in a hash that will provide much faster access. Something like this may work:
local cache = {}
local function findValue(key)
if cache[key] == nil then
local value
-- do a linear search iterating through table elements searching for 'key'
-- store the result if found
cache[key] = value
end
return cache[key]
end
If the elements in the table change their values, you'll need to invalidate the cache when the values are updated or removed.
There are no built-in functions for searching tables. There are many ways to go about it which vary in complexity and efficiency.
local t = {
r={displayname="Red", name="Ruby", age=15, x=4, y=10},
y={displayname="Blue", name="Trey", age=22, x=3, y=2},
t={displayname="Red", name="Jack", age=20, x=2, y=3},
h={displayname="Red", name="Tim", age=25, x=2, y=33},
v={displayname="Blue", name="Bonny", age=10, x=2, y=0}
}
In Programming in Lua they recommend building a reverse table for efficient look ups.
revDisplayName = {}
for k,v in pairs(t) do
if revDisplayName[v.displayname] then
table.insert(revDisplayName[v.displayname], k)
else
revDisplayName[v] = {k}
end
end
Then you can match display names easily
for _, rowname in pairs(revDisplayName["Red"]) do
print(t[rowname].x, t[rowname].y)
end
There is code for creating SQL-like queries in Lua, on Lua tables, in Beginning Lua Programming if you want to build complex queries.
If you just want to search through a few records for matches, you can abstract the searching using an iterator in Lua
function allmatching(tbl, kvs)
return function(t, key)
repeat
key, row = next(t, key)
if key == nil then
return
end
for k, v in pairs(kvs) do
if row[k] ~= v then
row = nil
break
end
end
until row ~= nil
return key, row
end, tbl, nil
end
which you can use like so:
for k, row in allmatching(t, {displayname="Red", x=2}) do
print(k, row.name, row.x, row.y)
end
which prints
h Tim 2 33
t Jack 2 3

Resources