Is it possible to destruct sequence in Nim? - nim-lang

Is it possible to get first N elements in Nim? Something like:
let [a, b, ...rest] = "a/b/c".split("/")
P.S.
Use case I'm trying to parse "NYSE:MSFT" string
proc parse_esymbol*(esymbol: string): tuple[string, string] =
let parts = esymbol.split(":")
assert parts.len == 2, fmt"invalid esymbol '{esymbol}'"
(parts[0], parts[1])
echo parse_esymbol("NYSE:MSFT")

You can assign variables from a tuple like this:
let (a,b) = ("a","b")
There isn't a built-in seq to tuple conversion, but you can do it with a little macro like this:
macro first[T](s:openArray[T],l:static[int]):untyped =
result = newNimNode(nnkPar)
for i in 0..<l:
result.add nnkBracketExpr.newTree(s,newLit(i))
let (a,b) = "a/b/c".split('/').first(2)

there are currently at least two libraries implementing a macro like the one in this answer: unpack and definesugar.
import strutils
import unpack
block:
[a, b, *rest] <- "a/b/c/d/e/f".split("/")
echo a,b
echo rest
import definesugar
block:
(a, b, *rest) := "a/b/c/d/e/f".split("/")
echo a,b
echo rest
# output for both
# ab
# #["c", "d", "e", "f"]
recent discussion: https://forum.nim-lang.org/t/7072
For your specific use case though, I would implement something with https://nim-lang.github.io/Nim/strscans.html

Related

python 'concatenate' requires extra parentheses

I'm trying to concatenate 3 lists. When I try to use concatenate, like so, I get an error (TypeError: 'list' object cannot be interpreted as an integer):
import numpy as np
a = [1]
b = [2]
c = [3]
z = np.concatenate(a, b, c)
But if I put "extra" parentheses, it works like so:
z = np.concatenate((a, b, c))
Why?
I am not sure what library you are using (concatenate is not a built-in python 3.x function). However, I'll explain what I think is going on.
When you call concatenate(a, b, c), the function concatenate is sent three parameters: a, b, and c. concatenate then performs some logic that is (presumably) not the desired behavior.
When you call concatenate((a, b, c)), a tuple (effectively a list that cannot be changed) is created with a value of (a, b, c), which is evaluated to ([1], [2], [3]). Then this tuple is passed to the concatenate function. The following code is actually equivalent to your second code snippet:
a = [1]
b = [2]
c = [3]
y = (a, b, c) # This evaluates to ([1], [2], [3]).
z = concatenate(y)
I hope I've explained this clearly enough. Here's an article that explains tuples in more depth, if I haven't: https://www.w3schools.com/python/python_tuples.asp
EDIT: Thanks for including the library. Here's the code for what you're probably trying to do:
import numpy as np
a = [1]
b = [2]
c = [3]
z = np.array(a + b + c) # Lists can be concatenated using the `+` operator. Then, to make a numpy array, just call the constructor

Why does Python 3 print statement appear to alter a variable, declared later in the code, but works fine without it?

I am running Python 3.6.2 on Windows 10 and was learning about the zip() function.
I wanted to print part of the object returned by the zip() function.
Here is my code, without the troublesome print statement:
a = ("John", "Charles", "Mike")
b = ("Jenny", "Christy", "Monica", "Vicky")
x = zip(a, b)
tup = tuple(x)
print(tup)
print(type(tup))
print(len(tup))
print(tup[1])
Here is my code with the troublesome print statement:
a = ("John", "Charles", "Mike")
b = ("Jenny", "Christy", "Monica", "Vicky")
x = zip(a, b)
print(tuple(x)[1])
tup = tuple(x)
print(tup)
print(type(tup))
print(len(tup))
print(tup[1])
The print(tuple(x)[1]) statement appears to change the tuple 'tup' into a zero-length one and causes the print(tup[1]) to fail later in the code!
In this line, you create an iterator:
x = zip(a, b)
Within the print statement, you convert the iterator to a tuple. This tuple has 3 elements. This exhausts the iterator and anytime you call it afterwards, it will return no further elements.
Therefore, upon your creation of tup, your iterator does not return an element. Hence, you have a tuple with length 0. And of course, this will raise an exception when you try to access the element with index 1.
For testing, consider this:
a = ("John", "Charles", "Mike")
b = ("Jenny", "Christy", "Monica", "Vicky")
x = zip(a, b)
tup1 = tuple(x)
tup2 = tuple(x)
print(tup1)
print(tup2)
It will give you the following result:
(('John', 'Jenny'), ('Charles', 'Christy'), ('Mike', 'Monica'))
()
This is basically what you do when creating a tuple out of an iterator twice.

Unpack multiple variables from sequence

I am expecting the code below to print chr7.
import strutils
var splitLine = "chr7 127471196 127472363 Pos1 0 +".split()
var chrom, startPos, endPos = splitLine[0..2]
echo chrom
Instead it prints #[chr7, 127471196, 127472363].
Is there a way to unpack multiple values from sequences at the same time?
And what would the tersest way to do the above be if the elements weren't contiguous? For example:
var chrom, startPos, strand = splitLine[0..1, 5]
Gives the error:
read_bed.nim(8, 40) Error: type mismatch: got (seq[string], Slice[system.int], int literal(5))
but expected one of:
system.[](a: array[Idx, T], x: Slice[system.int])
system.[](s: string, x: Slice[system.int])
system.[](a: array[Idx, T], x: Slice[[].Idx])
system.[](s: seq[T], x: Slice[system.int])
var chrom, startPos, strand = splitLine[0..1, 5]
^
This can be accomplished using macros.
import macros
macro `..=`*(lhs: untyped, rhs: tuple|seq|array): auto =
# Check that the lhs is a tuple of identifiers.
expectKind(lhs, nnkPar)
for i in 0..len(lhs)-1:
expectKind(lhs[i], nnkIdent)
# Result is a statement list starting with an
# assignment to a tmp variable of rhs.
let t = genSym()
result = newStmtList(quote do:
let `t` = `rhs`)
# assign each component to the corresponding
# variable.
for i in 0..len(lhs)-1:
let v = lhs[i]
# skip assignments to _.
if $v.toStrLit != "_":
result.add(quote do:
`v` = `t`[`i`])
macro headAux(count: int, rhs: seq|array|tuple): auto =
let t = genSym()
result = quote do:
let `t` = `rhs`
()
for i in 0..count.intVal-1:
result[1].add(quote do:
`t`[`i`])
template head*(count: static[int], rhs: untyped): auto =
# We need to redirect this through a template because
# of a bug in the current Nim compiler when using
# static[int] with macros.
headAux(count, rhs)
var x, y: int
(x, y) ..= (1, 2)
echo x, y
(x, _) ..= (3, 4)
echo x, y
(x, y) ..= #[4, 5, 6]
echo x, y
let z = head(2, #[4, 5, 6])
echo z
(x, y) ..= head(2, #[7, 8, 9])
echo x, y
The ..= macro unpacks tuple or sequence assignments. You can accomplish the same with var (x, y) = (1, 2), for example, but ..= works for seqs and arrays, too, and allows you to reuse variables.
The head template/macro extracts the first count elements from a tuple, array, or seqs and returns them as a tuple (which can then be used like any other tuple, e.g. for destructuring with let or var).
For anyone that's looking for a quick solution, here's a nimble package I wrote called unpack.
You can do sequence and object destructuring/unpacking with syntax like this:
someSeqOrTupleOrArray.lunpack(a, b, c)
[a2, b2, c2] <- someSeqOrTupleOrArray
{name, job} <- tim
tom.lunpack(job, otherName = name)
{job, name: yetAnotherName} <- john
Currently pattern matching in Nim only works with tuples. This also makes sense, because pattern matching requires a statically known arity. For instance, what should happen in your example, if the seq does not have a length of three? Note that in your example the length of the sequence can only be determined at runtime, so the compiler does not know if it is actually possible to extract three variables.
Therefore I think the solution which was linked by #def- was going in the right direction. This example uses arrays, which do have a statically known size. In this case the compiler knows the tuple arity, i.e., the extraction is well defined.
If you want an alternative (maybe convenient but unsafe) approach you could do something like this:
import macros
macro extract(args: varargs[untyped]): typed =
## assumes that the first expression is an expression
## which can take a bracket expression. Let's call it
## `arr`. The generated AST will then correspond to:
##
## let <second_arg> = arr[0]
## let <third_arg> = arr[1]
## ...
result = newStmtList()
# the first vararg is the "array"
let arr = args[0]
var i = 0
# all other varargs are now used as "injected" let bindings
for arg in args.children:
if i > 0:
var rhs = newNimNode(nnkBracketExpr)
rhs.add(arr)
rhs.add(newIntLitNode(i-1))
let assign = newLetStmt(arg, rhs) # could be replaced by newVarStmt
result.add(assign)
i += 1
#echo result.treerepr
let s = #["X", "Y", "Z"]
s.extract(a, b, c)
# this essentially produces:
# let a = s[0]
# let b = s[1]
# let c = s[2]
# check if it works:
echo a, b, c
I do not have included a check for the seq length yet, so you would simply get out-of-bounds error if the seq does not have the required length. Another warning: If the first expression is not a literal, the expression would be evaluated/calculated several times.
Note that the _ literal is allowed in let bindings as a placeholder, which means that you could do things like this:
s.extract(a, b, _, _, _, x)
This would address your splitLine[0..1, 5] example, which btw is simply not a valid indexing syntax.
yet another option is package definesugar:
import strutils, definesugar
# need to use splitWhitespace instead of split to prevent empty string elements in sequence
var splitLine = "chr7 127471196 127472363 Pos1 0 +".splitWhitespace()
echo splitLine
block:
(chrom, startPos, endPos) := splitLine[0..2]
echo chrom # chr7
echo startPos # 127471196
echo endPos # 127472363
block:
(chrom, startPos, strand) := splitLine[0..1] & splitLine[5] # splitLine[0..1, 5] not supported
echo chrom
echo startPos
echo strand # +
# alternative syntax
block:
(chrom, startPos, *_, strand) := splitLine
echo chrom
echo startPos
echo strand
see https://forum.nim-lang.org/t/7072 for recent discussion

Using string as variable in iPython interactive

I would like to run the following in iPython:
mylist = ['a','b']
def f(a,b):
do_something
sliderinterval=(0,10,1)
w = interactive(f, a = sliderinterval, b = sliderinterval)
but instead of writing a and b, I would like to take them from mylist. Is that possible?
Make a dict comprehension, and then pass the dictionary to the function by unpacking (**) in as keywords arguments.
mylist = ['a','b']
def f(a,b):
print(a,b)
sliderinterval=(0,10,1)
d = {k:sliderinterval for k in mylist}
w = interactive(f, **d)
**d is equivalent to writing manually key1=value1, key2=value2... you will often see it in function signature as **kwargs or **kw, for unpacking list you will need only one star and see to as *args.

How to concat string + i?

for i=1:N
f(i) = 'f'+i;
end
gives an error in MatLab. What's the correct syntax to initialize an array with N strings of the pattern fi?
It seems like even this is not working:
for i=1:4
f(i) = 'f';
end
You can concatenate strings using strcat. If you plan on concatenating numbers as strings, you must first use num2str to convert the numbers to strings.
Also, strings can't be stored in a vector or matrix, so f must be defined as a cell array, and must be indexed using { and } (instead of normal round brackets).
f = cell(N, 1);
for i=1:N
f{i} = strcat('f', num2str(i));
end
For versions prior to R2014a...
One easy non-loop approach would be to use genvarname to create a cell array of strings:
>> N = 5;
>> f = genvarname(repmat({'f'}, 1, N), 'f')
f =
'f1' 'f2' 'f3' 'f4' 'f5'
For newer versions...
The function genvarname has been deprecated, so matlab.lang.makeUniqueStrings can be used instead in the following way to get the same output:
>> N = 5;
>> f = strrep(matlab.lang.makeUniqueStrings(repmat({'f'}, 1, N), 'f'), '_', '')
f =
1×5 cell array
'f1' 'f2' 'f3' 'f4' 'f5'
Let me add another solution:
>> N = 5;
>> f = cellstr(num2str((1:N)', 'f%d'))
f =
'f1'
'f2'
'f3'
'f4'
'f5'
If N is more than two digits long (>= 10), you will start getting extra spaces. Add a call to strtrim(f) to get rid of them.
As a bonus, there is an undocumented built-in function sprintfc which nicely returns a cell arrays of strings:
>> N = 10;
>> f = sprintfc('f%d', 1:N)
f =
'f1' 'f2' 'f3' 'f4' 'f5' 'f6' 'f7' 'f8' 'f9' 'f10'
Using sprintf was already proposed by ldueck in a comment, but I think this is worth being an answer:
f(i) = sprintf('f%d', i);
This is in my opinion the most readable solution and also gives some nice flexibility (i.e. when you want to round a float value, use something like %.2f).
according to this it looks like you have to set "N" before trying to use it and it looks like it needs to be an int not string? Don't know much bout MatLab but just what i gathered from that site..hope it helps :)
Try the following:
for i = 1:4
result = strcat('f',int2str(i));
end
If you use this for naming several files that your code generates, you are able to concatenate more parts to the name. For example, with the extension at the end and address at the beginning:
filename = strcat('c:\...\name',int2str(i),'.png');

Resources