Julia: empty vector of strings - string

I'd like to initialize an empty vector and then add strings to it. However, using x=[] creates an empty array of type Any. I've read that specifying types improves performance.
I tried x = Vector{String} but other functions (append, join and push) don't work as expected.
Is it possible to creates an empty array of strings to further append strings to it?

You can do it in two ways:
vs = String[]
or
vs = Vector{String}()

Related

list of one string element getting converted to list of characters

I have a program which receives input from another program and use it for further operations. The input can be a list, set, tuple but for further operations a list is needed. So I am converting input to list.
The problem arises when input my program receives is a list/set/tuple with just one element like below. The
import itertools
def not_mine(c):
d = {'John':['mid', 'forward'],
'Lana':['mid'],
'Jacob':['defence', 'mid'],
'Ian':['goal', 'mid']}
n = itemgetter(*c)(d)
n = list(set(itertools.chain.from_iterable(n)))
return n
def mine(c):
name = not_mine(c)
name_1 = list(name)
print(name_1)
mine(['Jacob', 'Ian'])
['defence', 'goal', 'mid']
mine(['Lana'])
['i', 'm', 'd']
Is there any way to prevent the second case? It should be a list of one element ['mid'].
Iterators
The function set uses the first argument as an iterator to create a sequence of items. str is natively an iterator. In other words, you can loop over a str and you'll assign to the for variable each character in the string per iteration.
for whatami in "hi!":
print(whatami)
h
i
!
If you want to treat a single string input as a single item, explicitly pass an iterator argument to set (list works the same way, BTW) with a single item in it. Tuple is, also, an iterator. Let's try to use it to prove our theory
t1 = ('ourstring', )
print(f"t1 is of type {type(t1)}")
s1 = set(t1)
print(s1)
t1 is of type <class 'tuple'>
{'ourstring'}
It works!
What we've done with ('ourstring', ) is explicitly define a tuple with one item. There's a familiar delimiter, ,, used to say "this tuple is instantiated with only one item".
Input
To separate situations between ingesting a list of items and one string item, you can consider two approaches.
The most straight-forward way is to agree on a delimiter in the input such as comma separated values. firstvalue,secondvalue,etc. The down side of this is that you'll quickly run into limitations of what kind of data you can receive.
To ease your development, argparse is strongly recommended command line arguments. It is a built-in, battle-hardened package made for this type of task. The docs's first example even shows a multi-value field.

Python nested list comprehension

in my code i have created a nested list via list comprehension containing hex numbers. My next step was to calculate the decimal value of these hex numbers.
My last step was removing the () brackets of each element, because my former method created tupels for each list element.
My question here is, can i combine all three steps into one big step and if yes, will it be more efficient in computing ?
My code looks like this:
from struct import unpack
from codecs import decode
self.step1 = [[self.inputlist[self.otherlist[i]+k] for i in range(len(self.otherlist))]
for k in range(asd)]
self.step2 = [[unpack("<B",decode(x,"hex")) for x in y] for y in self.step1]
self.step3 = [[p[0] for p in q] for q in self.step2]
this code worked fine (i shortened it and am not showing how self.inputlist,otherlist,asd are defined). I am just curious if i can put self.step1, self.step2,self.step3 into one nested list comprehension.

Find union of two variable-name scalars

I have a Stata program that outputs a local scalar of space-separated variable names.
I have to run the program twice on two samples (same dta) and store the union (intersection - variable names appearing in both scalars) as a new space-separated local scalar (for input to another program).
I can't figure out how to split (per spaces) and or test the occurrences of variable names in each.
Stata has a bunch of extended macro functions to use on lists that you can find with help macrolists, where you can see that A & B returns the intersection of A and B. If A="a b c d" and B="b c f g", then A & B = "b c".
This allows you to do something like this:
clear
scalar l1="vara varb varc"
scalar l2="varc vard vare"
local l1 = scalar(l1)
local l2 = scalar(l2)
local inter: list l1 & l2
scalar inter="`inter'"
scalar list inter
You convert the scalars to locals, get their union, and convert that into a scalar. It is probably easier to just modify your code to use locals rather than scalars so you don't have to deal with conversions.
I am not sure I perfectly understand your question, if this is not the appropriate answer, please add an example for us to work with.
Here is the code that checks two space-separated macros and gets their intersection, even if it's not the most elegant, unless your macros are huge it should still be quite fast.
local list1 first list here
local list2 list two here
local intersection
foreach l1 in `list1' {
foreach l2 in `list2' {
// if they overlap, add to the intersection macro
if "`l1'" == "`l2'" {
local intersection `intersection' `l1'
}
}
}
mac list // show the macros stored currently in the do file

Simple adding two arrays using numpy in python?

This might be a simple question. However, I wanted to get some clarifications of how the following code works.
a = np.arange(8)
a
array([1,2,3,4,5,6,7])
Example Function = a[0:-1]+a[1:]/2.0
In the Example Function, I want to draw your attention to the plus sign between the array a[0:-1]+a[1:]. How does that work? What does that look like?
For instance, is the plus sign (addition) adding the first index of each array? (e.g 1+2) or add everything together? (e.g 1+2+2+3+3+4+4+5+5+6+6+7)
Then, I assume /2.0 is just dividing it by 2...
A numpy array uses vector algebra in that you can only add two arrays if they have the same dimensions as you are adding element by element
a = [1,2,3,4,5]
b = [1,1,1]
a+b # will throw an error
whilst
a = [1,2,3,4,5]
b = [1,1,1,1,1]
a+b # is ok
The division is also element by element.
Now to your question about the indexing
a = [1,2,3,4,5]
a[0:-1]= [1,2,3,4]
a[1:] = [2,3,4,5]
or more generally a[index_start: index_end] is inclusive at the start_index but exclusive at the end_index - unless you are given a a[start_index:]where it includes everything up to and including the last element.
My final tip is just to try and play around with the structures - there is no harm in trying different things, the computer will not explode with a wrong value here or there. Unless you trying to do so of course.
If arrays have identical shapes, they can be added:
new_array = first_array.__add__(second_array)
This simple operation adds each value from first_array to each value in second_array and puts result into new_array.

F# Parse Excel File Into Tuple Array

I have a follow up question on this post. I would like to take the contents of an Excel spreadsheet and put it into a Array of tuples where each tuple corresponds to each row in the spreadsheet.
I started with looping though the entire range like this:
let path = "XXX.xlsx"
let app = ApplicationClass(Visible = false)
let book = app.Workbooks.Open path
let sheet = book.Worksheets.[1] :?> _Worksheet
let content = sheet.UsedRange.Value2 :?> obj[,]
for i=content.GetLowerBound(0) to content.GetUpperBound(0) do
for j=content.GetLowerBound(1) to content.GetUpperBound(1) do
But strikes me as very inefficient. If there something in the base API spec out of the box that I can use?
Thanks in advance
The Array2D module implements some common functions for 2D arrays.
I am guessing you want to use Array2D.iter or Array2D.iteri to replace your for loop.
The straightforward way to convert each row (of a two-dimensional array) to a tuple (in a single-dimensional row) is to finish what you started - just iterate over all the rows and construct a tuple:
let tuples =
[ for i in contents.GetLowerBound(0) .. contents.GetUpperBound(0) ->
contents.[i,0], contents.[i,1], contents.[i,2] ]
To do this, you need to know (statically) what is the length of the row. This is because F# tuples are fixed-length tuples and the length is checked. The above example assumes that there are just 3 elements with indices 0, 1 and 2. If the length is dynamic, then you probably should continue using 2D arrays rather than a list of tuples.
Another option is to bypass the Excel engine altogether. There are a couple of ways of doing this, but the one I've had great success with is 'SpreadsheetGear'. It isn't free (bring on the downvotes), but it replicates the Excel API very closely and is very fast.

Resources