Find combinations of words in two vectors

Find combinations of words in two vectors - string

I have a long list of words contained in two vectors
The first vector looks like this:
x <- c("considerably", "much", "far")
The second vector looks like this:
y <- c("higher", "lower")
I need a vector returned, which lists possible combinations of words from each vector. Using x and y, I would need this vector returned
[1] "considerably higher" "considerably lower" "much higher" "much lower"
[5] "far higher" "far lower"
Therefore words in vector x must come before words in vector y. Is there a quick way of doing this?

You could use outer with paste, I think that will be quite quick!
as.vector( t( outer( x , y , "paste" ) ) )
# [1] "considerably higher" "considerably lower" "much higher"
# [4] "much lower" "far higher" "far lower"

You could use expand.grid.
sort(apply(X = expand.grid(x, y), MARGIN = 1, FUN = function(x) paste(x[1], x[2], sep = " ")))

Related

Construct powerset without complements

Starting from this question I've built this code:
import itertools
n=4
nodes = set(range(0,n))
ss = set()
for i in range(1,n+1):
ss = ss.union( set(itertools.combinations(range(0,n), i)))
ss2 = set()
for s in ss:
cs = []
for i in range(0,n):
if not(i in s):
cs.append(i)
cs=tuple(cs)
if not(s in ss2) and not(cs in ss2):
ss2.add(s)
ss = ss2
The code construct all subsets of S={0,1,...,n-1} (i) without complements (example, for n=4, either (1,3) or (0,2) is contained, which one does not matter); (ii) without the empty set, but (iii) with S; the result is in ss. Is there a more compact way to do the job? I do not care if the result is a set/list of sets/lists/tuples. (The result contains 2**(n-1) elements)
Additional options:
favorite subset or complement that has less elements
output sorted by increasing size

When you exclude complements, you actually exclude half of the combinations. So you could imagine generating all combinations and then kick out the last half of them. There you must be sure not to kick out a combination together with its complement, but the way you have them ordered, that will not happen.
Further along this idea, you don't even need to generate combinations that have a size that is more than n/2. For even values of n, you would need to halve the list of combinations with size n/2.
Here is one way to achieve all that:
import itertools
n=4
half = n//2
# generate half of the combinations
ss = [list(itertools.combinations(range(0,n), i))
for i in range(1, half+1)]
# if n is even, kick out half of the last list
if n % 2 == 0:
ss[-1] = ss[-1][0:len(ss[-1])//2]
# flatten
ss = [y for x in ss for y in x]
print(ss)

Output all instances from 1 to 8 where the length of the spelling of a number is greater than the length of the spelling of a value higher than it?

I'm a complete Haskell noob and I've been trying to do this for an entire day now.
So one output could be:
Three,Six
(3 is less than 6 but the spelling of it is longer than the spelling of 6)
I came up with this in Haskell but the variables go out of scope, I don't really understand scope in Haskell yet. This might be completely wrong but any help is appreciated.
let numbers = [("One",1),("Two",2),("Three",3),("Four",4),("Five",5),("Six",6),("Seven",7),("Eight",8)]
[([ x | x <- numbers], [y | y <- numbers]) | length (fst x) > length (fst y), snd x < snd y]
Can someone help me to correct this nested list comprehension? Or even tell me if I can use a nested list comprehension at all?
To clarify:
I want to output a list of pairs, where the spelling of the first element in the pair is longer than the spelling of the second element in the pair, but also, the first element in the pair as a number, is less than the second element in the pair as a number.

It sounds like you want something like this:
[(y1, y2) | (x1, y1) <- numbers, (x2, y2) <- numbers, length x1 > length x2, y1 < y2]
That is, it's a list of pairs of numbers - with the requirements you specify. I'm not able to test this at a moment, I think it should work but let me know if you have any issues with it.
Your scope issues were because you were trying to do nested comprehensions and access variables from the inner comprehension in the outer one - this is not allowed, because a variable used inside a comprehension is only in scope in that particular comprehension.
I have also replaced your uses of fst and snd by explicit pattern-matching on the elements of the pair, which is almost always preferred because it's more explicit.

How to convert (0,0) to [0,0] in prolog?

I'm making a predicate distance/3 that calculates the distance between 2 points on a 2d plane. For example :
?- distance((0,0), (3,4), X).
X = 5
Yes
My predicate only works if (0,0) is the list [0,0]. Is there a way to make this conversion?

You can do this with a simple rule that unifies its left and right sides:
convert((A,B), [A,B]).
Demo.

Although the others have answered, keep in mind that (a,b) in Prolog is actually not what you might think it is:
?- write_canonical((a,b)).
','(a,b)
true.
So this is the term ','/2. If you are working with pairs, you can do two things that are probably "prettier":
Keep them as a "pair", a-b:
?- write_canonical(a-b).
-(a,b)
true.
The advantage here is that pairs like this can be manipulated with a bunch of de-facto standard predicates, for example keysort, as well as library(pairs).
Or, if they are actually a data structure that is part of your program, you might as well make that explicit, as in coor(a, b) for example. A distance in two-dimensional space will then take two coor/2 terms:
distance(coor(X1, Y1), coor(X2, Y2), D) :-
D is sqrt((X1-X2)^2 + (Y1-Y2)^2).
If you don't know how many dimensions you have, you can then indeed keep the coordinates of each point in a list. The message here is that lists are meant for things that can have 0 or more elements in them, while pairs, or other terms with arity 2, or any term with a known arity, are more explicit about the number of elements they have.

If you just have a simple pair, you can use the univ operator and simply say something like:
X = (a,b) ,
X =.. [_|Y] .
which produces
X = (a,b) .
Y = [a,b] .
This doesn't work if X is something like (a,b,c), producing as it does
X = (a,b,c) .
Y = [a,(b,c)] .
[probably not what you want].
The more general case is pretty simple:
csv2list( X , [X] ) :- % We have a list of length 1
var(X) . % - if X is UNbound
csv2list( X , [X] ) :- % We have a list of length 1
nonvar(X) , % - if X is bound, and
X \= (_,_) . % - X is not a (_,_) term.
cs22list( Xs , [A|Ys] ) :- % otherwise (the general case) ,
nonvar(Xs) , % - if X is bound, and
Xs = (A,Bs) , % - X is a (_,) term,
csv2list(Bs,Ys % - recurse down added the first item to result list.
. % Easy!

Haskell create a list with specific increment

In the mathematical languages, you can create a vector as follows:
x = seq(0, 2*pi, length.out = 100)
This outputs:
[1] 0.00000000 0.06346652 0.12693304 0.19039955 0.25386607 0.31733259 0.38079911
[8] 0.44426563 0.50773215 0.57119866 0.63466518 0.69813170 0.76159822 0.82506474
[15] 0.88853126 0.95199777 1.01546429 1.07893081 1.14239733 1.20586385 1.26933037
[22] 1.33279688 1.39626340 1.45972992 1.52319644 1.58666296 1.65012947 1.71359599
[29] 1.77706251 1.84052903 1.90399555 1.96746207 2.03092858 2.09439510 2.15786162
[36] 2.22132814 2.28479466 2.34826118 2.41172769 2.47519421 2.53866073 2.60212725
[43] 2.66559377 2.72906028 2.79252680 2.85599332 2.91945984 2.98292636 3.04639288
[50] 3.10985939 3.17332591 3.23679243 3.30025895 3.36372547 3.42719199 3.49065850
[57] 3.55412502 3.61759154 3.68105806 3.74452458 3.80799110 3.87145761 3.93492413
[64] 3.99839065 4.06185717 4.12532369 4.18879020 4.25225672 4.31572324 4.37918976
[71] 4.44265628 4.50612280 4.56958931 4.63305583 4.69652235 4.75998887 4.82345539
[78] 4.88692191 4.95038842 5.01385494 5.07732146 5.14078798 5.20425450 5.26772102
[85] 5.33118753 5.39465405 5.45812057 5.52158709 5.58505361 5.64852012 5.71198664
[92] 5.77545316 5.83891968 5.90238620 5.96585272 6.02931923 6.09278575 6.15625227
[99] 6.21971879 6.28318531
How can this be achieved in Haskell?
I tried creating a lambda function and using it with map, but I could n't get the same output.
Thanks
let myPi = (\x -> 2*pi)
map myPi [1..10]

Well, you can just do
[0, 2*pi/100 .. 2*pi]
Note that this is not ideal both performance- and floating-point-rounding–wise (because it translates to enumFromThenTo), Daniel Fischer's version is better (it translates to enumFromTo). Thinking it over, GHC will probably compile both to almost equally-fast code, but I'm not sure. If it's really performance-critical, it's best not to use lists at all but e.g. Data.Vector.
As Jakub Hampl remarked, Haskell can deal with infinite lists. That's probably not much use to you here, but it opens interesting possibilties – for instance, you might not be sure which resolution you actually need. You can let your list begin with a very low resolution, then fold back and start again with a higher one. One simple way to achieve this:
import Data.Fixed
multiResS₁ = [ log x `mod'` 2*pi | x<-[1 .. ] ]
using this to plot the sine function looks like this
Prelude Data.Fixed Graphics.Rendering.Chart.Simple> let domainS₁ = take 200 multiResS₁
Prelude Data.Fixed Graphics.Rendering.Chart.Simple> plotPNG "multiresS1.png" domainS₁ sin

Easiest is a list comprehension,
[(2*pi)*k/99 | k <- [0 .. 99]]
(the multiplication with k/99 mitigates the floating point rounding, so the last value is exactly 2*pi.)

R: Call matrixes from a vector of string names?

Imagine I've got 100 numeric matrixes with 5 columns each.
I keep the names of that matrixes in a vector or list:
Mat <- c("GON1EU", "GON2EU", "GON3EU", "NEW4", ....)
I also have a vector of coefficients "coef",
coef <- c(1, 2, 2, 1, ...)
And I want to calculate a resulting vector in this way:
coef[1]*GON1EU[,1]+coef[2]*GON2EU[,1]+coef[3]*GON3EU[,1]+coef[4]*NEW4[,1]+.....
How can I do it in a compact way, using the the vector of names?
Something like:
coef*(Object(Mat)[,1])
I think the key is how to call an object from a string with his name and use and vectorial notation. But I don't know how.

get() allows you to refer to an object by a string. It will only get you so far though; you'll still need to construct the repeated call to get() on the list matrices etc. However, I wonder if an alternative approach might be feasible? Instead of storing the matrices separately in the workspace, why not store the matrices in a list?
Then you can use sapply() on the list to extract the first column of each matrix in the list. The sapply() step returns a matrix, which we multiply by the coefficient vector. The column sums of that matrix are the values you appear to want from your above description. At least I'm assuming that coef[1]*GON1EU[,1] is a vector of length(GON1EU[,1]), etc.
Here's some code implementing this idea.
vec <- 1:4 ## don't use coef - there is a function with that name
mat <- matrix(1:12, ncol = 3)
myList <- list(mat1 = mat, mat2 = mat, mat3 = mat, mat4 = mat)
colSums(sapply(myList, function(x) x[, 1]) * vec)
Here is some output:
> sapply(myList, function(x) x[, 1]) * vec
mat1 mat2 mat3 mat4
[1,] 1 1 1 1
[2,] 4 4 4 4
[3,] 9 9 9 9
[4,] 16 16 16 16
> colSums(sapply(myList, function(x) x[, 1]) * vec)
mat1 mat2 mat3 mat4
30 30 30 30
The above example suggest you create, or read in, your 100 matrices as components of a list from the very beginning of your analysis. This will require you to alter the code you used to generate the 100 matrices. Seeing as you already have your 100 matrices in your workspace, to get myList from these matrices we can use the vector of names you already have and use a loop:
Mat <- c("mat","mat","mat","mat")
## loop
for(i in seq_along(myList2)) {
myList[[i]] <- get(Mat[i])
}
## or as lapply call - Kudos to Ritchie Cotton for pointing that one out!
## myList <- lapply(Mat, get)
myList <- setNames(myList, paste(Mat, 1:4, sep = ""))
## You only need:
myList <- setNames(myList, Mat)
## as you have the proper names of the matrices
I used "mat" repeatedly in Mat as that is the name of my matrix above. You would use your own Mat. If vec contains what you have in coef, and you create myList using the for loop above, then all you should need to do is:
colSums(sapply(myList, function(x) x[, 1]) * vec)
To get the answer you wanted.

See help(get) and that's that.
If you'd given us a reproducible example I'd have said a bit more. For example:
> a=1;b=2;c=3;d=4
> M=letters[1:4]
> M
[1] "a" "b" "c" "d"
> sum = 0 ; for(i in 1:4){sum = sum + i * get(M[i])}
> sum
[1] 30
Put whatever you need in the loop, or use apply over the vector M and get the object:
> sum(unlist(lapply(M,function(n){get(n)^2})))
[1] 30

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Find combinations of words in two vectors - string

You could use outer with paste, I think that will be quite quick! as.vector( t( outer( x , y , "paste" ) ) ) # [1] "considerably higher" "considerably lower" "much higher" # [4] "much lower" "far higher" "far lower"

You could use expand.grid. sort(apply(X = expand.grid(x, y), MARGIN = 1, FUN = function(x) paste(x[1], x[2], sep = " ")))

Related

Construct powerset without complements

Output all instances from 1 to 8 where the length of the spelling of a number is greater than the length of the spelling of a value higher than it?

How to convert (0,0) to [0,0] in prolog?

Haskell create a list with specific increment

R: Call matrixes from a vector of string names?

Categories

Resources