Can this be done without Quasi Quoter? - haskell

I have a tiny DSL that actually works quite well. When I say
import language.CWMWL
main = runCWMWL $ do
out (matrixMult, A, 1, row, 1 3 44 6 7)
then runCWMWL is a function that is exported by language.CWMWL. This parses the experession and takes some action.
What I want to achieve is that there is some way to repeat this e.g. 1000 times and have the third element of the tuple consisting the numbers 1 to 1000. My own DSL is not complete enough to do this. Eventually I want to change the string in the last element as well.
Is there any possibility to do this without Quasi Quotes? Are Quasi Quotes the best tool for this?
What binops / primitives would my DSL need to contain or need to wrap in order to allow this in an elegant way?

Unless I'm misunderstanding, I don't think quasiquotation will get you something much nicer than
main = runCWMWL $
sequence [ out (matrixMult, A, n, row, 1 3 44 6 7) | n <- [1..1000] ]
You might also look into MonadComprehensions as well as RebindableSyntax for other ideas.

Related

How would I "iterate" through a range of numbers?

Let's say my function (foo) takes 2 arguments: startNum, and endNum. I need to return every single a list of every multiple of 2 (or alternatively, every number evenly divisible by 2) that falls within that range by checking each number one by one. It is assumed that endNum will always be greater than startNum.
For example, if the function signature was something like this:
foo :: Int -> Int -> Int[]
Then foo(5,10) would return [6, 8, 10].
So far I have tried to mimic a "for" loop, and attempted to use map and scan/scanl in slightly unconventional ways to try and account for the fact that I am not starting off with a list, but rather a range of numbers. However, I have not been able to find a solution using these methods (my level of experience with Haskell is very low, so that is the biggest factor here in why I have not been able to accomplish so simple of a task).
I am expecting the solution, in some way, to use recursion. I am not sure exactly how to begin an implementation of this, or if my previously attempted methods are even correct ways to go about it.
Iteration in Haskell usually means either recursion, or a list comprehension.
For recursion you need a base case and an update case. In your example, we know that if startNum is greater than endNum, the list must be empty. That's easy to write:
foo startNum endNum
| startNum > endNum = []
The trick is the update. Or updates. What do you return if startNum is even? What about when it's not?
foo startNum endNum
| startNum > endNum = []
| even startNum = ...
| otherwise = ...
More natural is a list comprehension with a condition. That code is trivial.
[x | x <- [startNum..endNum], even x]

haskell implementation of a sequence

I just started Haskell and I'm struggling!!!
So I need to create a list om Haskell that has the formula
F(n) = (F(n-1)+F(n-2)) * F(n-3)/F(n-4)
and I have F(0) =1, F(1)=1,F(2)=1,F(3)=1
So I thought of initializing the first 4 elements of the list and then have a create a recursive function that runs for n>4 and appends the values to the list.
My code looks like this
let F=[1,1,1,1]
fib' n F
| n<4="less than 4"
|otherwise = (F(n-1)+F(n-2))*F(n-3)/F(n-4) : fib (n-1) F
My code looks conceptually right to me(not sure though), but I get an incorrect indentation error when i compile it. And am I allowed to initialize the elements of the list in the way that I have?
First off, variables in Haskell have to be lower case. Secondly, Haskell doesn't let you mix integers and fractions so freely as you may be used to from untyped or barely-typed languages. If you want to convert from an Int or an Integer to, say, a Double, you'll need to use fromIntegral. Thirdly, you can't stick a string in a context where you need a number. Fourthly, you may or may not have an indentation problem—be sure not to use tabs in your Haskell files, and to use the GHC option -fwarn-tabs to be sure.
Now we get to the heart of the matter: you're going about this all somewhat wrong. I'm going to give you a hint instead of a full answer:
thesequence = 1 : 1 : 1 : 1 : -- Something goes here that *uses* thesequence

replace within boxed structure

I have the following (for example) data
'a';'b';'c';'a';'b';'a'
┌─┬─┬─┬─┬─┬─┐
│a│b│c│a│b│a│
└─┴─┴─┴─┴─┴─┘
and I'd like to replace all 'a' with a number, 3, and 'b' with another number 4, and get back
┌─┬─┬─┬─┬─┬─┐
│3│4│c│3│4│3│
└─┴─┴─┴─┴─┴─┘
how can I do that?
Thanks for help.
rplc
If that was a string (like 'abcaba') there would be the easy solution of rplc:
'abcaba' rplc 'a';'3';'b';'4'
34c343
amend }
If you need to have it like boxed data (if, for example, 'a' represents something more complex than a character or atom), then maybe you can use amend }:
L =: 'a';'b';'c';'a';'b';'a'
p =: I. (<'a') = L NB. positions of 'a' in L
0 3 5
(<'3') p } L NB. 'amend' "3" on those positions
putting the above into a dyad:
f =: 4 :'({.x) (I.({:x) = y) } y' NB. amend '{.x' in positions where '{:x' = y
('3';'a') f L
┌─┬─┬─┬─┬─┬─┐
│3│b│c│3│b│3│
└─┴─┴─┴─┴─┴─┘
which you can use in more complex settings:
]L =: (i.5);'abc';(i.3);'hello world';(<1;2)
┌─────────┬───┬─────┬───────────┬─────┐
│0 1 2 3 4│abc│0 1 2│hello world│┌─┬─┐│
│ │ │ │ ││1│2││
│ │ │ │ │└─┴─┘│
└─────────┴───┴─────┴───────────┴─────┘
((1;2);(i.3)) f L
┌─────────┬───┬─────┬───────────┬─────┐
│0 1 2 3 4│abc│┌─┬─┐│hello world│┌─┬─┐│
│ │ ││1│2││ ││1│2││
│ │ │└─┴─┘│ │└─┴─┘│
└─────────┴───┴─────┴───────────┴─────┘
btw, {.y is the first item of y; {:y is the last item of y
bottom line
Here's a little utility you can put in your toolbox:
tr =: dyad def '(y i.~ ({." 1 x),y) { ({:" 1 x) , y'
] MAP =: _2 ]\ 'a';3; 'b';4
+-+-+
|a|3|
+-+-+
|b|4|
+-+-+
MAP tr 'a';'b';'c';'a';'b';'a'
+-+-+-+-+-+-+
|3|4|c|3|4|3|
+-+-+-+-+-+-+
just above the bottom line
The utility tr is a verb which takes two arguments (a dyad): the right argument is the target, and the left argument is the mapping table. The table must have two columns, and each row represents a single mapping. To make just a single replacement, a vector of two items is acceptable (i.e. 1D list instead of 2D table, so long as the list is two items long).
Note that the table must have the same datatype as the target (so, if you're replacing boxes, it must be a table of boxes; if characters, then a table of characters; numbers for numbers, etc).
And, since we're doing like-for-like mapping, the cells of the mapping table must have the same shape as the items of the target, so it's not suitable for tasks like string substitution, which may require shape-shifting. For example, ('pony';'horse') tr 'I want a pony for christmas' won't work (though, amusingly, 'pony horse' tr&.;: 'I want a pony for christmas' would, for reasons I won't get into).
way above the bottom line
There's no one, standard answer to your question. That said, there is a very common idiom to do translation (in the tr, or mapping 1:1, sense):
FROM =: ;: 'cat dog bird'
TO =: ;: 'tiger wolf pterodactyl'
input=: ;: 'cat bird dog bird bird cat'
(FROM i. input) { TO
+-----+-----------+----+-----------+-----------+-----+
|tiger|pterodactyl|wolf|pterodactyl|pterodactyl|tiger|
+-----+-----------+----+-----------+-----------+-----+
To break this down, the primitive i. is the lookup function and the primitive { is the selection function (mnemonic: i. gives you the *i*ndex of the elements you're looking for).
But the simplistic formulation above only applies when you want to replace literally everything in the input, and FROM is guaranteed to be total (i.e. the items of the input are constrained to whatever is in FROM).
These contraints make the simple formulation appropriate for tasks like case conversion of strings, where you want to replace all the letters, and we know the total universe of letters in advance (i.e. the alphabet is finite).
But what happens if we don't have a finite universe? What should we do with unrecognized items? Well, anything we want. This need for flexibility is the reason that there is no one, single translation function in J: instead, the language gives you the tools to craft a solution specific to your needs.
For example, one very common extension to the pattern above is the concept of substitution-with-default (for unrecognized items). And, because i. is defined to return 1+#input for items not found in the lookup, the extension is surprisingly simple: we just extend the replacement list by one item, i.e. just append the default!
DEFAULT =: <'yeti'
input=: ;: 'cat bird dog horse bird monkey cat iguana'
(FROM i. input) { TO,DEFAULT
+-----+-----------+----+----+-----------+----+-----+----+
|tiger|pterodactyl|wolf|yeti|pterodactyl|yeti|tiger|yeti|
+-----+-----------+----+----+-----------+----+-----+----+
Of course, this is destructive in the sense it's not invertible: it leaves no information about the input. Sometimes, as in your question, if you don't know how to replace something, it's best to leave it alone.
Again, this kind of extension is surprisingly simple, and, once you see it, obvious: you extend the lookup table by appending the input. That way, you're guaranteed to find all the items of the input. And replacement is similarly simple: you extend the replacement list by appending the input. So you end up replacing all unknown items with themselves.
( (FROM,input) i. input) { TO,input
+-----+-----------+----+-----+-----------+------+-----+------+
|tiger|pterodactyl|wolf|horse|pterodactyl|monkey|tiger|iguana|
+-----+-----------+----+-----+-----------+------+-----+------+
This is the strategy embodied in tr.
above the top line: an extension
BTW, when writing utilities like tr, J programmers will often consider the N-dimensional case, because that's the spirit of the language. As it stands, tr requires a 2-dimensional mapping table (and, by accident, will accept a 1-dimensional list of two items, which can be convenient). But there may come a day when we want to replace a plane inside a cube, or a cube inside a hypercube, etc (common in in business intelligence applications). We may wish to extend the utility to cover these cases, should they ever arise.
But how? Well, we know the mapping table must have at least two dimensions: one to hold multiple simultaneous substitutions, and another to hold the rules for replacement (i.e. one "row" per substition and two "columns" to identify an item and its replacement). The key here is that's all we need. To generalize tr, we merely need to say we don't care about what's beneath those dimensions. It could be a Nx2 table of single characters, or an Nx2 table of fixed-length strings, or an Nx2 table of matrices for some linear algebra purpose, or ... who cares? Not our problem. We only care about the frame, not the contents.
So let's say that, in tr:
NB. Original
tr =: dyad def '(y i.~ ({." 1 x),y) { ({:" 1 x) , y'
NB. New, laissez-faire definition
tr =: dyad def '(y i.~ ({."_1 x),y) { ({:"_1 x) , y'
A taxing change, as you can see ;). Less glibly: the rank operator " can take positive or negative arguments. A positive argument lets the verb address the content of its input, whereas a negative argument lets the verb address the frame of its input. Here, "1 (positive) applies {. to the rows of the x, whereas "_1 (negative) applies it to the the "rows" of x, where "rows" in scare-quotes simply means the items along the first dimension, even if they happen to be 37-dimensional hyperrectangles. Who cares?
Well, one guy cares. The original definition of tr let the laziest programmer write ('dog';'cat') tr ;: 'a dog makes the best pet' instead of (,:'dog';'cat') tr ;: 'a dog makes the best pet'. That is, the original tr (completely accidentally) allowed a simple list as a mapping table, which of course isn't a Nx2 table, even in an abstract, virtual sense (because it doesn't have at least two dimensions). Maybe we'd like to retain this convenience. If so, we'd have to promote degenerate arguments on the user's behalf:
tr =: dyad define
x=.,:^:(1=##$) x
(y i.~ ({."_1 x),y) { ({:"_1 x) , y
)
After all, laziness is a prime virtue of a programmer.
Here's the simplest way I can think of to accomplish what you have asked for:
(3;3;3;4;4) 0 3 5 1 4} 'a';'b';'c';'a';'b';'a'
┌─┬─┬─┬─┬─┬─┐
│3│4│c│3│4│3│
└─┴─┴─┴─┴─┴─┘
here's another approach
(<3) 0 3 5} (<4) 1 4} 'a';'b';'c';'a';'b';'a'
┌─┬─┬─┬─┬─┬─┐
│3│4│c│3│4│3│
└─┴─┴─┴─┴─┴─┘
Hypothetically speaking, you might want to be generalizing this kind of expression, or you might want an alternative. I think the other posters here have pointed out ways of doing that. . But sometimes just seeing the simplest form can be interesting?
By the way, here's how I got my above indices (with some but not all of the irrelevancies removed):
I. (<'a') = 'a';'b';'c';'a';'b';'a'
0 3 5
('a') =S:0 'a';'b';'c';'a';'b';'a'
1 0 0 1 0 1
('a') -:S:0 'a';'b';'c';'a';'b';'a'
1 0 0 1 0 1
I.('a') -:S:0 'a';'b';'c';'a';'b';'a'
0 3 5
I.('b') -:S:0 'a';'b';'c';'a';'b';'a'
1 4

Funny Haskell Behaviour: min function on three numbers, including a negative

I've been playing around with some Haskell functions in GHCi.
I'm getting some really funny behaviour and I'm wondering why it's happening.
I realized that the function min is only supposed to be used with two values. However, when I use three values, in my case
min 1 2 -5
I'm getting
-4
as my result.
Why is that?
You are getting that result because this expression:
min 1 2 -5
parses as if it were parenthesized like this:
(min 1 2) -5
which is the same as this:
1 -5
which is the same as this:
1 - 5
which is of course -4.
In Haskell, function application is the most tightly-binding operation, but it is not greedy. In fact, even a seemingly simple expression like min 1 2 actually results in two separate function calls: the function min is first called with a single value, 1; the return value of that function is a new, anonymous function, which will return the smaller value between 1 and its single argument. That anonymous function is then called with an argument of 2, and of course returns 1. So a more accurate fully-parenthesized version of your code is this:
((min 1) 2) - 5
But I'm not going to break everything down that far; for most purposes, the fact that what looks like a function call with multiple arguments actually turns into a series of multiple single-argument function calls is a hand-wavable implementation detail. It's important to know that if you pass too few arguments to a function, you get back a function that you can call with the rest of the arguments, but most of the time you can ignore the fact that such calls are what's happening under the covers even when you pass the right number of arguments to start with.
So to find the minimum of three values, you need to chain together two calls to min (really four calls per the logic above, but again, hand-waving that):
min (min 1 2) (-5)
The parentheses around -5 are required to ensure that the - is interpreted as prefix negation instead of infix subtraction; without them, you have the same problem as your original code, only this time you would be asking Haskell to subtract a number from a function and would get a type error.
More generally, you could let Haskell do the chaining for you by applying a fold to a list, which can then contain as many numbers as you like:
foldl1 min [1, 2, -5]
(Note that in the literal list syntax, the comma and square bracket delimit the -5, making it clearly not a subtraction operation, so you don't need the parentheses here.)
The call foldl1 fun list means "take the first two items of list and call fun on them. Then take the result of that call and the next item of list, and call fun on those two values. Then take the result of that call and the next item of the list..." And so on, continuing until there's no more list, at which point the value of the last call to fun is returned to the original caller.
For the specific case of min, there is a folded version already defined for you: minimum. So you could also write the above this way:
minimum [1, 2, -5]
That behaves exactly like my foldl1 solution; in particular, both will throw an error if handed an empty list, while if handed a single-element list, they will return that element unchanged without ever calling min.
Thanks to JohnL for reminding me of the existence of minimum.
When you type min 1 2 -5, Haskell doesn't group it as min 1 2 (-5), as you seem to think. It instead interprets it as (min 1 2) - 5, that is, it does subtraction rather than negation. The minimum of 1 and 2 is 1, obviously, and subtracting 5 from that will (perfectly correctly) give you -4.
Generally, in Haskell, you should surround negative numbers with parentheses so that this kind of stuff doesn't happen unexpectedly.
Nothing to add to the previous answers. But you are probably looking for this function.
import Data.List
minimum [1, 2, -4]

Repeat each element in a string a certain number of times

I'm using the rep() function to repeat each element in a string a number of times. Each character I have contains information for a state, and I need the first three elements of the character vector repeated three times, and the fourth element repeated five times.
So lets say I have the following character vectors.
al <- c("AlabamaCity", "AlabamaCityST", "AlabamaCityState", "AlabamaZipCode")
ak <- c("AlaskaCity", "AlaskaCityST", "AlaskaCityState", "AlaskaZipCode")
az <- c("ArizonaCity", "ArizonaCityST", "ArizonaCityState", "ArizonaZipCode")
ar <- c("ArkansasCity", "ArkansasCityST", "ArkansasCityState", "ArkansasZipCode")
I want to end up having the following output.
AlabamaCity
AlabamaCity
AlabamaCity
AlabamaCityST
AlabamaCityST
AlabamaCityST
AlabamaCityState
AlabamaCityState
AlabamaCityState
AlabamaZipCode
AlabamaZipCode
AlabamaZipCode
AlabamaZipCode
AlabamaZipCode
AlabamaZipCode
...
I was able to get the desired output with the following command, but it's a little inconvenient when I'm running through all fifty states. Plus, I might have another column with 237 cities in Alabama, and I'll inevitably run into problems matching up the names in the first column with the values in the second column.
dat = data.frame(name=c(rep(al[1:3],each=3), rep(al[4],each=6),
rep(ak[1:3],each=3), rep(ak[4],each=6)))
dat
dat2 = data.frame(name=c(rep(al[1:3],each=3), rep(al[4],each=6),
rep(ak[1:3],each=3), rep(ak[4],each=6)),
city=c(rep("x",each=15), rep("y",each=15)))
dat2
Of course, in real life, the 'x' and 'y' won't be single values.
So my question concerns if there is a more efficient way of performing this task. And closely related to the question, when does it become important to ditch procedural programming in favor of OOP in R. (not a programmer, so the second part may be a really stupid question) More importantly, is this a task where I should look for a oop related solution.
According to ?rep, times= can be a vector. So, how about this:
dat <- data.frame(name=rep(al, times=c(3,3,3,6)))
It would also be more convenient if your "state" data were in a list.
stateData <- list(al,ak,az,ar)
Data <- lapply(stateData, function(x) data.frame(name=rep(x, times=c(3,3,3,6))))
Data <- do.call(rbind, Data)
I think you can combine the times() argument of rep to work through a list with sapply(). So first, we need to make our list object:
vars <- list(al, ak, az, ar)
# Iterate through each object in vars. By default, this returns a column for each list item.
# Convert to vector and then to data.frame...This is probably not that efficient.
as.data.frame(as.vector(sapply(vars, function(x) rep(x, times = c(3,3,3,6)))))
1 AlabamaCity
2 AlabamaCity
3 AlabamaCity
4 AlabamaCityST
....snip....
....snip....
57 ArkansasZipCode
58 ArkansasZipCode
59 ArkansasZipCode
60 ArkansasZipCode
You might consider using expand.grid, then paste on the results from that.

Resources