In response to the question of FoldList like primitive in J, I wanted to create an adverb fold so that x u fold y is to fold y with verb u and inital value x:
fold =: 2 : 0
z =.x
for_item. y do. z =. z u item end.
z
)
But I got error when trying it out:
1 (+fold) 1 2 3
|value error: x
| z=. x
what's wrong here? thanks.
Just a couple small things.
First, the numeric code for an adverb is 1. The 2 : 0 you have is defining a conjunction, not an adverb. The way it stands now, J is expecting two direct arguments to fold, and you've only provided one (the +; the two numeric arrays are indirect, not direct, arguments). However, that's not what J is complaining about here, because the other issue is actually tripping it up first. I'll get to that in a second, but nevertheless the first thing you need to do is define fold as an adverb [1].
The more immediate issue that J is complaining about is that it doesn't know what you mean by x. Why? For the same reason that it would if you replaced 2 : 0 (or conjunction define) -- or even, more pertinently, adverb define -- with verb define. Because explicit verbs (direct or derived) are monadic by default and have no x argument (hence mentioning x is a value error). If you want to define a dyadic verb, you must ask for it explicitly.
Now, defining a dyadic verb directly is straightforward: instead of saying verb define, you simply say dyad define. But deriving a dyadic verb from a modifier (adverb or conjunction) is a little less obvious. You must use the special colon syntax which allows you to separate the monadic and dyadic valences of explicit definitions. This syntax applies to all explicit definitions, including verbs, adverbs, and conjunctions, but for adverbs and conjunctions it is the only way to derive an explicit verb.
Bottom line:
fold =: adverb define
NB. Note solitary colon on next line. Everything after that is dyadic.
:
z =.x
for_item. y do. z =. z u item end.
z
)
[1]: You may find using the standard covers for nameclasses easier to remember (and read later), as in adverb define and conjunction define (for one-liners, you can use def in place of define).
Related
I have been using J for a couple of weeks now and recently started using it for simple problems rather than just toying with concepts.
This problem needs to replace all x characters in a string into y, which works but using a dyad form of the final verb gives me unexpected output.
Let's use the following example input:
input =. 'abcxdefxghi'
First, I need to find the indexes of x characters in the right argument for amend.
findx =. I.#:([:'x'&= ])
findx input NB. 3 7
0 findx input NB. 3 7
1 findx input NB. 3 7
Then, I amend the results of findx with a bonded y on the left.
trxy =. 'y'&(findx })
trxy input NB. abcydefyghi
_1 trxy input NB. domain error
0 trxy input NB. abcxdefxghi <= this is the really unexpected result
1 trxy input NB. abcydefyghi <= somewhat unexpected, works with strictly positive ints
'a' trxy input NB. domain error
There is two things I don't understand:
why trxy sometimes works as a dyad when I thought I bonded the left side of my amend ?
why a 0 left argument stops trxy from working ?
With 1 trxy input you are executing 1 'y'&(findx }) input – and x u&n y is maybe not what you expect. It is documented (somewhat hidden) on the bottom of this page: https://code.jsoftware.com/wiki/Vocabulary/ampm
It is equivalent to x (m&n # ] ^: [) y, thus applies n on y (with m on the left side) for x times. That's why with 0 trxy y you aren't executing anything, thus y stays the same. With 1 trxy y you are applying trxy once to y. As trxy has no trivial inverse, _1 trxy y results in an error. And because 'a' isn't a number, the last one is just a plain error.
If you – for whatever reason – just want to be able to write trxy as a monad and a dyad that ignores the left-hand side, you could use trxy =. 'y' findx} ]. (As you could also use findx =. I.#:('x' = ]) or just findx =. [:I. 'x'=].)
Total Julia noob here (with basic knowledge of Python). I am trying to do linear regression and things I read suggest the GLM package. Here is some sample code I found here:
using DataFrames, GLM
y = 1:10
df = DataFrame(y = y, x1 = y.^2, x2 = y.^3)
sm = GLM.lm( #formula(y ~ x1 + x2), df )
coef(sm)
Can someone explain the syntax here? What does #formula mean? Docs here say #foo means a
macro which I guess is basically just a function, but where do I find the function/macro formula? Just looking at the use here though, I would have thought it is maybe passing y ~ x1 + x2 (whatever that is) as the formula argument to lm? (similar to keyword arguments = in python?)
Next, what is ~ here? General docs say ~ means negation but I'm not seeing how that makes here.
Is there a place in the GLM docs where all of this is explained? I'm not seeing that. Only seeing a few examples but not a full breakdown of each function and all of its arguments.
You have stumbled upon the #formula language that is defined in the StatsModels.jl package and implemented in many statistics/econometrics related packages across the Julia ecosystem.
As you say, #formula is a macro, which transforms the expression given to it (here y ~ x1 + x2) into some other Julia expression. If you want to find out what happens when a macro gets called in Julia - which I admit can often look like magic to new (and sometimes experienced!) users - the #macroexpand macro can help you. In this case:
julia> #macroexpand #formula(y ~ x1 + x2)
:(StatsModels.Term(:y) ~ StatsModels.Term(:x1) + StatsModels.Term(:x2))
The result above is the expression constructed by the #formula macro. We see that the variables in our formula macro are transformed into StatsModels.Term objects. If we were to use StatsModels directly, we could construct this ourselves by doing:
julia> Term(:y) ~ Term(:x1) + Term(:x2)
FormulaTerm
Response:
y(unknown)
Predictors:
x1(unknown)
x2(unknown)
julia> (Term(:y) ~ Term(:x1) + Term(:x2)) == #formula(y ~ x1 + x2)
true
Now what is going on with ~, which as you say can be used for negation in Julia? What has happened here is that StatsModels has defined methods for ~ (which in Julia is and infix operator, that means essentially it is a function that can be written in between its arguments rather than having to be called with its arguments in brackets:
julia> (Term(:y) ~ Term(:x)) == ~(Term(:y), Term(:x))
true
So writing y::Term ~ x::Term is the same as calling ~(y::Term, x::Term), and this method for calling ~ with terms on the left and right hand side is defined by StatsModels (see method no. 6 below):
julia> methods(~)
# 6 methods for generic function "~":
[1] ~(x::BigInt) in Base.GMP at gmp.jl:542
[2] ~(::Missing) in Base at missing.jl:100
[3] ~(x::Bool) in Base at bool.jl:39
[4] ~(x::Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8}) in Base at int.jl:254
[5] ~(n::Integer) in Base at int.jl:138
[6] ~(lhs::Union{AbstractTerm, Tuple{Vararg{AbstractTerm,N}} where N}, rhs::Union{AbstractTerm, Tuple{Vararg{AbstractTerm,N}} where N}) in StatsModels at /home/nils/.julia/packages/StatsModels/pMxlJ/src/terms.jl:397
Note that you also find the general negation meaning here (method 3 above, which defines the behaviour for calling ~ on a boolean argument and is in Base Julia).
I agree that the GLM.jl docs maybe aren't the most comprehensive in the world, but one of the reasons for that is that the whole machinery behind #formula actually isn't a GLM.jl thing - so do check out the StatsModels docs linked above which are quite good I think.
This question occurred to my while solving this problem.
NB. Find the next number whose prime factorization exponents
NB. match those of the given number.
exps=. /:~#{:#(__&q:)
f=. 3 : 0
target=. exps y
(>:^:(-.#(target-:exps))^:_) y+1
)
f 20 NB. 28
Note that in order to specify the while condition of the Do... While, I first had calculate the prime exponents of the argument y and save that answer to target. I was then able to write -.#(target-:exps) as the While condition.
This of course breaks the tacit style. So I'd like to know if there is a way to achieve the same thing that my verb above achieves, but do so as a single tacit verb?
The way I approached this was to think of f as the centre of a dyadic fork where the left argument is exps y which is the unchanging comparison target and the right argument is >: y which does the initial incrementing. The next step was to use ] at each ^: in f to keep exps monadic. The [ pulls in exps y from the left argument.
Written in tacit
exps=. /:~#{:#(__&q:)
ft=: exps >:#]^:([ -.#-: exps#])^:_ >:
ft 20
28
It's cool that 3 * 4 results in 12, and * 4 results in 1, but does using the same primitive for both operations ever provide a benefit? For example, let's say I were to define the following:
SIGNUM =: * : [:
TIMES =: [: : *
If I were to only ever use SIGNUM and TIMES instead of *, would I ever miss out on a clever use of *? That is, x TIMES y seems to be exactly the same as x * y for every x I can imagine (although my imagination is pretty limited in this regard). Is there an x where x * y produces the same result as SIGNUM y?
In case * : [: isn't immediately clear, the following should illustrate:
SIGNUM =: * : [:
TIMES =: [: : *
SIGNUM 4
1
3 TIMES 4
12
* 4
1
3 * 4
12
3 SIGNUM 4
|domain error: SIGNUM
| 3 SIGNUM 4
TIMES 4
|domain error: TIMES
| TIMES 4
Let's write conclusions from the comments down:
There is no direct language-level reason not to use names for primitives
Using names instead of primitives can however harm performance, as special code does not necessarily get triggered. I think this can be remedied by fixing verbs after building them with f..
The reason for having the same name for monadic and dyadic verbs is historical: APL used it before. Most verbs have a related actions in monadic / dyadic versions and inflections (a number of trailing dots and colons).
For instance, ^ can be expressed in traditional notation as pow(x,y) or exp(y) where x and y are left and right arguments, and e is Euler's constant. Here, the monadic version is the same as the dyadic version, with a sensible default left argument. Different inflections of the same root are all power-related verbs:
- ^. does logarithms (base e for the monad)
- ^: does Power conjunction, applying a verb a variable number of times.
Other relations between monadic and dyadic verbs can also exist, for example $ can be said to get or set the Shape of an array, depending on whether it is used as monad or dyad.
That said, I think that once one gets a bit of experience with J, it becomes easier to spot which valence a verb has based on the sentence it is used in. Examples are:
Monad # Ambiv NB. Mv is always used monadically, Av depends on arguments
Ambiv & Monad
(Dyad Monad) NB. A hook, where verb 1 is always dyadic
(Ambiv Dyad Ambiv) NB. A fork, the middle is one always dyadic
It was probably a mistake to use the same symbols for dyadic and monadic built-ins except for those where the monadic case is a default parameter to the dyad.
TIMES =: 1&$: : *
would be a good defnition that doesn't give an error.
As for ambivalent cases,
(3 * TIMES) 4
12
2 (3 * TIMES) 4
24
Another useful ambivalent verb is:
TIMESORSQUARE =: *~
*~ 3
9
2 *~ 3
6
Concatenative languages have some very intriguing characteristics, such as being able to compose functions of different arity and being able to factor out any section of a function. However, many people dismiss them because of their use of postfix notation and how it's tough to read. Plus the Polish probably don't appreciate people using their carefully crafted notation backwards.
So, is it possible to have prefix notation? If it is, what would the tradeoffs be?
I have an idea of how it could work, but I'm not experienced with concatenative languages so I'm probably missing something. Basically, a function would be evaluated in reverse order and values would be pulled from the stack in reverse order. To demonstrate this, I'll compare postfix to what prefix would look like. Here are some concatenative expressions with the traditional postfix notation.
5 dup * ! Multiply 5 by itself
3 2 - ! Subtract 2 from 3
(1, 2, 3, 4, 5) [2 >] filter length ! Get the number of integers from 1 to 5
! that are greater than 2
The expressions are evaluated from left to right: in the first example, 5 is pushed on the stack, then dup duplicates the top value on the stack, then * multiplies the top two values on the stack. Functions pull their last argument first from the stack: in the second example, when - is called, 2 is at the top of the stack, but it is the last argument.
Here is what I think prefix notation would look like:
* dup 5
- 3 2
length filter (1, 2, 3, 4, 5) [< 2]
The expressions are evaluated from right to left, and functions pull their first argument first from the stack. Note how the prefix filter example reads much more closely to its description and looks similar to the applicative style. One issue I noticed is factoring things out might not be as useful. For example, in postfix notation you can factor out 2 - from 3 2 - to create a subtractTwo function. In prefix notation you can factor out - 3 from - 3 2 to create a subtractFromThree function, which doesn't seem as useful.
Barring any glaring issues, perhaps a concatenative language that uses prefix notation could win over the people who dislike postfix notation. Any insight is appreciated.
Well certainly, if your words are still fixed-arity then it's just a matter of executing tokens right to left.
It's only because of n-arity functions that prefix notation implies parenthesis, and it's only because of wanting human "reading order" to match execution order that being a stack language implies postfix.
I'm writing such a language right now as it happens, and so far I like some of the side-effects of using prefix notation. The semantics are based on Joy:
Files are parsed from left to right, but executed from right to left.
By extension, definitions must come after the point at which they are used.
As a nice side-effect, comments are simply lists which are dropped.
Here's the factorial function, for instance:
def 'fact [cond [* fact - 1 dup] [1 drop] dup]
I also find it easier to reason about the code as I'm writing it, but I don't have a strong background in concatenative languages. Here's my (probably-naive) derivation of the map function over lists. The 'nb' function drops something and is used for comments. 'stash [f]' pops into a temp, runs 'f' on the rest of the stack, then pushes the temp back on.
def 'map [q [cons map stash [head swap i] dup stash [tail dup]] [nb] is_cons nip]
nb [map [f] (cons x y) -> cons map [f] x f y
stash [tail dup] [f] (cons x y) = [f] y (cons x y)
dup [f] y (cons x y) = [f] [f] y (cons x y)
stash [head swap i] [f] [f] y (cons x y) = [f] x (f y)
cons map [f] x (f y) = cons map [f] x f y
map [f] [] -> []]
I just came from reading about the Om Language
Seems just what you are talking about. From it's description (emphasis mine):
The Om language is:
a novel, maximally-simple concatenative, homoiconic programming and algorithm notation language with:
minimal syntax, comprised of only three elements.
prefix notation, in which functions manipulate the remainder of the program itself. [...]
It also states that it's not finished, and will experience much change yet.
Still, it seems to be working, and really interesting as proof of concept.
I imagine a concatenative prefix language without stack. It could call functions, which would then themselves interpret code until they got all needed operands. Interpreter would then call next function. It would only need one memory construct - the result. Everything else could be read from the source code at time of execution. As you might have noticed, I am talking about interpreted language, not compiled one.