How to properly use (0 : 0) in verbs - j

I am trying to define the following entity
x =: fn
text
...
)
NB. x is a noun
Notice that no other argument except the text is given to fn. If I understand correctly fn could be only a verb. I found a package which archives similar results but fn there is an adverb. I tried to use 0 : 0 in an verb (monad and dyad) but got syntax error. What am I doing wrong ? Why the 0 : 0 trick can be used only in adverbs ?

0 : 0 creates a noun based on the subsequent lines until a line containing only a ) is reached.
if we use the verb ;: (Words - which boxes the j sentence argument into words) as fn, then this is what we get from the assignment.
a=: ;: (0 : 0)
here
we
go
)
a
+----+-+--+-+--+-+
|here| |we| |go| |
+----+-+--+-+--+-+
(<LF)= each a
+-------+-+---+-+---+-+
|0 0 0 0|1|0 0|1|0 0|1|
+-------+-+---+-+---+-+
The last part shows us that the blanks in the boxes are the linefeeds between lines.
Hope this helps
Cheers, bob

Related

How can I fix 'noun result was required' error in J?

I'm trying to do the 4th Advent of Code problem using J, and I've ran into lots of problems, but now I have a code that looks like this:
fn =. < 'D:/PyCharm/AOC/output4.txt'
data =. freads fn
concatIntegers =. ,&'x'#,&.":
separado =. ;: data
lista =. > ((i.(#separado)%2)*2) {separado
n_lista =. #lista
n_2 =. n_lista % 4
lista_2 =. concatIntegers each/|:lista
matriz =. 250 4 $ > lista_2
loop =: monad : 0
res =. 0
i=.0
condicion=.0
for_i. i.250 do.
fila =. i { matriz
elfo11 =. 0 {fila
elfo12 =. 1 {fila
elfo21 =. 2 {fila
elfo22 =. 3 {fila
condicion =. ((elfo11 <: elfo21) *. (elfo12 >: elfo22)) +. ((elfo21 <: elfo11) *. (elfo22 >: elfo12))
if. condicion do.
res =. >: res
end.
end.
res
)
loop matriz
What this should do is: Loads a txt file, parses it, creates a matrix, and then using the verb loop it would add 1 to a counter every time the condition is applied.
The thing is, I can't make that loop work, every time I try running it, it gives me the same error:
|noun result was required: loop
| condicion
|[-30] d:\pycharm\aoc\day4.ijs
I am losing my mind
The code works until it reaches the loop verb I created, but I've been looking through documentation for ages and I can't spot my error
The code until that works as intended
Problem #1: variable i used in:
fila =. i { matriz
is not defined and is considered as unknown verb not noun.
Problem #2: loop iterates on martiz which is of lenght 250 elements (each element is list of 4 integers). But it does 1000 iterations, so there is out of array bound here.
Try to replace the line:
for. i.1000 do.
by the line
for_i. i.250 do.
Problem #3: there is no priorities for operators, so condition should be computed as (I guess here):
condicion =. ((elfo11 <: elfo21) *. (elfo12 >: elfo22)) +. ((elfo21 <: elfo11) *. (elfo22 >: elfo12))
Problem #4: res increment is not saved, try to replace the line:
>: res
by the line
res=. >: res
Problem #5: loop verb cannot see martiz noun since that is local, try to replace the line:
matriz =. 250 4 $ > lista_2
by the line
matriz =: 250 4 $ > lista_2

How to fix indentation problem with haskell if statement

I have the following Haskell code:
f :: Int -> Int
f x =
let var1 = there in
case (there) of
12 -> 0
otherwise | (there - 1) >= 4 -> 2
| (there + 1) <= 2 -> 3
where there = 6
The function alone is garbage, ignore what exactly it does.
I want to replace the guards with if
f x =
let var1 = there in
case (there) of
12 -> 0
otherwise -> if (there - 1) >= 4 then 2
else if (there + 1) <= 2 then 3
where there = 6
I tried moving the if to the next line, the then to the next line, lining them up, unlining them, but nothing seems to work.
I get a parsing error and I don't know how to fix it:
parse error (possibly incorrect indentation or mismatched brackets)
|
40 | where there = 6
| ^
You have a few misunderstandings in here. Let's step through them starting from your original code:
f x =
A function definition, but the function never uses the parameter x. Strictly speaking this is a warning and not an error, but most code bases will use -Werror so consider omitting the parameter or using _ to indicate you are explicitly ignoring the variable.
let var1 = there in
This is unnecessary - again you are not using var1 (the below used there) so why have it?
case (there) of
Sure. Or just case there of, not need for excessive parens cluttering up the code.
12 -> 0
Here 12 is a pattern match, and it's fine.
otherwise ->
Here you used the variable name otherwise as a pattern which will uncondtionally match the value there. This is another warning: otherwise is a global value equal to True so it can be used in guards, such as function foo | foo < 1 = expr1 ; | otherwise = expr2. Your use is not like that, using otherwise as a pattern shadows the global value. Instead consider the catch all pattern with underscore:
_ -> if (there - 1) >= 4
then 2
else if (there + 1) <= 2
then 3
where there = 6
Ok... what if there was equal to 3? 3-1 is not greater than 4. 3+1 is not less than 2. You always need an else with your if statement. There is no if {} in Haskell instead there is if ... else ... much like the ternary operator in C, as explained in the Haskell wiki.

Importing csv file into J and using them as variable

I saved this data (20 vectors v)into csv file like this
v=:<"1 (? 20 2 $ 20)
makecsv v
v writecsv jpath'~temp/position.csv'
]vcsv =: freads jpath '~temp/position.csv'
fixcsv vcsv
, and I could import the csv file by
readcsv jpath '~temp/position.csv'
However, it doesn't give same result if I name it as
w=: readcsv jpath '~temp/position.csv'
diff=: ([{]) ,. ]
0 diff v
0 diff w
Actually, 0 diff w gives a length error
Is there any other approach should I use to have same results from both v(original) and w(imported csv data)?
Thank you!
I'm a J beginner so you may get a better answer later, but poking at it I think I have found something.
First, the tables/csv addon docs state that readcsv "Reads csv file into a boxed array," emphasis mine, while writecsv "Writes an array to a csv file." In other words, readcsv and writecsv are not symmetric operations. And the shapes of the values seem to confirm that:
$ w
1 20
$ v
20
This is also why the diff works for v but not w. If you simply unbox the result, it seems to work better:
0 diff 0 { w
┌───┬─────┐
│1 3│1 3 │
├───┼─────┤
...
├───┼─────┤
│1 3│5 8 │
└───┴─────┘
However, the shapes are still not exactly the same:
$ > v
20 2
$ > 0 { w
20 5
I think this is because readcsv doesn't know that your values are numeric; you probably need to throw a ". in there somewhere to decode them.
When you write the CSV file, you just have a bunch of ASCII characters. In this case, you've got numbers, spaces, and commas.
When you read the CSV, J has no guarantees about the format or contents. fixcsv gets your commas and line-breaks translated into a grid of cells, but J boxes it all to be safe, because it's a bunch of variable-length ASCII strings.
If you want to get back to v, you have two things you need to do. The first is to get the dimensions right. CSV files, pretty much by definition, are two-dimensional. If you change your example to write a two-dimensional array to the CSV, you'll find that you have the same shape after fixcsv readcsv.
u =: 4 5 $ v
u writecsv jpath'~temp/position.csv'
104
] t =: fixcsv freads jpath '~temp/position.csv'
┌────┬─────┬────┬────┬─────┐
│9 11│1 4 │8 3 │3 12│5 4 │
├────┼─────┼────┼────┼─────┤
│7 11│10 11│9 10│0 8 │6 16 │
├────┼─────┼────┼────┼─────┤
│13 8│17 12│13 2│5 19│17 14│
├────┼─────┼────┼────┼─────┤
│2 15│19 10│3 1 │12 7│14 13│
└────┴─────┴────┴────┴─────┘
$ v
20
$ u
4 5
$ t
4 5
If you're definitely dealing with a one-dimensional list (albeit of boxed number pairs), then you can Ravel (,) what you read to get it down to one dimension.
$ w
1 20
$ , w
20
Once you have them in the same shape, you need to convert the ASCII text into number arrays. Do that with Numbers (".).
10 * > {. v
90 110
10 * > {. , w
|domain error
| 10 *>{.,w
'a' , > {. , w
a9 11
10 * _ ". > {. , w
90 110

Pattern Matching BASIC programming Language and Universe Database

I need to identify following patterns in string.
- "2N':'2N':'2N"
- "2N'-'2N'-'2N"
- "2N'/'2N'/'2N"
- "2N'/'2N'-'2N"
AND SO ON.....
basically i want this pattern if written in Simple language
2 NUMBERS [: / -] 2 NUMBERS [: / -] 2 NUMBERS
So is there anyway by which i could write one pattern which will cover all the possible scenarios ? or else i have to write total 9 patterns and had to match all 9 patterns to string.... and it is not the scenario in my code , i have to match 4, 2 number digits separated by [: / -] to string for which i have towrite total 27 patterns. So for understanding purpose i have taken 3 ,2 digit scenario...
Please help me...Thank you
Maybe you could try something like (Pick R83 style)
OK = X MATCH "2N1X2N1X2N" AND X[3,1]=X[6,1] AND INDEX(":/-",X[3,1],1) > 0
Where variable X is some input string like: 12-34-56
Should set variable OK to 1 if validation passes, else 0 for any invalid format.
This seems to get all your required validation into a single statement. I have assumed that the non-numeric characters have to be the same. If this is not true, the check could be changed to something like:
OK = X MATCH "2N1X2N1X2N" AND INDEX(":/-",X[3,1],1) > 0 AND INDEX(":/-",X[6,1],1) > 0
Ok, I guess the requirement of surrounding characters was not obvious to me. Still, it does not make it much harder. You just need to 'parse' the string looking for the first (I assume) such pattern (if any) in the input string. This can be done in a couple of lines of code. Here is a (rather untested ) R83 style test program:
PROMPT ":"
LOOP
LOOP
CRT 'Enter test string':
INPUT S
WHILE S # "" AND LEN(S) < 8 DO
CRT "Invalid input! Hit RETURN to exit, or enter a string with >= 8 chars!"
REPEAT
UNTIL S = "" DO
*
* Look for 1st occurrence of pattern in string..
CARDNUM = ""
FOR I = 1 TO LEN(S)-7 WHILE CARDNUM = ""
IF S[I,8] MATCH "2N1X2N1X2N" THEN
IF INDEX(":/-",S[I+2,1],1) > 0 AND INDEX(":/-",S[I+5,1],1) > 0 THEN
CARDNUM = S[I,8] ;* Found it!
END ELSE I = I + 8
END
NEXT I
*
CRT CARDNUM
REPEAT
There is only 7 or 8 lines here that actually look for the card number pattern in the source/test string.
Not quite perfect but how about 2N1X2N1X2N this gets you 2 number followed by 1 of any character followed by 2 numbers etc.
This might help:
BIG.STRING ="HELLO TILDE ~ CARD 12:34:56 IS IN THIS STRING"
TEMP.STRING = BIG.STRING
CONVERT "~:/-" TO "*~~~" IN TEMP.STRING
IF TEMP.STRING MATCHES '0X2N"~"2N"~"2N0X' THEN
FIRST.TILDE.POSN = INDEX(TEMP.STRING,"~",1)
CARD.STRING = BIG.STRING[FIRST.TILDE.POSN-2,8]
PRINT CARD.STRING
END

algorithm/code in R to find pattern from any position in a string

I want to find the pattern from any position in any given string such that the pattern repeats for a threshold number of times at least.
For example for the string "a0cc0vaaaabaaaabaaaabaa00bvw" the pattern should come out to be "aaaab". Another example: for the string "ff00f0f0f0f0f0f0f0f0000" the pattern should be "0f".
In both cases threshold has been taken as 3 i.e. the pattern should be repeated for at least 3 times.
If someone can suggest an optimized method in R for finding a solution to this problem, please do share with me. Currently I am achieving this by using 3 nested loops, and it's taking a lot of time.
Thanks!
Use regular expressions, which are made for this type of stuff. There may be more optimized ways of doing it, but in terms of easy to write code, it's hard to beat. The data:
vec <- c("a0cc0vaaaabaaaabaaaabaa00bvw","ff00f0f0f0f0f0f0f0f0000")
The function that does the matching:
find_rep_path <- function(vec, reps) {
regexp <- paste0(c("(.+)", rep("\\1", reps - 1L)), collapse="")
match <- regmatches(vec, regexpr(regexp, vec, perl=T))
substr(match, 1, nchar(match) / reps)
}
And some tests:
sapply(vec, find_rep_path, reps=3L)
# a0cc0vaaaabaaaabaaaabaa00bvw ff00f0f0f0f0f0f0f0f0000
# "aaaab" "0f0f"
sapply(vec, find_rep_path, reps=5L)
# $a0cc0vaaaabaaaabaaaabaa00bvw
# character(0)
#
# $ff00f0f0f0f0f0f0f0f0000
# [1] "0f"
Note that with threshold as 3, the actual longest pattern for the second string is 0f0f, not 0f (reverts to 0f at threshold 5). In order to do this, I use back references (\\1), and repeat these as many time as necessary to reach threshold. I need to then substr the result because annoyingly base R doesn't have an easy way to get just the captured sub expressions when using perl compatible regular expressions. There is probably a not too hard way to do this, but the substr approach works well in this example.
Also, as per the discussion in #G. Grothendieck's answer, here is the version with the cap on length of pattern, which is just adding the limit argument and the slight modification of the regexp.
find_rep_path <- function(vec, reps, limit) {
regexp <- paste0(c("(.{1,", limit,"})", rep("\\1", reps - 1L)), collapse="")
match <- regmatches(vec, regexpr(regexp, vec, perl=T))
substr(match, 1, nchar(match) / reps)
}
sapply(vec, find_rep_path, reps=3L, limit=3L)
# a0cc0vaaaabaaaabaaaabaa00bvw ff00f0f0f0f0f0f0f0f0000
# "a" "0f"
find.string finds substring of maximum length subject to (1) substring must be repeated consecutively at least th times and (2) substring length must be no longer than len.
reps <- function(s, n) paste(rep(s, n), collapse = "") # repeat s n times
find.string <- function(string, th = 3, len = floor(nchar(string)/th)) {
for(k in len:1) {
pat <- paste0("(.{", k, "})", reps("\\1", th-1))
r <- regexpr(pat, string, perl = TRUE)
if (attr(r, "capture.length") > 0) break
}
if (r > 0) substring(string, r, r + attr(r, "capture.length")-1) else ""
}
and here are some tests. The last test processes the entire text of James Joyce's Ulysses in 1.4 seconds on my laptop:
> find.string("a0cc0vaaaabaaaabaaaabaa00bvw")
[1] "aaaab"
> find.string("ff00f0f0f0f0f0f0f0f0000")
[1] "0f0f"
>
> joyce <- readLines("http://www.gutenberg.org/files/4300/4300-8.txt")
> joycec <- paste(joyce, collapse = " ")
> system.time(result <- find.string2(joycec, len = 25))
user system elapsed
1.36 0.00 1.39
> result
[1] " Hoopsa boyaboy hoopsa!"
ADDED
Although I developed my answer before having seen BrodieG's, as he points out they are very similar to each other. I have added some features of his to the above to get the solution below and tried the tests again. Unfortunately when I added the variation of his code the James Joyce example no longer works although it does work on the other two examples shown. The problem seems to be in adding the len constraint to the code and may represent a fundamental advantage of the code above (i.e. it can handle such a constraint and such constraints may be essential for very long strings).
find.string2 <- function(string, th = 3, len = floor(nchar(string)/th)) {
pat <- paste0(c("(.", "{1,", len, "})", rep("\\1", th-1)), collapse = "")
r <- regexpr(pat, string, perl = TRUE)
ifelse(r > 0, substring(string, r, r + attr(r, "capture.length")-1), "")
}
> find.string2("a0cc0vaaaabaaaabaaaabaa00bvw")
[1] "aaaab"
> find.string2("ff00f0f0f0f0f0f0f0f0000")
[1] "0f0f"
> system.time(result <- find.string2(joycec, len = 25))
user system elapsed
0 0 0
> result
[1] "w"
REVISED The James Joyce test that was supposed to be testing find.string2 was actually using find.string. This is now fixed.
Not optimized (even it is fast) function , but I think it is more R way to do this.
Get all patterns of certains length > threshold : vectorized using mapply and substr
Get the occurrence of these patterns and extract the one with maximum occurrence : vectorized using str_locate_all.
Repeat 1-2 this for all lengths and tkae the one with maximum occurrence.
Here my code. I am creating 2 functions ( steps 1-2) and step 3:
library(stringr)
ss = "ff00f0f0f0f0f0f0f0f0000"
ss <- "a0cc0vaaaabaaaabaaaabaa00bvw"
find_pattern_length <-
function(length=1,ss){
patt = mapply(function(x,y) substr(ss,x,y),
1:(nchar(ss)-length),
(length+1):nchar(ss))
res = str_locate_all(ss,unique(patt))
ll = unlist(lapply(res,length))
list(patt = patt[which.max(ll)],
rep = max(ll))
}
get_pattern_threshold <-
function(ss,threshold =3 ){
res <-
sapply(seq(threshold,nchar(ss)),find_pattern_length,ss=ss)
res[,which.max(res['rep',])]
}
some tests:
get_pattern_threshold('ff00f0f0f0f0f0f0f0f0000',5)
$patt
[1] "0f0f0"
$rep
[1] 6
> get_pattern_threshold('ff00f0f0f0f0f0f0f0f0000',2)
$patt
[1] "f0"
$rep
[1] 18
Since you want at least three repetitions, there is a nice O(n^2) approach.
For each possible pattern length d cut string into parts of length d. In case of d=5 it would be:
a0cc0
vaaaa
baaaa
baaaa
baa00
bvw
Now look at each pairs of subsequent strings A[k] and A[k+1]. If they are equal then there is a pattern of at least two repetitions. Then go further (k+2, k+3) and so on. Finally you also check if suffix of A[k-1] and prefix of A[k+n] fit (where k+n is the first string that doesn't match).
Repeat it for each d starting from some upper bound (at most n/3).
You have n/3 possible lengths, then n/d strings of length d to check for each d. It should give complexity O(n (n/d) d)= O(n^2).
Maybe not optimal but I found this cutting idea quite neat ;)
For a bounded pattern (i.e not huge) it's best I think to just create all possible substrings first and then count them. This is if the sub-patterns can overlap. If not change the step fun in the loop.
pat="a0cc0vaaaabaaaabaaaabaa00bvw"
len=nchar(pat)
thr=3
reps=floor(len/2)
# all poss strings up to half length of pattern
library(stringr)
pat=str_split(pat, "")[[1]][-1]
str.vec=vector()
for(win in 2:reps)
{
str.vec= c(str.vec, rollapply(data=pat,width=win,FUN=paste0, collapse=""))
}
# the max length string repeated more than 3 times
tbl=table(str.vec)
tbl=tbl[tbl>=3]
tbl[which.max(nchar(names(tbl)))]
aaaabaa
3
NB Whilst I'm lazy and append/grow the str.vec here in a loop, for a larger problem I'm pretty sure the actual length of str.vec is predetermined by the length of the pattern if you care to work it out.
Here is my solution, it's not optimized (build vector with patterns <- c() ; pattern <- c(patterns, x) for example) and can be improve but simpler than yours, I think.
I can't understand which pattern exactly should (I just return the max) be returned but you can adjust the code to what you want exactly.
str <- "a0cc0vaaaabaaaabaaaabaa00bvw"
findPatternMax <- function(str){
nb <- nchar(str):1
length.patt <- rev(nb)
patterns <- c()
for (i in 1:length(nb)){
for (j in 1:nb[i]){
patterns <- c(patterns, substr(str, j, j+(length.patt[i]-1)))
}
}
patt.max <- names(which(table(patterns) == max(table(patterns))))
return(patt.max)
}
findPatternMax(str)
> findPatternMax(str)
[1] "a"
EDIT :
Maybe you want the returned pattern have a min length ?
then you can add a nchar.patt parameter for example :
nchar.patt <- 2 #For a pattern of 2 char min
nb <- nb[length.patt >= nchar.patt]
length.patt <- length.patt[length.patt >= nchar.patt]

Resources