How to define a variable based on an if/then/else statement - haskell

I'm trying to translate some python code to haskell. However I reached a point where I'm not sure how to proceed.
if len(prod) % 2 == 0:
ss = float(1.5 * count_vowels(cust))
else:
ss = float(count_consonants(cust)) # muliplicaton by 1 is implied.
if len(cust_factors.intersection(prod_factors)) > 0:
ss *= 1.5
return ss
I've tried to translate it to this:
if odd length prod
then ss = countConsonants cust
else ss = countVowels cust
if length (cust intersect prod) > 0
then ss = 1.5 * ss
else Nothing
return ss
But I keep getting errors of:
parse error on input `='
Any help or words of wisdom on this would be greatly appreciated.

Don't think of programming in Haskell as "if this, then do that, then do the other thing" — the entire idea of doing things in a sequence is imperative. You're not checking a condition and then defining a variable — you're just calculating a result that depends on a condition. In functional programming, if is an expression and variables are assigned the result of an expression, not assigned inside it.
The most direct translation would be:
let ss = if odd $ length prod
then countConsonants cust
else countVowels cust
in if length (cust `intersect` prod) > 0
then Just $ 1.5 * ss
else Nothing

In Haskell, if is an expression, not a statement. This means it returns a value (like a function) instead of performing an action. Here's one way to translate your code:
ss = if odd length prod
then countConsinants cust
else countVowels cust
return if length ( cust intersect prod) > 0
then Just $ 1.5 * ss
else Nothing
Here's another way:
return if length ( cust intersect prod) > 0
then Just $ 1.5 * if odd length prod
then countConsinants cust
else countVowels cust
else Nothing
As Matt has pointed out, however, your Python code doesn't return None. Every code path sets ss to a number. If this is how it's supposed to work, here's a Haskell translation:
let ss = if odd $ length prod
then countConsonants cust
else countVowels cust
in if length (cust `intersect` prod) > 0
then 1.5 * ss
else ss

If I were you I'd use guards. Maybe I'm a Haskell heathen.
ss prod prodfactors cust | even $ length prod = extratest . (1.5 *) . countvowels cust
| otherwise = extratest . countconsonants cust
where extratest curval | custfactorsintersection prodfactors > 0 = curval * 1.5
| otherwise = curval

I would write it like this in Haskell:
if (not $ null $ cust_factors `intersect` prod_factors)
then ss * 1.5
else ss
where
ss = if (even $ length prod)
then 1.5 * count_vowels cust
else count_cosinants cust
Some comments about what you wrote:
You can do assignment in Haskell using the let and where syntax. In general everything you write in Haskell are expressions. In your case you have to write the whole thing as a single expression and using let or where simplifies that task.
return in Haskell means something different than in Python, it's used for computations with side effects (like IO). For your example there is no need for it.
Nothing is a special value of the type Maybe a. This type represents values of type a with possible failure (the Nothing).
And to answer your direct question.
This Python code
if b:
s = 1
else:
s = 2
would be translated to Haskell to s = if b then 1 else 2 inside a let or where clause.

Functional programming is different from imperative programming. Trying to "translate" line by line isn't how Haskell is meant to be used.

To specifically answer your question. "ss" already has a value. It simply isn't possible to give it a different value. ss = ss * 1.5 makes no sense.

Related

How to fix indentation problem with haskell if statement

I have the following Haskell code:
f :: Int -> Int
f x =
let var1 = there in
case (there) of
12 -> 0
otherwise | (there - 1) >= 4 -> 2
| (there + 1) <= 2 -> 3
where there = 6
The function alone is garbage, ignore what exactly it does.
I want to replace the guards with if
f x =
let var1 = there in
case (there) of
12 -> 0
otherwise -> if (there - 1) >= 4 then 2
else if (there + 1) <= 2 then 3
where there = 6
I tried moving the if to the next line, the then to the next line, lining them up, unlining them, but nothing seems to work.
I get a parsing error and I don't know how to fix it:
parse error (possibly incorrect indentation or mismatched brackets)
|
40 | where there = 6
| ^
You have a few misunderstandings in here. Let's step through them starting from your original code:
f x =
A function definition, but the function never uses the parameter x. Strictly speaking this is a warning and not an error, but most code bases will use -Werror so consider omitting the parameter or using _ to indicate you are explicitly ignoring the variable.
let var1 = there in
This is unnecessary - again you are not using var1 (the below used there) so why have it?
case (there) of
Sure. Or just case there of, not need for excessive parens cluttering up the code.
12 -> 0
Here 12 is a pattern match, and it's fine.
otherwise ->
Here you used the variable name otherwise as a pattern which will uncondtionally match the value there. This is another warning: otherwise is a global value equal to True so it can be used in guards, such as function foo | foo < 1 = expr1 ; | otherwise = expr2. Your use is not like that, using otherwise as a pattern shadows the global value. Instead consider the catch all pattern with underscore:
_ -> if (there - 1) >= 4
then 2
else if (there + 1) <= 2
then 3
where there = 6
Ok... what if there was equal to 3? 3-1 is not greater than 4. 3+1 is not less than 2. You always need an else with your if statement. There is no if {} in Haskell instead there is if ... else ... much like the ternary operator in C, as explained in the Haskell wiki.

what is this shift used in the simplified galil seiferas string match algorithm?

I'm self-studying problem 32-1 in CLRS; part c), presents the following algorithm for string matching:
REPETITION-MATCHER(P, T)
m = P.length
n = T.length
k = 1 + ρ'(P)
q = 0
s = 0
while s <= n-m
if T[s+q+1] == P[q+1]
q = q+1
if q==m
print "Pattern occurs with shift" s
if q==m or T[s+q+1] != P[q+1]
s = s+max(1, ceil(q/k))
q = 0
Here, ρ'(P), which is a function of P only, is defined as the largest integer r such that some prefix P[1..i] = y^r, e.g. a substring y repeated r times.
This algorithm appears to be 95 percent similar to the naive brute-force string matcher. However, the one part which greatly confuses me, and which seems to be the centerpiece of the entire algorithm, is the second to last line. Here, q is the number of characters of P matched so far. What is the rationale behind ceil(q/k)? It is completely opaque to me. It would have made more sense if that line were something like s = s + max(1+q, 1+i), where i is the length of the prefix that gives rise to ρ'(P).
CLRS claims that this algorithm is due to Galil and Seiferas, but in the reference they provide, I cannot find anything that resembles the algorithm provided above. It appears that reference contains, if anything, a much more advanced version of what is here. Can someone explain this ceil(q/k) value, and/or point me toward a reference that describes this particular algorithm, instead of the more well-known main Galil Seiferas paper?
Example #1:
Match aaaa in aaaaab, here ρ' = 4. Consider state:
aaaa ab
^
We have a mismatch here, and we want to move forward by one symbol, no more, because we will match full pattern again (last line sets q to zero). q = 4 and k = 5, so ceil(q/k) = 1, that's all right.
Example #2: Match abcd.abcd.abcd.X in abcd.abcd.abcd.abcd.X. Consider state:
abcd.abcd.abcd. abcd.X
^
We have a mismatch here, and we would like to move forward by five symbols. q = 15 and k = 4, so ceil(q/k) = 4. That's ok, it is almost 5, we still can match our pattern. Had we bigger ρ', say 10, we would have ceil(50/(10+1)) = 5.
Yeh, algorithms skips forward less symbols than KMP does, in case ρ'=10 its running time is O(10n+m) while KMP has O(n+m).
I figured out the proof of correctness.
let k = ρ'(P) + 1, and ρ'(P) is the largest possible repetition factor out of all the prefixes of P.
Suppose T[s+1..s+q] = P[1..q], and either q=m or T[s+q+1] != P[q+1]
Then, for 1 <= j <= floor(q/k) (except for the case q=m and m mod k = 0, in which the upper limit must be ceil(m/k)), we have
T[s+1..s+j] = P[1..j]
T[s+j+1..s+2j] = P[j+1..2j]
T[s+2j+1..s+3j] = P[2j+1..3j]
...
T[s+(k-1)j+1..s+kj] = P[(k-1)j+1..kj]
where not every quantity on every line is equal, since k cannot be a repetition factor, since the largest possible repetition factor out of any prefix of P is k-1.
Suppose we now make a comparison at shift s' = s+j, so that we will make the following comparisons
T[s+j+1..s+2j] with P[1..j]
T[s+2j+1..s+3j] with P[j+1..2j]
T[s+3j+1..s+4j] with P[2j+1..3j]
...
T[s+kj+1..s+(k+1)j] with P[(k-1)j+1..kj]
We claim that not every comparison can match, e.g. at least one of the above "with"s must be replaced with !=. We prove by contradiction. Suppose every "with" above is replaced by =. Then, comparing to the first set of comparisons we did, we would immediately have the following:
P[1..j] = P[j+1..2j]
P[j+1..2j] = [2j+1..3j]
...
P[(k-2)j+1..(k-1)j] = P[(k-1)j+1..kj]
However, this cannot be true, because k is not a repetition factor, hence a contradiction.
Hence, for any 1 <= j <= floor(q/k), testing a new shift s'=s+j is guaranteed to mismatch.
Hence, the smallest shift that is possible to result in a match is s + floor(q/k) + 1 >= ceil(q/k).
Note the code uses ceil(q/k) for simplicity, solely to deal with the case that q = m and m mod k = 0, in which case k * (floor(q/k)+1) would be greater than m, so only ceil(q/k) would do. However, when q mod k = 0 and q < m, then ceil(q/k) = floor(q/k), so is slightly suboptimal, since that shift is guaranteed to fail, and floor(q/k)+1 is the first shift that has any chance of matching.

Fortran nested WHERE statement

I have a Fortran 90 source code with a nested WHERE statement. There is a problem but it seems difficult to understand what exactly happens. I would like to transform it into DO-IF structure in order to debug. What it is not clear to me is how to translate the nested WHERE.
All the arrays have the same size.
WHERE (arrayA(:) > 0)
diff_frac(:) = 1.5 * arrayA(:)
WHERE (diff_frac(:) > 2)
arrayC(:) = arrayC(:) + diff_frac(:)
ENDWHERE
ENDWHERE
My option A:
DO i=1, SIZE(arrayA)
IF (arrayA(i) > 0) THEN
diff_frac(i) = 1.5 * arrayA(i)
DO j=1, SIZE(diff_frac)
IF (diff_frac(j) > 2) THEN
arrayC(j) = arrayC(j) + diff_frac(j)
ENDIF
ENDDO
ENDIF
ENDDO
My option B:
DO i=1, SIZE(arrayA)
IF (arrayA(i) > 0) THEN
diff_frac(i) = 1.5 * arrayA(i)
IF (diff_frac(i) > 2) THEN
arrayC(i) = arrayC(i) + diff_frac(i)
ENDIF
ENDIF
ENDDO
Thank you
According to the thread "Nested WHERE constructs" in comp.lang.fortran (particularly Ian's reply), it seems that the first code in the Question translates to the following:
do i = 1, size( arrayA )
if ( arrayA( i ) > 0 ) then
diff_frac( i ) = 1.5 * arrayA( i )
endif
enddo
do i = 1, size( arrayA )
if ( arrayA( i ) > 0 ) then
if ( diff_frac( i ) > 2 ) then
arrayC( i ) = arrayC( i ) + diff_frac( i )
endif
endif
enddo
This is almost the same as that in Mark's answer except for the second mask part (see below). Key excerpts from the F2008 documents are something like this:
7.2.3 Masked array assignment – WHERE (page 161)
7.2.3.2 Interpretation of masked array assignments (page 162)
... 2. Each statement in a WHERE construct is executed in sequence.
... 4. The mask-expr is evaluated at most once.
... 8. Upon execution of a WHERE statement that is part of a where-body-construct, the control mask is established to have the value m_c .AND. mask-expr.
... 10. If an elemental operation or function reference occurs in the expr or variable of a where-assignment-stmt or in a mask-expr, and is not within the argument list of a nonelemental function reference, the operation is performed or the function is evaluated only for the elements corresponding to true values of the control mask.
If I understand the above thread/documents correctly, the conditional diff_frac( i ) > 2 is evaluated after arrayA( i ) > 0, so corresponding to double IF blocks (if I assume that A .and. B in Fortran does not specify the order of evaluation).
However, as noted in the above thread, the actual behavior may depend on compilers... For example, if we compile the following code with gfortran5.2, ifort14.0, or Oracle fortran 12.4 (with no options)
integer, dimension(4) :: x, y, z
integer :: i
x = [1,2,3,4]
y = 0 ; z = 0
where ( 2 <= x )
y = x
where ( 3.0 / y < 1.001 ) !! possible division by zero
z = -10
end where
end where
print *, "x = ", x
print *, "y = ", y
print *, "z = ", z
they all give the expected result:
x = 1 2 3 4
y = 0 2 3 4
z = 0 0 -10 -10
But if we compile with debugging options
gfortran -ffpe-trap=zero
ifort -fpe0
f95 -ftrap=division (or with -fnonstd)
gfortran and ifort abort with floating-point exception by evaluating y(i) = 0 in the mask expression, while f95 runs with no complaints. (According to the linked thread, Cray behaves similarly to gfortran/ifort, while NAG/PGI/XLF are similar to f95.)
As a side note, when we use "nonelemental" functions in WHERE constructs, the control mask does not apply and all the elements are used in the function evaluation (according to Sec. 7.2.3.2, sentence 9 of the draft above). For example, the following code
integer, dimension(4) :: a, b, c
a = [ 1, 2, 3, 4 ]
b = -1 ; c = -1
where ( 3 <= a )
b = a * 100
c = sum( b )
endwhere
gives
a = 1 2 3 4
b = -1 -1 300 400
c = -1 -1 698 698
which means that sum( b ) = 698 is obtained from all the elements of b, with the two statements evaluated in sequence.
Why not
WHERE (arrayA(:) > 0)
diff_frac(:) = 1.5 * arrayA(:)
ENDWHERE
WHERE (diff_frac(:) > 2 .and. arrayA(:) > 0)
arrayC(:) = arrayC(:) + diff_frac(:)
ENDWHERE
?
I won't say it can't be done with nested wheres, but I don't see why it has to be. Then, if you must translate to do loops, the translation is very straightforward.
Your own attempts suggest you think of where as a kind of looping construct, I think it's better to think of it as a masked assignment (which is how it's explained in the language standard) in which each individual assignment happens at the same time. These days you might consider translating into do concurrent constructs.
Sorry about deflecting the question a bit, but this is interesting. I am not sure that I can tell how the nested where is going to be compiled. It may even be one of those cases that push the envelope.
I agree with High Performance Mark that where is best thought of as a masking operation and then it is unclear (to me) whether your "A" or "B" will result.
I do think that his solution should be the same as your nested where.
My point: Since this is tricky to even discern, can you write new code instead of this, from scratch? Not to translate it, but delete it, forget about it, and write code to do the job.
If you know exactly what this piece of code needs to do, its pre- and post- conditions, then it shouldn't be difficult. If you don't know that then the algorithm may be too entangled in which case this should be rewritten anyway. There may be subtleties involved between what this was intended to do and what it does. You say you are debugging this code already.
Again, sorry to switch context but I think that there is a possibility that this is one of those situations where code is best served by a complete rewrite.
If you want to keep it and only write loops for debugging: Why not write them and compare output?
Run it with where as it is, then run it with "A" instead, then with "B". Print values.

Why are the below layout parsed correctly in Haskell?

I am testing out my understanding of the layout-parsing function in Haskell report (Here)
I could understand that:
test case 1 will pass due to good alignment
test case 2 will fail because "in a + b" is considered a new item at module-level
However, I could not understand why test case 3 would be correctly parsed. So, Questions:
Why will test-case 3 be correctly parsed?
Which pattern in the LHS of the parsing function L (see Here) does test-case 3 match?
-- test case 1
f_1 = let a = 1
b = 2
in a + b
-- test case 2
f_2 = let a = 1
b = 2
in a + b
-- test case 3
f_3 = let a = 1
b = 2
in a + b
Test case 3 matches the parse-error(t) rule. Because the token in is not legal at that point in the let block, a } is inserted before the in to end it.
The parse-error rule can be confusing, but it is also very flexible; using it you can e.g. write Haskell one-liners with rarely any explicit {} at all.

algorithm/code in R to find pattern from any position in a string

I want to find the pattern from any position in any given string such that the pattern repeats for a threshold number of times at least.
For example for the string "a0cc0vaaaabaaaabaaaabaa00bvw" the pattern should come out to be "aaaab". Another example: for the string "ff00f0f0f0f0f0f0f0f0000" the pattern should be "0f".
In both cases threshold has been taken as 3 i.e. the pattern should be repeated for at least 3 times.
If someone can suggest an optimized method in R for finding a solution to this problem, please do share with me. Currently I am achieving this by using 3 nested loops, and it's taking a lot of time.
Thanks!
Use regular expressions, which are made for this type of stuff. There may be more optimized ways of doing it, but in terms of easy to write code, it's hard to beat. The data:
vec <- c("a0cc0vaaaabaaaabaaaabaa00bvw","ff00f0f0f0f0f0f0f0f0000")
The function that does the matching:
find_rep_path <- function(vec, reps) {
regexp <- paste0(c("(.+)", rep("\\1", reps - 1L)), collapse="")
match <- regmatches(vec, regexpr(regexp, vec, perl=T))
substr(match, 1, nchar(match) / reps)
}
And some tests:
sapply(vec, find_rep_path, reps=3L)
# a0cc0vaaaabaaaabaaaabaa00bvw ff00f0f0f0f0f0f0f0f0000
# "aaaab" "0f0f"
sapply(vec, find_rep_path, reps=5L)
# $a0cc0vaaaabaaaabaaaabaa00bvw
# character(0)
#
# $ff00f0f0f0f0f0f0f0f0000
# [1] "0f"
Note that with threshold as 3, the actual longest pattern for the second string is 0f0f, not 0f (reverts to 0f at threshold 5). In order to do this, I use back references (\\1), and repeat these as many time as necessary to reach threshold. I need to then substr the result because annoyingly base R doesn't have an easy way to get just the captured sub expressions when using perl compatible regular expressions. There is probably a not too hard way to do this, but the substr approach works well in this example.
Also, as per the discussion in #G. Grothendieck's answer, here is the version with the cap on length of pattern, which is just adding the limit argument and the slight modification of the regexp.
find_rep_path <- function(vec, reps, limit) {
regexp <- paste0(c("(.{1,", limit,"})", rep("\\1", reps - 1L)), collapse="")
match <- regmatches(vec, regexpr(regexp, vec, perl=T))
substr(match, 1, nchar(match) / reps)
}
sapply(vec, find_rep_path, reps=3L, limit=3L)
# a0cc0vaaaabaaaabaaaabaa00bvw ff00f0f0f0f0f0f0f0f0000
# "a" "0f"
find.string finds substring of maximum length subject to (1) substring must be repeated consecutively at least th times and (2) substring length must be no longer than len.
reps <- function(s, n) paste(rep(s, n), collapse = "") # repeat s n times
find.string <- function(string, th = 3, len = floor(nchar(string)/th)) {
for(k in len:1) {
pat <- paste0("(.{", k, "})", reps("\\1", th-1))
r <- regexpr(pat, string, perl = TRUE)
if (attr(r, "capture.length") > 0) break
}
if (r > 0) substring(string, r, r + attr(r, "capture.length")-1) else ""
}
and here are some tests. The last test processes the entire text of James Joyce's Ulysses in 1.4 seconds on my laptop:
> find.string("a0cc0vaaaabaaaabaaaabaa00bvw")
[1] "aaaab"
> find.string("ff00f0f0f0f0f0f0f0f0000")
[1] "0f0f"
>
> joyce <- readLines("http://www.gutenberg.org/files/4300/4300-8.txt")
> joycec <- paste(joyce, collapse = " ")
> system.time(result <- find.string2(joycec, len = 25))
user system elapsed
1.36 0.00 1.39
> result
[1] " Hoopsa boyaboy hoopsa!"
ADDED
Although I developed my answer before having seen BrodieG's, as he points out they are very similar to each other. I have added some features of his to the above to get the solution below and tried the tests again. Unfortunately when I added the variation of his code the James Joyce example no longer works although it does work on the other two examples shown. The problem seems to be in adding the len constraint to the code and may represent a fundamental advantage of the code above (i.e. it can handle such a constraint and such constraints may be essential for very long strings).
find.string2 <- function(string, th = 3, len = floor(nchar(string)/th)) {
pat <- paste0(c("(.", "{1,", len, "})", rep("\\1", th-1)), collapse = "")
r <- regexpr(pat, string, perl = TRUE)
ifelse(r > 0, substring(string, r, r + attr(r, "capture.length")-1), "")
}
> find.string2("a0cc0vaaaabaaaabaaaabaa00bvw")
[1] "aaaab"
> find.string2("ff00f0f0f0f0f0f0f0f0000")
[1] "0f0f"
> system.time(result <- find.string2(joycec, len = 25))
user system elapsed
0 0 0
> result
[1] "w"
REVISED The James Joyce test that was supposed to be testing find.string2 was actually using find.string. This is now fixed.
Not optimized (even it is fast) function , but I think it is more R way to do this.
Get all patterns of certains length > threshold : vectorized using mapply and substr
Get the occurrence of these patterns and extract the one with maximum occurrence : vectorized using str_locate_all.
Repeat 1-2 this for all lengths and tkae the one with maximum occurrence.
Here my code. I am creating 2 functions ( steps 1-2) and step 3:
library(stringr)
ss = "ff00f0f0f0f0f0f0f0f0000"
ss <- "a0cc0vaaaabaaaabaaaabaa00bvw"
find_pattern_length <-
function(length=1,ss){
patt = mapply(function(x,y) substr(ss,x,y),
1:(nchar(ss)-length),
(length+1):nchar(ss))
res = str_locate_all(ss,unique(patt))
ll = unlist(lapply(res,length))
list(patt = patt[which.max(ll)],
rep = max(ll))
}
get_pattern_threshold <-
function(ss,threshold =3 ){
res <-
sapply(seq(threshold,nchar(ss)),find_pattern_length,ss=ss)
res[,which.max(res['rep',])]
}
some tests:
get_pattern_threshold('ff00f0f0f0f0f0f0f0f0000',5)
$patt
[1] "0f0f0"
$rep
[1] 6
> get_pattern_threshold('ff00f0f0f0f0f0f0f0f0000',2)
$patt
[1] "f0"
$rep
[1] 18
Since you want at least three repetitions, there is a nice O(n^2) approach.
For each possible pattern length d cut string into parts of length d. In case of d=5 it would be:
a0cc0
vaaaa
baaaa
baaaa
baa00
bvw
Now look at each pairs of subsequent strings A[k] and A[k+1]. If they are equal then there is a pattern of at least two repetitions. Then go further (k+2, k+3) and so on. Finally you also check if suffix of A[k-1] and prefix of A[k+n] fit (where k+n is the first string that doesn't match).
Repeat it for each d starting from some upper bound (at most n/3).
You have n/3 possible lengths, then n/d strings of length d to check for each d. It should give complexity O(n (n/d) d)= O(n^2).
Maybe not optimal but I found this cutting idea quite neat ;)
For a bounded pattern (i.e not huge) it's best I think to just create all possible substrings first and then count them. This is if the sub-patterns can overlap. If not change the step fun in the loop.
pat="a0cc0vaaaabaaaabaaaabaa00bvw"
len=nchar(pat)
thr=3
reps=floor(len/2)
# all poss strings up to half length of pattern
library(stringr)
pat=str_split(pat, "")[[1]][-1]
str.vec=vector()
for(win in 2:reps)
{
str.vec= c(str.vec, rollapply(data=pat,width=win,FUN=paste0, collapse=""))
}
# the max length string repeated more than 3 times
tbl=table(str.vec)
tbl=tbl[tbl>=3]
tbl[which.max(nchar(names(tbl)))]
aaaabaa
3
NB Whilst I'm lazy and append/grow the str.vec here in a loop, for a larger problem I'm pretty sure the actual length of str.vec is predetermined by the length of the pattern if you care to work it out.
Here is my solution, it's not optimized (build vector with patterns <- c() ; pattern <- c(patterns, x) for example) and can be improve but simpler than yours, I think.
I can't understand which pattern exactly should (I just return the max) be returned but you can adjust the code to what you want exactly.
str <- "a0cc0vaaaabaaaabaaaabaa00bvw"
findPatternMax <- function(str){
nb <- nchar(str):1
length.patt <- rev(nb)
patterns <- c()
for (i in 1:length(nb)){
for (j in 1:nb[i]){
patterns <- c(patterns, substr(str, j, j+(length.patt[i]-1)))
}
}
patt.max <- names(which(table(patterns) == max(table(patterns))))
return(patt.max)
}
findPatternMax(str)
> findPatternMax(str)
[1] "a"
EDIT :
Maybe you want the returned pattern have a min length ?
then you can add a nchar.patt parameter for example :
nchar.patt <- 2 #For a pattern of 2 char min
nb <- nb[length.patt >= nchar.patt]
length.patt <- length.patt[length.patt >= nchar.patt]

Resources