Error in script to count the number of occurrences - groovy

I am entirely new to Groovy scripting and need help.
I tried the following script to count the number of lines that has a specific text occurrence.
Error observed:
groovy.lang.MissingMethodException: No signature of method: java.io.File.eachline() is applicable for argument types: (Hemanth_v1$_run_closure2) values: [Hemanth_v1$_run_closure2#10f39d0]
Possible solutions: eachLine(groovy.lang.Closure), eachLine(int, groovy.lang.Closure), eachLine(java.lang.String, groovy.lang.Closure), eachFile(groovy.lang.Closure), eachLine(java.lang.String, int, groovy.lang.Closure), eachFile(groovy.io.FileType, groovy.lang.Closure)
Script:
def file = new File('C:\\NE\\header.txt');
count = 0
def data1= file.filterLine { line ->
line.contains('smtpCus:');
}
//custom code by Hemanth
file.eachline { line, count ->
if (line.contains('Received:')) {
count++
}
}

There are two issues with the script you've shown to us:
there is a typo in file.eachline - it should be file.eachLine
in the closure passed to eachLine you increment a local variable count so the outer count remains 0 after the execution.
Here is what your script should look like:
def file = new File('C:\\NE\\header.txt')
count = 0
def data1 = file.filterLine { line ->
line.contains('smtpCus:')
}
//custom code by Hemanth
file.eachLine { line ->
if (line.contains('Received:')) {
count++
}
}
println count
Reading file as java.util.stream.Stream<T>
There is also one thing worth mentioning if it comes to reading files in Groovy (and Java in general). If you work with a huge file it's a good practice to load this file using Java Stream API - Files.lines(path)
import java.nio.file.Files
import java.nio.file.Paths
long counter = Files.lines(Paths.get('C:\\NE\\header.txt'))
.filter { line -> line.contains('Received:') }
.count()
println counter

Groovy methods use camel case. It should be eachLine() instead of eachline(). See if that helps!

Related

Collecting a GPars loop to a Map

I need to iterate on a List and for every item run a time-expensive operation and then collect its results to a map, something like this:
List<String> strings = ['foo', 'bar', 'baz']
Map<String, Object> result = strings.collectEntries { key ->
[key, expensiveOperation(key)]
}
So that then my result is something like
[foo: <an object>, bar: <another object>, baz: <another object>]
Since the operations i need to do are pretty long and don't depend on each other, I've been willing to investigate using GPars to run the loop in parallel.
However, GPars has a collectParallel method that loops through a collection in parallel and collects to a List but not a collectEntriesParallel that collects to a Map: what's the correct way to do this with GPars?
There is no collectEntriesParallel because it would have to produce the same result as:
collectParallel {}.collectEntries {}
as Tim mentioned in the comment. It's hard to make reducing list of values to map (or any other mutable container) in a deterministic way other than collecting results to a list in parallel and in the end collecting to map entries in a sequential manner. Consider following sequential example:
static def expensiveOperation(String key) {
Thread.sleep(1000)
return key.reverse()
}
List<String> strings = ['foo', 'bar', 'baz']
GParsPool.withPool {
def result = strings.inject([:]) { seed, key ->
println "[${Thread.currentThread().name}] (${System.currentTimeMillis()}) seed = ${seed}, key = ${key}"
seed + [(key): expensiveOperation(key.toString())]
}
println result
}
In this example we are using Collection.inject(initialValue, closure) which is an equivalent of good old "fold left" operation - it starts with initial value [:] and iterates over all values and adds them as key and value to initial map. Sequential execution in this case takes approximately 3 seconds (each expensiveOperation() sleeps for 1 second).
Console output:
[main] (1519925046610) seed = [:], key = foo
[main] (1519925047773) seed = [foo:oof], key = bar
[main] (1519925048774) seed = [foo:oof, bar:rab], key = baz
[foo:oof, bar:rab, baz:zab]
And this is basically what collectEntries() does - it's kind of reduction operation where initial value is an empty map.
Now let's see what happens if we try to parallelize it - instead of inject we will use injectParallel method:
GParsPool.withPool {
def result = strings.injectParallel([:]) { seed, key ->
println "[${Thread.currentThread().name}] (${System.currentTimeMillis()}) seed = ${seed}, key = ${key}"
seed + [(key): expensiveOperation(key.toString())]
}
println result
}
Let's see what is the result:
[ForkJoinPool-1-worker-1] (1519925323803) seed = foo, key = bar
[ForkJoinPool-1-worker-2] (1519925323811) seed = baz, key = [:]
[ForkJoinPool-1-worker-1] (1519925324822) seed = foo[bar:rab], key = baz[[:]:]:[]
foo[bar:rab][baz[[:]:]:[]:][:]:]:[[zab]
As you can see parallel version of inject does not care about the order (which is expected) and e.g. first thread received foo as a seed variable and bar as a key. This is what could happen if reduction to a map (or any mutable object) was performed in parallel and without specific order.
Solution
There are two ways to parallelize the process:
1. collectParallel + collectEntries combination
As Tim Yates mentioned in the comment you can parallel expensive operation execution and in the end collect results to a map sequentially:
static def expensiveOperation(String key) {
Thread.sleep(1000)
return key.reverse()
}
List<String> strings = ['foo', 'bar', 'baz']
GParsPool.withPool {
def result = strings.collectParallel { [it, expensiveOperation(it)] }.collectEntries { [(it[0]): it[1]] }
println result
}
This example executes in approximately 1 second and produces following output:
[foo:oof, bar:rab, baz:zab]
2. Java's parallel stream
Alternatively you can use Java's parallel stream with Collectors.toMap() reducer function:
static def expensiveOperation(String key) {
Thread.sleep(1000)
return key.reverse()
}
List<String> strings = ['foo', 'bar', 'baz']
def result = strings.parallelStream()
.collect(Collectors.toMap(Function.identity(), { str -> expensiveOperation(str)}))
println result
This example also executes in approximately 1 second and produces output like that:
[bar:rab, foo:oof, baz:zab]
Hope it helps.

How to sort elements in an array or just print them sorted?

I wrote the following Groovy code which returns an array of CIDR blocks in use throughout all 3 AWS regions we use, the results are populated to a Jenkins extended parameter:
def regions = ['us-west-2', 'us-east-1', 'eu-west-1']
def output = []
regions.each { region ->
def p = ['/usr/local/bin/aws', 'ec2', 'describe-vpcs', '--region', region].execute() | 'grep -w CidrBlock'.execute() | ['awk', '{print $2}'].execute() | ['tr', '-d', '"\\"\\|,\\|\\{\\|\\\\["'].execute() | 'uniq'.execute()
p.waitFor()
p.text.eachLine { line ->
output << line
}
}
output.each {
println it
}
The output of the code looks like so:
172.31.0.0/16
172.56.0.0/16
172.55.0.0/16
172.64.0.0/16
172.52.0.0/16
I would like to sort the output in a numeric way, can it be done?
Edit #1:
If I use ".sort()" I get the following error:
Caught: groovy.lang.MissingMethodException: No signature of method: java.lang.String.sort() is applicable for argument types: () values: []
Possible solutions: drop(int), tr(java.lang.CharSequence, java.lang.CharSequence), wait(), toSet(), size(), size()
groovy.lang.MissingMethodException: No signature of method: java.lang.String.sort() is applicable for argument types: () values: []
Possible solutions: drop(int), tr(java.lang.CharSequence, java.lang.CharSequence), wait(), toSet(), size(), size()
at populate_parameter_with_used_cidrs$_run_closure2.doCall(populate_parameter_with_used_cidrs.groovy:15)
at populate_parameter_with_used_cidrs.run(populate_parameter_with_used_cidrs.groovy:14)
Some general hints to your code first:
p.waitFor() is not necessary if you do p.text, as this waits for the process to finish first anyway.
To get a list of Strings for the lines of a multi-line String, you can simply use readLines().
To transform one list into another list you can use collect() or collectMany().
This would boil down your code to
def regions = ['us-west-2', 'us-east-1', 'eu-west-1']
def output = regions.collectMany { ['/usr/local/bin/aws', 'ec2', 'describe-vpcs', '--region', it].execute() | 'grep -w CidrBlock'.execute() | ['awk', '{print $2}'].execute() | ['tr', '-d', '"\\"\\|,\\|\\{\\|\\\\["'].execute() | 'uniq'.execute().text.readLines() }
output.each { println it }
And to get the number-aware sorting, you add to that
output = output.sort { a, b ->
def aparts = a.split('[./]').collect { it as short }
def bparts = b.split('[./]').collect { it as short }
(0..4).collect { aparts[it] <=> bparts[it] }.find() ?: 0
}
How about .sort()?
def list = ['172.31.0.0/16', '172.56.0.0/16', '172.55.0.0/16', '172.64.0.0/16', '172.52.0.0/16']
println list.sort()
As an option: to sort and remove duplicates
(output as SortedSet).each {
println it
}

Groovy find the last iteration inside a cloure?

In groovy how to find the last iteration inside the closure.
def closure = { it->
//here I need to print last line only
}
new File (file).eachLine{ closure(it)}
Need to find inside the closure iteration.
Update 1:
Instead of reading a file, In Common How can i find the last iteration inside the closure ?
def closure = { it->
//Find last iteration here
}
I guess you need eachWithIndex:
def f = new File('TODO')
def lines = f.readLines().size()
def c = { l, i ->
if(i == lines - 1) {
println "last: $i $l"
}
}
f.eachWithIndex(c)
Of course in case of big files you need to count lines efficiently.

Using MetaProgramming to Add collectWithIndex and injectWithIndex similar to eachWithIndex

Please help with a metaprogramming configuration such that I can add collections methods called collectWithIndex and injectWithIndex that work in a similar manner to eachWithIndex but of course include the base functionality of collect and inject. The new methods would accept a two (three with maps) argument closure just like eachWithIndex. I would like to have the capability to utilize these methods across many different scripts.
Use case:
List one = [1, 2, 3]
List two = [10, 20, 30]
assert [10, 40, 90] == one.collectWithIndex { value, index ->
value * two [index]
}
Once the method is developed then how would it be made available to scripts? I suspect that a jar file would be created with special extension information and then added to the classpath.
Many thanks in advance
I'm still sure, it's not a proper SO question, but I'll give you an example, how you can enrich metaclass for your multiple scripts.
Idea is based on basescript, adding required method to List's metaClass in it's constructor. You have to implement collect logic yourself, through it's pretty easy. You can use wrapping
import org.codehaus.groovy.control.CompilerConfiguration
class WithIndexInjector extends Script {
WithIndexInjector() {
println("Adding collectWithIndex to List")
List.metaClass.collectWithIndex {
int i = 0
def result = []
for (o in delegate) // delegate is a ref holding initial list.
result << it(o, i++) // it is closure given to method
result
}
}
#Override Object run() {
return null
}
}
def configuration = new CompilerConfiguration()
configuration.scriptBaseClass = WithIndexInjector.name
new GroovyShell(configuration).evaluate('''
println(['a', 'b'].collectWithIndex { it, id -> "[$id]:$it" })
''')
// will print [[0]:a, [1]:b]
If you like to do it in more functional way, without repeating collect logic, you may use wrapping proxy closure. I expect it to be slower, but maybe it's not a deal. Just replace collectWithIndex with following implementation.
List.metaClass.collectWithIndex {
def wrappingProxyClosure = { Closure collectClosure, int startIndex = 0 ->
int i = startIndex
return {
collectClosure(it, i++) // here we keep hold on outer collectClosure and i, and use call former with one extra argument. "it" is list element, provided by default collect method.
}
}
delegate.collect(wrappingProxyClosure(it))
}
offtopic: In SO community your current question will only attract minuses, not answers.

print the closure definition/source in Groovy

Anyone who knows how the print the source of a closure in Groovy?
For example, I have this closure (binded to a)
def a = { it.twice() }
I would like to have the String "it.twice()" or "{ it.twice() }"
Just a simple toString ofcourse won't work:
a.toString(); //results in: Script1$_run_closure1_closure4_closure6#12f1bf0
short answer is you can't. long answer is:
depending on what you need the code for, you could perhaps get away with
// file: example1.groovy
def a = { it.twice() }
println a.metaClass.classNode.getDeclaredMethods("doCall")[0].code.text
// prints: { return it.twice() }
BUT
you will need the source code of the script available in the classpath AT RUNTIME as explained in
groovy.lang.MetaClass#getClassNode()
"Obtains a reference to the original
AST for the MetaClass if it is
available at runtime
#return The
original AST or null if it cannot be
returned"
AND
the text trick does not really return the same code, just a code like representation of the AST, as can be seen in this script
// file: example2.groovy
def b = {p-> p.twice() * "p"}
println b.metaClass.classNode.getDeclaredMethods("doCall")[0].code.text
// prints: { return (p.twice() * p) }
still, it might be useful as it is if you just want to take a quick look
AND, if you have too much time on your hands and don't know what to do you could write your own org.codehaus.groovy.ast.GroovyCodeVisitor to pretty print it
OR, just steal an existing one like groovy.inspect.swingui.AstNodeToScriptVisitor
// file: example3.groovy
def c = {w->
[1,2,3].each {
println "$it"
(1..it).each {x->
println 'this seems' << ' somewhat closer' << ''' to the
original''' << " $x"
}
}
}
def node = c.metaClass.classNode.getDeclaredMethods("doCall")[0].code
def writer = new StringWriter()
node.visit new groovy.inspect.swingui.AstNodeToScriptVisitor(writer)
println writer
// prints: return [1, 2, 3].each({
// this.println("$it")
// return (1.. it ).each({ java.lang.Object x ->
// return this.println('this seems' << ' somewhat closer' << ' to the \n original' << " $x")
// })
// })
now.
if you want the original, exact, runnable code ... you are out of luck
i mean, you could use the source line information, but last time i checked, it wasn't really getting them right
// file: example1.groovy
....
def code = a.metaClass.classNode.getDeclaredMethods("doCall")[0].code
println "$code.lineNumber $code.columnNumber $code.lastLineNumber $code.lastColumnNumber"
new File('example1.groovy').readLines()
... etc etc you get the idea.
line numbers shuld be at least near the original code though
That isn't possible in groovy. Even when a groovy script is run directly, without compiling it first, the script is converted into JVM bytecode. Closures aren't treated any differently, they are compiled like regular methods. By the time the code is run, the source code isn't available any more.

Resources