Groovy grep words in file - groovy

I want to grep words in file from path. How do it in groovy way ? How count how many words I find each file ?
import groovy.io.FileType
def splitStatements() {
String path = "C:\\Users\\John\\test"
def result = new AntBuilder().fileset( dir: path ) {
containsregexp expression:['END','BEGIN']
}*.file
println result
}
splitStatements()

That's doing that what I want :
def wordCount_END = 0
def wordCount_BEGIN = 0
def dir = new File("C:\\Users\\John\\test")
dir.eachFileRecurse (FileType.FILES) { file ->
Scanner s = new Scanner(file)
while (s.hasNext()) {
if (s.next().equals('BEGIN')) wordCount_END++
}
}
dir.eachFileRecurse (FileType.FILES) { file ->
Scanner s = new Scanner(file)
while (s.hasNext()) {
if (s.next().equals('END')) wordCount_BEGIN++
}
}
println("END count per lock: " + wordCount_END)
println("BEGIN count per lock: " + wordCount_BEGIN)
}

Related

Finding a String from list in a String is not efficient enough

def errorList = readFile WORKSPACE + "/list.txt"
def knownErrorListbyLine = errorList.readLines()
def build_log = new URL (Build_Log_URL).getText()
def found_errors = null
for(knownError in knownErrorListbyLine) {
if (build_log.contains(knownError)) {
found_errors = build_log.readLines().findAll{ it.contains(knownError) }
for(error in found_errors) {
println "FOUND ERROR: " + error
}
}
}
I wrote this code to find listed errors in a string, but it takes about 20 seconds.
How can I improve the performance? I would love to learn from this.
Thanks a lot!
list.txt contains a string per line:
Step ... was FAILED
[ERROR] Pod-domainrouter call failed
#type":"ErrorExtender
[postDeploymentSteps] ... does not exist.
etc...
And build logs is where I need to find these errors.
Try this:
def errorList = readFile WORKSPACE + "/list.txt"
def knownErrorListbyLine = errorList.readLines()
def build_log = new URL (Build_Log_URL)
def found_errors = null
for(knownError in knownErrorListbyLine) {
build_log.eachLine{
if ( it.contains(knownError) ) {
println "FOUND ERROR: " + error
}
}
}
This might be even more performant:
def errorList = readFile WORKSPACE + "/list.txt"
def knownErrorListbyLine = errorList.readLines()
def build_log = new URL (Build_Log_URL)
def found_errors = null
build_log.eachLine{
for(knownError in knownErrorListbyLine) {
if ( it.contains(knownError) ) {
println "FOUND ERROR: " + error
}
}
}
Attempt using the last one relying on string eachLine instead.
def errorList = readFile WORKSPACE + "/list.txt"
def knownErrorListbyLine = errorList.readLines()
def build_log = new URL (Build_Log_URL).getText()
def found_errors = null
build_log.eachLine{
for(knownError in knownErrorListbyLine) {
if ( it.contains(knownError) ) {
println "FOUND ERROR: " + error
}
}
}
Try to move build_log.readLines() to the variable outside of the loop.
def errorList = readFile WORKSPACE + "/list.txt"
def knownErrorListbyLine = errorList.readLines()
def build_log = new URL (Build_Log_URL).getText()
def found_errors = null
def buildLogByLine = build_log.readLines()
for(knownError in knownErrorListbyLine) {
if (build_log.contains(knownError)) {
found_errors = buildLogByLine.findAll{ it.contains(knownError) }
for(error in found_errors) {
println "FOUND ERROR: " + error
}
}
}
Update: using multiple threads
Note: this may help in case errorList size is large enough. And also if the matching errors distributed evenly.
def sublists = knownErrorListbyLine.collate(x)
// int x - the sublist size,
// depends on the knownErrorListbyLine size, set the value to get e. g. 4 sublists (threads).
// Also do not use more than 2 threads per CPU. Start from 1 thread per CPU.
def logsWithErrors = []// list for store results per thread
def lock = new Object()
def threads = sublists.collect { errorSublist ->
Thread.start {
def logs = build_log.readLines()
errorSublist.findAll { build_log.contains(it) }.each { error ->
def results = logs.findAll { it.contains(error) }
synchronized(lock) {
logsWithErrors << results
}
}
}
}
threads*.join() // wait for all threads to finish
logsWithErrors.flatten().each {
println "FOUND ERROR: $it"
}
Also, as was suggested earlier by other user, try to measure the logs download time, it could be the bottleneck:
def errorList = readFile WORKSPACE + "/list.txt"
def knownErrorListbyLine = errorList.readLines()
def start = Calendar.getInstance().timeInMillis
def build_log = new URL(Build_Log_URL).getText()
def end = Calendar.getInstance().timeInMillis
println "Logs download time: ${(end-start)/1000} ms"
def found_errors = null

Download a zip file using Groovy

I need to download a zip file from a url using groovy.
Test url: https://gist.github.com/daicham/5ac8461b8b49385244aa0977638c3420/archive/17a929502e6dda24d0ecfd5bb816c78a2bd5a088.zip
What I've done so far:
def static downloadArtifacts(url,filename) {
new URL(url).openConnection().with { conn ->
conn.setRequestProperty("PRIVATE-TOKEN", "xxxx")
url = conn.getHeaderField( "Location" )
if( !url ) {
new File((String)filename ).withOutputStream { out ->
conn.inputStream.with { inp ->
out << inp
inp.close()
}
}
}
}
}
But while opening the downloaded zip file I get an error "An error occurred while loading the archive".
Any help is appreciated.
URL url2download = new URL(url)
File file = new File(filename)
file.bytes = url2download.bytes
You can do it with HttpBuilder-NG:
// https://http-builder-ng.github.io/http-builder-ng/
#Grab('io.github.http-builder-ng:http-builder-ng-core:1.0.3')
import groovyx.net.http.HttpBuilder
import groovyx.net.http.optional.Download
def target = 'https://gist.github.com/daicham/5ac8461b8b49385244aa0977638c3420/archive/17a929502e6dda24d0ecfd5bb816c78a2bd5a088.zip'
File file = HttpBuilder.configure {
request.uri = target
}.get {
Download.toFile(delegate, new File('a.zip'))
}
You can do it:
import java.util.zip.ZipEntry
import java.util.zip.ZipOutputStream
class SampleZipController {
def index() { }
def downloadSampleZip() {
response.setContentType('APPLICATION/OCTET-STREAM')
response.setHeader('Content-Disposition', 'Attachment;Filename="example.zip"')
ZipOutputStream zip = new ZipOutputStream(response.outputStream);
def file1Entry = new ZipEntry('first_file.txt');
zip.putNextEntry(file1Entry);
zip.write("This is the content of the first file".bytes);
def file2Entry = new ZipEntry('second_file.txt');
zip.putNextEntry(file2Entry);
zip.write("This is the content of the second file".bytes);
zip.close();
}
}

Groovy function with parameter

I have a class which is paring csv based file, but I would like to put a parameter for the token symbol.
Please let me know how can I change the function and use the function on program.
class CSVParser{
static def parseCSV(file,closure) {
def lineCount = 0
file.eachLine() { line ->
def field = line.tokenize(';')
lineCount++
closure(lineCount,field)
}
}
}
use(CSVParser.class) {
File file = new File("test.csv")
file.parseCSV { index,field ->
println "row: ${index} | ${field[0]} ${field[1]} ${field[2]}"
}
}
You'll have to add the parameter in between the file and closure parameters.
When you create a category class with static methods, the first parameter is the object the method is being called on so file must be first.
Having a closure as the last parameter allows the syntax where the open brace of the closure follows the function invocation without parentheses.
Here's how it would look:
class CSVParser{
static def parseCSV(file,separator,closure) {
def lineCount = 0
file.eachLine() { line ->
def field = line.tokenize(separator)
lineCount++
closure(lineCount,field)
}
}
}
use(CSVParser) {
File file = new File("test.csv")
file.parseCSV(',') { index,field ->
println "row: ${index} | ${field[0]} ${field[1]} ${field[2]}"
}
}
Just add the separator as the second parameter to the parseCSV method:
class CSVParser{
static def parseCSV(file, sep, closure) {
def lineCount = 0
file.eachLine() { line ->
def field = line.tokenize(sep)
closure(++lineCount, field)
}
}
}
use(CSVParser.class) {
File file = new File("test.csv")
file.parseCSV(";") { index,field ->
println "row: ${index} | ${field[0]} ${field[1]} ${field[2]}"
}
}

How to use not equalto in Groovy in this case

I just want to print files which are not located in ss
def folder = "./test-data"
// println "reading files from directory '$folder'"
def basedir = new File(folder)
basedir.traverse {
if (it.isFile()) {
def rec = it
// println it
def ss = rec.toString().substring(12)
if(!allRecords contains(ss)) {
println ss
}
}
Your question isn't exactly clear, but it looks like you're just trying to do
if (!allRecords.contains(ss)) {
println ss
}
in the last part of your code segment.

Tail a file in Groovy

I am wondering is there is a simple way to tail a file in Groovy? I know how to read a file, but how do I read a file and then wait for more lines to be added, read them, wait, etc...
I have what I am sure is a really stupid solution:
def lNum = 0
def num= 0
def numLines = 0
def myFile = new File("foo.txt")
def origNumLines = myFile.eachLine { num++ }
def precIndex = origNumLines
while (true) {
num = 0
lNum = 0
numLines = myFile.eachLine { num++ }
if (numLines > origNumLines) {
myFile.eachLine({ line ->
if (lNum > precIndex) {
println line
}
lNum++
})
}
precIndex = numLines
Thread.sleep(5000)
}
Note that I am not really interested in invoking the Unix "tail" command. Unless it is the only solution.
I wrote a groovy class which resembles the basic tail functionality:
class TailReader {
boolean stop = false
public void stop () {
stop = true
}
public void tail (File file, Closure c) {
def runnable = {
def reader
try {
reader = file.newReader()
reader.skip(file.length())
def line
while (!stop) {
line = reader.readLine()
if (line) {
c.call(line)
}
else {
Thread.currentThread().sleep(1000)
}
}
}
finally {
reader?.close()
}
} as Runnable
def t = new Thread(runnable)
t.start()
}
}
The tail method taks a file and closure as parameters. It will run in a separate thread and will pass each new line that will be appended to the file to the given closure. The tail method will run until the stop method is called on the TailReader instance. Here's a short example of how to use the TailReader class:
def reader = new TailReader()
reader.tail(new File("/tmp/foo.log")) { println it }
// Do something else, e.g.
// Thread.currentThread().sleep(30 * 1000)
reader.stop()
In reply to Christoph :
For my use case, I've replaced the block
line = reader.readLine()
if (line) {
c.call(line)
}
else {
Thread.currentThread().sleep(1000)
}
with
while ((line = reader.readLine()) != null)
c.call(line)
Thread.currentThread().sleep(1000)`
... != null => to ensure empty lines are outputted as well
while (... => to ensure every line is read as fast as possible

Resources