words count example in Scala?

words count example in Scala? - string

I see a lot of Scala tutorials with examples doing things like recrursive traversals or solving math problems. In my daily programming life I have the feeling most of my coding time is spent on mundane tasks like string manipulation, database queries and date manipulations. Is anyone interested to give an example of the a Scala version of the following perl script?
#!/usr/bin/perl
use strict;
#opens a file with on each line one word and counts the number of occurrences
# of each word, case insensitive
print "Enter the name of your file, ie myfile.txt:\n";
my $val = <STDIN>;
chomp ($val);
open (HNDL, "$val") || die "wrong filename";
my %count = ();
while ($val = <HNDL>)
{
chomp($val);
$count{lc $val}++;
}
close (HNDL);
print "Number of instances found of:\n";
foreach my $word (sort keys %count) {
print "$word\t: " . $count{$word} . " \n";
}
In summary:
ask for a filename
read the file (contains 1 word per line)
do away with line ends ( cr, lf or crlf)
lowercase the word
increment count of the word
print out each word, sorted alphabetically, and its count
TIA

A simple word count like that could be written as follows:
import io.Source
import java.io.FileNotFoundException
object WC {
def main(args: Array[String]) {
println("Enter the name of your file, ie myfile.txt:")
val fileName = readLine
val words = try {
Source.fromFile(fileName).getLines.toSeq.map(_.toLowerCase.trim)
} catch {
case e: FileNotFoundException =>
sys.error("No file named %s found".format(fileName))
}
val counts = words.groupBy(identity).mapValues(_.size)
println("Number of instances found of:")
for((word, count) <- counts) println("%s\t%d".format(word, count))
}
}

If you're going for concise/compact, you can in 2.10:
// Opens a file with one word on each line and counts
// the number of occurrences of each word (case-insensitive)
object WordCount extends App {
println("Enter the name of your file, e.g. myfile.txt: ")
val lines = util.Try{ io.Source.fromFile(readLine).getLines().toSeq } getOrElse
{ sys.error("Wrong filename.") }
println("Number of instances found of:")
lines.map(_.trim.toLowerCase).toSeq.groupBy(identity).toSeq.
map{ case (w,ws) => s"$w\t: ${ws.size}" }.sorted.foreach(println)
}

val lines : List[String] = List("this is line one" , "this is line 2", "this is line three")
val linesConcat : String = lines.foldRight("")( (a , b) => a + " "+ b)
linesConcat.split(" ").groupBy(identity).toList.foreach(p => println(p._1+","+p._2.size))
prints :
this,3
is,3
three,1
line,3
2,1
one,1

Related

How to convert the describe table output list or map in groovy

How to convert the below output map or list in groovy
col_name,data_type,comment
"brand","string",""
"tactic_name","string",""
"tactic_id","string",""
"content_description","string",""
"id","bigint",""
"me","bigint",""
"npi","bigint",""
"fname","string",""
"lname","string",""
"addr1","string",""
"addr2","string",""
"city","string",""
"state","string",""
"zip","int",""
"event","string",""
"event_date","timestamp",""
"error_flag","string",""
"error_reason","string",""
"vendor","string",""
"year","int",""
"month","int",""
"",,
"# Partition Information",,
"# col_name ","data_type ","comment "
"",,
"vendor","string",""
"year","int",""
"month","int",""**
Need to separate the partition columns in separate map and normal columns in separate map.
Expected output:
[[brand,string],[...]]

Try this code :
CsvParser is used to read the text. But your text needs some alteration before parsing it. So i did some text processing for fitting it into the csv format.
#Grab('com.xlson.groovycsv:groovycsv:0.2')
import com.xlson.groovycsv.CsvParser
def csv = '''col_name,data_type,comment
"brand","string",""
"tactic_name","string",""
"tactic_id","string",""
"content_description","string",""
"id","bigint",""
"me","bigint",""
"npi","bigint",""
"fname","string",""
"lname","string",""
"addr1","string",""
"addr2","string",""
"city","string",""
"state","string",""
"zip","int",""
"event","string",""
"event_date","timestamp",""
"error_flag","string",""
"error_reason","string",""
"vendor","string",""
"year","int",""
"month","int",""
"",,
"# Partition Information",,
"# col_name ","data_type ","comment "
"",,
"vendor","string",""
"year","int",""
"month","int",""**'''
def maptxt = csv.split('"# Partition Information",,')
def map1txt = maptxt[0].trim()
def map2txt = maptxt[1].trim().readLines().collect{
it=it.replace('#','')
it=it.replaceAll("\\s", "")
}.join('\n')
println getAsMap(map1txt)
println getAsMap(map2txt)
Map getAsMap (def txt)
{
Map ret = [:]
def data = new CsvParser().parse(txt)
for (each in data){
if(each.col_name) // empty keys are neglected.
ret[each.col_name]=each.data_type
}
ret
}
your text has empty col_name. This code neglected that rows.

How can i read an print the values from the text file

I have data in the text file.
M10 M2GBXR100A.PGM 8.00000000 3.0000000 3.00000000 2545.07500000sec 0.0
I am trying to read and print the text file data but how can just get the individual data.
I have used
File file = new File("C:/File/stat_l15.txt")
printn file.text
String Name = file.text.substring(0, file.text.indexOf(' '))
By this I am able to retrieve M10 but how can I get M2GBXR100A
Finally I need the output as
Name : M10
pg_name : M2GBXR100A.PGM
right : 8.00000000
left : 3.0000000
these data i am saving in a table !!

Since your file is delimited by spaces, you can use Split:
File file = new File("C:/File/stat_l15.txt")
println file.text
List values = file.text.split(' ')
println "Name: ${values[0]}"
println "pg_name: ${values[1]}"
println "right: ${values[2]}"
println "left : ${values[3]}"

format the following result file into a tabular format using Perl

I have a sort of a problem, and I am still novice with Perl.
I just want to ask how can I format the following results file into an Excel readable format (let's say CSV).
Result file example. llq1_dly.mt0
$MEAS COMMANDS SOURCE='llq1_dly.meas' DB FILE='clk_top_45h_lpe_sim.fsdb'
.TITLE '**-------------'
tdrll10_0 tdfll10_0 tdrll10_1 tdfll10_1 tdrll10_2 tdfll10_2 tdrll10_3
2.106560e-10 1.990381e-10 2.102583e-10 1.986280e-10 2.095036e-10 1.978480e-10 2.083813e-10
into the following file with a result like this one below
llq1_dly,tdr,tdf,
ll10_0,2.106560e-10,1.990381e-10,
ll10_1,2.102583e-10,1.986280e-10,
ll10_2,2.095036e-10,1.978480e-10,
ll10_3,2.083813e-10,1.967019e-10,
...
or more likely this one (to be compatible with engineering scientific notations):
llq1_dly,tdr,tdf,
ll10_0,210.6560e-12,199.0381e-12,
ll10_1,210.2583e-12,198.6280e-12,
ll10_2,209.5036e-12,197.8480e-12,
ll10_3,208.3813e-12,196.7019e-12,
...

Here's a program that produces the output you ask for. I don't generally approve of offering answers to questions where the OP hasn't made any attempt to write a solution themselves, but this question interested me.
It may well be that this could be written more simply, but you don't say what parts of the input are invariant. For instance, I have written it so that there can be any number of different columns with any names, rather than just tdr and tdf every time. As it is I have had to guess that the trailing part of each header ends in ll, so for instance tdrll10_0 is tdr and ll10_0. If that is wrong then you will need a different way of splitting the string.
I have written the program so that it reads from the DATA file handle. I trust you are able to write an open statement to read from the correct input file?
I hope this helps
use strict;
use warnings;
use 5.010;
use Number::FormatEng 'format_eng';
Number::FormatEng::use_e_zero();
my $fh = \*DATA;
my ($source, #headers, #values);
while ( <$fh> ) {
if ( /SOURCE=(?|'([^']+)'|"([^"]+)")/ ) { #' code highlighting fix
($source = $1) =~ s/\.[^.]*\z//;
}
elsif ( /^\.TITLE/ ) {
#headers = split ' ', <$fh>;
#values = split ' ', <$fh>;
last;
}
}
my #title = ( $source );
my (%headers, #table, #line);
for my $i ( 0 .. $#headers) {
my #fields = split /(?=ll)/, $headers[$i];
if ( $headers{$fields[0]} ) {
push #table, [ #line ];
#line = ();
%headers = ();
}
++$headers{$fields[0]};
push #line, $fields[1] if #line == 0;
push #line, format_eng($values[$i]);
push #title, $fields[0] unless #table;
}
print "$_," for #title;
print "\n";
for ( #table ) {
print "$_," for #$_;
print "\n";
}
__DATA__
$MEAS COMMANDS SOURCE='llq1_dly.meas' DB FILE='clk_top_45h_lpe_sim.fsdb'
.TITLE '**-------------'
tdrll10_0 tdfll10_0 tdrll10_1 tdfll10_1 tdrll10_2 tdfll10_2 tdrll10_3
2.106560e-10 1.990381e-10 2.102583e-10 1.986280e-10 2.095036e-10 1.978480e-10 2.083813e-10
output
llq1_dly,tdr,tdf,
ll10_0,210.656e-12,199.0381e-12,
ll10_1,210.2583e-12,198.628e-12,
ll10_2,209.5036e-12,197.848e-12,

Groovy File Traverse behavior when parsing file

I have a closure working properly on traverse, but another of the same kind is failing. I'm suspecting scope or timing is causing this to fail. The working code sums the size of files in the file system. The code not working is inspecting the content of the file and only prints one match. Running these with Grails 2.3.7
working code:
def groovySrcDir = new File('.', 'plugins/')
def countSmallFiles = 0
def postDirVisitor = {
if (countSmallFiles > 0) {
println "Found $countSmallFiles files with small filenames in ${it.name}"
}
countSmallFiles = 0
}
groovySrcDir.traverse(type: FILES, postDir: postDirVisitor, nameFilter: ~/.*\.groovy$/) {
if (it.name.size() < 15) {
countSmallFiles++
}
}
problem code:
def datamap = [:]
def printDomainFound = {
//File currentFile = new File(it.canonicalPath)
def fileText = it.text
if(fileText.indexOf("#Table ") > 0){
//println "Found a Table annotation in ${it.name} "
datamap.put(it.name, it.name)
}
}
groovySrcDir.traverse type: FILES, visit: printDomainFound, nameFilter: filterGroovyFiles
datamap.each {
println it.key
}

I tested your code and worked fine.
Which behaviour are you expecting?
I find a couple of suspicious things:
If fileText begins with "#Table " then indexOf will return 0 and the condition if(fileText.indexOf("#Table ") > 0) will not be satisfied.
"#Table " has a trailing space, then a file containing, for example: "#Table(", will not be printed.
You can also check that filterGroovyFiles has the appropiate value.
I hope it'll help.
-- EDIT --
Running the code with def filterGroovyFiles = ~/.*\.groovy$/ and this file tree:
plugins
|--sub1
| |-dum.groovy
| |-dum2.groovy
dum3.groovy
And all three groovy files containing the (but not starting with!!) "#Table " (with trailing space!!). I get the expected output:
dum3.groovy
dum.groovy
dum2.groovy
(Note both dum.groovy and dum2.groovy from the same folder sub1 are appearing).
I'm using groovy 2.0.5.
Please recheck your files :
Have the correct extension
Contain but not at the begining (index==0) the String "#Table "

how to replace a string/word in a text file in groovy

Hello I am using groovy 2.1.5 and I have to write a code which show the contens/files of a directory with a given path then it makes a backup of the file and replace a word/string from the file.
here is the code I have used to try to replace a word in the file selected
String contents = new File( '/geretd/resume.txt' ).getText( 'UTF-8' )
contents = contents.replaceAll( 'visa', 'viva' )
also here is my complete code if anyone would like to modify it in a more efficient way, I will appreciate it since I am learning.
def dir = new File('/geretd')
dir.eachFile {
if (it.isFile()) {
println it.canonicalPath
}
}
copy = { File src,File dest->
def input = src.newDataInputStream()
def output = dest.newDataOutputStream()
output << input
input.close()
output.close()
}
//File srcFile = new File(args[0])
//File destFile = new File(args[1])
File srcFile = new File('/geretd/resume.txt')
File destFile = new File('/geretd/resumebak.txt')
copy(srcFile,destFile)
x = " "
println x
def dire = new File('/geretd')
dir.eachFile {
if (it.isFile()) {
println it.canonicalPath
}
}
String contents = new File( '/geretd/resume.txt' ).getText( 'UTF-8' )
contents = contents.replaceAll( 'visa', 'viva' )

As with nearly everything Groovy, AntBuilder is the easiest route:
ant.replace(file: "myFile", token: "NEEDLE", value: "replacement")

As an alternative to loading the whole file into memory, you could do each line in turn
new File( 'destination.txt' ).withWriter { w ->
new File( 'source.txt' ).eachLine { line ->
w << line.replaceAll( 'World', 'World!!!' ) + System.getProperty("line.separator")
}
}
Of course this (and dmahapatro's answer) rely on the words you are replacing not spanning across lines

I use this code to replace port 8080 to ${port.http} directly in certain file:
def file = new File('deploy/tomcat/conf/server.xml')
def newConfig = file.text.replace('8080', '${port.http}')
file.text = newConfig
The first string reads a line of the file into variable. The second string performs a replace. The third string writes a variable into file.

Answers that use "File" objects are good and quick, but usually cause following error that of course can be avoided but at the cost of loosen security:
Scripts not permitted to use new java.io.File java.lang.String.
Administrators can decide whether to approve or reject this signature.
This solution avoids all problems presented above:
String filenew = readFile('dir/myfile.yml').replaceAll('xxx','YYY')
writeFile file:'dir/myfile2.yml', text: filenew

Refer this answer where patterns are replaced. The same principle can be used to replace strings.
Sample
def copyAndReplaceText(source, dest, Closure replaceText){
dest.write(replaceText(source.text))
}
def source = new File('source.txt') //Hello World
def dest = new File('dest.txt') //blank
copyAndReplaceText(source, dest) {
it.replaceAll('World', 'World!!!!!')
}
assert 'Hello World' == source.text
assert 'Hello World!!!!!' == dest.text

other simple solution would be following closure:
def replace = { File source, String toSearch, String replacement ->
source.write(source.text.replaceAll(toSearch, replacement))
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

words count example in Scala? - string

val lines : List[String] = List("this is line one" , "this is line 2", "this is line three") val linesConcat : String = lines.foldRight("")( (a , b) => a + " "+ b) linesConcat.split(" ").groupBy(identity).toList.foreach(p => println(p._1+","+p._2.size)) prints : this,3 is,3 three,1 line,3 2,1 one,1

Related

How to convert the describe table output list or map in groovy

How can i read an print the values from the text file

format the following result file into a tabular format using Perl

Groovy File Traverse behavior when parsing file

how to replace a string/word in a text file in groovy

Categories

Resources