Groovier way to parse tsv file into map

Groovier way to parse tsv file into map - groovy

I have a tsv file in the form of "key \t value", and I need to read into a map. Currently i do it like this:
referenceFile.eachLine { line ->
def (name, reference) = line.split(/\t/)
referencesMap[name.toLowerCase()] = reference
}
Is there a shorter/nicer way to do it?

It's already quite short. Two answers I can think of:
First one avoids the creation of a temporary map object:
referenceFile.inject([:]) { map, line ->
def (name, reference) = line.split(/\t/)
map[name.toLowerCase()] = reference
map
}
Second one is more functional:
referenceFile.collect { it.split(/\t/) }.inject([:]) { map, val -> map[val[0].toLowerCase()] = val[1]; map }

The only other way I can think of doing it would be with an Iterator like you'd find in Commons IO:
#Grab( 'commons-io:commons-io:2.4' )
import org.apache.commons.io.FileUtils
referencesMap = FileUtils.lineIterator( referenceFile, 'UTF-8' )
.collectEntries { line ->
line.tokenize( '\t' ).with { k, v ->
[ (k.toLowerCase()): v ]
}
}
Or with a CSV parser:
#Grab('com.xlson.groovycsv:groovycsv:1.0')
import static com.xlson.groovycsv.CsvParser.parseCsv
referencesMap = referenceFile.withReader { r ->
parseCsv( [ separator:'\t', readFirstLine:true ], r ).collectEntries {
[ (it[ 0 ].toLowerCase()): it[ 1 ] ]
}
}
But neither of them are shorter, and not necessarily nicer either...
Though I prefer option 2 as it can handle cases such as:
"key\twith\ttabs"\tvalue
As it deals with quoted strings

This is the comment tim_yates added to melix's answer, and I think it's the shortest/clearest answer:
referenceFile.collect { it.tokenize( '\t' ) }.collectEntries { k, v -> [ k.toLowerCase(), v ] }

Related

Remove hyphens from keys in deeply nested map

I posted this question in the Groovy mailing lists, but I've not yet gotten an answer. I was wondering if someone can help here. I am re-posting relevant text from my original question.
I have an input json that’s nested, that is read via a JsonSlurper, and some of the keys have hyphens in them. I need to replace those keys that have hyphens with underscores and convert it back to json for downstream processing. I looked at the JsonGenerator.Options documentation and I could not find any documentation for this specific requirement.
I also looked through options to iterate through the Map that is produced from JsonSlurper, but unfortunately I’m not able to find an effective solution that iterates through a nested Map, changes the keys and produces another Map which could be converted to a Json string.
Example Code
import groovy.json.*
// This json can be nested many levels deep
def inputJson = """{
"database-servers": {
"dc-1": [
"server1",
"server2"
]
},
"discovery-servers": {
"dc-3": [
"discovery-server1",
"discovery-server2"
]
}
}
"""
I need to convert the above to json that looks like the example below. I can iterate through and convert using the collectEntries method which only works on the first level, but I need to do it recursively, since the input json can be an nested many levels deep.
{
"database_servers": {
"dc_1": [
"server1",
"server2"
]
},
"discovery_servers": {
"dc_3": [
"discovery-server1",
"discovery-server2"
]
}
}

Seems like you just need a recursive method to process the slurped Map and its sub-Maps.
import groovy.json.JsonSlurper
JsonSlurper slurper = new JsonSlurper()
def jsonmap = slurper.parseText( inputJson )
Map recurseMap( def inputMap ) {
return inputMap.collectEntries { key, val ->
String newkey = key.replace( "-", "_" )
if ( val instanceof Map ) {
return [ newkey, recurseMap( val ) ]
}
return [ newkey, val ]
}
}
def retmap = recurseMap( jsonmap )
println retmap // at this point you can use output this however you like

Scala - Remove all elements in a list/map of strings from a single String

Working on an internal website where the URL contains the source reference from other systems. This is a business requirement and cannot be changed.
i.e. "http://localhost:9000/source.address.com/7808/project/repo"
"http://localhost:9000/build.address.com/17808/project/repo"
I need to remove these strings from the "project/repo" string/variables using a trait so this can be used natively from multiple services. I also want to be able to add more sources to this list (which already exists) and not modify the method.
"def normalizePath" is the method accessed by services, 2 non-ideal but reasonable attempts so far. Getting stuck on a on using foldLeft which I woudl like some help with or an simpler way of doing the described. Code Samples below.
1st attempt using an if-else (not ideal as need to add more if/else statements down the line and less readable than pattern match)
trait NormalizePath {
def normalizePath(path: String): String = {
if (path.startsWith("build.address.com/17808")) {
path.substring("build.address.com/17808".length, path.length)
} else {
path
}
}
}
and 2nd attempt (not ideal as likely more patterns will get added and it generates more bytecode than if/else)
trait NormalizePath {
val pattern = "build.address.com/17808/"
val pattern2 = "source.address.com/7808/"
def normalizePath(path: String) = path match {
case s if s.startsWith(pattern) => s.substring(pattern.length, s.length)
case s if s.startsWith(pattern2) => s.substring(pattern2.length, s.length)
case _ => path
}
}
Last attempt is to use an address list(already exists elsewhere but defined here as MWE) to remove occurrences from the path string and it doesn't work:
trait NormalizePath {
val replacements = (
"build.address.com/17808",
"source.address.com/7808/")
private def remove(path: String, string: String) = {
path-string
}
def normalizePath(path: String): String = {
replacements.foldLeft(path)(remove)
}
}
Appreciate any help on this!

If you are just stripping out those strings:
val replacements = Seq(
"build.address.com/17808",
"source.address.com/7808/")
replacements.foldLeft("http://localhost:9000/source.address.com/7808/project/repo"){
case(path, toReplace) => path.replaceAll(toReplace, "")
}
// http://localhost:9000/project/repo
If you are replacing those string by something else:
val replacementsMap = Seq(
"build.address.com/17808" -> "one",
"source.address.com/7808/" -> "two/")
replacementsMap.foldLeft("http://localhost:9000/source.address.com/7808/project/repo"){
case(path, (toReplace, replacement)) => path.replaceAll(toReplace, replacement)
}
// http://localhost:9000/two/project/repo
The replacements collection can come from elsewhere in the code and will not need to be redeployed.
// method replacing by empty string
def normalizePath(path: String) = {
replacements.foldLeft(path){
case(startingPoint, toReplace) => startingPoint.replaceAll(toReplace, "")
}
}
normalizePath("foobar/build.address.com/17808/project/repo")
// foobar/project/repo
normalizePath("whateverPath")
// whateverPath
normalizePath("build.address.com/17808build.address.com/17808/project/repo")
// /project/repo

A very simple replacement could be made as follows:
val replacements = Seq(
"build.address.com/17808",
"source.address.com/7808/")
def normalizePath(path: String): String = {
replacements.find(path.startsWith(_)) // find the first occurrence
.map(prefix => path.substring(prefix.length)) // remove the prefix
.getOrElse(path) // if not found, return the original string
}
Since the expected replacements are very similar, have you tried to generalize them and use regex matching?

There are a million and one ways to extract /project/repo from a String in Scala. Here are a few I came up with:
val list = List("build.address.com/17808", "source.address.com/7808") //etc
def normalizePath(path: String) = {
path.stripPrefix(list.find(x => path.contains(x)).getOrElse(""))
}
Output:
scala> normalizePath("build.address.com/17808/project/repo")
res0: String = /project/repo
val list = List("build.address.com/17808", "source.address.com/7808") //etc
def normalizePath(path: String) = {
list.map(x => if (path.contains(x)) {
path.takeRight(path.length - x.length)
}).filter(y => y != ()).head
}
Output:
scala> normalizePath("build.address.com/17808/project/repo")
res0: Any = /project/repo
val list = List("build.address.com/17808", "source.address.com/7808") //etc
def normalizePath(path: String) = {
list.foldLeft(path)((a, b) => a.replace(b, ""))
}
Output:
scala> normalizePath("build.address.com/17808/project/repo")
res0: String = /project/repo
Depends how complicated you want your code to look (or how silly you want to be), really. Note that the second example has return type Any, which might not be ideal for your scenario. Also, these examples aren't meant to be able to just take the String out of the middle of your path... they can be fairly easily modified if you want to do that though. Let me know if you want me to add some examples just stripping things like build.address.com/17808 out of a String - I'd be happy to do so.

How to get value by name with Erlang struct?

Newbie of Erlang here.
Have Json like this:
{
"ReadCardResultResult":{
"amount":"0",
"balance":"9400",
"Status":1,
"Commands":[
],
"message":"0000000000000000",
"ret":{
"code":0,
"desc":"SUCCESS",
"subReturn":null
},
"transactionId":103979,
"txnInfo":[
{
"infoId":101,
"infoName":"TestName1",
"infoValue":"04432FBAA53080"
},
{
"infoId":102,
"infoName":"TestName2",
"infoValue":""
},
{
"infoId":103,
"infoName":"TestName3",
"infoValue":"9400"
},
{
"infoId":104,
"infoName":"TestName4",
"infoValue":"5"
}
]
}
}
My task is to get specific infoValue out of txnInfo according to infoName. For example: I need to get infoValue with "TestName3", that would be "9400".
So far I narrowed the Json with proplists:get_value(<<"txnInfo">>, ReadCardResultResult). and now I have this:
[{struct,[{<<"infoId">>,101},
{<<"infoName">>,<<"TestName1">>},
{<<"infoValue">>,<<"043A2FBAA53080">>}]},
{struct,[{<<"infoId">>,108},
{<<"infoName">>,<<"TestName2">>},
{<<"infoValue">>,<<"772">>}]},
{struct,[{<<"infoId">>,108},
{<<"infoName">>,<<"TestName3">>},
{<<"infoValue">>,<<"772">>}]},
{struct,[{<<"infoId">>,125},
{<<"infoName">>,<<"TestName4">>},
{<<"infoValue">>,<<>>}]}]
Now, where do I go from here? I'm really stuck on this. Any help would be appreciated.

To efficiently get the first item of a list matching a predicate, you can invert the predicate and use lists:dropwhile/2 (see this answer for more info about that). Other than that, it's just some pattern matching and a case expression:
-module(a).
-compile([export_all]).
main() ->
TxnInfo = [{struct,[{<<"infoId">>,101},
{<<"infoName">>,<<"TestName1">>},
{<<"infoValue">>,<<"043A2FBAA53080">>}]},
{struct,[{<<"infoId">>,108},
{<<"infoName">>,<<"TestName2">>},
{<<"infoValue">>,<<"772">>}]},
{struct,[{<<"infoId">>,108},
{<<"infoName">>,<<"TestName3">>},
{<<"infoValue">>,<<"9400">>}]},
{struct,[{<<"infoId">>,125},
{<<"infoName">>,<<"TestName4">>},
{<<"infoValue">>,<<>>}]}],
WantName = <<"TestName3">>,
case lists:dropwhile(fun({struct, PropList}) -> proplists:get_value(<<"infoName">>, PropList) /= WantName end, TxnInfo) of
[] ->
io:format("no matches~n");
[{struct, PropList} | _] ->
io:format("first match: ~p~n", [proplists:get_value(<<"infoValue">>, PropList)])
end.
Output:
first match: <<"9400">>
If you only care about the first result and want to crash if none is found, you can replace the case with just:
[{struct, PropList} | _] = lists:dropwhile(...),

Compare two maps and find differences using Groovy or Java

I would like to find difference in two maps and create a new csv file with the difference (and put the difference between **) like below:
Map 1
[
[cuInfo:"T12",service:"3",startDate:"14-01-16 13:22",appId:"G12355"],
[cuInfo:"T13",service:"3",startDate:"12-02-16 13:00",appId:"G12356"],
[cuInfo:"T14",service:"9",startDate:"10-01-16 11:20",appId:"G12300"],
[cuInfo:"T15",service:"10",startDate:"26-02-16 10:20",appId:"G12999"]
]
Map 2
[
[name:"Apple", cuInfo:"T12",service:"3",startDate:"14-02-16 10:00",appId:"G12351"],
[name:"Apple",cuInfo:"T13",service:"3",startDate:"14-01-16 13:00",appId:"G12352"],
[name:"Apple",cuInfo:"T16",service:"3",startDate:"14-01-16 13:00",appId:"G12353"],
[name:"Google",cuInfo:"T14",service:"9",startDate:"10-01-16 11:20",appId:"G12301"],
[name:"Microsoft",cuInfo:"T15",service:"10",startDate:"26-02-16 10:20",appId:"G12999"],
[name:"Microsoft",cuInfo:"T18",service:"10",startDate:"26-02-16 10:20",appId:"G12999"]
]
How can I get the output csv like below
Map 1 data | Map 2 data
service 3;name Apple;
cuInfo;startDate;appId | cuInfo;startDate;appId
T12;*14-02-16 10:00*;*G12351* | T12;*14-01-16 13:22*;*G12355*
T13;*14-01-16 13:00*;*G12352* | T13;*12-02-16 13:00*;*G12356*
service 9;name Google;
T14;*10-01-16 11:20*;*G12301* | T12;*10-01-16 11:20*;*G12300*
Thanks

In the following I'm assuming that the list of maps is sorted appropriately so that the comparison is fair, and that both lists are of the same length:
First, create an Iterator to traverse both lists simultaneously:
#groovy.transform.TupleConstructor
class DualIterator implements Iterator<List> {
Iterator iter1
Iterator iter2
boolean hasNext() {
iter1.hasNext() && iter2.hasNext()
}
List next() {
[iter1.next(), iter2.next()]
}
void remove() {
throw new UnsupportedOperationException()
}
}
Next, process the lists to get rows for the CSV file:
def rows = new DualIterator(list1.iterator(), list2.iterator())
.findAll { it[0] != it[1] } // Grab the non-matching lines.
.collect { // Mark the non-matching values.
(m1, m2) = it
m1.keySet().each { key ->
if(m1[key] != m2[key]) {
m1[key] = "*${m1[key]}*"
m2[key] = "*${m2[key]}*"
}
}
[m1, m2]
}.collect { // Merge the map values into a List of String arrays
[it[0].values(), it[1].values()].flatten() as String[]
}
Finally, write the header and rows out in CSV format. NOTE: I'm using a proper CSV; your example is actually invalid because the number of columns are inconsistent:
def writer = new CSVWriter(new FileWriter('blah.csv'))
writer.writeNext(['name1', 'cuInfo1', 'service1', 'startDate1', 'appId1', 'name2', 'cuInfo2', 'service2', 'startDate2', 'appId2'] as String[])
writer.writeAll(rows)
writer.close()
The output looks like this:
"name1","cuInfo1","service1","startDate1","appId1","name2","cuInfo2","service2","startDate2","appId2"
"Apple","T12","3","*14-02-16 10:00*","*G12351*","Apple","T12","3","*14-01-16 13:22*","*G12355*"
"Apple","T13","3","*14-01-16 13:00*","*G12352*","Apple","T13","3","*12-02-16 13:00*","*G12356*"
"Google","T14","9","10-01-16 11:20","*G12301*","Google","T14","9","10-01-16 11:20","*G12300*"

How to add dynamically node to XML in Groovy with the StreamingMarkupBuilder

I am trying to dynamically create an XML file with Groovy. I'm pleased with the simplicity everything works, but i am having a hard time understanding the whole mechanism behind closures and delegates. While it appears to be easy to add properties and child nodes with a fixed name, adding a node with a dynamic name appears to be a special case.
My use case is creating a _rep_policy file, which can be used in CQ5.
<?xml version="1.0" encoding="UTF-8"?>
<jcr:root xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:rep="internal"
jcr:primaryType="rep:ACL">
<allow
jcr:primaryType="rep:GrantACE"
rep:principalName="administrators"
rep:privileges="{Name}[jcr:all]"/>
<allow0
jcr:primaryType="rep:GrantACE"
rep:principalName="contributor"
rep:privileges="{Name}[jcr:read]"/>
</jcr:root>
Processing the collections works fine, but generating the name ...
import groovy.xml.StreamingMarkupBuilder
import groovy.xml.XmlUtil
def _rep_policy_files = [
'/content': [ // the path
'deny': [ // permission
'jcr:read': [ // action
'a1', 'b2']], // groups
'allow': [
'jcr:read, jcr:write': [
'c2']
]
]
]
def getNodeName(n, i) {
(i == 0) ? n : n + (i - 1)
}
_rep_policy_files.each {
path, permissions ->
def builder = new StreamingMarkupBuilder();
builder.encoding = "UTF-8";
def p = builder.bind {
mkp.xmlDeclaration()
namespaces << [
jcr: 'http://www.jcp.org/jcr/1.0',
rep: 'internal'
]
'jcr:root'('jcr:primaryType': 'rep:ACL') {
permissions.each {
permission, actions ->
actions.each {
action, groups ->
groups.eachWithIndex {
group, index ->
def nodeName = getNodeName(permission, index)
"$nodeName"(
'jcr:primaryType': 'rep:GrantACE',
'rep:principalName': "$group",
'rep:privileges': "{Name}[$action]")
}
}
}
}
}
print(XmlUtil.serialize(p))
}

Is this something (or similar) that you are looking for?
'jcr:root'('jcr:primaryType': 'rep:ACL') {
_rep_policy_files['/content'].each {k, v ->
if(k == 'allow')
"$k"('jcr:primaryType': 'rep:GrantACE',
'rep:principalName': 'administrators',
'rep:privileges': "$v" ){}
if(k == 'deny')
"$k"('jcr:primaryType': 'rep:GrantACE',
'rep:principalName': 'contributor',
'rep:privileges': "$v" ){}
}
}
The resultant xml shown in question cannot be apprehended properly with what you have in _rep_policy_files.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Groovier way to parse tsv file into map - groovy

I have a tsv file in the form of "key \t value", and I need to read into a map. Currently i do it like this: referenceFile.eachLine { line -> def (name, reference) = line.split(/\t/) referencesMap[name.toLowerCase()] = reference } Is there a shorter/nicer way to do it?

This is the comment tim_yates added to melix's answer, and I think it's the shortest/clearest answer: referenceFile.collect { it.tokenize( '\t' ) }.collectEntries { k, v -> [ k.toLowerCase(), v ] }

Related

Remove hyphens from keys in deeply nested map

Scala - Remove all elements in a list/map of strings from a single String

How to get value by name with Erlang struct?

Compare two maps and find differences using Groovy or Java

How to add dynamically node to XML in Groovy with the StreamingMarkupBuilder

Categories

Resources