Groovy String Concatenation - string

Current code:
row.column.each(){column ->
println column.attributes()['name']
println column.value()
}
Column is a Node that has a single attribute and a single value. I am parsing an xml to input create insert statements into access. Is there a Groovy way to create the following structured statement:
Insert INTO tablename (col1, col2, col3) VALUES (1,2,3)
I am currently storing the attribute and value to separate arrays then popping them into the correct order.

I think it can be a lot easier in groovy than the currently accepted answer. The collect and join methods are built for this kind of thing. Join automatically takes care of concatenation and also does not put the trailing comma on the string
def names = row.column.collect { it.attributes()['name'] }.join(",")
def values = row.column.collect { it.values() }.join(",")
def result = "INSERT INTO tablename($names) VALUES($values)"

You could just use two StringBuilders. Something like this, which is rough and untested:
def columns = new StringBuilder("Insert INTO tablename(")
def values = new StringBuilder("VALUES (")
row.column.each() { column ->
columns.append(column.attributes()['name'])
columns.append(", ")
values.append(column.value())
values.append(", ")
}
// chop off the trailing commas, add the closing parens
columns = columns.substring(0, columns.length() - 2)
columns.append(") ")
values = values.substring(0, values.length() - 2)
values.append(")")
columns.append(values)
def result = columns.toString()
You can find all sorts of Groovy string manipulation operators here.

Related

Add single quotes around string in python

I have converted list to string.
but after conversion I am getting string without single quote around the string
for eg:
items = ['aa','bb','cc']
items = ','.join(items)
output is : aa,bb,cc
expected output: 'aa','bb','cc'
You could use a list comprehension to quote the individual strings in the list:
items = ['aa','bb','cc']
items = ','.join([f"'{i}'" for i in items])
print(items) # 'aa','bb','cc'
One way to accomplish this is by passing the list into a string formatter, which will place the outer quotes around each list element. The list is mapped to the formatter, then joined, as you have shown.
For example:
','.join(map("'{}'".format, items))
Output:
"'aa','bb','cc'"

Is there a simple way to parse comma separated Key:Value pairs in Excel, Power Query or VBA if the values contain unescaped commas?

I'm working with a CSV export of identity data containing ~22000 records. One of the fields included is titled 'ExtendedAttributes' and each cell in the column contains a quote bound string of comma separated Key:Value pairs. Each record in the file has an arbitrary number of extended attributes (up to around 50). My ultimate objective is to expand these extended attributes into their own columns in Excel (2016). I already have solutions for the expansions into columns from other data using formulae, simple VBA and most recently Power Query based approaches.
However, my previous solutions have all been based on the Key:Value pairs being simple to delimit. In this export, the ExtendedAttributes field has:
Value data that may contain unescaped/unquoted commas. e.g.
"Key1: Value1, name: surname, forename, Key2: Value2, ... "
Keys that may contain multiple comma separated values, which are also unquoted/unescaped. e.g.
"Key1: Value1, emailAlias: alias1#domain, alias2#domain, alias3#domain, Key2: Value2, ... "
My usual approach to this, where Key:Value pairs don't have these problems would be to delimit using commas to break it into the key value pairs, transpose the data into rows, then delimit using the colon to populate my new columns and their values as described here in the PowerBI community pages
This doesn't work here because delimiting using a comma breaks the values.
Is there a straightforward way to parse this into the constituent Key:Value pairs using (ideally) Power Query? Happy to also go with VBA or formula based solutions.
My instinctive approach would be to try and identify substrings containing a colon and prepend them with a unique character, which can then be used as a delimiter. (It's not impossible that the data may also include unescaped colons, but I'm happy to assume that it doesn't) But recognise that this may be a needlessly complex approach and I'm unsure how best to do it.
I'm happy to keep values with multiple comma separated items as a single unit (A problem for me to deal with later).
For the example data:
"Key1: Value1, name: surname, forename, emailAlias: alias1#domain, alias2#domain, alias3#domain, Key2: Value2, ... "
I'd like to end up with something that lets me treat the data like this, using maybe a ! as an example unique character that I could then use as a delimiter:
"Key1: Value1!name: surname, forename!emailAlias: alias1#domain, alias2#domain, alias3#domain!Key2: Value2!..."
I don't have access to the original data (vendor controlled system) and have limited data processing tools on my corporate desktop (Excel 2016, VBA, PQ).
Appreciate any help.
In Power Query, you can define a function Partition as follows:
let
Output = (str as text, sep as text) as text =>
Text.RemoveRange(
Text.Replace(
Text.Combine(
List.Transform(
Text.Split(str, " "),
each if Text.Contains(_, ":") then sep & _ else _
),
" "
), ", " & sep, sep
),
0, Text.Length(sep)
)
in
Output
Example text transformation using separator !
Starting text:
Key1: Value1, name: surname, forename, emailAlias: alias1#domain
Split the string based on spaces into a list
Key1:
Value1,
name:
surname,
forename,
emailAlias:
alias1#domain
Prepend any list items containing : with separator !
!Key1:
Value1,
!name:
surname,
forename,
!emailAlias:
alias1#domain
Combine the list back into a string
!Key1: Value1, !name: surname, forename, !emailAlias: alias1#domain
Replace , ! with !
!Key1: Value1!name: surname, forename!emailAlias: alias1#domain
Remove the first separator
Key1: Value1!name: surname, forename!emailAlias: alias1#domain
Once you have this function defined, you can call it in a column transformation that would look something like
= Table.TransformColumns(#"Prev Step", {{"ColName", each Partition(_,"!") , type text}})
You do not say anything about the file records not being counted in the "ExtendedAttribute field" category... I prepared a function able to separate that area which you put in discussion. Please, use the next code:
Function separateKeys(x As String, sep As String) As String
Dim arr1, arr2, i As Long, k As Long
arr1 = Split(x, ": ")
ReDim arr2(UBound(arr1))
For i = 0 To UBound(arr1) - 1
If arr1(i + 1) = arr1(UBound(arr1)) Then Exit For
arr2 = Split(arr1(i + 1), " ")
arr2(UBound(arr2) - 1) = Replace(arr2(UBound(arr2) - 1), ",", sep)
arr1(i + 1) = Join(arr2, " ")
Next
separateKeys = Replace(Join(arr1, ":"), sep & " ", sep)
End Function
The above function can (probably) be adapted in a way to skip calculations for rest of the file, or also transform each comma in the sep character (simple using Replace).
In order to test the above function, please use the next testing Sub:
Sub testSepKeys()
Dim x As String, sep As String
sep = "|" 'you can try something else, but improbable to appear in the processed text
x = "Key1: Value1, name: surname, forename, emailAlias: alias1#domain, alias2#domain, alias3#domain, Key2: Value2, Value3, key3: Val1, Val2"
Debug.Print separateKeys(x, sep)
End Sub
Like global way of working, I would suggest splitting the file on line separator, then process all the array elements (lines) using the above (adapted) function and finally join it on the line separator.
The newly created file should be open using Workbooks.OpenText, DataType:=xlDelimited, OtherChar:=sep.
Please, test the above function and send some feedback.

groovy iterate through list of key and value

I have this list:
service_name_status=[a-service=INSTALL, b-service=UPGRADE, C-service=UPGRADE, D-service=INSTALL]
And I need to iterate through this list so the first element will be the value of a parameter called "SERVICE_NAME" and the second element will be the value of a parameter called "HELM_COMMAND",
after asserting those values to the parameters I will run my command that uses those parameters and then continue the next items on the list which should replace the values of the parameters with items 3 and 4 and so on.
So what I am looking for is something like that:
def service_name_status=[a-service=INSTALL, b-service=UPGRADE, C-service=UPGRADE, D-service=INSTALL]
def SERVICE_NAME
def HELM_COMMAND
for(x in service_name_status){
SERVICE_NAME=x(0,2,4,6,8...)
HELM_COMMAND=x(1,3,5,7,9...)
println SERVICE_NAME=$SERVICE_NAME
println HELM_COMMAND=$HELM_COMMAND
}
the output should be:
SERVICE_NAME=a-service
HELM_COMMAND=INSTALL
SERVICE_NAME=b-service
HELM_COMMAND=UPGRADE
SERVICE_NAME=c-service
HELM_COMMAND=UPGRADE
SERVICE_NAME=d-service
HELM_COMMAND=INSTALL
and so on...
I couldn't find anything that takes any other element in groovy, any help will be appreciated.
The collection you want is a Map, not a List.
Take note of the quotes in the map, the values are strings so you need the quotes or it won't work. You may have to change that at the source where your data comes from.
I kept your all caps variable names so you will feel at home, but they are not the convention.
Note the list iteration with .each(key, value)
This will work:
Map service_name_status = ['a-service':'INSTALL', 'b-service':'UPGRADE', 'C-service':'UPGRADE', 'D-service':'INSTALL']
service_name_status.each {SERVICE_NAME, HELM_COMMAND ->
println "SERVICE_NAME=${SERVICE_NAME}"
println "HELM_COMMAND=${HELM_COMMAND}"
}
EDIT:
The following can be used to convert that to a map. Be careful, the replaceAll part is fragile and depends on the data to always look the same.
//assuming you can have it in a string like this
String st = "[a-service=INSTALL, b-service=UPGRADE, C-service=UPGRADE, D-service=INSTALL]"
//this part is dependent on format
String mpStr = st.replaceAll(/\[/, "['")
.replaceAll(/=/, "':'")
.replaceAll(/]/, "']")
.replaceAll(/, /, "', '")
println mpStr
//convert the properly formatted string to a map
Map mp = evaluate(mpStr)
assert mp instanceof java.util.LinkedHashMap

Insert variable number of items with FORMAT ( )

I have a text with some strings that I want to replace with a variable. For example:
message = """I am a message for {user} and you have puchased the following items {items} with color {color}"""
There I want to replace {user}, {items} and {color} by a variable using the following code:
message = message_template.format(user='Ali', ID = ID1)
The problem is that in some cases I will have one item and in other cases more than 5 and I need to insert them independently. Also, color and item are part of a Dataframe.
Any idea about how could I insert a changing number of variables with .format( )?
Thanks
As for multiple items convert your list to a string using : ', '.join(items)
items = ['i1','i2','i3']
message = message_template.format(user='Ali', items = ', '.join(items), color='orange')

Excel wrongly converts ranges into dates, how to avoid it?

I have a .tsv file with some fields being ranges like 1 - 4. I want to read these fields as they are textually written. However, upon file opening excel converts automatically those range fields to dates. For instance 1 - 4 is converted to 4-Jan. If I try to format back the cell to another type, the value is already changed and I can only get a useless number (39816). Even if the range fields are within double quotes, the wrong conversion to date still takes place. How to avoid this behavior?
I think you best use the import facility in excel but you may have to manually change the file extension to a csv.
When importing be sure to select text for all the columns with these values.
My question is in fact a duplicate of at least:
1) Stop Excel from automatically converting certain text values to dates
2) Excel: Default to TEXT rather than GENERAL when opening a .csv file
The possible solutions for Excel are to 1) either writing the fields with special double quotes like "May 16, 2011" as "=""May 16, 2011""" or 2) importing the csv/tsv file with the external data wizard and then selecting manually which columns you want to read as TEXT and not GENERAL (which could convert fields to dates)
As for my use case, I was only using Excel to remove some columns. None of the solutions was appealing to me because I wouldn't like to rewrite the tsv files with special quotes and because I had hundreds of columns and I didn't want to select each manually to be read as TEXT.
Therefore I wrote a scala script to filter tsv files by column names:
package com.jmcejuela.ml
import java.io.InputStream
import java.io.Writer
import scala.io.Codec
import scala.io.Source
import Table._
/**
* Class to represent tables with a fixed size of columns. All rows have the same columns.
*/
class Table(val rows: Seq[Row]) {
lazy val numDiffColumns = rows.foldLeft(Set[Int]())((set, row) => set + row.size)
def toTSV(out: Writer) {
if (rows.isEmpty) out.write(TableEmpty.toString)
else {
out.write(writeLineTSV(rows.head.map(_.name))) //header
rows.foreach(r => out.write(writeLineTSV(r.map(_.value))))
out.close
}
}
/**
* Get a Table with only the given columns.
*/
def filterColumnsByName(columnNames: Set[String]): Table = {
val existingNames = rows.head.map(_.name).toSet
assert(columnNames.forall(n => existingNames.contains(n)), "You want to include column names that do not exist")
new Table(rows.map { row => row.filter(col => columnNames.contains(col.name)) })
}
}
object TableEmpty extends Table(Seq.empty) {
override def toString = "Table(Empty)"
}
object Table {
def apply(rows: Row*) = new Table(rows)
type Row = Array[Column]
/**
* Column representation. Note that each column has a name and a value. Since the class Table
* is a sequence of rows which are a size-fixed array of columns, the name field is redundant
* for Table. However, this column representation could be used in the future to support
* schemata-less tables.
*/
case class Column(name: String, value: String)
private def parseLineTSV(line: String) = line.split("\t")
private def writeLineTSV(line: Seq[String]) = line.mkString("", "\t", "\n")
/**
* It is assumed that the first row gives the names to the columns
*/
def fromTSV(in: InputStream)(implicit encoding: Codec = Codec.UTF8): Table = {
val linesIt = Source.fromInputStream(in).getLines
if (linesIt.isEmpty) TableEmpty
else {
val columnNames = parseLineTSV(linesIt.next)
val padding = {
//add padding of empty columns-fields to lines that do not include last fields because they are empty
def infinite[A](x: A): Stream[A] = x #:: infinite(x)
infinite("")
}
val rows = linesIt.map { line =>
((0 until columnNames.size).zip(parseLineTSV(line) ++: padding).map { case (index, field) => Column(columnNames(index), field) }).toArray
}.toStream
new Table(rows)
}
}
}
Write 01-04 instead of 1-4 in excel..
I had a "text" formatted cell in excel being populated with a chemical casn with the value "8013-07-8" that was being reformatted into a date format. To remedy the problem, I concatenated a single quote to the beginning of the value and it rendered correctly when viewing the results. When you click on the cell, you see the prefixed single-quote, but at least I stopped seeing it as a date.
In my case, When I typed 5-14 in my D2 excel cell, is coverts to date 14 May. With a help from somebody , I was able to change the date format to the number range (5-14) using the following approach and wanted to share it with you. (I will use my case an example).
Using cell format in excel, I converted the date format in D2 (14 May) to number first ( in my case it gave me 43599).
then used the formula below ,in excel, to convert it 5-14.
=IF (EXACT (D2, 43599), "5-14", D2).

Resources