xmlslurper node remove issues

xmlslurper node remove issues - groovy

i have a xml from where i wish to remove few nodes. The idea i used is to run through all the root node children and keep writing to another file those nodes which i dont need to delete.
One problem I see is: the node attributes gets reordered in the written file which i dont want
my code looks like:
def xml = new XmlSlurper(false, false).parse(args[0])
ant.delete(file:fileName)
File f = new File(fileName)
xml.children().each{
String str = it.#name
if(some condiotion == false)
f << groovy.xml.XmlUtil.serialize(it)
}
another problem is that in the begining of every node it inserts
<?xml version="1.0" encoding="UTF-8"?>

There is no concrete example of the xml present in the question. Here is an example how a Node can be removed:
import groovy.xml.XmlUtil
def xml = '''
<School>
<Classes>
<Class>
<Teachers>
<Name>Rama</Name>
<Name>Indhu</Name>
</Teachers>
<Name>Anil</Name>
<RollNumber>16</RollNumber>
</Class>
<Class>
<Teachers>
<Name>Nisha</Name>
<Name>Ram</Name>
</Teachers>
<Name>manu</Name>
<RollNumber>21</RollNumber>
</Class>
</Classes>
</School>
'''
def parsed = new XmlSlurper().parseText( xml )
parsed.'**'.each {
if(it.name() == 'Teachers') {
it.replaceNode { }
}
}
XmlUtil.serialize( parsed )
In the above example, Teachers node is removed by doing a depthFirst search and iterating over each node and finally using replaceNode with an empty Closure. I hope this can be used as the logic you want.
PS: I have omitted the File operations for brevity.

The API work with a replacementStack. So, the replaceNode {} will show the result only when you serialize the node like:
GPathResult body = parsePath.Body
int oldSize = parsePath.children().size()
body.list()
body[0].replaceNode {} // Remove o no, mas não será visivel para o objeto pai, somente no momento de serializacao. Pois a API adiciona em uma pilha de alteracao que será usada na serializacao
String newXmlContent = XmlUtil.serialize(parsePath)
GPathResult newParsePath = new XmlSlurper().parseText(newXmlContent)
int newSize = newParsePath.children().size()
assertNotNull(this.parsePath)
assertEquals(2, oldSize)
assertEquals(1, newSize)
assertTrue(newSize < oldSize)
assertNotNull(body)

Related

How to make Flexmark-Java process "#"-mentions?

When processing markdown, GitHub supports the #-syntax to mention a username or team. How can such mentions be processed with Flexmark-Java? With following code snippet, Hello, #world ! is reported as a single Text node where I would expect to get world reported separately as some kind of MentionsNode:
final Parser parser = Parser.builder(new Parser.Builder()).build();
final Document document = parser.parse("Hello, #world !");
new NodeVisitor(Collections.emptyList()) {
public void visit(Node node) {
System.out.println("Node: " + node);
super.visit(node);
}
}.visit(document);
Output:
Node: Document{}
Node: Paragraph{}
Node: Text{text=Hello, #world !}

There is a corresponding extension for this:
final MutableDataHolder options = new MutableDataSet()
.set(Parser.EXTENSIONS, Collections.singletonList(GfmUsersExtension.create()));
final Parser parser = Parser.builder(options).build();
final Document document = parser.parse("Hello, #world, and #1!");
...

Problems retrieving content controls with Open XML sdk

I am developing a solution that will generate word-documents. The word-documents are generated on the basis of a template document which has defined content controls. Everything was working good for me when I had only one content control in my template, but after expanding the template document with more content controls, I am getting exceptions. It seems like I am not finding the content controls.
This is my method:
private void CreateReport(File file)
{
var byteArray = file.OpenBinary();
using (var mem = new MemoryStream())
{
mem.Write(byteArray, 0, byteArray.Length);
try
{
using (var wordDoc = WordprocessingDocument.Open(mem, true))
{
var mainPart = wordDoc.MainDocumentPart;
var firstName = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == "FirstName").Single();
var t = firstName.Descendants<Text>().Single();
t.Text = _firstName;
var lastName = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == "LastName").Single();
var t2= lastName.Descendants<Text>().Single();
t2.Text = _lastName;
mainPart.Document.Save();
SaveFileToSp(mem);
}
}
catch (FileFormatException)
{
}
}
}
This is the exception I get:
An exception of type 'System.InvalidOperationException' occurred in System.Core.dll but was not handled in user code. Innerexception: Null
Any tips for me on how I can write better method for finding controls?

Your issue is that one (or more) of your calls to Single() is being called on a sequence that has more than one element. The documentation for Single() states (emphasis mine):
Returns the only element of a sequence, and throws an exception if there is not exactly one element in the sequence.
In your code this can happen in one of two scenarios. The first is if you have more than one control with the same Tag value, for example you might have two controls in the document labelled "LastName" which would mean that this line
var lastName = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == "LastName")
would return two elements.
The second is if your content control has more than one Text element in it in which case this line
var t = firstName.Descendants<Text>();
would return multiple elements. For example if I create a control with the content "This is a test" I end up with XML which has 4 Text elements:
<w:p w:rsidR="00086A5B" w:rsidRDefault="003515CB">
<w:r>
<w:rPr>
<w:rStyle w:val="PlaceholderText" />
</w:rPr>
<w:t xml:space="preserve">This </w:t>
</w:r>
<w:r>
<w:rPr>
<w:rStyle w:val="PlaceholderText" />
<w:i />
</w:rPr>
<w:t>is</w:t>
</w:r>
<w:r>
<w:rPr>
<w:rStyle w:val="PlaceholderText" />
</w:rPr>
<w:t xml:space="preserve"> </w:t>
</w:r>
<w:r w:rsidR="00E1178E">
<w:rPr>
<w:rStyle w:val="PlaceholderText" />
</w:rPr>
<w:t>a test</w:t>
</w:r>
</w:p>
How to get round the first issue depends on whether you wish to replace all of the matching Tag elements or just one particular one (such as the first or last).
If you want to replace just one you can change the call from Single() to First() or Last() for example but I guess you need to replace them all. In that case you need to loop around each matching element for each tag name you wish to replace.
Removing the call to Single() will return an IEnumerable<SdtBlock> which you can iterate around replacing each one:
IEnumerable<SdtBlock> firstNameFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == "FirstName");
foreach (var firstName in firstNameFields)
{
var t = firstName.Descendants<Text>().Single();
t.Text = _firstName;
}
To get around the second problem is slightly more tricky. The easiest solution in my opinion is to remove all of the existing paragraphs from the content and then add a new one with the text you wish to output.
Breaking this out into a method probably makes sense as there's a lot of repeated code - something along these lines should do it:
private static void ReplaceTags(MainDocumentPart mainPart, string tagName, string tagValue)
{
//grab all the tag fields
IEnumerable<SdtBlock> tagFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == tagName);
foreach (var field in tagFields)
{
//remove all paragraphs from the content block
field.SdtContentBlock.RemoveAllChildren<Paragraph>();
//create a new paragraph containing a run and a text element
Paragraph newParagraph = new Paragraph();
Run newRun = new Run();
Text newText = new Text(tagValue);
newRun.Append(newText);
newParagraph.Append(newRun);
//add the new paragraph to the content block
field.SdtContentBlock.Append(newParagraph);
}
}
Which can then be called from your code like so:
using (var wordDoc = WordprocessingDocument.Open(mem, true))
{
var mainPart = wordDoc.MainDocumentPart;
ReplaceTags(mainPart, "FirstName", _firstName);
ReplaceTags(mainPart, "LastName", _lastName);
mainPart.Document.Save();
SaveFileToSp(mem);
}

I think your Single() method is causing the exception.
When you got only one content control, Single() can get the only available element. But when you expand the content controls, your Single() method can cause InvalidOperationException as there are more than one element in the sequence. If this is the case, try to loop your code and take one element at a time.

Writing dynamic query results into file

I am trying to write a generic program in Groovy that will get the SQL from config file along with other parameters and put them into file.
here is the program:
def config = new ConfigSlurper().parse(new File("config.properties").toURL())
Sql sql = Sql.newInstance(config.db.url, config.db.login, config.db.password, config.db.driver);
def fileToWrite = new File(config.copy.location)
def writer = fileToWrite.newWriter()
writer.write(config.file.headers)
sql.eachRow(config.sql){ res->
writer.write(config.file.rows)
}
in the config the sql is something like this:
sql="select * from mydb"
and
file.rows="${res.column1}|${res.column2}|${res.column3}\n"
when I run it I get
[:]|[:]|[:]
[:]|[:]|[:]
[:]|[:]|[:]
in the file. If I substitute
writer.write(config.file.rows)
to
writer.write("${res.column1}|${res.column2}|${res.column3}\n")
it outputs the actual results. What do I need to do different to get the results?

You accomplish this by using lazy evaluation of the Gstring combined with altering the delegate.
First make the Gstring lazy by making the values be the results of calling Closures:
file.rows="${->res.column1}|${->res.column2}|${-> res.column3}"
Then prior to evaluating alter the delegate of the closures:
config.file.rows.values.each {
if (Closure.class.isAssignableFrom(it.getClass())) {
it.resolveStrategy = Closure.DELEGATE_FIRST
it.delegate = this
}
}
The delegate must have the variable res in scope. Here is a full working example:
class Test {
Map res
void run() {
String configText = '''file.rows="${->res.column1}|${->res.column2}|${-> res.column3}"
sql="select * from mydb"'''
def slurper = new ConfigSlurper()
def config = slurper.parse(configText)
config.file.rows.values.each {
if (Closure.class.isAssignableFrom(it.getClass())) {
it.resolveStrategy = Closure.DELEGATE_FIRST
it.delegate = this
}
}
def results = [
[column1: 1, column2: 2, column3: 3],
[column1: 4, column2: 5, column3: 6],
]
results.each {
res = it
println config.file.rows.toString()
}
}
}
new Test().run()

The good news is that the ConfigSlurper is more than capable of doing the GString variable substitution for you as intended. The bad news is that it does this substitution when it calls the parse() method, way up above, long before you have a res variable to substitute into the parser. The other bad news is that if the variables being substituted are not defined in the config file itself, then you have to supply them to the slurper in advance, via the binding property.
So, to get the effect you want you have to parse the properties through each pass of eachRow. Does that mean you have to create a new ConfigSlurper re-read the file once for every row? No. You will have to create a new ConfigObject for each pass, but you can reuse the ConfigSlurper and the file text, as follows:
def slurper = new ConfigSlurper();
def configText = new File("scripts/config.properties").text
def config = slurper.parse(configText)
Sql sql = Sql.newInstance(config.db.url, config.db.login, config.db.password, config.db.driver);
def fileToWrite = new File(config.copy.location)
def writer = fileToWrite.newWriter()
writer.write(config.file.headers)
sql.eachRow(config.sql){ result ->
slurper.binding = [res:result]
def reconfig = slurper.parse(configText)
print(reconfig.file.rows)
}
Please notice that I changed the name of the Closure parameter from res to result. I did this to emphasize that the slurper was drawing the name res from the binding map key, not from the closure parameter name.
If you want to reduce wasted "reparsing" time and effort, you could separate the file.rows property into its own separate file. i would still read in that file text once and reuse the text in the "per row" parsing.

Parsing attributes and Values using Groovy's XmlParser

I have the following part of an XML file:
<properties>
<project.version>module.version</project.version>
<ie.version>17-8-103</ie.version>
<leg_uk.version>17-6-6</leg_uk.version>
<leg_na.version>17-8-103</leg_na.version>
</properties>
I want to generate a file with the following content:
ie.project.version = 17-8-103
leg_uk.project.version = 17-8-103
How to generate such file?

Is that what You're looking for?
def txt = """<properties>
<project.version>module.version</project.version>
<ie.version>17-8-103</ie.version>
<leg_uk.version>17-6-6</leg_uk.version>
<leg_na.version>17-8-103</leg_na.version>
</properties>"""
def xml = new XmlParser().parseText(txt)
new File('tmp').withWriter { w ->
xml.children().each {
w << "${it.name()}=${it.value().text()}\n"
}
}

XMl parsing in c# 4.0

i have the following xml to be parsed
<config>
<ParametricTesting>Y</ParametericTesting>
<FunctionalTesting>Y</FunctionalTesting>
<Utilities>N</Utilities>
<CommonApi>N</CommonApi>
<ClientData>N</ClientData>
<DataSourceTest>Y<DataSourceTest>
<Excel>
<ExcelFilePath>myexcel1.xlsx</ExcelFilePath>
</Excel>
<Access>
<AccessDB> </AccessDB>
</Access>
<Sql>
<SqlConnectionString> </SqlConnectionString>
</Sql>
<RunnerConsole>N</RunnerConsole>
<Schedular>N</Schedular>
</config>
I am using xmlreader to read the xml but since i am new to c# so i don't know why the code is breaking after reading second tag i.e ParametericTesting.
code:
string ConfigXml = Path.GetFullPath("Config.xml");
XmlReader xmlReader = XmlReader.Create(ConfigXml);
while (xmlReader.Read()) {
if ((xmlReader.NodeType== XmlNodeType.Element) && xmlReader.Name.Equals("ParametricTesting")) {
// TODO : write code relevant for parametric testing
xmlReader.Read();
}
else if ((xmlReader.NodeType== XmlNodeType.Element)&& xmlReader.Name.Equals("DataSourceTest")) {
string Datasource = xmlReader.GetAttribute("DataSourceTest");
if (Datasource.Equals("Y")) {
if (xmlReader.Name.Equals("Excel") && (xmlReader.NodeType == XmlNodeType.Element)) {
string excelfile = xmlReader.GetAttribute("ExcelFilePath");
string ExcelPath = Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location) + "\\Files\\" + excelfile;
objExcel.DataSourceName = excelfile;
objExcel.Open();
}
}
xmlReader.Read();
}
xmlReader.Read();
}
My code is not reading element beyond parametricTesting . Please help.

you open tag of "ParametricTesting" in the config.xml is different from the closing tag. correct it and that line passes.
also, you don't close the tag "DataSourceTest"
here is the fixed XML:
<config>
<ParametricTesting>Y</ParametricTesting>
<FunctionalTesting>Y</FunctionalTesting>
<Utilities>N</Utilities>
<CommonApi>N</CommonApi>
<ClientData>N</ClientData>
<DataSourceTest>Y</DataSourceTest>
<Excel>
<ExcelFilePath>myexcel1.xlsx</ExcelFilePath>
</Excel>
<Access>
<AccessDB> </AccessDB>
</Access>
<Sql>
<SqlConnectionString> </SqlConnectionString>
</Sql>
<RunnerConsole>N</RunnerConsole>
<Schedular>N</Schedular>
</config>

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

xmlslurper node remove issues - groovy

Related

How to make Flexmark-Java process "#"-mentions?

Problems retrieving content controls with Open XML sdk

Writing dynamic query results into file

Parsing attributes and Values using Groovy's XmlParser

XMl parsing in c# 4.0

Categories

Resources