reading XSD from file using eclipse-mdt-xsd - xsd

I'm trying to use the MSD-XSD library for parsing XSD files, but I can't figure out how to create an XSDSchema object from an XSD file, and if there's a way to also read the xs:include files, etc.

I found a way to do it. Not sure if this is the easiest way.
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
docFactory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(new FileReader(file));
Document doc = builder.parse(is);
XSDSchema xsd = XSDSchemaImpl.createSchema(doc.getFirstChild());

Related

Error parsing xml when namespace directives are present

I need to read multiple XML-files using node.js. When the root node contains namespace directives, parsing the xml file fails. When removing the namespace directives, all works well. All my files can have different declarations. How do I parse the XML, ignoring the namespace attributes? I need to use xPath to get some values.
I'm using ...
var fs = require('fs');
var xpath = require('xpath');
var dom = require('xmldom').DOMParser;
var xml = fs.readFileSync('/test.xml', 'utf8').toString();
var doc = new dom().parseFromString(xml);
var id = xpath.select("/export/asset/id", doc);
console.log(id[0].firstChild.data);
XML-file
<export xmlns="some url" xmlns:xsi="some url" format="archive" version="2.4" xsi:schemaLocation="some url.xsd">
<asset>
<id>1445254514291</id>
<name>test</name>
<displayName />
<origin>demo</origin>
</asset>
<export>
Generally speaking, taking into account namespaces is preferable, but if you have to deal with too many of these, one way to avoid them altogether is to use xpath and the somewhat convoluted, but effective, local-name() function.
So I would change
var id = xpath.select("/export/asset/id", doc);
to
var id = xpath.select("//*[local-name()='export']//*[local-name()='asset']//*[local-name()='id']", doc);
With the sample xml in the question, this should output:
"1445254514291"

cucumber hook scenario.embed always create screenshot at project root

In cucumber hook scenario.embed always create screenshot at my project root directory. I need it to create it different location
scenario.embed(screenshot, "image/png");
I create below code still no wayout:
File screenShot=((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
// extracting date for folder name.
SimpleDateFormat dateFormatForFoldername = new SimpleDateFormat("yyyy-MM-dd");//dd/MM/yyyy
Date currentDate = new Date();
String folderDateFormat = dateFormatForFoldername.format(currentDate);
// extracting date and time for snapshot file
SimpleDateFormat dateFormatForFileName = new SimpleDateFormat("yyyy-MM-dd HH-mm-ss");//dd/MM/yyyy
String fileDateFormet = dateFormatForFileName.format(currentDate);
String filefolder="./ScreenShots"+"/FailCase/"+folderDateFormat+"/";
// Creating folders and files
File screenshot = new File(filefolder+fileDateFormet+".jpeg");
FileUtils.copyFile(screenShot, new File(screenshot.getPath()));
byte[] fileContent = Files.readAllBytes(screenshot.toPath());
scenario.embed(fileContent, "image/png");
How to pass a directory path to embed funtion or override it?
#mpkorstanje - he is rightly pointed about it
As per his comment:
Use OutputType.BYTES and send the bytes directly to scenario.embed and write them directly to the screenshot file
But my issue was I am using mkolisnyk package and when I am using #AfterSuite annotation of it, it creating fail file images over root folder. seems bug in mkolisnyk package.
Mixing AfterSuite of testng and #ExtendedCucumberOptions of mkolisnyk works for me

Exception when open excel: File contains corrupted data

I am trying to read an excel with OpenXML.
What I did is simply as following:
private WorkbookPart wbPart = null;
private SpreadsheetDocument document = null;
public byte[] GetExcelReport()
{
byte[] original = File.ReadAllBytes(this.originalFilename);
using (MemoryStream stream = new MemoryStream())
{
stream.Write(original, 0, original.Length);
using (SpreadsheetDocument excel = SpreadsheetDocument.Open(stream, true))
{
this.document = excel;
this.wbPart = document.WorkbookPart;
UpdateValue();
}
stream.Seek(0, SeekOrigin.Begin);
byte[] data = stream.ToArray();
return data;
}
}
I initialized this.originalFilename in the constructor. It is the filename ended with '.xlsx' which i created with excel 2010.
But this line of code
using (SpreadsheetDocument excel = SpreadsheetDocument.Open(stream, true))
gives the exception: Message: System.IO.FileFormatException: File contains corrupted data.
The StackTrace:
Does anyone know how to solve this problem? At the beginning, I didn't use the Stream, I just use SpreadsheetDocument.Open(filename, true). However, it turns out to be exactly the same exception.
I've tried to create a new .xlsx file, but it's still the same.
There is a MSDN page which describes the process of reading and writing Excel file using stream and open xml SDK.
http://msdn.microsoft.com/en-us/library/office/ff478410.aspx
Try extracting the document contents through zip application and check whether you are getting the standard folders inside like xl,docProps and _rels etc.,
This is a method to find whether the package is properly packaged as archive or not.
Hope this helps.

How to parse & index different portions of an HTML page using Tika & Lucene?

I have been trying to parse & index different portions of an HTML page using Lucene & Tika. For eg. I would like to index text within Title, H1, H2, A tags of a HTML page separately and provide a different boost to each of them. I am using Tika for HTML parsing and creating a Document object with the appropriate fields that need to be indexed. However I could not find anything within Tika which would help me index the tags I want right out of the box.
My code looks something like this :
InputStream is = new FileInputStream(f);
Parser parser = new AutoDetectParser();
ContentHandler handler = new BodyContentHandler(-1);
ParseContext context = new ParseContext();
context.set(HtmlMapper.class, DefaultHtmlMapper.INSTANCE);
try {
parser.parse(is, handler, metadata, context);
} finally {
is.close();
}
Document doc = new Document();
doc.add(new Field("contents", handler.toString(),
Field.Store.NO, Field.Index.ANALYZED));
for (String name : metadata.names()) {
String value = metadata.get(name);
if (textualMetadataFields.contains(name)) {
doc.add(new Field("contents", value,
Field.Store.NO, Field.Index.ANALYZED));
}
doc.add(new Field(name, value, Field.Store.YES, Field.Index.YES));
}
Stepping into Tika's HTML parsing code I found that it is org.apache.tika.parser.html.HtmlHandler class that fills up metadata object.
Do I need to write a custom HTML handler like HtmlHandler ?
Is there some class in Tika which can parse out text within different HTML tags that one specifies ?
Can someone please provide code samples for solutions that you propose ?
Um. Are you using a search engine for the project for any specific reason? I used one to search for an answer, imagine that ;-)
A good and relevant tutorial

Programmatically Edit Infopath Form Fields?

I have a form library in my share point site. Programmatically I need to fill some fields. Can I do that? If any one know please provide me some sample code. First I need to retrieve the infopath document and then I need to fill the fields.
What axel_c posted is pretty dang close. Here's some cleaned up and verified working code...
public static void ChangeFields()
{
//Open SharePoint site
using (SPSite site = new SPSite("http://<SharePoint_Site_URL>"))
{
using (SPWeb web = site.OpenWeb())
{
//Get handle for forms library
SPList formsLib = web.Lists["FormsLib"];
if (formsLib != null)
{
foreach (SPListItem item in formsLib.Items)
{
XmlDocument xml = new XmlDocument();
//Open XML file and load it into XML document
using (Stream s = item.File.OpenBinaryStream())
{
xml.Load(s);
}
//Do your stuff with xml here. This is just an example of setting a boolean field to false.
XmlNodeList nodes = xml.GetElementsByTagName("my:SomeBooleanField");
foreach (XmlNode node in nodes)
{
node.InnerText = "0";
}
//Get binary data for new XML
byte[] xmlData = System.Text.Encoding.UTF8.GetBytes(xml.OuterXml);
using (MemoryStream ms = new MemoryStream(xmlData))
{
//Write data to SharePoint XML file
item.File.SaveBinary(ms);
}
}
}
}
}
}
The Infopath document is just a regular XML file, the structure of which matches the data sources you defined in the Infopath form.
You just need to access the file via the SharePoint object model, modify it using standard methods (XmlDocument API) and then write it back to the SharePoint list. You must be careful to preserve the structure and insert valid data or you won't be able to open the form using Infopath.
You should really check out a book on SharePoint if you plan to do any serious development. Infopath is also a minefield.
Object model usage examples: here, here and here. The ridiculously incomplete MSDN reference documentation is here.
EDIT: here is some example code. I haven't done SharePoint for a while so I'm not sure this is 100% correct, but it should give you enough to get started:
// Open SharePoint site
using (SPSite site = new SPSite("http://<SharePoint_Site_URL>"))
{
using (SPWeb web = site.OpenWeb())
{
// Get handle for forms library
SPList formsLib = web.Lists["FormsLib"];
if (formsLib != null)
{
SPListItem itm = formsLib.Items["myform.xml"];
// Open xml and load it into XML document
using (Stream s = itm.File.OpenBinary ())
{
MemoryStream ms;
byte[] xmlData;
XmlDocument xml = new XmlDocument ();
xml.Load (s);
s.Close ();
// Do your stuff with xml here ...
// Get binary data for new XML
xmlData = System.Text.Encoding.UTF8.GetBytes (xml.DocumentElement.OuterXml);
ms = new MemoryStream (xmlData);
// Write data to sharepoint item
itm.File.SaveBinary (ms);
ms.Close ();
itm.Update ();
}
}
web.Close();
}
site.Close();
}
It depends a bit on your available tool set, skills and exact requirements.
There are 2 main ways of pre populating data inside an InfoPath form.
Export the relevant fields as part of the form's publishing process. The fields will then become columns on the Document / Forms library from where you can manipulate them either manually, via a Workflow or wherever your custom code is located.
Directly manipulate the form using code similar to what was provided by Axel_c previously. The big question here is: what will trigger this code? An event receiver on the Document Library, a SharePoint Designer Workflow, a Visual Studio workflow etc?
If you are trying to do this from a SharePoint Designer workflow then have a look at the Workflow Power Pack for SharePoint. It allows C# and VB code to be embedded directly into the workflow without the need for complex Visual Studio development. An example of how to query InfoPath data from a workflow can be found here. If you have some development skills you should be able to amend it to suit your needs.
I also recommend the site www.infopathdev.com, they have excellent and active forums. You will almost certainly find an answer to your question there.
Thanks for the sample code, #axel_c and #Jeff Burt
Below is just the same code from Jeff Burt modified for a file in Document set which I needed. If you don't already have the Document Set reference, you can check out this site on how to grab one:
http://howtosharepoint.blogspot.com/2010/12/programmatically-create-document-set.html
Also, the codes will open the .xml version of the infopath form and not the .xsn template version which you might run into.
Thanks again everyone...
private void ChangeFields(DocumentSet docSet)
{
string extension = "";
SPFolder documentsetFolder = docSet.Folder;
foreach (SPFile file in documentsetFolder.Files)
{
extension = Path.GetExtension(file.Name);
if (extension != ".xml") //check if it's a valid xml file
return;
XmlDocument xml = new XmlDocument();
//Open XML file and load it into XML document, needs to be .xml file not .xsn
using (Stream s = file.OpenBinaryStream())
{
xml.Load(s);
}
//Do your stuff with xml here. This is just an example of setting a boolean field to false.
XmlNodeList nodes = xml.GetElementsByTagName("my:fieldtagname");
foreach (XmlNode node in nodes)
{
node.InnerText = "xyz";
}
//Get binary data for new XML
byte[] xmlData = System.Text.Encoding.UTF8.GetBytes(xml.OuterXml);
using (MemoryStream ms = new MemoryStream(xmlData))
{
//Write data to SharePoint XML file
file.SaveBinary(ms);
}
}
}
I had this issue and resolved it with help from Jeff Burt / Axel_c's posts.
I was trying to use the XMLDocument.Save([stream]) and SPItem.File.SaveBinary([stream]) methods to write an updated InfoPath XML file back to a SharePoint library. It appears that XMLDocument.Save([stream]) writes the file back to SharePoint with the wrong encoding, regardless of what it says in the XML declaration.
When trying to open the updated InfoPath form I kept getting the error "a calculation in the form has not been completed..."
I've written these two functions to get and update and InfoPath form. Just manipulate the XML returned from ReadSPFiletoXMLdocument() in the usual way and send it back to your server using WriteXMLtoSPFile().
private System.Xml.XmlDocument ReadSPFiletoXMLdocument(SPListItem item)
{
//get SharePoint file XML
System.Xml.XmlDocument xDoc = new System.Xml.XmlDocument();
try
{
using (System.IO.Stream xmlStream = item.File.OpenBinaryStream())
{
xDoc.Load(xmlStream);
}
}
catch (Exception ex)
{
//put your own error handling here
}
return xDoc;
}
private void WriteXMLtoSPFile(SPListItem item, XmlDocument xDoc)
{
byte[] xmlData = System.Text.Encoding.UTF8.GetBytes(xDoc.OuterXml);
try
{
using (System.IO.MemoryStream outStream = new System.IO.MemoryStream(xmlData))
{
item.File.SaveBinary(outStream);
}
}
catch (Exception ex)
{
//put your own error handling here
}
}

Resources