I am very beginner to Cassandra. I am finding it difficult to model the XML file attached.
My requirement is:
I get many XML files as given below on daily basis. I need to write a C# program to read the XML and store in Cassandra.
Once stored, the data would be read by 1000's of customers 1000's of times. Once the data is stored, it would NOT be updated again.
NOTE: It is possible that each parent node (Ex: Customer/Site...) may appear more no. of times. i.e i may have Customer2, Site2, another EquipmentDetails node etc.
My XML:
<?xml version="1.0" standalone="yes"?>
<MyAnalystXMLReport>
<MC-DomainID ID="XYZ123">
<Customer Name="CUSTOMER1" ID="53043">
<Site Name="SITE1" ID="488688">
<EquipmentDetails>
<EquipmentDescription>Test Desc</EquipmentDescription>
<EquipmentRefId>T3567111</EquipmentRefId>
<ComponentDescription>COM Oil</ComponentDescription>
<ComponentRefId>
</ComponentRefId>
<AnanlystNo>1235LKJU</AnanlystNo>
<ComponentType>TestComp</ComponentType>
<Sample SampleNo="976023696">
<SampleCondition>USUAL</SampleCondition>
<AnalysisComments>Test Comments</AnalysisComments>
<DateRecieved>2015-12-10</DateRecieved>
<DateAnalysed>2015-12-18</DateAnalysed>
<EquipmentLife>
</EquipmentLife>
<LubricantLife>
</LubricantLife>
<TopUpVolume>0</TopUpVolume>
<FuelUsed>
</FuelUsed>
<Tests>
<TestGroup Name="Group NAME1" ID="667">
<Test Name="Test Name1" ID="1785">
<Result>171.3</Result>
</Test>
</TestGroup>
<TestGroup Name="Group NAME2" ID="617">
<Test Name="Test NAME2" ID="1763">
<Result>153.40</Result>
</Test>
</TestGroup>
</Tests>
</Sample>
</EquipmentDetails>
</Site>
</Customer>
</MC-DomainID>
</MyAnalystXMLReport>
How would i model this XML in efficient way to achieve this requirement?
And what is the easiest way to store this data data & model in Cassandra using C#. Any example would be greatly appreciated.
Thanks for your help.
So to begin, modelling data for use in Cassandra is a bit different than a traditional RDBMS. My first suggestion would be to look at this link:
http://www.datastax.com/dev/blog/basic-rules-of-cassandra-data-modeling
Basically Cassandra using a table-per-query methodology to determine the data models. This means that you need to know how your application will query the data and then model your logical and physical persistance models after that. DataStax has some excellent tutorial videos on Cassandra in general and Data modelling specifically. See here:
https://academy.datastax.com/courses/ds220-data-modeling
As far as how to use it from C#, there is a good video on how to get started using c# here:
https://academy.datastax.com/demos/getting-started-apache-cassandra-and-c-net
Related
I'm currently extending the features of a Google Site Search/Custom search setup on a website.
Im not very big on this subject, but ive setup the sitemap with "Pagedata" element that contains some extra data, like Date and Category.
<url>
<loc>
http://www.videnscenterfordemens.dk/viden-om-demens/til-patienter-og-paaroerende
</loc>
<lastmod>2013-10-28</lastmod>
<PageMap xmlns="http://www.google.com/schemas/sitemap-pagemap/1.0">
<DataObject type="document" id="hibachi">
<Attribute name="category">En kategori</Attribute>
<Attribute name="date">20131028</Attribute>
</DataObject>
</PageMap>
</url>
Queries like this to get pages with a specific category:
q=patienter+more:pagemap:document-category:En kategori
But that don return the above page, instead in return other pages, that dont even have a category set.
Any idea on what i need to do, to be able to search in pages with specific category attribute?
PS. the sitemap has been indexed by google after the changes where made.
Try changing your page map as follows and you should see it show up in the structured data testing tool that Devnook linked.
<PageMap >
<DataObject type="document" id="hibachi">
<Attribute name="category" value="En kategori" />
<Attribute name="date" value="20131028" />
</DataObject>
</PageMap>
One way to debug this it the Structured Data Testing tool:
In the tab Google custom Search you can see attributes recognized by Google for search. Seems like you pagemap is not appearing there - maybe check with other pages you mentioned and look for differences?
Also, annotating via sitemaps work only for verified site owners. If you're an owner of videnscenterfordemens.dk, maybe you need to verify your site with Webmaster tools?
I am using Apache Solr for indexing using DataImportHandler.
The document structure is as follows:
id(long), title(text), abstract(text), pubDate(date)
I combined title and abstract filed fro text searching.My problem is when I query
"title: utility" then it gives result as follows:
id, title
6, Financial Deal Insights Energy & Utilities December 2008
11,Residential utility retail strategies in an economic downturn
16,Financial Deal Insights: Energy & Utilities Review of 2008
41,Solar at the heart of utility corporate strategy
I want to search only "utility" but it gives result also for utilities...
I also tried title:"utility" and title:utility~1 but it doesnt worked.
I read about 'stemming' but I dont have any idea how to use it...
please help me..
thanks..
This is cause of the PorterStemFilterFactory in your Text analysis.
<filter class="solr.PorterStemFilterFactory"/>
Stemmer would reduce the words to root and hence utility would match utilities as well.
Check if you need Stemmer for Searching, else you can remove it from your filter chain.
Else check for a less aggressive stemmer to fit your needs.
I just need a clearance from expert. I need to translate whole site in other language. My site is consist of the 100 of articles. I need to get that whole article translated. Should I create .po or xml file for each article
If above is only way then let me know efficient way to create .po and xml files as these are not small messages.
I see you've tagged your post with 'expressionengine', so I'm assuming that your site is built on EE. In which case, neither .po files nor XML files are the way to go. Since EE offers completely customizable fields and channels, you can have you secondary language content managed just like your primary language content.
There are many different approaches to this in EE, each with their own pros and cons. This article linked below gives a great overview of the many approaches, and offers many links to additional reading. It's more than one answer on SO can properly cover.
Multi-language Solutions for ExpressionEngine on EE Insider
To export as XML:
http://devot-ee.com/add-ons/export-it
or
http://devot-ee.com/add-ons/ajw-export
Alternatively you can simply build a template that outputs the XML using standard {exp:channel:entries} tag pair, making the template type XML and adding the correct header and code for XML.
To re-import:
http://devot-ee.com/add-ons/datagrab
All of the above will involve knowing what fields you want to export out along with their table and row references so it can easily be re-imported.
Strongly suggest you thoroughly test the export and import facility you opt for to ensure it works before beginning any translation process.
Example XML Template (this is to build sitemap.xml but gives you a start on building your own XML structure):
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
{exp:channel:entries channel="pages" entry_id="not 117|104" limit="500" disable="member_data|pagination|trackbacks" rdf="off" dynamic="no" status="Open" sort="asc"}
<url>
<loc>{page_url}</loc>
<lastmod>{gmt_edit_date format='%Y-%m-%dT%H:%i:%s%Q'}</lastmod>
<changefreq>daily</changefreq>
<priority>1</priority>
</url>
{/exp:channel:entries}
</urlset>
Info: I'm using Spring-ws 1.5.9 and Spring 2.5.6
I'm currently in the process of build a lot of web services and have a few questions as to how the architecture should be.
Right now I have a single web service. It (of course) contains a single wsdl and a single endpoints and so forth.
I'm currently extending the web service, and there I have created another xsd, auto-gen code using JAXB2 (xjc) and so forth.
Now, how should I handle these xsds, wsdl(s), code and so forth? I cannot see what Spring-ws recommends...
My architect would like to have a single wsdl, which can be achieved using the following:
<bean id="schemaCollection" class="org.springframework.xml.xsd.commons.CommonsXsdSchemaCollection">
<property name="xsds">
<list>
<value>one.xsd</value>
<value>two.xsd</value>
</list>
</property>
<property name="inline" value="true"/>
Is this a good way to do this? I'm gonna end up with like 10-15 web services thus a large wsdl.
How about endpoints? Should I create a single endpoint and test for the type of request (e.g. using instanceof)? I myself think that having one endpoint mapping to one request is more elegant/clean.
Finally, what about marshalling? I have this (with one ws/schema):
<oxm:jaxb2-marshaller id="marshaller" contextPath="mydomain.signals.one.v1_0.schemas"/>
<oxm:jaxb2-marshaller id="unmarshaller" contextPath="mydomain.signals.v1_0.schemas"/>
But, how should I add another schema to this?? I'm trying something like the following, which doesn't seem to be working for me right now:
<bean id="marshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="contextPaths">
<list>
<value>mydomain.signals.one.v1_0.schemas</value>
<value>mydomain.signals.two.v1_0.schemas</value>
</list>
</property>
</bean>
I hope this makes sense. What I'm aiming for is pointers and hints as to what I should do.
If you have 10-15 web services there's no way around not having a huge wsdl. If the goal is a single wsdl, what you are doing seems acceptable.
I would prefer endpoints for each request.
Also, have you tried using a colon separated list of values for your context path? So, don't use the list, just have one long string with each context path separated by colons.
From the Spring-WS documentation:
The context path is a list of colon (:) separated Java package names
that contain schema derived classes.
I know that passage is for Jaxb1 but I'm pretty sure it still applies to the Jaxb2Marshaller. I think you'd only use the list variant if you were specifying classes.
http://static.springsource.org/spring-ws/site/reference/html/oxm.html
i am using SPWeb.ProcessBatchData() method to batch create folders inside one document library. everything works fine expect after folders have been created, folders all have very strange name. for example if my document library's name is 000, then the folder name's is "1._000". I tried a lot of other properties, but i have no luck to find out how to set the folder name right. Can some one help me on this?
Cheers
You are right about the webservices are a bit more strict on the characters you can put in and it cannot process the same amount of requests, but you can work around that I guess :)
What you can do though if you really want to use the ProcessBatchData method is, re-use the result you receive back from the method. If it's correct, you will get all the ListItemId's back from each folder. Using the Id's you can create another batch to rename the Title's of the items.
But if I were you, I would switch and use the webservice and workaronud that :)
this is the correct syntax of the XML to create a folder truly titled:
<?xml version="1.0" encoding="utf-8"?>
<ows:Batch OnError="Continue">
<Method ID="Test">
<SetList Scope="Request">82d62a9a-55ba-49c8-a9b8-68ec965a5931</SetList>
<SetVar Name="Cmd">Save</SetVar>
<SetVar Name="ID">New</SetVar>
<SetVar Name="Type">1</SetVar>
<SetVar Name="owsfileref">/sites/1/docs/folder1</SetVar>
</Method>
</ows:Batch>
the critical line is this one:
<SetVar Name="Type">1</SetVar>
"Type" is the accepted alias of the FSObjType field
Regards,
Ahmad