How do I schema validate GML 3.3.0? What is the root element? - xsd

I have to post data to a webservice, and the specifications tell me that it must be valid GML 3.3.0, so I am trying implement validation for GML 3.3.0.
When considering GML 3.2.1, the root element is http://schemas.opengis.net/gml/3.2.1/gml.xsd and I can simply do:
$ xmllint --schema http://schemas.opengis.net/gml/3.3.0/gml.xsd
However, consider the content of these two folders:
http://schemas.opengis.net/gml/3.2.1/
http://schemas.opengis.net/gml/3.3.0/
The files are entirely different, and I don't see what the root element for 3.3.0 is.
How do I validate for 3.3.0?

Related

Bitbake: What data structure is datastore?

Following is a sentence from Bitbake user's manual:
"BitBake parses each recipe and append file located with BBFILES and stores the values of various variables into the datastore."
What data type is 'datastore' ? Is it list or Tuple or Dictionary ? Or what data type is it?
Bitbake's datastore is complex store of key+value pairs where keys also have flags (also key+value pairs). Its a custom structure written with a copy on write backend. It supports the idea of 'overrides' where one variable with special naming can override another. See https://git.openembedded.org/bitbake/tree/lib/bb/data_smart.py and https://git.openembedded.org/bitbake/tree/lib/bb/data.py within the codebase for the implementation, the Bitbake manual for information about how to use the data store and https://git.openembedded.org/bitbake/tree/lib/bb/tests/data.py for unittests of it.
You can work out the type of an object in python by executing type(foo) in the same environment. As for that specific type (datastore), a quick google indicates that it's neither a tuple or a dictionary, but a custom object with it's API documented here.

Unable to start geomesa-accumulo

hduser#Neha-PC:/usr/local/geomesa-tutorials$ java -cp geomesa-tutorials-accumulo/geomesa-tutorials-accumulo-quickstart/target/geomesa-tutorials-accumulo-quickstart-2.3.0-SNAPSHOT.jar org.geomesa.example.accumulo.AccumuloQuickStart --accumulo.instance.id accumulo --accumulo.zookeepers localhost:2184 --accumulo.user root --accumulo.password PASS1234 --accumulo.catalog table1
Picked up JAVA_TOOL_OPTIONS: -Dgeomesa.hbase.coprocessor.path=hdfs://localhost:8020/hbase/lib/geomesa-hbase-distributed-runtime_2.11-2.2.0.jar
Loading datastore
java.lang.IncompatibleClassChangeError: Method org.locationtech.geomesa.security.AuthorizationsProvider.apply(Ljava/util/Map;Ljava/util/List;)Lorg/locationtech/geomesa/security/AuthorizationsProvider; must be InterfaceMethodref constant
at org.locationtech.geomesa.accumulo.data.AccumuloDataStoreFactory$.buildAuthsProvider(AccumuloDataStoreFactory.scala:234)
at org.locationtech.geomesa.accumulo.data.AccumuloDataStoreFactory$.buildConfig(AccumuloDataStoreFactory.scala:162)
at org.locationtech.geomesa.accumulo.data.AccumuloDataStoreFactory.createDataStore(AccumuloDataStoreFactory.scala:48)
at org.locationtech.geomesa.accumulo.data.AccumuloDataStoreFactory.createDataStore(AccumuloDataStoreFactory.scala:36)
at org.geotools.data.DataAccessFinder.getDataStore(DataAccessFinder.java:121)
at org.geotools.data.DataStoreFinder.getDataStore(DataStoreFinder.java:71)
at org.geomesa.example.quickstart.GeoMesaQuickStart.createDataStore(GeoMesaQuickStart.java:103)
at org.geomesa.example.quickstart.GeoMesaQuickStart.run(GeoMesaQuickStart.java:77)
at org.geomesa.example.accumulo.AccumuloQuickStart.main(AccumuloQuickStart.java:25)
You need to ensure that all versions of GeoMesa on the classpath are the same. Just from your command, it seems you are at least mixing 2.3.0-SNAPSHOT with 2.2.0. Try checking out the git tag for tutorial project that corresponds to the GeoMesa version you want, as described here. If you want to use a SNAPSHOT version, you need to make sure that you have pulled the latest changes for each project.

Sphinx4 figuring out correct models

I am trying to use the Sphinx4 library for speech recognition, but I cannot seem to figure out the correct combination of acoustic model-dictionary-language model. I have tried out various combinations and I get a different error every time.
I am trying to follow the tutorial on http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4. I do not have a config.xml as I would if I was using ConfigurationManager instead of Configuration, because there is no perceivable way of passing the location of the config file to the Configuration itself (ConfigMgr takes it as an argument to the constructor); and that might be my problem right there. I just do not know how to point to one, and since the tutorial says "It is possible to configure low-level components of the application through XML file although you should do that ONLY IF you understand what is going on.", I assume having a config.xml file is not compulsory.
Combining the latest dictionary (7b - obtained from Sourceforge) with the latest acoustic model (cmusphinx-en-us-5.2.tar.gz - from SF again) and the language model (cmusphinx-5.0-en-us.lm.gz - from SF again) results in NullPointerException in startRecognition. The issue is similar to the problem here: sphinx-4 NullPointerException at startRecognition, but the link given in the answer no longer works. I obtained 0.7a from SF (since that is the dict the link seems to point at), but I am getting even earlier in the execution Error loading word: ;;; when I use that one. I tried downloading latest models and dict from the Github repo, that results in java.lang.IndexOutOfBoundsException: Index: 16128, Size: 16128.
Any help is much appreciated!
You need to use latest code from github
http://github.com/cmusphinx/sphinx4
as described by tutorial
http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4
Correct models (en-us) are already included, you should not replace anything. You should not configure any XML files, use samples as provided in the sources.

gora-mongodb.mapping.XML properties File

I'm new to Nutch (2.2.1) and trying to run it on Cygwin/Windows 7 with the latest version of Gora (0.5) so I can persist data to a MongoDB (2.6) datastore. I changed the Nutch-Site.XML File to include my Mongo property but I'm a little confused about the gora-mongodb.mapping.XML properties file here that's needed. Just wondering do I need to:
1) create a Java class within the Nutch/Gora project which I specify in class-name property in the gora-mongodb.mapping File or will Gora create this for me? The documentation doesn't appear to be very clear.
2) I created a sample File in my apache-nutch-2.2.1\runtime\local\conf folder and added the name of my MongoDB collection. When I run Nutch I get the following error:
$ ./nutch crawl urls -dir testCrawl -depth 3 -topN 5
cygpath: can't convert empty path
Exception in thread "main" org.apache.gora.util.GoraException: java.lang.IllegalStateException: A collection is not specified
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:221)
at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)
at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
Caused by: java.lang.IllegalStateException: A collection is not specified
at org.apache.gora.mongodb.store.MongoMappingBuilder.build(MongoMappingBuilder.java:77)
at org.apache.gora.mongodb.store.MongoStore.initialize(MongoStore.java:168)
at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
... 8 more
Any help or clarification around this file would be appreciated.
You need 2 files in nutch/conf:
gora.properties: where you declare you are going to use mongodb backend.
gora-mongodb-mapping.xml (notice the dash, not the dot you wrote): where you create a mapping between names in Gora entities and the fields in the datastore.
The version you are using I really think it is not prepared to work with Gora 0.5, but give it a shot. Copy gora-mongodb-mapping.xml from Nutch-2.3-SNAPSHOT to nutch/conf/
If it does not work, try using Nutch-2.3-SNAPSHOT instead of 2.2.1.

Apache Pig: Load a file that shows fine using hadoop fs -text

I have files that are named part-r-000[0-9][0-9] and that contain tab separated fields. I can view them using hadoop fs -text part-r-00000 but can't get them loaded using pig.
What I've tried:
x = load 'part-r-00000';
dump x;
x = load 'part-r-00000' using TextLoader();
dump x;
but that only gives me garbage. How can I view the file using pig?
What might be of relevance is that my hdfs is still using CDH-2 at the moment.
Furthermore, if I download the file to local and run file part-r-00000 it says part-r-00000: data, I don't know how to unzip it locally.
According to HDFS Documentation, hadoop fs -text <file> can be used on "zip and TextRecordInputStream" data, so your data may be in one of these formats.
If the file was compressed, normally Hadoop would add the extension when outputting to HDFS, but if this was missing, you could try testing by unzipping/ungzipping/unbzip2ing/etc locally. It appears Pig should do this decompressing automatically, but may require the file extension be present (e.g. part-r-00000.zip) -- more info.
I'm not too sure on the TextRecordInputStream.. it sounds like it would just be the default method of Pig, but I could be wrong. I didn't see any mention of LOAD'ing this data via Pig when I did a quick Google.
Update:
Since you've discovered it is a sequence file, here's how you can load it using PiggyBank:
-- using Cloudera directory structure:
REGISTER /usr/lib/pig/contrib/piggybank/java/piggybank.jar
--REGISTER /home/hadoop/lib/pig/piggybank.jar
DEFINE SequenceFileLoader org.apache.pig.piggybank.storage.SequenceFileLoader();
-- Sample job: grab counts of tweets by day
A = LOAD 'mydir/part-r-000{00..99}' # not sure if pig likes the {00..99} syntax, but worth a shot
USING SequenceFileLoader AS (key:long, val:long, etc.);
If you want to manipulate (read/write) sequence files with Pig then you can give a try to Twitter's Elephant-Bird as well.
You can find here examples how to read/write them.
If you use custom Writables in you sequence file then you can implement a custom converter by extending AbstractWritableConverter .
Note, that Elephant-Bird needs to have an installed Thrift in your machine.
Before building it, make sure that it is using the correct Thrift version you have and also provide the correct path of the Thrift executable in its pom.xml:
<plugin>
<groupId>org.apache.thrift.tools</groupId>
<artifactId>maven-thrift-plugin</artifactId>
<version>0.1.10</version>
<configuration>
<thriftExecutable>/path_to_thrift/thrift</thriftExecutable>
</configuration>
</plugin>

Resources