Property exception component:'wsjLoader' - cmusphinx

In recent days I have been reading alot about modifying the HelloWorld demo file & adding new words in it of our own choice. But I am encountering a serious problem which I am unable to counter. I am listing down my steps & then the error program is giving me.
Any help is much appreciated!
First I extracted the WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz jar file. And then I added some new words & their pronounciations in cmudict.0.6d file. After I saved it & compressed it back to jar file using a jar file maker.
The same step I repeated with HelloWorld jar file. After extracting I modified its hello.gram file by adding new words (the words which I inserted in dictionary as well as few words that were already there in dictionary e.g. John) Then I compressed it back using the same step & loaded both files on Eclipse. But both of them giving me a similar error. While the original demo files are working fine, these two files that I modified are not working anymore.
If I am replacing helloworld.jar file then I get this error:
Exception in thread "main" Property exception component:'jsgfGrammar' property:'grammarLocation' - Can't locate resource:/edu/cmu/sphinx/demo/helloworld/
edu.cmu.sphinx.util.props.InternalConfigurationException: Can't locate resource:/edu/cmu/sphinx/demo/helloworld/
at edu.cmu.sphinx.util.props.ConfigurationManagerUtils.getResource(ConfigurationManagerUtils.java:483)
at edu.cmu.sphinx.jsgf.JSGFGrammar.newProperties(JSGFGrammar.java:232)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.linguist.flat.FlatLinguist.newProperties(FlatLinguist.java:246)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager.newProperties(SimpleBreadthFirstSearchManager.java:182)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.decoder.AbstractDecoder.newProperties(AbstractDecoder.java:65)
at edu.cmu.sphinx.decoder.Decoder.newProperties(Decoder.java:37)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.recognizer.Recognizer.newProperties(Recognizer.java:90)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:161)
at edu.cmu.sphinx.demo.helloworld.HelloWorld.main(HelloWorld.java:36)
and if I'm replacing WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar
Exception in thread "main" Property exception component:'wsjLoader' property:'location' - Can't locate resource:/WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz
edu.cmu.sphinx.util.props.InternalConfigurationException: Can't locate resource:/WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz
at edu.cmu.sphinx.util.props.ConfigurationManagerUtils.getResource(ConfigurationManagerUtils.java:483)
at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.newProperties(Sphinx3Loader.java:243)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.linguist.acoustic.tiedstate.TiedStateAcousticModel.newProperties(TiedStateAcousticModel.java:102)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.linguist.flat.FlatLinguist.setupAcousticModel(FlatLinguist.java:278)
at edu.cmu.sphinx.linguist.flat.FlatLinguist.newProperties(FlatLinguist.java:244)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager.newProperties(SimpleBreadthFirstSearchManager.java:182)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.decoder.AbstractDecoder.newProperties(AbstractDecoder.java:65)
at edu.cmu.sphinx.decoder.Decoder.newProperties(Decoder.java:37)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.recognizer.Recognizer.newProperties(Recognizer.java:90)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:161)
at edu.cmu.sphinx.demo.helloworld.HelloWorld.main(HelloWorld.java:36)
Although I know there is some problem with locating the files but I'm not getting how to fix it. Also can it be a reason that I might be compressing the jar files back in a wrong way? But keep that in mind that original demo files are working fine.

You made it wrong from the beginning, you should have used updated version sphinx4-5prealpha which is much easier to use. You should not repackage any sphinx4 jars. Here are the steps you need to make to create an application using sphinx4 and recognizing a custom grammar:
Setup a new java application in your IDE
Add sphinx4 dependency with jars or with maven/gradle.
Write grammar as requested and add it into application resources.
Write dictionary as requested and add it to application
resources.
Copy SpeechRecognizer code from tutorial and modify path
according to the location of the resources you created.
For more details, see sphinx4 tutorial

Related

SSIS Package strange data flow issue, spitting out empty excel with large dataset

I am having issue with the SSIS package, by Running from BIDS I could export 400K records successfully, But when I tried to run from the Job the package ran successfully but the excel file is empty.
The user which I am running the package with having the full access to the C:\Users folders. and I see it saving the data into the temporary folder but not writing that data into the file and finish with empty file.
For example : 230000 records (works good)
Create the excel file
Load the temporary data
Write data into the file
close the file
330000 records (not working)
Create the excel file
Load the temporary data
Write data into the file xxxxxxx this line missing from the process monitor
close the file
Solution : give permission to the user executing the package to the folder C:\Users\Default doesn't work for me.
Please help!
Sorry for bugging you guys, Found the problem. There was just 1.6GB of disk space on the server, thought the file is taking just 200MB of space but generate lots of temporary files causing the disk full error. Strange that SSIS package ran successfully without giving any warning or error. Thanks for looking into it.

Spark (PySpark) File Already Exists Exception

I am trying to save a data frame as a text file, however, I am getting a File Already Exists exception. I tried adding the mode to the code but to no avail. Furthermore, the file does not actually exists. Would anyone have an idea how I can solve this problem? I am using PySpark
This is the code:
distFile = sc.textFile("/Users/jeremy/Downloads/sample2.nq")
mapper = distFile.map(lambda q: __q2v(q))
reducer = mapper.reduceByKey(lambda a, b: a + os.linesep + b)
data_frame = reducer.toDF(["context", "triples"])
data_frame.coalesce(1).write.partitionBy("context").text("/Users/jeremy/Desktop/so")
May I add that the exception is being raised after some time and that some data is actually stored in temporary files (which are obviously deleted).
Thanks!
Edit: Exception can be found here: https://gist.github.com/jerdeb/c30f65dc632fb997af289dac4d40c743
you can used overwrite or append for replacing the file or adding the data into same file.
data_frame.coalesce(1).write.mode('overwrite').partitionBy("context").text("/Users/jeremy/Desktop/so")
or
data_frame.coalesce(1).write.mode('append').partitionBy("context").text("/Users/jeremy/Desktop/so")
I had the same problem and was able get around it with this:
outputDir = "/FileStore/tables/my_result/"
dbutils.fs.rm(outputDir , True)
Just change the outputDir variable to whatever directory you are writing to.
You should check your executors and look at the logs of the ones that are failing.
In my case, I had a coalesce(1) on a large DF. 4 of my executors failed - 3 of them had the same error of org.apache.hadoop.fs.FileAlreadyExistsException: File already exists.
However, 1 of them had a different exception: org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 262144 bytes of memory, got 148328
I was able to fix it by increasing the executor memory so that the coalesce did not cause an out of memory error.

shutil move raising Invalid cross-device link error

I am using Python 3.5.
I am using shutil.move(src, dest) to move a file between 2 different file systems.
As I understand cross-device link error is raised by OS because it can't create hard links across 2 different file systems, which is fine.
But as per documentation, shutil.move can move files by copying it to destination and then deleting it at the source.
My exception further says that the exception is because of os.rename that is internally called on line 538 of in shutil.move source code.
Anyone knows how to make shutil.move work?
I read tons of post suggesting shutil.move would definitely work to copy files between 2 file systems, including the documentation .

Unable to Load Resource Using getResource

I am trying to simply load a file from a packages resource folder. I have the following project structure:
And have tried the following in an attempt to load each of the txt files to the Populator.groovy script:
File file = new File(Populator.class.getResource("/names/first-names.txt").getFile())
The above results in a FileNotFoundException if any methods are called from the file instance. The path returned is correct, and the file is indeed where the path specifies. I am also using very similar methods of extracting resources in above modules and no errors are occurring. Whats going on here?
Why not
File file = new File(Populator.class.getResource("/names/first-names.txt").toURI())
Not sure why you want it as a file though? Wouldn't an input stream do?

ParseExceptions when using HQL file on HDInsight

I'm following this tutorial http://azure.microsoft.com/en-us/documentation/articles/hdinsight-use-hive/ but have become stuck when changing the source of the query to use a file.
It all works happily when using New-AzureHDInsightHiveJobDefinition -Query $queryString but when I try New-AzureHDInsightHiveJobDefinition -File "/example.hql" with example.hql stored in the "root" of the blob container I get ExitCode 40000 and the following in standarderror:
Logging initialized using configuration in file:/C:/apps/dist/hive-0.11.0.1.3.7.1-01293/conf/hive-log4j.properties
FAILED: ParseException line 1:0 character 'Ã?' not supported here
line 1:1 character '»' not supported here
line 1:2 character '¿' not supported here
Even when I deliberately misspell the hql filename the above error is still generated along with the expected file not found error so it's not the content of the hql that's causing the error.
I have not been able to find the hive-log4j.properties in the blob store to see if it's corrupt, I have torn down the HDInsight cluster and deleted the associated blob store and started again but ended up with the same result.
Would really appreciate some help!
I am able to induce a similar error by putting a Utf-8 or Unicode encoded .hql file into blob storage and attempting to run it. Try saving your example.hql file as 'ANSI' in Notepad (Open, the Save As and the encoding option is at the bottom of the dialog) and then copy it to blob storage and try again.
If the file is not found on Start-AzureHDInsightJob, then that cmdlet errors out and does not return a new AzureHDInsightJob object. If you had a previous instance of the result saved, then the subsequent Wait-AzureHDInsightJob and Get-AzureHDInsightJobOutput would be referring to a previous run, giving the illusion of the same error for the not found case. That error should definitely indicate a problem reading an UTF-8 or Unicode file when one is not expected.

Resources