CMU Sphinx 5prealpha alignment issue - cmusphinx

I am using sphinx 5prealpha to do alignment but I am getting bad results. I tried different AM and dictionnaries the results is always the same. When I use the same AM and dictionaries with an older version (sphinx4) I get very good result.
for this audio file and text :
files
The result is good for sphinx4 and not for 5prealpha. I am sure if you try with any french AM and dictionary you will see the difference.
Is there any way to fix that issue. Any help will be appreciated.
---- UPDATE ----
I tried with the two AM and dictionaries here :
Ester and SPhinx models
I am using this source code: Sphinx source code
When I use ester AM and dictionary, I expect to get (result from the old sphinx version 4):
expected.txt
But I get (with 5 prealpha :
what_i_get.txt
I am displaying result with :
List<WordResult> wr = aligner.align(audioUrl, text);
for (WordResult result : wr) {
System.out.println(
result.getWord().toString() + " " + Long.toString(result.getTimeFrame().getStart())
+ " " + Long.toString(result.getTimeFrame().getEnd()));
}

Old algorithm worked best for short utterances like yours, new one is designed for very long files. So new algorithm is not that great for short utterances.
One has to fix slightly the alignment algorithm to make it work for case. I'm planning to look on it this week, but probably it will take some more time.

Related

Android java get wav file frames

I need to get waveform data from the wav file,but my code returns not right waveform (i compare my results with waveform from fl studio)
This is my code:
path = "/storage/emulated/0/FLM User
Files/My Samples/808 (16).wav";
waveb = FileUtil.readFile(path);
waveb = waveb.substring((int) (waveb.indexOf("data") + 4), (int)(waveb.length()));
byte[] b = waveb.getBytes();
for(int i= 0; i < (int)(b.length/4); i++) {
map = new HashMap<>();
map.put("value", String.valueOf((long)((b[i*4] & 0xFF) +
((b[i*4+1] & 0xFF) << 8))));
map.put("byte", String.valueOf((long)(b[i*4])));
l.add(map);
}
listview1.setAdapter(new
Listview1Adapter(l));
( (BaseAdapter)listview1.getAdapter()).notifyDataSetChanged();
My results:
Fl studio mobile results:
I'm not sure I can help, given what I know off of the top of my head, but perhaps this will trigger some ideas in your search for a solution.
It looks to me like you are assuming the sound file is 16-bit stereo, little-endian, and that you are only attempting to inspect one track of the stereo frame. Can you confirm this?
There's at least one way this plan could go awry: the .wav header may be an odd number of bytes in length, and you might not be properly parsing frame boundaries as a result. As an experiment, maybe try adding a different increment when you reference the b[] array? For example b[i4 + 1] and b[i4 + 2] instead of b[i4] and b[i4 + 1]. This won't solve the general problem of parsing .wav headers, but it could at least get you closer to understanding the situation.
It sure looks like Java's AudioInputStream is not accessible in Android, and all searches that I have that ask if there is an Android equivalent are turning up unanswered.
I've used AudioTrack for the playback of raw PCM, but I don't know an Android equivalent for reading wav files. The AudioRecord class and read() methods look interesting as the read methods store PCM data in a short array, but I've never used them, and they seem to be hard-coded to the microphone for input.
There used to be a Google Group: andraudio#googlegroups.com. IDK if it is still around. I used to go there and occasionally ask about things.
Maybe there is code you can use from Oboe or libGDX? The latter makes use of OpenAL and is for cross-platform development, with Android as one of the target platforms. I have not looked into either for this question.
If you do find the answer, it would be great to post it as a solution. This seems to be a matter that many have tried to solve and given up on.

Take mean of feature set using openSMILE audio feature extractor

My problem is taking mean of all features from different frames in one sample .wav file. I am trying cFunctionals in "chroma_fft.conf" file which belongs to latest OpenEar framework. For best explanation, i am writing these essential codes which i wrote in "chroma_fft.conf" and it is shown below;
[componentInstances:cComponentManager]
instance[functL1].type = cFunctionals
[functL1:cFunctional]
reader.dmLevel = chroma
writer.dmLevel = func
frameMode = full
frameSize=0
frameStep=0
functionalsEnabled = Means
Means.amean = 1
[csvSink:cCsvSink]
reader.dmLevel = func
..NOT-IMPORTANT......
..NOT-IMPORTANT......
However, when i run from command prompt in windows, i got error;
"(ERROR) [1] in configManager : base instance of field 'functL1.reader.dmInstance' not found in configmanager!"
Very similar code is running succesfully from "emo_large.conf" but this code got error. If any body knows how to use OpenSmile audio feature extractor, can give advice or answer why it has error and how to use "cFunctionals" properly to take mean, variance, moments etc. of large feature sets.
Thanks!
In this case you have a typo in
[functL1:cFunctional]
which should be
[functL1:cFunctionals]
I admit the error message
"(ERROR) [1] in configManager : base instance of field 'functL1.reader.dmInstance' not found in configmanager!"
is not intutive, but it refers to the fact that openSMILE expects a configuration section functL1 of type cFunctionals in the config to read the mandatory (sub-)field functL1.reader.dmInstance, which it then cannot find, because the section (due to the typo) is not defined.
Cheers,
Florian

Finding Related Topics using Google Knowledge Graph API

I'm currently working on a behavioral targeting application and I need a considerably large keyword database/tool/provider that enables applications to reach to the similar keywords via given keyword for my app. I've recently found that Freebase, which had been providing a similar service before Google acquired them and then integrated to their Knowledge Graph. I was wondering if it's possible to have a list of related topics/keywords for the given entity.
import json
import urllib
api_key = 'API_KEY_HERE'
query = 'Yoga'
service_url = 'https://kgsearch.googleapis.com/v1/entities:search'
params = {
'query': query,
'limit': 10,
'indent': True,
'key': api_key,
}
url = service_url + '?' + urllib.urlencode(params)
response = json.loads(urllib.urlopen(url).read())
for element in response['itemListElement']:
print element['result']['name'] + ' (' + str(element['resultScore']) + ')'
The script above returns the queries below, though I'd like to receive related topics to yoga, such as health, fitness, gym and so on, rather than the things that has the word "Yoga" in their name.
Yoga Sutras of Patanjali (71.245544)
Yōga, Tokyo (28.808222)
Sri Aurobindo (28.727333)
Yoga Vasistha (28.637642)
Yoga Hosers (28.253984)
Yoga Lin (27.524054)
Patanjali (27.061115)
Yoga Journal (26.635073)
Kripalu Center (26.074436)
Yōga Station (25.10318)
I'd really appreciate any suggestions, and I'm also open to using any other API if there is any that I could make use of. Cheers.
See your point:) So here's the script I use for that using Serpstat's API. Here's how it works:
Script collects the keywords from Serpstat's database
Then, collects search suggestions from Serpstat's database
Finally, collects search suggestions from Google's suggestions
Note that to make script work correctly, it's preferable to fill all input boxes. But not all of them are required.
Keyword — required keyword
Search Engine — a search engine for which the analysis will be carried out. For example, for the US Google, you need to set the g_us. The entire list of available search engines can be found here.
Limit the maximum number of phrases from the organic issue, which will participate in the analysis. You cannot set more than 1000 here.
Default keys — list of two-word keywords. You should give each of them some "weight" to receive some kind of result if something goes wrong.
Format: type, keyword, "weight". Every keyword should be written from a new line.
Types:
w — one word
p — two words
Examples:
"w; bottle; 50" — initial weight of word bottle is 50.
"p; plastic bottle; 30" — initial weight of phrase plastic bottle is 30.
"w; plastic bottle; 20" — incorrect. You cannot use a two-word phrase for the "w" type.
Bad words — comma-separated list of words you want the script to exclude from the results.
Token — here you need to enter your token for API access. It can be found on your profile page.
You can download the source code for script here

Using Perl6 to process a large text file, and it's Too Slow.(2014-09)

The code host in https://github.com/yeahnoob/perl6-perf , as follow:
use v6;
my $file=open "wordpairs.txt", :r;
my %dict;
my $line;
repeat {
$line=$file.get;
my ($p1,$p2)=$line.split(' ');
if ?%dict{$p1} {
%dict{$p1} = "{%dict{$p1}} {$p2}".words;
} else {
%dict{$p1} = $p2;
}
} while !$file.eof;
Running well when the "wordpairs.txt" is small.
But when the "wordpairs.txt" file is about 140,000 lines (each line, two words), it is running Very Very Slow. And it cannot Finish itself, even after 20 seconds running.
What's the problem with it? Is there any fault in the code??
Thanks for anyone help!
Following contents Added # 2014-09-04, THANKS for many suggestions from SE Answers and IRC#freenode#perl6
The code(for now, 2014-09-04):
my %dict;
grammar WordPairs {
token word-pair { (\S*) ' ' (\S*) "\n" }
token TOP { <word-pair>* }
}
class WordPairsActions {
method word-pair($/) { %dict{$0}.push($1) }
}
my $match = WordPairs.parse(slurp, :actions(WordPairsActions));
say ?$match;
Running time cost(for now):
$ time perl6 countpairs.pl wordpairs.txt
True
The pairs count of the key word "her" in wordpairs.txt is 1036
real 0m24.043s
user 0m23.854s
sys 0m0.181s
$ perl6 --version
This is perl6 version 2014.08 built on MoarVM version 2014.08
This test's time performance is not reasonable for now(as the same proper Perl 5 code only cost about 160ms), but Much Better than my original old Perl6 code. :)
PS. The whole thing, including original test code, patch and sample text, is on github.
I've tested this with code very similar to Christoph's using a file containing 10,000 lines. It takes around 15 seconds, which as you say, is significantly slower than Perl 5. I suspect that the code is slow because something this code uses hasn't seen as much optimisation effort as other parts of Rakudo and MoarVM have received recently. I'm sure that the performance of the code will improve dramatically over the next few months as whatever is slow sees more attention.
When trying to determine why some Perl 6 code is slow I suggest running perl6 on MoarVM with --profile to see whether it helps you find the bottleneck. Unfortunately, with this code it'll point to rakudo internals rather than anything you can improve.
It's certainly worth talking to #perl6 on irc.freenode.net as they'll have the knowledge to offer an alternative solution and will be able to improve its performance in the future.
Rakudo isn't exactly known for its stellar performance.
Using more idiomatic code might or might not help:
my %dict;
for open('wordpairs.txt', :r).lines {
my ($key, #words) = .words;
push %dict{$key}, #words;
}
You could also check the other backends (Rakudo runs on MoarVM, Parrot and JVM) to see if it is equally slow everywhere.
It would be interesting to know if it's IO or processing that's slow, eg via
my %dict;
say 'start IO';
my #lines = eager open('wordpairs.txt', :r).lines;
say 'done IO';
say 'start processing';
for #lines { ... }
say 'done processing';
I believe there's also a profiler available, if you want to dig into the issue yourself.

Using indexed types for ElasticSearch in Titan

I currently have a VM running Titan over a local Cassandra backend and would like the ability to use ElasticSearch to index strings using CONTAINS matches and regular expressions. Here's what I have so far:
After titan.sh is run, a Groovy script is used to load in the data from separate vertex and edge files. The first stage of this script loads the graph from Titan and sets up the ES properties:
config.setProperty("storage.backend","cassandra")
config.setProperty("storage.hostname","127.0.0.1")
config.setProperty("storage.index.elastic.backend","elasticsearch")
config.setProperty("storage.index.elastic.directory","db/es")
config.setProperty("storage.index.elastic.client-only","false")
config.setProperty("storage.index.elastic.local-mode","true")
The second part of the script sets up the indexed types:
g.makeKey("property").dataType(String.class).indexed("elastic",Edge.class).make();
The third part loads in the data from the CSV files, this has been tested and works fine.
My problem is, I don't seem to be able to use the ElasticSearch functions when I do a Gremlin query. For example:
g.E.has("property",CONTAINS,"test")
returns 0 results, even though I know this field contains the string "test" for that property at least once. Weirder still, when I change CONTAINS to something that isn't recognised by ElasticSearch I get a "no such property" error. I can also perform exact string matches and any numerical comparisons including greater or less than, however I expect the default indexing method is being used over ElasticSearch in these instances.
Due to the lack of errors when I try to run a more advanced ES query, I am at a loss on what is causing the problem here. Is there anything I may have missed?
Thanks,
Adam
I'm not quite sure what's going wrong in your code. From your description everything looks fine. Can you try the follwing script (just paste it into your Gremlin REPL):
config = new BaseConfiguration()
config.setProperty("storage.backend","inmemory")
config.setProperty("storage.index.elastic.backend","elasticsearch")
config.setProperty("storage.index.elastic.directory","/tmp/es-so")
config.setProperty("storage.index.elastic.client-only","false")
config.setProperty("storage.index.elastic.local-mode","true")
g = TitanFactory.open(config)
g.makeKey("name").dataType(String.class).make()
g.makeKey("property").dataType(String.class).indexed("elastic",Edge.class).make()
g.makeLabel("knows").make()
g.commit()
alice = g.addVertex(["name":"alice"])
bob = g.addVertex(["name":"bob"])
alice.addEdge("knows", bob, ["property":"foo test bar"])
g.commit()
// test queries
g.E.has("property",CONTAINS,"test")
g.query().has("property",CONTAINS,"test").edges()
The last 2 lines should return something like e[1t-4-1w][4-knows-8]. If that works and you still can't figure out what's wrong in your code, it would be good if you can share your full code (e.g. in Github or in a Gist).
Cheers,
Daniel

Resources