I'm looking forward to implement something like google direct answers which uses knowledge graph, is there any useful resource can I read ? also Where can I find data for that?
Thanks in advance
From the semantic web aspect of your question i guess an interesting starting point for you is the DBpedia (example queries) dataset, maybe in combination with the strings to Wikipedia concepts dataset.
I am excited for this free class: Probabilistic Graphical Models - Daphne Koller, Professor. I would look her up and watch her youtube videos in the mean time.
Related
I know, the object recognition feature is currently not supported by Google's ARCore.
My simple goal: detect cups and show some coffee inside. (Best would be display it live on the phone)
Is there really no way to detect objects?
Do you know any additional computations approaches, which can recognize some objects via ARCore?
Train a CNN. Instead of training image + annotation, use the point cloud + annotation. Is this approach viable?
Any approach, to record the a video + point cloud and compute them on a backend?
Is Snapchat using ARCore?
Are they detecting the face and pose to put the virtual makeup on the mesh?
How is the mesh computed?
I don't expect answers to every question, just ideas.
Maybe, someone knows simular projects, interesting links or something to think about.
Thanks in advance.
I have a text script that is used to create podcasts. So the words in podcast audio are exactly the same as in my text. Now what I want to have is the following:
Word in text | Pronounciation started at
Hello 0:0:0.000
my 0:0:1.125
friends 0:0:2.750
Is that possible to do at all?
Thanks in advance!
One of the key words you could start with to approach the complexity of the problem is "forced alignment". This site also covers questions regarding this topic e.g. here which leads you to questions and answers concerning HTK (the Hidden Markov Model Toolkit) via the releated threads.
You can find a more hands-on style description of how to use forced alignment in automated audio segmentation here.
So the answer is: yes, it is possible, but it is algorithmically very complex and even in its best implementations it is not error-free.
PS.: I found you a really simple tool
I have an alphabet which has not been tackled before, so when scanned, there's no way to detect the letters for recognition with OCR. I'm trying to program OCR for it, but don't have much experience in this. I'd appreciate some hints as to where to get started, and how such a system is normally implemented.
Take a look at this page--it describes the training process for an open source OCR engine.
The free Stanford Online Machine Learning class has a great set of lessons on Photo OCR in Part XVIII.
This blog post has a brief description of the example taught in the class.
There are some excellent resources at google books. Likewise, if you search for Optical Character Recognition on Amazon, there are some pretty up-to-date books that look to be fairly thick and intellectually challenging :D heh
btw - I'm well aware this post has some age, but you never know when some other person might stumble across this and find just what they need. And if this even has the chance of helping out, then so be it. OCR is such a strange subject, that there's not too much out there that can really really answer the deep-machine ended questions. Especially if you're going to attempt to write your own library. :P
I am looking for a way to compare a user submitted audio recording against a reference recording for comparison in order to give someone a grade or percentage for language learning.
I realize that this is a very un-scientific way of doing things and is more than a gimmick than anything.
My first thoughts are some sort of audio fingerprinting, or waveform comparison.
Any ideas where I should be looking?
This is by no means a trivial problem to solve, though there is an abundance of research on the topic. Presently the most successful forms of machine learning in the speech recognition domain apply Hidden Markov Model techniques.
You may also want to take a look at existing implementations of HMM algorithms. One such library in its early stages is ghmm.
Perhaps even better and more readily applicable to your problem is HTK.
In addition to chomp's great answer, one important keyword you probably need to look up is Dynamic Time Warping (DTW). This is the wikipedia article: http://en.wikipedia.org/wiki/Dynamic_time_warping
Given images from a certain viewpoints is there some software out there which can help me interpolate the views(i.e. Viewpoint interpolation software?).
Thanks,
View Interpolation is an illposed and very difficult problem, which in general has no optimal solution in real world scenarios.
This is because of several reasons, some of which include
Wide Baseline Matching
Disparity Estimation
Occlusion Handling (Folds and Holes)
The quality of the outcome highly depends on your video footage, especially on how close together the cameras are.
Nevertheless, research is ongoing. Dyer and Seitz for example achieved nice results on constrained examples:
http://homes.cs.washington.edu/~seitz/vmorph/vmorph.htm
Stich et.al. from TU Braunschweig showed some amazing results with their Virtual Camera system, which probably is the one thing, you are looking for:
http://www.youtube.com/watch?v=uqKTbyNoaxE
And finally, for soccer enthusiasts:
http://www.youtube.com/watch?v=cUK1UobhCX0
http://www.youtube.com/watch?v=UePrOp2s31c
As a starting point, look for Image Morphing and View Morphing.
If you want, I can provide some more papers on the topic.
Good luck! :)
EDIT: Also, if you're interested, here's my work on Spatial and Temporal Interpolation of Multi-View Image-Sequence.
http://tobiasgurdan.de/research/