There is a lot of research going on about gesture recognition. I figured I would narrow this down to the topic of hand gesture recognition (i.e. stationary hand positions, up to as complex and dynamic as sign language recognition).
Considering the image processing techniques available in real-time, such as blob detection, edge detection, point of interest tracking, etc. Coupled with Hidden Markov Models and other comparison AI, what techniques/algorithms would you use to do real-time motion tracking and gesture recognition?
I remember previous colleagues of mine have worked on a similar problem. They have published a conference paper on this topic: Framework for a portable gesture interface. Hope this helps.
I think for being accurate you have to combine all these techniques.
I did such a thing with wii-mote, it recognizes the hand move but not gesture.
We used hidden markov model and it was successful in real time.
Maybe it is not a real answer but you cant know without trying.
well i think you can use active shape models or active appearence models i remember seeing some papers about it
Related
How do you determine which onsets are beats? I am using Spectral Flux for Note Onset Detection and a Running Mean for peak-picking/thresholding.
I am just working with the guitar instrument so the presence of percussions may not help with this. Any ideas?
Thanks!
EDIT: Wow...just realized this question is 3 years old...sorry to resurrect an old post.
My Master's thesis was in beat detection and the main advantage of my method over all other published methods of beat detection was in resolution, both in the time domain and frequency (beat) domain. You can find my thesis here. What it basically boils down to (after alot of filtering) is a comb-filter convolution. My code is an adaptation of this project, which contains Matlab files for you to see how it works.
My code (both in C++ and the Matlab port) is not publicly available due to possible copywrite issues with my university, but if you email me at dberm22[at]gmail[dot]com, I'd be more than willing to ahem::discuss my work with you.
Try using a beat tracking algorithm. Beat tracking is a distinct problem from onset detection.
I think there's a good algorithm in the Queen Mary plugin set for Sonic Visualizer. The plugins are open source, so you can have a look at the code to figure out how they work.
Or do a search on google scholar for "beat tracking". There are a number of effective approaches. Dan Ellis' is a good one to start with. It's intuitive, and there's code available in Matlab and Java.
I am looking for a way to compare a user submitted audio recording against a reference recording for comparison in order to give someone a grade or percentage for language learning.
I realize that this is a very un-scientific way of doing things and is more than a gimmick than anything.
My first thoughts are some sort of audio fingerprinting, or waveform comparison.
Any ideas where I should be looking?
This is by no means a trivial problem to solve, though there is an abundance of research on the topic. Presently the most successful forms of machine learning in the speech recognition domain apply Hidden Markov Model techniques.
You may also want to take a look at existing implementations of HMM algorithms. One such library in its early stages is ghmm.
Perhaps even better and more readily applicable to your problem is HTK.
In addition to chomp's great answer, one important keyword you probably need to look up is Dynamic Time Warping (DTW). This is the wikipedia article: http://en.wikipedia.org/wiki/Dynamic_time_warping
Given images from a certain viewpoints is there some software out there which can help me interpolate the views(i.e. Viewpoint interpolation software?).
Thanks,
View Interpolation is an illposed and very difficult problem, which in general has no optimal solution in real world scenarios.
This is because of several reasons, some of which include
Wide Baseline Matching
Disparity Estimation
Occlusion Handling (Folds and Holes)
The quality of the outcome highly depends on your video footage, especially on how close together the cameras are.
Nevertheless, research is ongoing. Dyer and Seitz for example achieved nice results on constrained examples:
http://homes.cs.washington.edu/~seitz/vmorph/vmorph.htm
Stich et.al. from TU Braunschweig showed some amazing results with their Virtual Camera system, which probably is the one thing, you are looking for:
http://www.youtube.com/watch?v=uqKTbyNoaxE
And finally, for soccer enthusiasts:
http://www.youtube.com/watch?v=cUK1UobhCX0
http://www.youtube.com/watch?v=UePrOp2s31c
As a starting point, look for Image Morphing and View Morphing.
If you want, I can provide some more papers on the topic.
Good luck! :)
EDIT: Also, if you're interested, here's my work on Spatial and Temporal Interpolation of Multi-View Image-Sequence.
http://tobiasgurdan.de/research/
I am in the process of developping a game, and after two months of work (not full time mind you), I have come to realise that our specs for the game are lacking a lot of details. I am not a professional game developper, this is only a hobby.
What I would like to receive help or advices for is this: What are the major components that you find in games, that have to be developped or already exists as librairies? The objective of this question is for me to be able to specify more game aspects.
Currently, we had specified pretty much only how we would work on the visual, completely forgetting everything about game logic (AI, Entities interactions, Quest logic (how do we decide whether or not a quest is completed)).
So far, I have found those points:
Physics (collision detection, actual forces, etc.)
AI (pathfinding, objectives, etc.)
Model management
Animation management
Scene management
Combat management
Inventory management
Camera (make sure not to render everything that is in the scene)
Heightmaps
Entities communication (Player with NPC, enemy, other players, etc)
Game state
Game state save system
In order to reduce the scope of this queston, I'd like it if you could specifically discuss aspects related to developping an RPG type of game. I will also point out that I am using XNA to develop this game, but I have almost no grasp of all the classes available yet (pretty much only using the Game component with some classes that are related to it such as GameTime, SpriteBatch, GraphicDeviceManager) but not much more.
You have a decent list, but you are missing storage (save load), text (text is important in RPGs : Unicode, font rendering), probably a macro system for text (something that replaces tokens like {player} with the player characters name), and most important of all content generation tools (map editor, chara editor, dialog editor) because RPGs need content (or auto generation tools if you need ). By the way have links to your work?
I do this exact stuff for a living so if you need more pointers perhaps I can help.
I don't know if this is any help, but I have been reading articles from http://www.gamasutra.com/ for many years.
I don't have a perfect set of tools from the beginning, but your list covers most of the usual trouble for RUNNING the game. But have you found out what each one of the items stands for? How much have you made already? "Inventory Management" sounds very heavy, but some games just need a simple "array" of objects. Takes an hour to program + some graphical integration (if you have your GUI Management done already).
How to start planning
When I develop games in my spare time, I usually get an idea because another game lacks this function/option. Then I start up what ever development tool I am currently using and try to see if I can make a prototype showing this idea. It's not always about fancy graphics, but most often it's more about finding out how to solve a certain problem. Green and red boxes will help you most of the way, but otherwise, use Google Images and do a quick search for prototype graphics. But remember that these images are probably copyrighted, so only use them for internal test purposes and to explain to your graphic artists what type of game/graphic you want to make.
Secondly, you'll find that you need to find/build tools to create the "maps/missions/quests" too. Today many develop their own "object script" where they can easily add new content/path to a game.
Many of the ideas we (my friends and I) have been testing started with a certain prototype of the interface, to see if its possible to generate that sort of screen output first. Then we build a quick'n'dirty map/level-editor that can supply us with test maps.
No game logic at this point, still figuring out if the game-engine in general is running.
My first game-algorithm problem
Back when I was in my teens I had a Commodore 64 and I was wondering, how do they sort 10 numbers in order for a Highscore? It took me quite a while to find a "scalable" way of doing this, but I learned a lot about programming too.
The second problem I found
How do I make a tank/cannon fire a bullet in the correct direction when I fly my helicopter around the screen?
I sat down and drew quick sketches of the actual problem, looked at the bullet lines, tried some theories of my own and found something that seemed to be working (by dividing and multiplying positions etc.) later on in school I discovered this to be more or less Pythagoras. LOL!
Years and many game attempts later
I played "Dune" and the later C&C + the new game Warcraft (v1/v2) - I remember it started to annoyed me how the lame AI worked. The path finding algorithms were frustrating for the player, I thought. They moved in direction of target position and then found a wall, but if the way was to complex, the object just stopped. Argh!
So I first sat with large amounts of paper, then I tried to draw certain scenarios where an "object" (tank/ork/soldier) would go from A to B and then suddenly there was a "structure" (building/other object) in the pathway - what then?
I learned about A-star pathfinding (after solving it first on my own in a similar way, then later reading about the reason for this working). A very "cpu heavy" way of finding a path, but I learned a lot from the process of "cracking this nut". These thoughts have helped me a lot developing other game algortimes over time.
So what I am saying is: I think you'll have to think more of:
How is the game to be played?
What does the user experience look like?
Why would the user want to come back to the game?
What requirements are needed? Broadband? 19" monitor with 1280x1024?
An RPG, yes - but will it be multi-user or single?
Do we need a fast network/server setup or do we need to develop a strong AI for the NPCs?
And much more...
I am not sure this is what you asked for, but I hope you can use it somehow?
There are hundreds of components needed to make a game, from time management to audio. You'll probably need to roll your own GUI, as native OS controls are very non-gamey. You will probably also need all kinds of tools to generate your worlds, exporters to convert models and textures into something suitable for your game etc.
I would strongly recommend that you start with one of the many free or cheap game engines that are out there. Loads of them come with the source code, so you can learn how they have been put together as you go.
When you think you are ready, you can start to replace parts of the engine you are using to better suit your needs.
I agree with Robert Gould's post , especially about tools and I'd also add
Scripting
Memory Management
Network - especially replication of game object states and match-making
oh and don't forget Localisation - particularly for text strings
Effects and effect timers (could be magical effects, could just be stuff like being stunned.)
Character professions, skills, spells (if that kind of game).
World creation tools, to make it easy for non-programmer builders.
Think about whether or not you want PvP. If so, you need to really think about how you're going to do your combat system and any limits you want on who can attack whom.
Equipment, "treasure", values of things and how you want to do the economy.
This is an older question, but IMHO now there is a better answer: use Unity (or something akin to it). It gives you 90% of what you need to make a game up front, so you can jump in and focus directly on the part you care about, which is the gameplay. When you run aground because there's something it doesn't do out of the box, you can usually find a resource in the Asset Store for free or cheap that will save you a lot of work.
I would also add that if you're not working on your game full-time, be mindful of the complexity and the time-frame of the task. If you'll try to integrate so many different frameworks into your RPG game, you can easily end up with several years worth of work; maybe it would be more advisable to start small and only develop the "core" of your game first and not bother about physics, for example. You could still add it in the second version.
With the popularity of the Apple iPhone, the potential of the Microsoft Surface, and the sheer fluidity and innovation of the interfaces pioneered by Jeff Han of Perceptive Pixel ...
What are good examples of Graphical User Interfaces which have evolved beyond the
Windows, Icons, ( Mouse / Menu ), and Pointer paradigm ?
Are you only interested in GUIs? A lot of research has been done and continues to be done on tangible interfaces for example, which fall outside of that category (although they can include computer graphics). The User Interface Wikipedia page might be a good place to start. You might also want to explore the ACM CHI Conference. I used to know some of the people who worked on zooming interfaces; the Human Computer Interaction Lab an the University of Maryland also has a bunch of links which you may find interesting.
Lastly I will point out that a lot of innovative user interface ideas work better in demos than they do in real use. I bring that up because your example, as a couple of commenters have pointed out, might, if applied inappropriately, be tiring to use for any extended period of time. Note that light pens were, for the most part, replaced by mice. Good design sometimes goes against naive intuition (mine anyway). There is a nice rant on this topic with regard to 3d graphics on useit.com.
Technically, the interface you are looking for may be called Post-WIMP user interfaces, according to a paper of the same name by Andries van Dam. The reasons why we need other paradigms is that WIMP is not good enough, especially for some specific applications such as 3D model manipulation.
To those who think that UI research builds only cool-looking but non-practical demos, the first mouse was bulky and it took decades to be prevalent. Also Douglas Engelbart, the inventor, thought people would use both mouse and (a short form of) keyboard at the same time. This shows that even a pioneer of the field had a wrong vision about the future.
Since we are still in WIMP era, there are diverse comments on how the future will be (and most of them must be wrong.) Please search for these keywords in Google for more details.
Programming by example/demonstration
In short, in this paradigm, users show what they want to do and computer will learn new behaviors.
3D User Interfaces
I guess everybody knows and has seen many examples of this interface before. Despite a lot of hot debates on its usefulness, a part of 3D interface ongoing research has been implemented into many leading operating systems. The state of the art could be BumpTop. See also: Zooming User Interfaces
Pen-based/Sketch-based/Gesture-based Computing
Though this interface may use the same hardware setup like WIMP but, instead of point-and-click, users command through strokes which are information-richer.
Direct-touch User Interface
This is ike Microsoft's Surface or Apple's iPhone, but it doesn't have to be on tabletop. The interactive surface can be vertical, say wall, or not flat.
Tangible User Interface
This has already been mentioned in another answer. This can work well with touch surface, a set of computer vision system, or augmented reality.
Voice User Interface, Mobile computing, Wearable Computers, Ubiquitous/Pervasive Computing, Human-Robot Interaction, etc.
Further information:
Noncommand User Interface by Jakob Nielsen (1993) is another seminal paper on the topic.
If you want some theoretical concepts on GUIs, consider looking at vis, by Tuomo Valkonen. Tuomo has been extremely critical of WIMP concept for a long, he has developed ion window manager, which is one of many tiling window managers around. Tiling WMs are actually a performance win for the user when used right.
Vis is the idea of an UI which actually adapts to the needs of the particular user or his environment, including vision impairment, tactile preferences (mouse or keyboard), preferred language (to better suit right-to-left languages), preferred visual presentation (button order, mac-style or windows-style), better use of available space, corporate identity etc. The UI definition is presentation-free, the only things allowed are input/output parameters and their relationships. The layout algorithms and ergonomical constraints of the GUI itself are defined exactly once, at system level and in user's preferences. Essentially, this allows for any kind of GUI as long as the data to be shown is clearly defined. A GUI for a mobile device is equally possible as is a text terminal UI and voice interface.
How about mouse gestures?
A somewhat unknown, relatively new and highly underestimated UI feature.
They tend to have a somewhat steeper learning curve then icons because of the invisibility (if nobody tells you they exist, they stay invisible), but can be a real time saver for the more experienced user (I get real aggrevated when I have to browse without mouse gestures).
It's kind of like the hotkey for the mouse.
Sticking to GUIs puts limits on the physical properties of the hardware. Users have to be able to read a screen and respond in some way. The iPhone, for example: It's interface is the whole top surface, so physical size and the IxD are opposing factors.
Around Christmas I wrote a paper exploring the potential for a wearable BCI-controlled device. Now, I'm not suggesting we're ready to start building such devices, but the lessons learnt are valid. I found that most users liked the idea of using language as the primary interaction medium. Crucially though, all expressed concerns about ambiguity and confirmation.
The WIMP paradigm is one that relies on very precise, definite actions - usually button pressing. Additionally, as Nielsen reminds us, good feedback is essential. WIMP systems are usually pretty good at (or at least have the potential to) immediately announcing the receipt and outcome of a users actions.
To escape these paired requirements, it seems we really need to write software that users can trust. This might mean being context aware, or it might mean having some sort of structured query language based on a subset of English, or it might mean something entirely different. What it certainly means though, is that we'd be free of the desktop and finally be able to deploy a seamlessly integrated computing experience.
NUI Group people work primarily on multi-touch interfaces and you can see some nice examples of modern, more human-friendly designs (not counting the endless photo-organizing-app demos ;) ).
People are used to WIMP, the other main issue is that most of the other "Cool" interfaces require specialized hardware.
I'm not in journalism; I write software for a living.
vim!
It's definitely outside the realm of WIMP, but whether it's beyond it or way behind it is up to judgment!
I would recommend the following paper:
Jacob, R. J., Girouard, A., Hirshfield, L. M., Horn, M. S., Shaer, O., Solovey, E. T., and Zigelbaum, J. 2008. Reality-based interaction: a framework for post-WIMP interfaces. In Proceeding of the Twenty-Sixth Annual SIGCHI Conference on Human Factors in Computing Systems (Florence, Italy, April 05 - 10, 2008). CHI '08. ACM, New York, NY, 201-210. see DOI