Anki - finding success rate by interval - anki

I've been using Anki for about two years now, and have 5,000-odd cards. I am trying to optimise the algorithm and think it would be useful to find out when on average I am most likely to forget cards.
Is there any way of getting stats for at what interval I am most likely to answer 'Again' to a card (maybe through an add-on to statistics or even downloading my data and importing into a spreadsheet)?

Related

getting frequency of audio on every interval of time in android

I am using Media Recorder class to record sound . and I am getting amplitude on some interval of time and converting it into decibel, but what I want I also want to get frequency of audio on that interval with corresponding to that amplitude or decibel. I searched, but I did not get proper idea that how to do it.
please if someone can guide me then please help
The process most often used to determine pitch is called the "fast fourier transform". Try using those keywords or the common abbreviation "FFT" along with the language or platform you are working on and that should bring up libraries to incorporate that can do this. Coding an FFT is pretty complex, so you'll probably want to use a library. But if you are curious about the math and how they work, check out The Scientists and Engineers Guide to Digital Signal Processing.

Finding the "noise level" of an audio recording programmatically

I am tasked with something seemingly trivial which is to
find out how "noisy" a given recording is.
This recording came about via a voice recorder, a
OLYMPUS VN-733 PC which was fairly cheap (I am not doing
advertisement, I merely mention this because I in no way
aim to do anything "professional" here, I simply need to
solve a seemingly simple problem).
To preface this, I have already obtained several datasets
from different outside locations, in particular parks or
near-road recordings. That is, the noise that exists at
these specific locations, and to then compare this noise,
on average, with the other locations.
In other words:
I must find out how noisy location A is compared to location
B and C.
I have made 1 minute recordings each so that at the
least the time span of a recording can be compared
to the other locations (and I was using the very
same voice record at all positions, in the same
height etc...).
A sample file can be found at:
http://shevegen.square7.ch/test.mp3
(This may eventually be moved lateron, it just serves as
example how these recordings may sound right now. I am
unhappy about the initial noisy clipping-sound, ideally
I'd only capture the background noise of the cars etc..
but for now this must suffice.)
Now my specific question is, how can I find out how "noisy"
or "loud" this is?
The primary goal is to compare them to the other .mp3
files, which would suffice for my purpose just fine.
But ideally it would be nice to calculate on average
how "loud" every individual .mp3 is and then compared
it to the other ones (there are several recordings
per given geolocation, so I could even merge them
together).
There are some similar questions but not one in particular
that I was able to find that could answer this in a
objective manner, or perhaps I did not understand the
problem at hand. I have all the audio datasets already
but I have no idea how to find out how "loud" any one
of them is individually; there are some apps on smartphones
that claim that they can do this automatically but since
I do not have any smartphone, this is a dead end for me.
Any general advice will be much appreciated.
Noise is a notion difficult to define. Then, I will focus on loudness.
You could compute the energy of each files. For that, you need to access the samples of the audio signal (generally from a built-in function of you programming language). Then you could compute the RMS energy of the signal.
That could be the more basic processing.

How can I synchronize two audio recordings *without* timestamps?

Let's say I have two separate recordings of the same concert (created on a user's phone and then uploaded to our server). These recordings are then aligned according to their creation timestamp. However, when these recordings are played together or quickly toggled between, it is revealed that their creation timestamps must be off because there is a perceptible delay.
Since the time stamp is not a reliable way to align these recordings, what is an alternative? I would really prefer not to have to learn about audio signal processing to solve this problem, but recognize this may be the only way. So, I guess my question is:
Can I get away with doing some kind of clock synchronization? Is that even possible if the internal device clocks are clearly off by an unknown amount? If yes, a general outline of how this would work and key words would be appreciated.
If #1 is not an option, I guess I need to learn about audio signal processing? Again, a general outline of how to tackle the problem from that angle and some key words would be appreciated.
There are 2 separate issues you need to deal with. Issue 1 is the alignment of the start time of the recordings. I doubt you can expect that both user's pressed record at the exact same moment. Even if they did they may be located different distances from the speaker and it takes time for sound to travel. Aligning the start times by hand is pretty trivial. The human brain is good at comparing the similarities of sound. Programmatically it's a different story. You might try using something like cross correlation or looking over on dsp.stackexchange.com. There is no exact method though.
Issue 2 is that the clocks driving the A/D converters on the two devices are not going to be running at the same exact rate. So even if you synchronize the start time, eventually the two are going to drift apart. The time it takes to noticeably drift is a function of the difference of the two clock frequencies. If they are relatively close you may not notice in a short recording. To counter act this you need to stretch the time of one of the recordings. This increases or decreases the duration of the recording without affecting the pitch. There are plenty of audio recording apps that allow you to time stretch but they don't give you any help in figuring out by how much. Start be googling "time stretching" or again have a look at dsp.stackexchange.com.
I realize neither of these are direct answers - rather suggestions.
Take a look at this document, describes how you can align recordings using Sonic Visualizer(GPL) and a plugin.
I've not used it before, but found the document (and this question) when I was faced with a similar problem.

Tools for parsing natural language questions in realtime

photos in washington VS show me photos in washington VS I wanna see all my photos in washington taken day before yesterday
what:photos
entities:washington (dont want to be too assuming)
when: 2013-03-14
I want to parse preset queries into conditions (like above). I want these qualities:
I can extract relevant terms even in presence of fluff ("I wanna see) and lowercase nouns
warm program can accept requests over HTTP or allow me to add some network communication
warm program responds in 50ms and needs atmost 500Mb of memory for reasonable sentences
I am more experienced in Python, less so in Java
Parser data structure is easy to handle
I use NLTK, but its slow. I see StanfordNLP and OpenNLP as viable alternatives but I find the program-start latency to be too high. I dont mind integrating them over servlets if I am left with no alternative.
The Stanford Parser is a solid choice, and pretty well-supported (as research code goes). But it sounds like low latency is an important requirement for you, so I'd also suggest you look at the BUBS Parser (full disclosure - I'm one of the primary researchers working on BUBS).
I haven't compared directly to NLTK, but I think you may find that the Stanford Parser doesn't meet your performance needs. This paper found a total throughput of ~60 words/second (~2-3 sentences/second). Those timings are pretty old, so newer hardware will certainly improve that, but probably still won't come close to 50 ms latency.
As you note, startup time will be an issue with any parser - a high-accuracy model is necessarily quite large. And 500 MB is probably pretty tight too (I usually run BUBS with 1-1.2 GB). But once loaded, BUBS latency is generally in the neighborhood of 10 ms per sentence (for ~20-25-word sentences), and we can push the total throughput up around 2500 words/second before accuracy starts to drop off. I think those numbers might meet your performance needs, and I don't know of any other high-accuracy (F1 >= 88-89) parser that comes close in speed.
Note: the fastest results are with recent pruning models that aren't yet posted to the website, but I can get you a model if you need. Hope that helps, and if you have more questions, feel free to ask.

What's the best interactive Analysis and Plotting Tool for software testing?

My realtime app generates a data log: 100 words of data #10Khz. I need to analyze it and produce some plots of the results. There are intermediate calculations involved - I need to take some differences, averages, etc. Excel would work fine, except for:
the 32000 item limit on graph data series is too small - that's only 3 seconds of data.
the glacial speed at which it processes changes to graphs containing large data series is unbearable.
What are good alternatives to Excel for manipulating and plotting large quantities of data? I'm looking for something interactive, not a library.
For this sort of stuff we typically roll our own, but I know that isn't the solution you want. Can you use a good quality database (eg Oracle) to do the manipulation, then maybe put the summarized data back into Excel for the plotting? I believe Excel will link to databases these days, so you could make it quite automated.
Otherwise there are statistical tools like [SAS][1], but get your cheque book out first.
[1]: http://www.sas.com/technologies/analytics/statistics/stat/index.html SAS
There are also several free tools for analysing and plotting (see below). But I am not sure whether they have components to handle data in real-time.
R (similar to SAS) for statistical computations
octave (similar to Matlab) for mathematical computations
R (for data manipulation) and its ggplot2 module for creating sexy graphs. Incredibly useful.
If you need real-time graphics, then I'd look at building something using matplotlib. It's a Python module, and you can link it to R using rpy2 if required.
In particle and nuclear physics the big tool is ROOT, which I have seen used in a "update every two seconds as the data comes in" mode with a lot of data and a modest amount of intermediate processing.
Mind you, the student who wrote that module was a very slick programmer, and it took a while to shake the bugs out, even so.
ROOT is available for free, and provides all kinds of tools and support.

Resources