Can a single MIDI track play more then one note at once? - audio

I am writing my own MIDI parser and everything seems to be going nicely.
I am testing against some of the files I see in the wild. I noticed that a MIDI track never appears to have more then one note on at once (produces more then one tone). Is this by design, can a midi track require more then one note to play at once?
(I am not referring to the number of simultaneous tracks, I am referring to the number of tones in a single track.)
The midi files I have tested look like this:
ON_NOTE71:ON_NOTE75:ON_NOTE79
ON_NOTE71:OFF_NOTE71:ON_NOTE75:OFF_NOTE75:ON_NOTE79:OFF_NOTE79
Can it look like this?
ON_NOTE71:ON_NOTE73:OFF_NOTE73:OFF_NOTE71
How do I detect this alternative structure?

Yes. Playing more than one note at once is known as polyphony. Different MIDI specifications define support for different levels of polyphony.
See http://www.midi.org/techspecs/gm.php

The number of notes that can play at once is a hardware implementation detail. Your software should allow for any number of simultaneous notes to be playing at the same time. I suggest keeping a table of which notes are currently on so that you can send a note off for each one when playback is stopped. Ideally the table should have a count for each note that is increased when a note on happens and decreased when a note off happens. That way if a certain pitch has two note on events pending you can send two note off events. You can't know how the device you're communicating with will handle successive note on events for the same pitch so it's safest to send an equal number of note off events.

Yes. Both controllers and software can produce such events.

Related

Record LineOut output directly to file with JSyn

I have built a loopstation in JSyn. It allows you to record and play back samples. By playing multiple samples you can layer up sounds (e.g. one percussion sample, one melody, etc)
JSyn allows me to connect each of the sample players directly to my lineout where it is mixed automatically. But now I would like to record the sound just as the user hears it to a .wav-file. But I am not sure what I should connect the input port of the recorder to.
What is the smartest way to connect the audio output of all samples to the WaveRecorder?
In other words: In the Programmer's Guide there is an example for this but I am not sure how I create the "finalMix" used there.
Rather than using multiple LineOuts, just use one LineOut.
You can mix all of your signals together using a chain of MultiplyAdd units.
http://www.softsynth.com/jsyn/docs/javadocs/com/jsyn/unitgen/MultiplyAdd.html
Or you can use a Mixer unit.
http://www.softsynth.com/jsyn/docs/javadocs/com/jsyn/unitgen/MixerStereoRamped.html
Then connect the mix to your WaveRecorder and to your single LineOut.

Advice on dynamically combining mpeg-dash mpd data

I'm doing research for a project that's about to start.
We will be supplied hundreds of 30 second video files that the end user can select (via various filters) we then want to play them as if it was one video.
It seems that Media Source Extensions with MPEG-DASH is the way to go.
I feel like it could possibly be solve in the following way, but I'd like to ask if this sounds right from anyone who has done similar things
My theory:
Create mpd's for each video (via mp4box or similar tool)
User make selections (each of which has a mpd)
Read each mpd and get their <period> elements (most likely only one in each)
Create a new mpd file and insert all the <period> elements into it in order.
Caveats
I imagine this may be problematic if the videos were all different sizes formats etc, but in this case we can assume consistency.
So my question is to anyone with mpeg-dash / mpd exterience, does this sound right? or is there a better way to acheive this?
Sounds right, multi period is the only feasible way in my opinion.
Ideally you would encode all the videos with the same settings to provide the end user a consistent experience. However, it shouldn't be a problem if quality or even aspect ratio etc change from one period to another from a technical point of view. You'll need a player which supports multi period, such as dash.js or Bitmovin.

Most efficient way for doing multiple PCAP filters

I have an application that using libpcap as the mechanism for acquiring packets, and I need to filter out different protocols for different parts of the application. I need to consider optimization as the streams will have a high rate of traffic (100-400 Mbps).
What I would really like to be able to do is set up a live capture (no filter), and then selectively filter packets after the capture is made. It doesn't seem like this is possible (the bpf is built into the capture mechanism from what I can tell).
If this indeed is not possible, there are two other ways of doing it (that I have thought of), and I am not sure what would be considered more efficient or 'better':
Make multiple captures each with their own filter
Make one capture (no filter) that dumps to fifos, and have other captures read from those fifo (with their own filters)
The fifo-approach is probably not very efficient as it involves copying lots and lots of memory from A to B (e.g. 400mbps buffered - they must not block each other - to four fifos, each having a different filter, deciding to throw away 99.99% of accumulated 1600mbps). Multiple captures on the other hand only trigger action in userland if there is actually stuff to do. The filtering is (usually) done in the kernel.
A third approach is to use libwireshark, the lower portion of Wireshark, to do filtering (and wtap for capturing). This involves quite some code overhead as libwireshark is not exactly in perfect shape for third party use outside of Wireshark.
However this does come with the ability to use Wireshark's "Display Filters", which are compiled to bytecode and reasonably efficient. Many filters may be compiled once and may look at the same frame one after another. You may be able to "stack" filters as e.g. "ip.tcp" implies "ip".
This becomes quite efficient if you are able to generate the most common element of all filters and place it as a BPF-filter on your capturing device. The display-filters then only look at data that might interest at least one of them.

Methods for determining acoustical similarity (but not fingerprinting)

I'm looking for methods that work in practise for determining some kind of acoustical similarity between different songs.
Most of the methods I've seen so far (MFCC etc.) seem actually to aim at finding identical songs only (i.e. fingerprinting, for music recognition not recommendation). While most recommendation systems seem to work on network data (co-listened songs) and tags.
Most Mpeg-7 audio descriptors also seem to be along this line. Plus, most of them are defined on the level of "extract this and that" level, but nobody seems to actually make any use of these features and use them for computing some song similarity. Yet even an efficient search of similar items...
Tools such as http://gjay.sourceforge.net/ and http://imms.luminal.org/ seem to use some simple spectral analysis, file system location, tags, plus user input such as the "color" and rating manually assigned by the user or how often the song was listened and skipped.
So: which audio features are reasonably fast to compute for a common music collection, and can be used to generate interesting playlists and find similar songs? Ideally, I'd like to feed in an existing playlist, and get out a number of songs that would match this playlist.
So I'm really interested in accoustic similarity, not so much identification / fingerprinting. Actually, I'd just want to remove identical songs from the result, because I don't want them twice.
And I'm also not looking for query by humming. I don't even have a microphone attached.
Oh, and I'm not looking for an online service. First of all, I don't want to send all my data to Apple etc., secondly I want to get only recommendations from the songs I own (I don't want to buy additional music right now, while I havn't explored all of my music. I havn't even converted all my CDs into mp3 yet ...) and secondly my music taste is not mainstream; I don't want the system to recommend Maria Carey all the time.
Plus of course, I'm really interested in what techniques work well, and which don't... Thank you for any recommendations of relevant literature and methods.
Only one application has ever done this really well. MusicIP mixer.
http://www.spicefly.com/article.php?page=musicip-software
It hasn't been updated for about ten years (and even then the interface was a bit clunky), it requires a very old version of Java, and doesn't work with all file formats - but it was and still is cross-platform and free. It does everything you're asking : generates acoustic fingerprints for every mp3/ogg/flac/m3u in your collection, saves them to a tag on the song, and given one or more songs, generates a playlist similar to those songs. It only uses the acoustics of the songs, so it's just as likely to add an unreleased track which only you have on your own hard drive as a famous song.
I love it, but every time I update my operating system / buy a new computer it takes forever to get it working again.

Comparing audio recordings

I have 5 recorded wav files. I want to compare the new incoming recordings with these files and determine which one it resembles most.
In the final product I need to implement it in C++ on Linux, but now I am experimenting in Matlab. I can see FFT plots very easily. But I don't know how to compare them.
How can I compute the similarity of two FFT plots?
Edit: There is only speech in the recordings. Actually, I am trying to identify the response of answering machines of a few telecom companies. It's enough to distinguish two messages "this person can not be reached at the moment" and "this number is not used anymore"
This depends a lot on your definition of "resembles most". Depending on your use case this can be a lot of things. If you just want to compare the bare spectra of the whole file you can just correlate the values returned by the two ffts.
However spectra tend to change a lot when the files get warped in time. To figure out the difference with this, you need to do a windowed fft and compare the spectra for each window. This then defines your difference function you can use in a Dynamic time warping algorithm.
If you need perceptual resemblance an FFT probably does not get you what you need. An MFCC of the recordings is most likely much closer to this problem. Again, you might need to calculate windowed MFCCs instead of MFCCs of the whole recording.
If you have musical recordings again you need completely different aproaches. There is a blog posting that describes how Shazam works, so you might be able to find this on google. Or if you want real musical similarity have a look at this book
EDIT:
The best solution for the problem specified above would be the one described here ("shazam algorithm" as mentioned above).This is however a bit complicated to implement and easier solution might do well enough.
If you know that there are only 5 different different possible incoming files, I would suggest trying first something as easy as doing the euclidian distance between the two signals (in temporal or fourier). It is likely to give you good result.
Edit : So with different possible starts, try doing an autocorrelation and see which file has the higher peak.
I suggest you compute simple sound parameter like fundamental frequency. There are several methods of getting this value - I tried autocorrelation and cepstrum and for voice signals they worked fine. With such function working you can make time-analysis and compare two signals (base - to which you compare, in - which you would like to match) on given interval frequency. Comparing several intervals based on such criteria can tell you which base sample matches the best.
Of course everything depends on what you mean resembles most. To compare function you can introduce other parameters like volume, noise, clicks, pitches...

Resources