Using 4 KByte/sec Channel for Voice Transmission - audio

Is it possible to transfer voice through a channel that allows not greater than 4KByte/sec speed? I am looking for the lowest quality transmission method. I considered Opus codec but looks like the lowest rate it supports is 6 kb/s.
Please, refer me to such methods.

The lowest rate supported by Opus is 6 k*bit*/s and you have 4 kByte/s, so 32 kbit/s. At 32 kbit/s you can have very high quality voice.

Related

Multichannel DAC configuration

We need 96 channel Analog output on our board. Each channel should be capable to drive separate waveforms simultaneously. DAC should support 4MSPS update rate per channel and 12 bit resolution. When we are checking for DAC with such specification, we are only finding DAC parts with 2 channel or 1 channel if we consider 4MSPS and 12 bit. Also 4 channels, but with around 40 I/O requirement per part. So the total I/O for 96 channel will be very huge.
Can anyone suggest any best suitable part for us or any alternate approach?
4MSPS sample rate, 12 bit resolution, Output 1Vpp (if more also fine).
Total I/O for 96 channel will be around 350, not more than that. (So with single FPGA, we can implement)
Please suggest.
Searched for suitable parts with multiple manufacturers.

Difference between sampling rate, bit rate and bit depth

This is kind of a basic question which might sound too obvious to many of you , but I am getting confused so bad.
Here is what a Quora user says. Now It is clear to me what a Sampling rate is - The number of samples you take of a sound signal (in one second) is it's sampling rate.
Now my doubt here is - This rate should have nothing to do with the quantisation, right?
About bit-depth, Is the quantisation dependant on bit-depth? As in 32-bit (2^32 levels) and 64-bit (2^64 levels). Or is it something else?
and the bit-rate, is number of bits transferred in one second? If I an audio file says 320 kbps what does that really mean?
I assume the readers have got some sense on how I am panicking on where does the bit rate, and bit depth have significance?
EDIT: Also find this question if you have worked with linux OS and gstreamer framework.
Now my doubt here is - This rate should have nothing to do with the
quantisation, right?
Wrong. Sampling is a process that results in quantisation. Sampling, as the name implies, means taking samples (amplitudes) of a (usually) continuous signal (e.g audio) at regular time intervals and converting them to a different represantation thereof. In digital signal processing, this represantation is discrete (not continuous). An example of this process is a wave file (e.g recording your own voice and saving it as a wav).
About bit-depth, Is the quantisation dependant on bit-depth? As in
32-bit (2^32 levels) and 64-bit (2^64 levels). Or is it something
else?
Yes. The CD format, for example, has a bit depth of 16 (16 bits per sample). Bit depth is a part of the format of a sound (wave) file (along with the number of channels and sampling rate).
Since sound (think of a pure sine tone) has both positive and negative parts, I'd argue that you can represent (2^16 / 2) amplitude levels using 16 bits.
and the bit-rate, is number of bits transferred in one second? If I an
audio file says 320 kbps what does that really mean?
Yes. Bit rates are usually meaningful in the context of network transfers. 320 kbps == 320 000 bits per second. (for kilobit you multiply by 1000, rather than 1024)
Let's take a worked example 'Red-book' CD audio
The Bit depth is 16-bit. This is the number of bits used to represent each sample. This is intimately coupled with quantisation.
The Smaple-rate is 44.1kHz
The Frame-rate is 44.1kHz (two audio channels make up a stereo pair)
The Bit-rate is therefore 16 * 44100 * 2 = 1411200 bits/sec
There are a few twists with compressed audio streams such such as MP3 or AAC. In these, there is a non-linear relationship between bit-rate, sample-rate and bit-depth. The bit-rate is generally the maximum rate per-second and the efficiency of the codec is content dependant.

What is the bit rate?

I am new to audio programming,
But I am wondering formula of bitRate,
According to wiki https://en.wikipedia.org/wiki/Bit_rate#Audio,
bit rate = sample rate X bit depth X channels
and
sample rate is the number of samples (or snapshots taken) per second obtained by a digital audio device.
bit depth is the number of bits of information in each sample.
So why bit rate = sample rate X bit depth X channels?
From my perspective, if bitDepth = 2 bit, sample rate = 3 HZ
then I can transfer 6 bit data in 1 second
For example:
Sample data = 00 //at 1/3 second.
Sample data = 01 //at 2/3 second.
Sample data = 10 //at 3/3 second.
So I transfer 000110 in 1 second, is that correct logic?
Bit-rate is the expected amount of bits per interval (eg: per second).
Sound cycles are measured in hertz, where 1 hertz == 1 second. So to get full sound data that represents that 1 second of audio, you calculate how many bits are needed to be sent (or for media players, they check the bit-rate in a file-format's settings so they can read & playback correctly).
Why is channels involved (isn't sample rate X bit-depth enough)?
In digital audio the samples are sent for each "ear" (L/R channel). There will always be double the amount of samples in a stereo sound versus if it was mono sound. Usually there is a "flag" to specify if sound is stereo or mono.
Logic Example: (without bit-depth, and assuming 1-bit per sample)...
There is speech "Hello" recorded at 200 samples/sec at bitrate of 100/sec. What happens?
If stereo flag, each ear gets 100 samples per sec (correct total of 200 played)
If mono, audio speech will sound slow by half (since only 100 samples played at expected bit-rate of 100, but remember, a full second was recorded at 200 sample/sec. You get half of "hello" in one second and the other at next second to (== slowed speech).
Taking the above example, you will find these audio gives slow/double speed adventures in your "new to audio programming" experience. The fix will be either setting channels amount or setting bit-rate correctly. Good luck.
The 'sample rate' is the rate at which each channel is sampled.
So 'sample rate X bit depth' will give you the bit rate for a single channel.
You then need to multiply that by the number of channels to get the total bit rate flowing through the system.
For example the CD standard has a sample rate of 44100 samples per second and a bit depth of 16 giving a bit rate of 705600 per channel and a total bit rate of 1411200 bits per seconds for stereo.

About definition for terms of audio codec

When I was studying Cocoa Audio Queue document, I met several terms in audio codec. There are defined in a structure named AudioStreamBasicDescription.
Here are the terms:
1. Sample rate
2. Packet
3. Frame
4. Channel
I known about sample rate and channel. How I was confused by the other two. What do the other two terms mean?
Also you can answer this question by example. For example, I have an dual-channel PCM-16 source with a sample rate 44.1kHz, which means there are 2*44100 = 88200 Bytes PCM data per second. But how about packet and frame?
Thank you at advance!
You are already familiar with the sample rate defintion.
The sampling frequency or sampling rate, fs, is defined as the number of samples obtained in one second (samples per second), thus fs = 1/T.
So for a sampling rate of 44100 Hz, you have 44100 samples per second (per audio channel).
The number of frames per second in video is a similar concept to the number of samples per second in audio. Frames for our eyes, samples for our ears. Additional infos here.
If you have 16 bits depth stereo PCM it means you have 16*44100*2 = 1411200 bits per second => ~ 172 kB per second => around 10 MB per minute.
To the definition in reworded terms from Apple:
Sample: a single number representing the value of one audio channel at one point in time.
Frame: a group of one or more samples, with one sample for each channel, representing the audio on all channels at a single point on time.
Packet: a group of one or more frames, representing the audio format's smallest encoding unit, and the audio for all channels across a short amount of time.
As you can see there is a subtle difference between audio and video frame notions. In one second you have for stereo audio at 44.1 kHz: 88200 samples and thus 44100 frames.
Compressed format like MP3 and AAC pack multiple frames in packets (these packets can then be written in MP4 file for example where they could be efficiently interleaved with video content). You understand that dealing with large packets helps to identify bits patterns for better coding efficiency.
MP3, for example, uses packets of 1152 frames, which are the basic atomic unit of an MP3 stream. PCM audio is just a series of samples, so it can be divided down to the individual frame, and it really has no packet size at all.
For AAC you can have 1024 (or 960) frames per packet. This is described in the Apple document you pointed at:
The number of frames in a packet of audio data. For uncompressed audio, the value is 1. For variable bit-rate formats, the value is a larger fixed number, such as 1024 for AAC. For formats with a variable number of frames per packet, such as Ogg Vorbis, set this field to 0.
In MPEG-based file format a packet is referred to as a data frame (not to be
mingled with the previous audio frame notion). See Brad comment for more information on the subject.

Modulate digital data into audio using AFSK

I want to modulate digital data into audio. Then communicate it through any audio channel and demodulate at the destination from audio to data again. To do this I hope to use computer sound card and software modem without using any hardware implementation. In the internet, I found that this can be through the technique called Audio Frequency-Shift Keying(AFSK). I want to know that can I obtain bit rate more than 1200bps from AFSK and if it is no what the reason behind that this limitation.
Is there any technique efficient than AFSK for this purpose ?
The most common currently-used form of AFSK is the Bell202 modem at 1200 baud. There are a few other standards which also use 1200 baud, and some that run at less than 1200 bits per second, but none that I know of that run greater than 1200.
However, as far as I know, there's no reason you couldn't write a software modem to transmit and receive at a higher baud rate. Bell202 uses bit stuffing (allowing the data stream to use the same tone no more than 5 bits in a row) to help keep the transmitter and receiver from falling out of sync with each other, so a higher baud rate might require bit stuffing at a lower threshold (every 4 or 3 bits).
Another consideration is that the sound cards you're using should use a sampling rate equal to or a multiple of the baud rate you choose. This is one of the reasons 1200 baud is so common, as 1200Hz and 48000Hz are common sample rates with audio hardware.
So 1200 baud isn't a limit. It's just a standard.

Resources