I'm developing a high-speed, high-resolution video camera for robotics applications. For various reasons I need to adopt gigabit ethernet (1Ge) or 10Ge to interface my cameras to PCs. Either that or I'll need to develop my own PCIe card which I prefer not to do (more work, plus then I'd have to create drivers).
I have two questions that I am not certain about after reading linux documentation.
#1: My desired ethernet frame is:
8-byte interpacket pad + sync byte
6-byte MAC address (destination)
6-byte MAC address (source)
2-byte packet length (varies 6KB to 9KB depending on lossless compression)
n-byte image data (number of bytes specified in previous 2-byte field)
4-byte CRC32
The question is, will linux accept this packet if the application tell linux to expect AF_PACKETs (assuming applications CAN tell linux this)? It is acceptable if the application that controls the camera (sends packets to it) and receives the image data in packets must run with root privilege.
#2: Which will be faster:
A: linux sockets with AF_PACKET protocol
B: libpcap application
Speed is crucial, because packets will arrive with little space between them, since each packet contains one horizontal row of pixels in my own lossless compression format (unless I can find a better algorithm that can also be implemented in the FPGA at real time speeds). There will be a pause between frames, but that is after 1200 or more horizontal rows (ethernet frame packets).
Because the application is robotics, each horizontal row will be immediately decompressed and stored in a simple packed array of RGBA pixels just like OpenGL accepts as textures. So robotics software can immediately inspect each image as the image arrives row by row and possibly react as quickly as inhumanly possible.
The data for the first RGBA pixel in each row immediately follows the last RGBA pixel in the previous row, so at the end of the last horizontal row of pixels the image is complete and ready to transfer to GPUs and/or save to disk. Each horizontal row will be a multiple of 16 pixels, so no "padding" is required.
NOTE: The camera must be directly plugged into the RJ45 jack without routers or other devices between camera and PC.
I think you will have to change your Ethernet frame format to use the first two bytes after the source and dest MACs as the type, not the length. Old-style lengths must be less than 1536, anything greater is treated as an IEEE type field instead. As you want 6K or more, there's a chance the receiving Ethernet chip / Linux packet handler will discard your frames because they're badly formatted.
As for performance, the Golden Rule is measure, don't guess. Pick the one that is simplest to program and try.
Hope this helps.
Related
My goal is to record audio using an electret microphone hooked into the analog pin of an esp8266 (12E) and then be able to play this audio on another device. My circuit is:
In order to check the output of the microphone I connected the circuit to the oscilloscope and got this:
In the "gif" above you can see the waves made by my voice when talking to microphone.
here is my code on esp8266:
void loop() {
sensorValue = analogRead(sensorPin);
Serial.print(sensorValue);
Serial.print(" ");
}
I would like to play the audio on the "Audacity" software in order to have an understanding of the result. Therefore, I copied the numbers from the serial monitor and paste it into the python code that maps the data to (-1,1) interval:
def mapPoint(value, currentMin, currentMax, targetMin, targetMax):
currentInterval = currentMax - currentMin
targetInterval = targetMax - targetMin
valueScaled = float(value - currentMin) / float(currentInterval)
return round(targetMin + (valueScaled * targetInterval),5)
class mapper():
def __init__(self,raws):
self.raws=raws.split(" ")
self.raws=[float(i) for i in self.raws]
def mapAll(self):
self.mappeds=[mapPoint(i,min(self.raws),max(self.raws),-1,1) for i in self.raws ]
self.strmappeds=str(self.mappeds).replace(",","").replace("]","").replace("[","")
return self.strmappeds
Which takes the string of numbers, map them on the target interval (-1 ,+1) and return a space (" ") separated string of data ready to import into Audacity software. (Tools>Sample Data Import and then select the text file including the data). The result of importing data from almost 5 seconds voice:
which is about half a second and when I play I hear unintelligible noise. I also tried lower frequencies but there was only noise there, too.
The suspected causes for the problem are:
1- Esp8266 has not the capability to read the analog pin fast enough to return meaningful data (which is probably not the case since it's clock speed is around 100MHz).
2- The way software is gathering the data and outputs it is not the most optimized way (In the loop, Serial.print, etc.)
3- The microphone circuit output is too noisy. (which might be, but as observed from the oscilloscope test, my voice has to make a difference in the output audio. Which was not audible from the audacity)
4- The way I mapped and prepared the data for the Audacity.
Is there something else I could try?
Are there similar projects out there? (which to my surprise I couldn't find anything which was done transparently!)
What can be the right way to do this? (since it can be a very useful and economic method for recording, transmitting and analyzing audio.)
There are many issues with your project:
You do not set a bias voltage on A0. The ADC can only measure voltages between Ground and VCC. When removing the microphone from the circuit, the voltage at A0 should be close to VCC/2. This is usually achieved by adding a voltage divider between VCC and GND made of 2 resistors, and connected directly to A0. Between the cap and A0.
Also, your circuit looks weird... Is the 47uF cap connected directly to the 3.3V ? If that's the case, you should connect it to pin 2 of the microphone instead. This would also indicate that right now your ADC is only recording noise (no bias voltage will do that).
You do not pace you input, meaning that you do not have a constant sampling rate. That is a very important issue. I suggest you set yourself a realistic target that is well within the limits of the ADC, and the limits of your serial port. The transfer rate in bytes/sec of a serial port is usually equal to baud-rate / 8. For 9600 bauds, that's only about 1200 bytes/sec, which means that once converted to text, you max transfer rate drops to about 400 samples per second. This issue needs to be addressed and the max calculated before you begin, as the max attainable overall sample rate is the maximum of the sample rate from the ADC and the transfer rate of the serial port.
The way to grab samples depends a lot on your needs and what you are trying to do with this project, your audio bandwidth, resolution and audio quality requirements for the application and the amount of work you can put into it. Reading from a loop as you are doing now may work with a fast enough serial port, but the quality will always be poor.
The way that is usually done is with a timer interrupt starting the ADC measurement and an ADC interrupt grabbing the result and storing it in a small FIFO, while the main loop transfers from this ADC fifo to the serial port, along the other tasks assigned to the chip. This cannot be done directly with the Arduino libraries, as you need to control the ADC directly to do that.
Here a short checklist of things to do:
Get the full ESP8266 datasheet from Expressif. Look up the actual specs of the ADC, mainly: the sample rates and resolutions available with your oscillator, and also its electrical constraints, at least its input voltage range and input impedance.
Once you know these numbers, set yourself some target, the math needed for successful project need input numbers. What is your application? Do you want to record audio or just detect a nondescript noise? What are the minimum requirements needed for things to work?
Look up in the Arduino documentartion how to set up a timer interrupt and an ADC interrupt.
Look up in the datasheet which registers you'll need to access to configure and run the ADC.
Fix the voltage bias issue on the ADC input. Nothing can work before that's done, and you do not want to destroy your processor.
Make sure the input AC voltage (the 'swing' voltage) is large enough to give you the results you want. It is not unusual to have to amplify a mic signal (with an opamp or a transistor), just for impedance matching.
Then you can start writing code.
This may sound awfully complex for such a small task, but that's what the average day of an embedded programmer looks like.
[EDIT] Your circuit would work a lot better if you simply replaced the 47uF DC blocking capacitor by a series resistor. Its value should be in the 2.2k to 7.6k range, to keep the circuit impedance within the 10k Ohms or so needed for the ADC. This would insure that the input voltage to A0 is within the operating limits of the ADC (GND-3.3V on the NodeMCU board, 0-1V with bare chip).
The signal may still be too weak for your application, though. What is the amplitude of the signal on your scope? How many bits of resolution does that range cover once converted by the ADC? Example, for a .1V peak to peak signal (SIG = 0.1), an ADC range of 0-3.3V (RNG = 3.3) and 10 bits of resolution (RES = 1024), you'll have
binary-range = RES * (SIG / RNG)
= 1024 * (0.1 / 3.3)
= 1024 * .03
= 31.03
A range of 31, which means around Log2(31) (~= 5) useful bits of resolution, is that enough for your application ?
As an aside note: The ADC will give you positive values, with a DC offset, You will probably need to filter the digital output with a DC blocking filter before playback. https://manual.audacityteam.org/man/dc_offset.html
I had a assignment for college where we needed to play a precompiled wav as integer array through the PWM and DAC. Now, I wanted more of a challenge, so I went out of my way and created a audio dac over usb using the micro controller in question: The STM32F051. It basically listens to my soundcard output using a wasapi loopback recorder, changes the resolution from 16 to 12 bit (since the dac on the stm32 only has a 12 bit resolution) and sends it over using usart using 10x sample rate as baud rate (in my case 960000). All done in C#.
On the microcontroller I simply use a interrupt for usart and push the received data to the dac.
It works pretty well, much better than PWM, and at a decent sample frequency of 48kHz.
But... here it comes.. When there is some (mostly) high pitch symphonic melody it starts to sound "wobbly".
Here a video where you can hear it: https://youtu.be/xD3uTP9etuA?t=88
I read up on the internet a bit about DIY dac's and someone somewhere (don't remember where) mentioned that MCU's in general have interrupt jitter. So may basic question is: Is interrupt jitter actually causing this? If so, are there ways to limit the jitter happening?
Or is this something entirely different?
I am thinking of trying to compact the pcm data send over serial (as said before, resolution of 12 bits, but are sent in packet of 2 8bits forming 16bits, hence twice the samplerate as the baud rate, so my plan is trying to shift 12 bits to the MSB and adding four bits of the next 12 bit value to the current 16 bit variable, hence only needing 12 transfers instead of 16 per 8 samples. Might read upon more efficient ways of compacting data for transport.), put the samples in a buffer and then use another timer that triggers at 48kHz for sending the samples to the dac. Would this concept work? Or would I just waste time?
For code, here is the project: https://github.com/EldinZenderink/SoundOverSerial
As I understand the term "word length" (spi_bits_per_word) in spi, defines the CS (chip select) active time.
It therefore seems that linux driver will function correctly when dealing with simple spi protocols which keeps word size constant.
But, How can we deal with spi protocols which use different spi size as part of protocol.
for example cs need to be active for sending spi word - 9 bits, and then reading spi - 8 bits or 24 bits (the length of the register read is different each time, depends on register)
How can we implement that using spi_write_then_read ?
Do we need to set bits_per_word size for the sending and then another bits_per_word for the receiving ?
Regards,
Ran
"word length" means number of bits you can send in one transaction. It doesn't defines the CS (chip select) active time. You can keep it active for whatever time you want(least is for word-length).
SPI has got some format. You cannot randomly read-write whatever number of bits you want.Most of SPI supports 4-bit, 8-bit, 16-bit and 32-bit mode. If the given mode doesn't satisfy your requirement then you need to break your requirement. For eg:- To read 24-bit data, we need to use 8-bit word-length transfer for 3 times.
Generally SPI is fullduplex means it will read at same time it will write.
I want to capture audio on Linux with low latency in a program I'm writing.
I've run some experiments using the ALSA API, using snd_pcm_readi() to
capture sound, then immediately using snd_pcm_writei() to play it back.
I've tried playing with the number of frames captured, and the buffer size,
but I don't seem to be able to get the latency down to less than a second
or so.
Am I better off using PulseAudio or JACK? Can those be used to play the
captured audio?
To reduce capture latency, reduce the period size of the capture device.
To reduce playback latency, reduce the buffer size of the playback device.
Jack can play the captured audio (just connect the input ports to the output ports), but you still have to configure its periods/buffers.
Also see Relation between period size of speaker and mic and Recording from ALSA - understanding memory mapping.
I've doing some work on low latency audio programming,
My experience is, first, your capture buffer should be small, like 10ms period buffer. (let's assuming you're using 512 frame buffer, and 48000 sample rate).
Then, you should config your Output device start_threshold to at least 2 * frame size ( 1 * frame size if your don't have much process of recorded data).
For record device, like CL. said, use a relative small period size is better, but not too small to avoid too much irq.
Also, you can change your process schedule to FIFO schedule.
Then, hopefully, you will get about 20ms total latency.
I believe you should at first ensure that you are running a Linux kernel which actually allows you to achieve low typical latency.
There are several kernel compile-time configuration options which you might look into:
CONFIG_HZ_1000
CONFIG_IRQ_FORCED_THREADING
CONFIG_PREEMPT
CONFIG_PREEMPT_RT_FULL (available only with RT patch)
Apart from that, there are more things you can do in order to optimize your audio latency in Linux. Some starting reference points can be found there:
http://wiki.linuxaudio.org/wiki/real_time_info
Does the framebuffer contain depth buffer information, or just color buffer information in graphics applications? What about for gui's on windows, is their a frame buffer and does it hold both color buffer + depth info or just color info?
If you're talking about the kernel-level framebuffer in Linux, it sets both the resolution and the color depth. Here's a list of common framebuffer modes; notice that the modes are determined by both the resolution and the color depth. You can override the framebuffer by passing a command line parameter to the kernel in your bootloader (vga=...).
Unlike Linux, on Windows the graphics subsystem is a part of the OS. I don't think (and, please, someone correct me if I'm wrong) there is support for non-VGA output devices in the latest Windows, so the framebuffer is deprecated/unavailable there.
As for real-time 3D, the "standard" buffers are the color buffer, in RGBA format, 1 byte per component, and the depth buffer, 3 bytes per component. There is one sample per fragment ( i.e., if you have 8x antialiasing, you will have 8 colors and 8 depth samples per pixel )
Now, many applications use additional, custom buffers. They are usually called g-buffers. These can include : object ID, material ID, albedo, shininess, normals, tangent, binormal, AO factor, most influent lights impacting the fragment, etc. At 1080p and 4xMSAA and double- or triple-buffering, this can take huge amounts of memory, so all these information are usually packed as tightly as possible.