String compression to refresh WS2811 RGB LEDs faster - node.js

I have the following problem. I am using WS2811 diodes, Arduino Due and node.js to my project. I want to stream video from a device connected to a node.js server and show it on array of diodes. Right now I am able to capture video from any device with browser and camera, change resolution of the video to this desired by me (15x10) and create String chain containing informations of all colors (R,G,B) of all diodes. I am sending it from node.js server to arduino though serial port with baud rate 115200. Unfortunately sending process it is too slow. I would like it to refresh the LED array at least 10 times per second. So I was wondering maybe to compress this string which I am sending to arduino, when it gets there decompress it, and set colors to diodes. Maybe you guys have some experience with similar project and advice me what to do.
For handling diodes I am using adafruit_neopixel library.

If I were you I would try to convert the video to a 16-bit encoding (like RGB565), or maybe even 8-bit, on your server.
Even at that low resolution I'm not certain the atmega328p is powerful enough to convert it back to 24-bit and send the data out to the display, but TIAS. If it doesn't work, you might want to consider switching to a BeagleBone or RPi.

If you have large areas of a similar colour, especially if you have dropped your bit depth to 16 or 8 bits as suggested in the previous answer, Run Length Encoding compression might be worth a try.
It's easy to implement it in a few lines of code:
https://en.wikipedia.org/wiki/Run-length_encoding

Related

Sending a webcam input to zoom using a recorded clip

I have an idea that I have been working on, but there are some technical details that I would love to understand before I proceed.
From what I understand, Linux communicates with the underlying hardware through the /dev/. I was messing around with my video cam input to zoom and I found someone explaining that I need to create a virtual device and mount it to the output of another program called v4loop.
My questions are
1- How does Zoom detect the webcams available for input. My /dev directory has 2 "files" called video (/dev/video0 and /dev/video1), yet zoom only detects one webcam. Is the webcam communication done through this video file or not? If yes, why does simply creating one doesn't affect Zoom input choices. If not, how does zoom detect the input and read the webcam feed?
2- can I create a virtual device and write a kernel module for it that feeds the input from a local file. I have written a lot of kernel modules, and I know they have a read, write, release methods. I want to parse the video whenever a read request from zoom is issued. How should the video be encoded? Is it an mp4 or a raw format or something else? How fast should I be sending input (in terms of kilobytes). I think it is a function of my webcam recording specs. If it is 1920x1080, and each pixel is 3 bytes (RGB), and it is recording at 20 fps, I can simply calculate how many bytes are generated per second, but how does Zoom expect the input to be Fed into it. Assuming that it is sending the strean in real time, then it should be reading input every few milliseconds. How do I get access to such information?
Thank you in advance. This is a learning experiment, I am just trying to do something fun that I am motivated to do, while learning more about Linux-hardware communication. I am still a beginner, so please go easy on me.
Apparently, there are two types of /dev/video* files. One for the metadata and the other is for the actual stream from the webcam. Creating a virtual device of the same type as the stream in the /dev directory did result in Zoom recognizing it as an independent webcam, even without creating its metadata file. I did finally achieve what I wanted, but I used OBS Studio virtual camera feature that was added after update 26.0.1, and it is working perfectly so far.

Is interrupt jitter causing the annoying wobble in audio using the mcu's dac?

I had a assignment for college where we needed to play a precompiled wav as integer array through the PWM and DAC. Now, I wanted more of a challenge, so I went out of my way and created a audio dac over usb using the micro controller in question: The STM32F051. It basically listens to my soundcard output using a wasapi loopback recorder, changes the resolution from 16 to 12 bit (since the dac on the stm32 only has a 12 bit resolution) and sends it over using usart using 10x sample rate as baud rate (in my case 960000). All done in C#.
On the microcontroller I simply use a interrupt for usart and push the received data to the dac.
It works pretty well, much better than PWM, and at a decent sample frequency of 48kHz.
But... here it comes.. When there is some (mostly) high pitch symphonic melody it starts to sound "wobbly".
Here a video where you can hear it: https://youtu.be/xD3uTP9etuA?t=88
I read up on the internet a bit about DIY dac's and someone somewhere (don't remember where) mentioned that MCU's in general have interrupt jitter. So may basic question is: Is interrupt jitter actually causing this? If so, are there ways to limit the jitter happening?
Or is this something entirely different?
I am thinking of trying to compact the pcm data send over serial (as said before, resolution of 12 bits, but are sent in packet of 2 8bits forming 16bits, hence twice the samplerate as the baud rate, so my plan is trying to shift 12 bits to the MSB and adding four bits of the next 12 bit value to the current 16 bit variable, hence only needing 12 transfers instead of 16 per 8 samples. Might read upon more efficient ways of compacting data for transport.), put the samples in a buffer and then use another timer that triggers at 48kHz for sending the samples to the dac. Would this concept work? Or would I just waste time?
For code, here is the project: https://github.com/EldinZenderink/SoundOverSerial

How to convert PCM audio stream for online play

I have access to an audio stream of PCM audio buffers. I should be clear I do not have access to the audio file. I only have access to a stream of 4096 byte chunks of the audio data.
The PCM buffers come in with the following format:
PCM Int 16
Little Endian
Two Channels
Interleaved
To support audio playback on a standard browser I need to convert the audio to the following format:
PCM Float 32
Big Endian
Two channels (at most)
Deinterleaved
This audio is coming from an iOS app so I have access to Swift and Objective C (although I am not very comfortable with Objective C...which makes Apple's Audio Converter Services almost impossible to use because Swift really doesn't like pointers).
Additionally the playback will occur on a browser so I could handle the conversion in client side Javascript or server sider. I am proficient enough in the following server side languages to do a conversion:
Java (preferred)
PHP
Node.js
Python
If anyone knows a way to do this in any of these languages please let me know. I have worked on this for long enough that I will probably understand even a very technical description of how to do this.
My current plan is to use bitwise operations to deinterleave the left and right channels, then cast the Int 16 Buffer to a Float 32 Buffer with the Web Audio API. Does this seem like a good plan?
Any help is appreciated, thank you.
My current plan is to use bitwise operations to deinterleave the left and right channels, then cast the Int 16 Buffer to a Float 32 Buffer with the Web Audio API. Does this seem like a good plan?
Yes, that is exactly what you need to do. I do the exact same thing in my applications, and this method works well and is really the only way that makes sense to do it. You don't want to send 32-bit float samples to the client from the server due to the amount of bandwidth. Do the conversion client-side.

Streaming audio from avconv via NodeJs WebSockets into Chrome with AudioContext

we're having trouble playing streamed audio in a browser (using Chrome).
We have a process which is streaming some audio (for example an internet radio) on udp on some port. It's avconv (avconv -y -i SOMEURL -f alaw udp://localhost:PORT).
We have a NodeJs server which receives this audio stream and forwards it to multiple clients connected via websockets. The audio stream which NodeJs receives is wrapped in a buffer which is an array with numbers from 0 to 255. The data is sent to the browser without any issues and then we're using AudioContext to play the audio stream in the browser (our code is based on AudioStreamer - https://github.com/agektmr/AudioStreamer).
At first, all all we got at this point was static. When looking into the AudioStreamer code, we realized that the audio stream data should be in the -1 to 1 range. With this knowledge we tried modifying each value in the buffer with this formula x = (x/128) - 1. We did it just to see what would happen and surprisingly the static became a bit less awful - you could even make out melodies of songs or words if the audio was speech. But it's still very very bad, lots of static, so this is obviously not a solution - but it does show that we are indeed receiving the audio stream via the websockets and not just some random data.
So the question is - what are we doing wrong? Is there a codec/format we should be using? Of course all the code (the avconv, NodeJs and client side) can be modified at will. We could also use another browser if needed, though I assume that's not the problem here. The only thing we do know is that we really need this to work through websockets.
The OS running the avconv and NodeJs is Ubuntu (various versions 10-13)
Any ideas? All help will be appreciated.
Thanks!
Tomas
The conversion from integer samples to floating point samples is incorrect. You must take into account:
Number of channels
Number of bits per sample
Signed/unsigned
Endianess
Let's assume you have a typical WAV file at 16 bit stereo, signed, little-endian. You're on the right track with your formula, but try this:
x = (x/32768) - 1

low latency sounds on key presses

I am trying to write an application(I'm a gui first timer) for my son, he has autism. There is a video player in the top half and a text entry area in the bottom. When letters are typed sounds are produced to mimic the words in the video.
There have been other posts on this site in regard to playing sounds on key presses, using gstreamer as a system call. I have also tried libcanberra but both seem to have significant delays between sounds. I can write the app in python or C but will likely do at least some of it in C.
I also want to mention that the video portion is being played by gstreamer. I tried to create two instances of gstreamer, to avoid expensive system calls but the audio instance seemed to kill the app when called.
If anyone has any tips on creating faster responding sounds I would really appreciate it.
You can upload a raw audio sample directly to PulseAudio so there will be no decoding and (perhaps save) extra switches by using the following function from Canberra:
http://developer.gnome.org/libcanberra/unstable/libcanberra-canberra.html#ca-context-cache
The next ca_context_play() will use it.
However, the biggest problem you'll encounter with this scenario (with simultaneous video playback) is that the audio device might be configured with large latency with PulseAudio (up to 1/2s or more for normal playback). It may be reasonable to file a bug to libcanberra to support a LOW_LATENCY flag, as it currently doesn't attempt to minimize delay for sound events afaik. That would be great to have.
GStreamer pulsesink could probably get low latency too (it has some properties for that), but I am afraid it won't be as lightweight as libcanberra, and you won't be able to cache a sample for instance. Ideally, GStreamer could also learn to cache samples, or pre-fill PulseAudio...

Resources