How to get audio peaks with FFmpeg?

How to get audio peaks with FFmpeg? - audio

I am working on a music app and need to generate audio spectrum for my files. Like this one:
So I tried using audiowaveform like this:
audiowaveform -i music.mp3 --pixels-per-second 1 -o out.dat
which gives me the following results[correct results]: [the first 10 words are meta data]
0000000 0001 0000 0000 0000 bb80 0000 bb80 0000
0000020 00f9 0000 df3e 1fa2 e22c 1ef3 e0bb 1e5a
0000040 e099 1e88 dfcf 1c33 e29f 1d4c e055 1f80
0000060 df63 1e3a e1b4 1f31 e271 1d81 e0e5 1b1c
0000100 e06d 1be4 dee2 1cb0 e118 1da1 e026 1dea
0000120 e055 1dac df9b 1dbf e0c3 2063 ded4 21b2
0000140 dec9 1f8d de5b 20c8 e02d 216a dd7e 21af
0000160 dea1 20ac de6c 2170 de80 1e12 de6f 1fb9
0000200 dde3 2106 e0d9 21be de88 218c de81 1f9f
0000220 decb 20ff deb2 1edc df32 20c4 dde7 ...
But when I do this kind of job with FFmpeg:
ffmpeg -y -i music.mp3 -acodec pcm_s16le -f s16le -ac 1 -ar 1 -v quiet out.pcm
that gives the following results, which is not same at all:
0000000 0001 fffe fffe fffe 0000 ffff fffd 0000
0000020 ffff ffff fffe 0001 0001 fffd 0001 fffe
0000040 0002 fffe fffc 0002 ffff fffc fffe 000b
0000060 0007 fffb 0004 0001 ffff fffd ffff 0002
0000100 0008 0006 fffe ffff 0001 0000 0003 000a
0000120 fffd ffff 0004 ffff 0001 ffff fffd ffff
0000140 fffe ffff 0001 fffd fffe 0000 fffb 0002
0000160 0002 0000 fffe 0000 fffb fffe fffe 0000
0000200 ffff 0000 ffff fffc 0002 0003 0005 0003
0000220 0002 fffb fffb fffa fffa 0004 0009 ...
You may wonder that why am i doing -ar 1 or --pixels-per-second 1? This is because I want to draw a line for each second, so I need to get peak for each second. . I don't know what am I missing there but I expect to get the same results from FFmpeg.

This is not a solution with FFMPEG, but still results with a wave form array.
My solution was to use the audiowaveform linux package which has a simple cli to extract the waveform data with the desired sample rate.
You can install it on ubuntu like:
sudo add-apt-repository ppa:chris-needham/ppa
sudo apt-get update
sudo apt-get install audiowaveform
Or on macOS with homebrew like:
brew tap bbc/audiowaveform
brew install audiowaveform
First I used the provided command in the question to draw the waveform but it was inaccurate and dirty. Because it takes one sample from each second which is not what I was looking for. So I decided to take 100 samples from each second and get the average with some JS code. So the command to extract the wave form will be:
audiowaveform -i /root/audio.mp3 --pixels-per-second 100 --output-format json -
This will output the audio wave form data along with some metadata on the stdout (That hyphen at the end does the trick). So in my case I used NodeJs to get this output and reduce the waveform array to the average of waveform blocks. Note that I removed the negative numbers from the waveform to work on the upper half of the waveform.
import {exec} from "child_process";
export default function getAudioWaveform(filename, blockSize = 100) {
return new Promise((resolve, reject) => {
const command = `audiowaveform -i ${filename} --pixels-per-second ${blockSize} --output-format json -`;
exec(command, (error, stdout, stderr) => {
if (error) reject(error, stderr);
try {
const data = JSON.parse(stdout).data.filter((_, i) => i % 2 === 1)
const waveform = [];
for (let i = 0; i < data.length; i++)
waveform[i] = Math.round(data.slice(i * 100, (i + 1) * 100).reduce((s, n) => s + n, 0) / 100);
resolve(waveform);
} catch (ex) {
reject(ex);
}
});
})
}

Related

How to make the DOS .exe relocation table smaller with OpenWatcom linker?

I've created the following DOS .exe file with OpenWatcom:
$ xxd prog.exe
00000000: 4d5a 8200 0100 0100 0300 4000 ffff 0500 MZ........#.....
00000010: 0204 0000 0000 0000 2000 0000 0000 0000 ........ .......
00000020: 0100 0000 0000 0000 0000 0000 0000 0000 ................
00000030: b804 008e d8e8 0900 b44c cd21 d1e2 01d0 .........L.!....
00000040: c353 52ba 0200 b409 cd21 ba0c 00b4 09cd .SR......!......
00000050: 21ba 0f00 b409 cd21 ba08 00b8 0700 e8db !......!........
00000060: ff89 c3ba 0a00 b809 00e8 d0ff 01d8 5a5b ..............Z[
00000070: c300 4865 6c6c 6f21 0d0a 2400 6162 0063 ..Hello!..$.ab.c
00000080: 6400 d.
Regions:
0x0...0x1c: DOS .exe header.
0x1c...0x20: 4 bytes of padding.
0x20...0x24: 4 bytes containing 1 relocation entry.
0x24...0x30: 12 bytes of padding.
0x30...: code (_TEXT) segment with 16-bit 8086 machine code.
...
How do I get rid of the 4 bytes of padding and the 12 bytes of padding, so that the code starts at offset 0x20? Is there a WLINK flag for this? Should I use a different linker? Should I post-process the generated .exe?

I wasn't able to find a configuration option for this, so I ended up writing my own linker and using it instead of WLINK. This way the .exe header became only 24 (0x18) bytes, and I didn't need any relocations.

Python stops binary reading of file after getting byte 0xa or 0xd

I want to read some binary file. It is a big file so i use maximalOffset variable to stop reading after getting to it. But reading is always ends at one offset - 8199. The last byte i get is 0xa. In xxd it is the part of byte 0a0d.
I am using Ubuntu 18 and Python 3.
I found some info about 0x1A in Windows (it's EOF symbol or something) but the solution was to use binary reading and 0xA is not 0x1A...
maximalOffsetString = "2070"
maximalOffset=int(maximalOffsetString,16)
offset=-16 # first 16 bytes must be on 0x0 offset
line = [ ]
pagefile = open("./pagefile", "rb")
for bytes in pagefile:
for byte in bytes:
if maximalOffset==offset: break
if len(line) == 16:
print(hex(offset))
print(str(offset)+" : "+str(maximalOffset))
print(line)
del line[:]
line.append(hex(byte))
offset=offset+1
break
pagefile.close()
# here i see what was the last symbols in array:
print(hex(offset))
print(str(offset)+" : "+str(maximalOffset))
print(line)
Output:
0x2007
8199 : 8304
['0xf0', '0xa9', '0xc', '0x7', '0x71', '0xc0', '0xa']
as you can see, my maximalOffset is 8304 but the reading stops at 8199. In xxd this line is:
00002010: f0a9 0c07 71c0 0a0d 0000 006c 0105 5c00
All file before this is only zeros. After 0x2000 there are random bytes.
00001fb0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00001fc0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00001fd0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00001fe0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00001ff0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00002000: 0104 0000 6f01 005c 0094 008c e026 6469 ....o..\.....&di
00002010: f0a9 0c07 71c0 0a0d 0000 006c 0105 5c00 ....q......l..\.
00002020: 9500 8c20 b800 8040 0001 10ab 0c07 4230 ... ...#......B0
00002030: 0dba 0069 010a 5c00 9600 8ce8 b800 38a7 ...i..\.......8.
00002040: 0c07 fbd0 7b01 6601 0f5c 0097 0008 0020 ....{.f..\.....
00002050: 208c f8b8 0090 940d 0724 0000 7a01 6301 ........$..z.c.
00002060: 0c5c 0098 008c 0027 6469 9892 0d07 f2b9 .\.....'di......
00002070: 0009 0080 4100 4100 6001 115c 0099 008c ....A.A.`..\....
00002080: 08b8 0020 0d0e 072b 7c01 7d01 165c 009a ... ...+|.}..\..
00002090: 008c 10b8 0028 a20c 0727 bc00 8100 4200 .....(...'....B.
000020a0: 7a01 1b5c 009b 008c 18b9 009f 0d07 29bc z..\..........).
000020b0: 0077 0118 5c00 9c00 8c98 b803 6091 0d07 .w..\.......`...
000020c0: 06b0 3b05 4000 0103 7401 1d5c 009d 7801 ..;.#...t..\..x.
000020d0: b800 208f 0d07 10f0 097a 0471 0122 5c00 .. ......z.q."\.

I think you are breaking the outer for loop before reading is finished, remove break at the bottom of the outer for loop.
...
for bytes in pagefile:
for byte in bytes:
...
line.append(hex(byte))
offset=offset+1
break # <- Remove this
pagefile.close()
...

Bash: How to detect multimedia keypresses in a shell script?

So I've been working on a old kobo ereader(No touch screen) and I've been trying to figure out how to detect when the buttons on it are pressed.
So far I've used hexdump to figure out the keycodes, but they don't work like a regular keyboard in that showkey doesn't work on them. Here's the hexdump output I got for the buttons:
hexdump /dev/input/event0
upPress 0000000 fc92 5512 92dd 0003 0001 0067 0001 0000
upRelease 0000010 fc92 5512 7905 0006 0001 0067 0000 0000
rightPress 0000020 fcab 5512 0cec 000b 0001 006a 0001 0000
rightRelease 0000030 fcab 5512 7de5 000d 0001 006a 0000 0000
downPress 0000040 fcb6 5512 48eb 0001 0001 006c 0001 0000
downRelease 0000050 fcb6 5512 b9e4 0003 0001 006c 0000 0000
leftPress 0000060 fcc0 5512 2b98 000f 0001 0069 0001 0000
leftRelease 0000070 fcc1 5512 3342 0002 0001 0069 0000 0000
middlePress 0000080 fccd 5512 acaa 0000 0001 001c 0001 0000
middleRelease 0000090 fccd 5512 1da4 0003 0001 001c 0000 0000
I've determined from this that the keycodes are the 7th number, so 0x67 for example. The only problem I have now is I can't figure out how to detect those in a shell script.
This has got me stumped, right now the device has Linux 2.6.28, Busybox v1.17.1 and a few other programs. It is connected to the internet though, so I might be able to install some stuff, but there's no package manager so I'd prefer not to.
Edit: Stuff I've tried -
Read doesn't work, atleast the way I'm using it.
#!/bin/bash
read -n 1 -s key
echo "key pressed:" $key
Lots of Google searches - most of them require X, which I don't have. The bind command might work, but I don't have it on the system.
Edit 2: More things -
More research has pointed me to the cat command, it shows the output from the keys in a weird code like this:
cat /dev/input/event0
T)U┐Ä☺g☺T)Utè☺gW)U╗☺l☺W)Uúp
☺lY)U3⌐
☺l☺Y)U"☺lZ)Uë"
☺l☺Z)Uæ║
☺l\)U║╙☺i☺\)U▓D♥☺i
Unfortunately it looks like it's different every time, so I don't know how to make sense of it.

Transforming a binary file into a decimal format using bash, how to?

I have a binary file generated by my program, but I need to compare its decimal contents to check if they meet my requirements. However, I can't seem to find a way to do this using bash, is there any code that lets me get this done? Whenever I open it using a program like sublime text I get the contents in HEX form; however, I am not looking for that.
5249 4646 5200 0000 5741 5645 666d 7420
1000 0000 0100 0200 44ac 0000 10b1 0200
0400 1000 6461 7461 2e00 0000 0200 0200
0200 0300 0300 0900 0900 0900 0c00 0c00
1400 1400 1400 1800 1800 0c00 0c00 0c00
0600 0600 0200 0200 0200

For a dump of bytes in decimal:
od -t u1 filename
For a dump of 2-byte words in decimal:
od -t u2 filename

In bash, simply type
xxd -b yourbinaryfile

Accessing shell variable in awk but not interpreted

I am very new to awk programming ...
Here is my code where I am trying to access shell variable count in awk
ivlen=`cat record.txt | awk -F " " '{printf "%s",$10}'`
echo $ivlen
count=` expr $ivlen / 2 `
echo $count
echo "\nInitialization Vector : (Value) "
// This one needs attention
Edit :
iv=`awk -v count=$count 'BEGIN {RS=" ";ORS=" ";}
{if (NR > 4 && NR < count+4 )print $0}' esp_payload.txt`
echo $iv
Input:
$cat esp_payload.txt
0000 5FB4 0000 0041
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 5361 6C74 6564 5F5F
D678 E0DA A075 5361 02B4 6273 D970 2F72
Output:(required) (I want those 0000 strings 12 in number)
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
// this is what I want not what is displayed
output : (displayed on screen)
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 5361
Now what is going wrong ?? why one 0000 not printed and why 5361 printed out

Your script is counting 0041\n0000 as a single record because it has no space character in it. You're also getting 0000\n0000 as a single field in your output, but you can't tell because you echo $iv instead of echo "$iv"
Change RS=" " to RS="[ \n]".

You can pass variables to awk by using -v and your script can be simplified a bit, because {print $0} is the default action:
iv=`awk -v count="$count" 'BEGIN { RS=" "; ORS=" " }
(NR > 4 && NR < count)' esp_payload.txt`

Your script is in single quotes, bash doesn't substitue variables in single quoted strings.
The cleanest way to pass parameters to awk: awk 'script.....' count=$count.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to get audio peaks with FFmpeg? - audio

Related

How to make the DOS .exe relocation table smaller with OpenWatcom linker?

Python stops binary reading of file after getting byte 0xa or 0xd

Bash: How to detect multimedia keypresses in a shell script?

Transforming a binary file into a decimal format using bash, how to?

Accessing shell variable in awk but not interpreted

Categories

Resources