FFmpeg check audio channels for silence - audio

I have two .mp4 files, both having 8 (7.1) audio channels. But in fact, I've been told that one has a stereo audio channel + 2 SAP (secondary audio on channels 7-8), and the other one has 6 (5.1) audio channels + 2 SAP (on channels 7-8). So basically the later one has some [real] audio channels such as Center channel where that doesn't exist in the former stereo one (although it has those channels, but apparently they are silent/mute).
I've been trying to see some differentiating metadata to somehow differentiate the two using Mediainfo, but the metadata for both look exactly the same. Also tried some basic metadata retrieval with ffmpeg and ffprobe, again they both look the same - no luck:
ffprobe -i 2ch.mp4 -show_streams -select_streams a:0
So the question is: Does ffmpeg or ffprobe have any quick ways to differentiate those two? Are there any audio filters that can detect if a specific audio channel is silent or not? Or any other differentiating metadata? I would prefer differentiating the two through some metadata than content analysis.
This is a sample of 2-channel mp4 file, and this one is a sample of the 6-channel mp4.

Both of your sample files have 4 audio streams or tracks. Each audio track has 2 channels, with a layout of stereo.
Apparently, the audio encoder is constant bit-rate, and so the metadata cannot be used to distinguish silent tracks from sound-bearing ones.
You'll need to parse each suspect audio stream.
ffmpeg -i file -map 0:a:1 -af astats -f null -
At the end of the console log, statistics for the audio stream will be printed,
e.g.
[Parsed_astats_0 # 0000000003c3aec0] Channel: 1
[Parsed_astats_0 # 0000000003c3aec0] DC offset: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Min level: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Max level: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Min difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Max difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Mean difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] RMS difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Peak level dB: -6153.053111
[Parsed_astats_0 # 0000000003c3aec0] RMS level dB: -inf
[Parsed_astats_0 # 0000000003c3aec0] RMS peak dB: -3076.526556
[Parsed_astats_0 # 0000000003c3aec0] RMS trough dB: -inf
[Parsed_astats_0 # 0000000003c3aec0] Crest factor: 1.000000
[Parsed_astats_0 # 0000000003c3aec0] Flat factor: -inf
[Parsed_astats_0 # 0000000003c3aec0] Peak count: 662528
[Parsed_astats_0 # 0000000003c3aec0] Bit depth: 0/0
[Parsed_astats_0 # 0000000003c3aec0] Dynamic range: -inf
[Parsed_astats_0 # 0000000003c3aec0] Zero crossings: 0
[Parsed_astats_0 # 0000000003c3aec0] Zero crossings rate: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Channel: 2
[Parsed_astats_0 # 0000000003c3aec0] DC offset: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Min level: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Max level: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Min difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Max difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Mean difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] RMS difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Peak level dB: -6153.053111
[Parsed_astats_0 # 0000000003c3aec0] RMS level dB: -inf
[Parsed_astats_0 # 0000000003c3aec0] RMS peak dB: -3076.526556
[Parsed_astats_0 # 0000000003c3aec0] RMS trough dB: -inf
[Parsed_astats_0 # 0000000003c3aec0] Crest factor: 1.000000
[Parsed_astats_0 # 0000000003c3aec0] Flat factor: -inf
[Parsed_astats_0 # 0000000003c3aec0] Peak count: 662528
[Parsed_astats_0 # 0000000003c3aec0] Bit depth: 0/0
[Parsed_astats_0 # 0000000003c3aec0] Dynamic range: -inf
[Parsed_astats_0 # 0000000003c3aec0] Zero crossings: 0
[Parsed_astats_0 # 0000000003c3aec0] Zero crossings rate: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Overall
[Parsed_astats_0 # 0000000003c3aec0] DC offset: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Min level: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Max level: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Min difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Max difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Mean difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] RMS difference: 0.000000
[Parsed_astats_0 # 0000000003c3aec0] Peak level dB: -6153.053111
[Parsed_astats_0 # 0000000003c3aec0] RMS level dB: -inf
[Parsed_astats_0 # 0000000003c3aec0] RMS peak dB: -3076.526556
[Parsed_astats_0 # 0000000003c3aec0] RMS trough dB: -inf
[Parsed_astats_0 # 0000000003c3aec0] Flat factor: -inf
[Parsed_astats_0 # 0000000003c3aec0] Peak count: 662528.000000
[Parsed_astats_0 # 0000000003c3aec0] Bit depth: 0/0
[Parsed_astats_0 # 0000000003c3aec0] Number of samples: 662528
If the RMS level dB is -inf, then that channel is silent.

Related

Pandas: mask dataframe by a rolling window

I have a dataframe df_snow_or_ice which indicates whether there is snow or not in a certain day as following:
df_snow_or_ice
Out[63]:
SWE
datetime_doy
2007-01-01 0.000000
2007-01-02 0.000000
2007-01-03 0.000000
2007-01-04 0.000000
2007-01-05 0.000000
...
2019-12-27 0.000000
2019-12-28 0.000000
2019-12-29 0.000000
2019-12-30 0.000000
2019-12-31 0.000064
[4748 rows x 1 columns]
And I also have a dataframe gpi_data_tmp and want to mask it based on whether there is snow or not (whether df_snow_or_ice['SWE']>0) in a rolling window of 42 days. That is, if at day d, df_snow_or_ice.iloc[d-21:d+21]['SWE']>0 during the interval [d-21:d+21], then gpi_data_tmp.iloc[d] is masked as np.nan. If I wrote it in for-loop, it's like:
half_width = 21
for i in range(half_width,len(df_snow_or_ice)-half_width+1,1):
if df_snow_or_ice['SWE'].iloc[i] > 0 :
gpi_data_tmp.iloc[(i-half_width):(i+half_width)] = np.nan
for i in range(len(df_snow_or_ice)):
if df_snow_or_ice['SWE'].iloc[i] > 0 :
gpi_data_tmp.iloc[i] = np.nan
So how can I write it efficiently? by some functions of pandas? Thanks!

Assign different colors to polydata in paraview

Trying to avoid defining multiple individual polygons/quad, so I use polydata.
I need to define multiple polydata in a Matlab generated vtk file, but each one should be assigned a different color (defined in a lookup table).
The following code gives an error and accepts only the first color which it assigns to all polydata.
# vtk DataFile Version 5.1
vtk output
ASCII
DATASET POLYDATA
POINTS 12 float
0.500000 1.000000 0.000000
0.353553 1.000000 -0.353553
0.000000 1.000000 -0.500000
-0.353553 1.000000 -0.353553
-0.500000 1.000000 0.000000
-0.353553 1.000000 0.353553
0.000000 1.000000 0.500000
0.353553 1.000000 0.353553
0. 0. 0.
1. 1. 1.
2. 2. 2.
1. 2. 1.
POLYGONS 3 12
OFFSETS vtktypeint64
0 8 12
CONNECTIVITY vtktypeint64
0 1 2 3 4 5 6 7
9 10 11 12
CELL_DATA 2
SCALARS SMEARED float 1
LOOKUP_TABLE victor
0 1
LOOKUP_TABLE victor 1
1.000000 0.000000 0.000000 1.000000
0.000000 1.000000 0.000000 1.000000
LOOKUP_TABLE victor 1
This should be LOOKUP_TABLE victor 2, as you define 2 RGBA points in your table

How to add path to texture in OBJ or MTL file?

I have next problem:
My project consists of .obj file, .mtl file and texture(.jpg).
I need to divide texture into multiple files. But, when I do it, the UV coordinates (after mapping and reverse mapping) will be the same on several files, thus it cause error watching obj using meshlab.
How can I solve my problem ?
Meshlab does support files with several texture files, just by using a separate material for each texture. It is not clear if you are generating your obj files with meshlab or other program, so I'm not sure if this is a meshlab related question.
Here is a sample of a minimal multitexture .obj file (8 vertex, 4 triangles, 2 textures)
mtllib ./TextureDouble.obj.mtl
# 8 vertices, 8 vertices normals
vn 0.000000 0.000000 1.570796
v 0.000000 0.000000 0.000000
vn 0.000000 0.000000 1.570796
v 1.000000 0.000000 0.000000
vn 0.000000 0.000000 1.570796
v 1.000000 1.000000 0.000000
vn 0.000000 0.000000 1.570796
v 0.000000 1.000000 0.000000
vn 0.000000 0.000000 1.570796
v 2.000000 0.000000 0.000000
vn 0.000000 0.000000 1.570796
v 3.000000 0.000000 0.000000
vn 0.000000 0.000000 1.570796
v 3.000000 1.000000 0.000000
vn 0.000000 0.000000 1.570796
v 2.000000 1.000000 0.000000
# 4 coords texture
vt 0.000000 0.000000
vt 1.000000 0.000000
vt 1.000000 1.000000
vt 0.000000 1.000000
# 2 faces using material_0
usemtl material_0
f 1/1/1 2/2/2 3/3/3
f 1/1/1 3/3/3 4/4/4
# 4 coords texture
vt 0.000000 0.000000
vt 1.000000 0.000000
vt 1.000000 1.000000
vt 0.000000 1.000000
# 2 faces using material_1
usemtl material_1
f 5/5/5 6/6/6 7/7/7
f 5/5/5 7/7/7 8/8/8
And here is the TextureDouble.obj.mtl file. To test the files, you must provide 2 image files named TextureDouble_A.png and TextureDouble_B.png.
newmtl material_0
Ka 0.200000 0.200000 0.200000
Kd 1.000000 1.000000 1.000000
Ks 1.000000 1.000000 1.000000
Tr 1.000000
illum 2
Ns 0.000000
map_Kd TextureDouble_A.png
newmtl material_1
Ka 0.200000 0.200000 0.200000
Kd 1.000000 1.000000 1.000000
Ks 1.000000 1.000000 1.000000
Tr 1.000000
illum 2
Ns 0.000000
map_Kd TextureDouble_B.png

I have a problem understanding sklearn's TfidfVectorizer results

Given a corpus of 3 documents, for example:
sentences = ["This car is fast",
"This car is pretty",
"Very fast truck"]
I am executing by hand the calculation of tf-idf.
For document 1, and the word "car", I can find that:
TF = 1/4
IDF = log(3/2)
TF-IDF = 1/4 * log(3/2)
Same result should apply to document 2, since it has 4 words, and one of them is "car".
I have tried to apply this in sklearn, with the code below:
from sklearn.feature_extraction.text import TfidfVectorizer
import pandas as pd
data = {'text': sentences}
df = pd.DataFrame(data)
tv = TfidfVectorizer()
tfvector = tv.fit_transform(df.text)
print(pd.DataFrame(tfvector.toarray(), columns=tv.get_feature_names()))
And the result I get is:
car fast is pretty this truck very
0 0.500000 0.50000 0.500000 0.000000 0.500000 0.000000 0.000000
1 0.459854 0.00000 0.459854 0.604652 0.459854 0.000000 0.000000
2 0.000000 0.47363 0.000000 0.000000 0.000000 0.622766 0.622766
I can understand that sklearn uses L2 normalization, but still, shouldn't the tf-idf score of "car" in the first two documents be the same? Can anyone help me understanding the results?
It is because of the normalization. If you add the parameter norm=None to the TfIdfVectorizer(norm=None), you will get the following result, which has the same value for car
car fast is pretty this truck very
0 1.287682 1.287682 1.287682 0.000000 1.287682 0.000000 0.000000
1 1.287682 0.000000 1.287682 1.693147 1.287682 0.000000 0.000000
2 0.000000 1.287682 0.000000 0.000000 0.000000 1.693147 1.693147

Plot 3 dimensional unequal length of data

I am new in gnuplot so excuse me if it looks simple. I have a data file like below and I want to draw a diagram like this:
t x = 0.00 0.20 0.40 0.60 0.80 1.00
0.00 0.000000 0.640000 0.960000 0.960000 0.640000 0.000000
0.02 0.000000 0.480000 0.800000 0.800000 0.480000 0.000000
0.04 0.000000 0.400000 0.640000 0.640000 0.400000 0.000000
0.06 0.000000 0.320000 0.520000 0.520000 0.320000 0.000000
0.08 0.000000 0.260000 0.420000 0.420000 0.260000 0.000000
0.10 0.000000 0.210000 0.340000 0.340000 0.210000 0.000000
0.12 0.000000 0.170000 0.275000 0.275000 0.170000 0.000000
0.14 0.000000 0.137500 0.222500 0.222500 0.137500 0.000000
0.16 0.000000 0.111250 0.180000 0.180000 0.111250 0.000000
0.18 0.000000 0.090000 0.145625 0.145625 0.090000 0.000000
0.20 0.000000 0.072813 0.117813 0.117813 0.072813 0.000000
GNU octave equivalent command is something like this:
mesh(tplot,xplot,ttplot);
Well as with many things, it is simple if you know how. This is straighforward to plot if you remove the x = and the t from the data file, e.g.:
0 0.00 0.20 0.40 0.60 0.80 1.00
0.00 0.000000 0.640000 0.960000 0.960000 0.640000 0.000000
0.02 0.000000 0.480000 0.800000 0.800000 0.480000 0.000000
0.04 0.000000 0.400000 0.640000 0.640000 0.400000 0.000000
0.06 0.000000 0.320000 0.520000 0.520000 0.320000 0.000000
0.08 0.000000 0.260000 0.420000 0.420000 0.260000 0.000000
0.10 0.000000 0.210000 0.340000 0.340000 0.210000 0.000000
0.12 0.000000 0.170000 0.275000 0.275000 0.170000 0.000000
0.14 0.000000 0.137500 0.222500 0.222500 0.137500 0.000000
0.16 0.000000 0.111250 0.180000 0.180000 0.111250 0.000000
0.18 0.000000 0.090000 0.145625 0.145625 0.090000 0.000000
0.20 0.000000 0.072813 0.117813 0.117813 0.072813 0.000000
Then the data can be interpreted as a "non-uniform" matrix, although it is uniform. This is useful as it reads the first row and first column correctly. See help matrix and help matrix nonuniform for more. For example:
echo 'splot "data" nonuniform matrix with lines' | gnuplot --persist
Gives me:
To make it similar to the output produced by the GNU Octave mesh command, do something like this:
set xlabel "x"
set ylabel "t"
set zlabel "u"
set view 20,210
set border 4095 lw 2
set hidden3d
set xyplane 0
set autoscale fix
set nokey
set notics
splot "data" nonuniform matrix lt -1 lw 2 with lines
Which results in:

Resources