Get length of .wav from sox output

Get length of .wav from sox output - audio

I need to get the length of a .wav file.
Using:
sox output.wav -n stat
Gives:
Samples read: 449718
Length (seconds): 28.107375
Scaled by: 2147483647.0
Maximum amplitude: 0.999969
Minimum amplitude: -0.999969
Midline amplitude: 0.000000
Mean norm: 0.145530
Mean amplitude: 0.000291
RMS amplitude: 0.249847
Maximum delta: 1.316925
Minimum delta: 0.000000
Mean delta: 0.033336
RMS delta: 0.064767
Rough frequency: 660
Volume adjustment: 1.000
How do I use grep or some other method to only output the value of the length in the second column, i.e. 28.107375?
Thanks

There is a better way:
soxi -D out.wav

The stat effect sends its output to stderr, use 2>&1 to redirect to stdout. Use sed to extract the relevant bits:
sox out.wav -n stat 2>&1 | sed -n 's#^Length (seconds):[^0-9]*\([0-9.]*\)$#\1#p'

This can be done by using:
soxi -D input.mp3 the output will be the duration directly in seconds
soxi -d input.mp3 the output will be the duration with the following format hh:mm:ss.ss

This worked for me (in Windows):
sox --i -D out.wav

I just added an option for JSON output on the 'stat' and 'stats' effects. This should make getting info about an audiofile a little bit easier.
https://github.com/kylophone/SoxJSONStatStats
$ sox somefile.wav -n stat -json

for ruby:
string = `sox --i -D file_wav 2>&1`
string.strip.to_f

There is my solution for C# (unfortunately sox --i -D out.wav returns wrong result in some cases):
public static double GetAudioDuration(string soxPath, string audioPath)
{
double duration = 0;
var startInfo = new ProcessStartInfo(soxPath,
string.Format("\"{0}\" -n stat", audioPath));
startInfo.UseShellExecute = false;
startInfo.CreateNoWindow = true;
startInfo.RedirectStandardError = true;
startInfo.RedirectStandardOutput = true;
var process = Process.Start(startInfo);
process.WaitForExit();
string str;
using (var outputThread = process.StandardError)
str = outputThread.ReadToEnd();
if (string.IsNullOrEmpty(str))
using (var outputThread = process.StandardOutput)
str = outputThread.ReadToEnd();
try
{
string[] lines = str.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
string lengthLine = lines.First(line => line.Contains("Length (seconds)"));
duration = double.Parse(lengthLine.Split(':')[1]);
}
catch (Exception ex)
{
}
return duration;
}

In CentOS
sox out.wav -e stat 2>&1 | sed -n 's#^Length (seconds):[^0-9]([0-9.])$#\1#p'

sox stat output to array and json encode
$stats_raw = array();
exec('sox file.wav -n stat 2>&1', $stats_raw);
$stats = array();
foreach($stats_raw as $stat) {
$word = explode(':', $stat);
$stats[] = array('name' => trim($word[0]), 'value' => trim($word[1]));
}
echo json_encode($stats);

Related

What to do in order to create a continuous .txt files without replacing the already existing .txt files using bash

I am trying to write a bash script to create multiple .txt files.
With the below code I created the files, but when I run the script again I get the same output instead of having more files with increasing number.
#! /bin/bash
for z in $(seq -w 1 10);
do
[[ ! -f "${z}_name.txt" ]] && {touch "${z}_name.txt";}
done

Based in part on work by Raman Sailopal in a now-deleted answer (and on comments I made about that answer, as well as comments I made about the question), you could use:
shopt -s nullglob
touch $(seq -f '%.0f_name.txt' \
$(printf '%s\n' [0-9]*_name.txt |
awk 'BEGIN { max = 0 }
{ val = $0 + 0; if (val > max) max = val; }
END { print max + 1, max + 10 }'
)
)
The shopt -s nullglob command means that if there are no names that match the glob expression [0-9]*_name.txt, nothing will be generated in the arguments to the printf command.
The touch command is given a list of file names. The seq command formats a range of numbers using zero decimal places (so it formats them as integers) plus the rest of the name (_name.txt). The range is given by the output of printf … | awk …. The printf() command lists file names that start with a digit and end with _name.txt one per line. The awk command keeps a track of the current maximum number; it coerces the name into a number (awk ignores the material after the last digit) and checks whether the number is larger than before. At the end, it prints two values, the largest value plus 1 and the largest value plus 10 (defaulting to 1 and 10 if there were no files). Adding the -w option to seq is irrelevant when you specify -f and a format; the file names won't be generated with leading zeros. There are ways to deal with this if they're crucial — probably simplest is to drop the -f option to seq and add the -w option, and output the output through sed 's/$/_name.txt/'.
You can squish the awk script onto a single line; you can squish the whole command onto a single line. However, it is arguably easier to see the organization of the command when they are spread over multiple lines.
Note that (apart from a possible TOCTOU — Time of Check, Time of Use — issue), there is no need to check whether the files exist. They don't; they'd have been listed by the glob [0-9]*_name.txt if they did, and the number would have been accounted for. If you want to ensure no damage to existing files, you'd need to use set -C or set -o noclobber and then create the files one by one using shell I/O redirection.
[…time passes…]
Actually, you can have awk do the file name generation instead of using seq at all:
touch $(printf '%s\n' [0-9]*_name.txt |
awk 'BEGIN { max = 0 }
{ val = $0 + 0; if (val > max) max = val; }
END { for (i = max + 1; i <= max + 10; i++)
printf "%d_name.txt\n", i
}'
)
And, if you try a bit harder, you can get rid of the printf command too:
touch $(awk 'BEGIN { max = 0
for (i = 1; i <= ARGC; i++)
{
val = ARGV[i] + 0;
if (val > max)
max = val
}
for (i = max + 1; i <= max + 10; i++)
printf "%d_name.txt\n", i
}' [0-9]*_name.txt
)
Don't forget the shopt -s nullglob — that's still needed for maximum resiliency.
You might even choose to get rid of the separate touch command by having awk write to the files:
awk 'BEGIN { max = 0
for (i = 0; i < ARGC; i++)
{
val = ARGV[i] + 0;
if (val > max)
max = val
}
for (i = max + 1; i <= max + 10; i++)
{
name = sprintf("%d_name.txt", i)
printf "" > name
}
exit
}' [0-9]*_name.txt
Note the use of exit. Note that the POSIX specification for awk says that ARGC is the number of arguments in ARGV and that the elements in ARGV are indexed from 0 to ARGC - 1 — as in C programs.
There are few shell scripts that cannot be improved. The first version shown runs 4 commands; the last runs just one. That difference could be quite significant if there were many files to be processed.
Beware: eventually, the argument list generated by the glob will get too big; then you have to do more work. You might be obliged to filter the output from ls (with its attendant risks and dangers) and feed the output (the list of file names) into the awk script and process the lines of input once more. While your lists remain a few thousand files long, it probably won't be a problem.

fastest way to sum the file sizes by owner in a directory

I'm using the below command using an alias to print the sum of all file sizes by owner in a directory
ls -l $dir | awk ' NF>3 { file[$3]+=$5 } \
END { for( i in file) { ss=file[i]; \
if(ss >=1024*1024*1024 ) {size=ss/1024/1024/1024; unit="G"} else \
if(ss>=1024*1024) {size=ss/1024/1024; unit="M"} else {size=ss/1024; unit="K"}; \
format="%.2f%s"; res=sprintf(format,size,unit); \
printf "%-8s %12d\t%s\n",res,file[i],i }}' | sort -k2 -nr
but, it doesn't seem to be fast all the times.
Is it possible to get the same output in some other way, but faster?

Another perl one, that displays total sizes sorted by user:
#!/usr/bin/perl
use warnings;
use strict;
use autodie;
use feature qw/say/;
use File::Spec;
use Fcntl qw/:mode/;
my $dir = shift;
my %users;
opendir(my $d, $dir);
while (my $file = readdir $d) {
my $filename = File::Spec->catfile($dir, $file);
my ($mode, $uid, $size) = (stat $filename)[2, 4, 7];
$users{$uid} += $size if S_ISREG($mode);
}
closedir $d;
my #sizes = sort { $a->[0] cmp $b->[0] }
map { [ getpwuid($_) // $_, $users{$_} ] } keys %users;
local $, = "\t";
say #$_ for #sizes;

Get a listing, add up sizes, and sort it by owner (with Perl)
perl -wE'
chdir (shift // ".");
for (glob ".* *") {
next if not -f;
($owner_id, $size) = (stat)[4,7]
or do { warn "Trouble stat for: $_"; next };
$rept{$owner_id} += $size
}
say (getpwuid($_)//$_, " => $rept{$_} bytes") for sort keys %rept
'
I didn't get to benchmark it, and it'd be worth trying it out against an approach where the directory is iterated over, as opposed to glob-ed (while I found glob much faster in a related problem).
I expect good runtimes in comparison with ls, which slows down dramatically as a file list in a single directory gets long. This is due to the system so Perl will be affected as well but as far as I recall it handles it far better. However, I've seen a dramatic slowdown only once entries get to half a million or so, not a few thousand, so I am not sure why it runs slow on your system.
If this need be recursive in directories it finds then use File::Find. For example
perl -MFile::Find -wE'
$dir = shift // ".";
find( sub {
return if not -f;
($owner_id, $size) = (stat)[4,7]
or do { warn "Trouble stat for: $_"; return };
$rept{$owner_id} += $size
}, $dir );
say (getpwuid($_)//$_, "$_ => $rept{$_} bytes") for keys %rept
'
This scans a directory with 2.4 Gb, of mostly small files over a hierarchy of subdirectories, in a little over 2 seconds. The du -sh took around 5 seconds (the first time round).
It is reasonable to bring these two into one script
use warnings;
use strict;
use feature 'say';
use File::Find;
use Getopt::Long;
my %rept;
sub get_sizes {
return if not -f;
my ($owner_id, $size) = (stat)[4,7]
or do { warn "Trouble stat for: $_"; return };
$rept{$owner_id} += $size
}
my ($dir, $recurse) = ('.', '');
GetOptions('recursive|r!' => \$recurse, 'directory|d=s' => \$dir)
or die "Usage: $0 [--recursive] [--directory dirname]\n";
($recurse)
? find( { wanted => \&get_sizes }, $dir )
: find( { wanted => \&get_sizes,
preprocess => sub { return grep { -f } #_ } }, $dir );
say (getpwuid($_)//$_, " => $rept{$_} bytes") for keys %rept;
I find this to perform about the same as the one-dir-only code above, when run non-recursively (default as it stands).
Note that File::Find::Rule interface has many conveniences but is slower in some important use cases, what clearly matters here. (That analysis should be redone since it's a few years old.)

Parsing output from ls - bad idea.
How about using find instead?
start in directory ${dir}
limit to that directory level (-maxdepth 1)
limit to files (-type f)
print a line with user name and file size in bytes (-printf "%u %s\n")
run the results through a perl filter
split each line (-a)
add to a hash under key (field 0) the size (field 1)
at the end (END {...}) print out the hash contents, sorted by key, i.e. user name
$ find ${dir} -maxdepth 1 -type f -printf "%u %s\n" | \
perl -ane '$s{$F[0]} += $F[1]; END { print "$_ $s{$_}\n" foreach (sort keys %s); }'
stefanb 263305714
A solution using Perl:
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
use File::Spec;
my %users;
foreach my $dir (#ARGV) {
opendir(my $dh, $dir);
# files in this directory
while (my $entry = readdir($dh)) {
my $file = File::Spec->catfile($dir, $entry);
# only files
if (-f $file) {
my($uid, $size) = (stat($file))[4, 7];
$users{$uid} += $size
}
}
closedir($dh);
}
print "$_ $users{$_}\n" foreach (sort keys %users);
exit 0;
Test run:
$ perl dummy.pl .
1000 263618544
Interesting difference. The Perl solution discovers 3 more files in my test directory than the find solution. I have to ponder why that is...

Did I see some awk in the op? Here is one in GNU awk using filefuncs extension:
$ cat bar.awk
#load "filefuncs"
BEGIN {
FS=":" # passwd field sep
passwd="/etc/passwd" # get usernames from passwd
while ((getline < passwd)>0)
users[$3]=$1
close(passwd) # close passwd
if(path="") # set path with -v path=...
path="." # default path is cwd
pathlist[1]=path # path from the command line
# you could have several paths
fts(pathlist,FTS_PHYSICAL,filedata) # dont mind links (vs. FTS_LOGICAL)
for(p in filedata) # p for paths
for(f in filedata[p]) # f for files
if(filedata[p][f]["stat"]["type"]=="file") # mind files only
size[filedata[p][f]["stat"]["uid"]]+=filedata[p][f]["stat"]["size"]
for(i in size)
print (users[i]?users[i]:i),size[i] # print username if found else uid
exit
}
Sample outputs:
$ ls -l
total 3623
drwxr-xr-x 2 james james 3690496 Mar 21 21:32 100kfiles/
-rw-r--r-- 1 root root 4 Mar 21 18:52 bar
-rw-r--r-- 1 james james 424 Mar 21 21:33 bar.awk
-rw-r--r-- 1 james james 546 Mar 21 21:19 bar.awk~
-rw-r--r-- 1 james james 315 Mar 21 19:14 foo.awk
-rw-r--r-- 1 james james 125 Mar 21 18:53 foo.awk~
$ awk -v path=. -f bar.awk
root 4
james 1410
Another:
$ time awk -v path=100kfiles -f bar.awk
root 4
james 342439926
real 0m1.289s
user 0m0.852s
sys 0m0.440s
Yet another test with a million empty files:
$ time awk -v path=../million_files -f bar.awk
real 0m5.057s
user 0m4.000s
sys 0m1.056s

Not sure why question is tagged perl when awk is being used.
Here's a simple perl version:
#!/usr/bin/perl
chdir($ARGV[0]) or die("Usage: $0 dir\n");
map {
if ( ! m/^[.][.]?$/o ) {
($s,$u) = (stat)[7,4];
$h{$u} += $s;
}
} glob ".* *";
map {
$s = $h{$_};
$u = !( $s >>10) ? ""
: !(($s>>=10)>>10) ? "k"
: !(($s>>=10)>>10) ? "M"
: !(($s>>=10)>>10) ? "G"
: ($s>>=10) ? "T"
: undef
;
printf "%-8s %12d\t%s\n", $s.$u, $h{$_}, getpwuid($_)//$_;
} keys %h;
glob gets our file list
m// discards . and ..
stat the size and uid
accumulate sizes in %h
compute the unit by bitshifting (>>10 is integer divide by 1024)
map uid to username (// provides fallback)
print results (unsorted)
NOTE: unlike some other answers, this code doesn't recurse into subdirectories
To exclude symlinks, subdirectories, etc, change the if to appropriate -X tests. (eg. (-f $_), (!-d $_ and !-l $_), etc). See perl docs on the _ filehandle optimisation for caching stat results.

Using datamash (and Stefan Becker's find code):
find ${dir} -maxdepth 1 -type f -printf "%u\t%s\n" | datamash -sg 1 sum 2

Processing text in bash - extracting the volume of a program from pactl sink-inputs output

Looking for a way to extract the volume from
pactl list sink-inputs
Output example:
Sink Input #67
Driver: protocol-native.c
Owner Module: 12
Client: 32
Sink: 0
Sample Specification: s16le 2ch 44100Hz
Channel Map: front-left,front-right
Format: pcm, format.sample_format = "\"s16le\"" format.channels = "2" format.rate = "44100" format.channel_map = "\"front-left,front-right\""
Corked: no
Mute: no
Volume: front-left: 19661 / 30% / -31.37 dB, front-right: 19661 / 30% / -31.37 dB
balance 0.00
Buffer Latency: 100544 usec
Sink Latency: 58938 usec
Resample method: n/a
Properties:
media.name = "'Alerion' by 'Asking Alexandria'"
application.name = "Clementine"
native-protocol.peer = "UNIX socket client"
native-protocol.version = "32"
media.role = "music"
application.process.id = "16924"
application.process.user = "gray"
application.process.host = "gray-kubuntu"
application.process.binary = "clementine"
application.language = "en_US.UTF-8"
window.x11.display = ":0"
application.process.machine_id = "54f542f950a5492c9c335804e1418e5c"
application.process.session_id = "3"
application.icon_name = "clementine"
module-stream-restore.id = "sink-input-by-media-role:music"
media.title = "Alerion"
media.artist = "Asking Alexandria"
I want to extract the
30
from the line
Volume: front-left: 19661 / 30% / -31.37 dB, front-right: 19661 / 30% / -31.37 dB
Note: There may be multiple sink inputs, and I need to extract the volume only from Sink Input #67
Thanks
P.S. Need this for a script of mine which should increase or decrease the volume of my music player. I'm completely new to both linux and bash so I couldn't figure a way to resolve the problem.
Edit:
My awk version
gray#gray-kubuntu:~$ awk -W version
mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan
compiled limits:
max NF 32767
sprintf buffer 2040

Since you are pretty new to use standard text processing tools, I will provide an answer with a detailed explanation. Feel free to use it for future.
Am basing this answer using the GNU Awk I have installed which should likely also work in mawk installed in your system.
pactl list sink-inputs | \
mawk '/Sink Input #67/{f=1; next} f && /Volume:/{ n=split($0,matchGroup,"/"); val=matchGroup[2]; gsub(/^[[:space:]]+/,"",val); gsub(/%/,"",val); print val; f=0}'
Awk processes one line at a time which is based on a /pattern/{action1; action2} syntax. In our case though, we match the line /Sink Input #67/ and enable a flag(f) to mark the next occurrence of Volume: string in the lines below. Without the flag set it could match the instances for other sink inputs.
So once we match the line, we split the line using the de-limiter / and get the second matched element which is stored in the array(matchGroup). Then we use the gsub() calls twice once, to replace the leading white-spaces and other to remove the % sign after the number.

This script I wrote might be what you wanted. It lets me adjust the volume easily using the pacmd and pactl commands. Seems to work well when I'm using a GNOME desktop, (Wayland or Xorg), and it's working on RHEL/Fedora and Ubuntu so far. I haven't tried using it with other desktops/distros, or with surround sound systems, etc.
Drop it in your path, and run it without any values to see the current volume. Alternatively set the volume by passing it a percentage. A single value sets both speakers, two values will set left, and right separately. In theory you shouldn't use a value outside of 0%-200%, but the script doesn't check for that (and neither does PulseAudio apparently), so be careful, as a volume higher than 200% may harm your speakers.
[~]# volume
L R
20% 20%
[~]# volume 100% 50%
[~]# volume
L R
100% 50%
[~]# volume 80%
[~]# volume
L R
80% 80%
#!/bin/bash
[ ! -z "$1" ] && [ $# -eq 1 ] && export LVOL="$1" && export RVOL="$1"
[ ! -z "$1" ] && [ ! -z "$2" ] && [ $# -eq 2 ] && export LVOL="$1" && export RVOL="$2"
SINK=$(pacmd list-sinks | grep -e '* index:' | grep -Eo "[0-9]*$")
if [ -z "$LVOL" ] || [ -z "$RVOL" ]; then
# pacmd list-sinks | grep -e '* index:' -A 20 | grep -e 'name:' -e '^\s*volume:.*\n' -e 'balance' --color=none
printf "%-5s%-4s\n%-5s%-4s\n" "L" "R" $(pacmd list-sinks | grep -e '* index:' -A 20 | grep -e '^\s*volume:.*\n' --color=none | grep -Eo "[0-9]*%" | tr "\n" " " | sed "s/ $/\n/g")
exit 0
elif [[ ! "$LVOL" =~ ^[0-9]*%$ ]] || [[ ! "$RVOL" =~ ^[0-9]*%$ ]]; then
printf "The volume should specified as a percentage, from 0%% to 200%%.\n"
exit 1
elif [ "$SINK" == "" ]; then
printf "Unable to find the default sound output.\n"
exit 1
fi
pactl -- set-sink-volume $SINK $LVOL $RVOL

How to generate random numbers in the BusyBox shell

How can I generate random numbers using AShell (restricted bash)? I am using a BusyBox binary on the device which does not have od or $RANDOM. My device has /dev/urandom and /dev/random.

$RANDOM and od are optional features in BusyBox, I assume given your question that they aren't included in your binary. You mention in a comment that /dev/urandom is present, that's good, it means what you need to do is retrieve bytes from it in a usable form, and not the much more difficult problem of implementing a random number generator. Note that you should use /dev/urandom and not /dev/random, see Is a rand from /dev/urandom secure for a login key?.
If you have tr or sed, you can read bytes from /dev/urandom and discard any byte that isn't a desirable character. You'll also need a way to extract a fixed number of bytes from a stream: either head -c (requiring FEATURE_FANCY_HEAD to be enabled) or dd (requiring dd to be compiled in). The more bytes you discard, the slower this method will be. Still, generating random bytes is usually rather fast in comparison with forking and executing external binaries, so discarding a lot of them isn't going to hurt much. For example, the following snippet will produce a random number between 0 and 65535:
n=65536
while [ $n -ge 65536 ]; do
n=1$(</dev/urandom tr -dc 0-9 | dd bs=5 count=1 2>/dev/null)
n=$((n-100000))
done
Note that due to buffering, tr is going to process quite a few more bytes than what dd will end up keeping. BusyBox's tr reads a bufferful (at least 512 bytes) at a time, and flushes its output buffer whenever the input buffer is fully processed, so the command above will always read at least 512 bytes from /dev/urandom (and very rarely more since the expected take from 512 input bytes is 20 decimal digits).
If you need a unique printable string, just discard non-ASCII characters, and perhaps some annoying punctuation characters:
nonce=$(</dev/urandom tr -dc A-Za-z0-9-_ | head -c 22)
In this situation, I would seriously consider writing a small, dedicated C program. Here's one that reads four bytes and outputs the corresponding decimal number. It doesn't rely on any libc function other than the wrappers for the system calls read and write, so you can get a very small binary. Supporting a variable cap passed as a decimal integer on the command line is left as an exercise; it'll cost you hundreds of bytes of code (not something you need to worry about if your target is big enough to run Linux).
#include <stddef.h>
#include <unistd.h>
int main () {
int n;
unsigned long x = 0;
unsigned char buf[4];
char dec[11]; /* Must fit 256^sizeof(buf) in decimal plus one byte */
char *start = dec + sizeof(dec) - 1;
n = read(0, buf, sizeof(buf));
if (n < (int)sizeof(buf)) return 1;
for (n = 0; n < (int)sizeof(buf); n++) x = (x << 8 | buf[n]);
*start = '\n';
if (x == 0) *--start = '0';
else while (x != 0) {
--start;
*start = '0' + (x % 10);
x = x / 10;
}
while (n = write(1, start, dec + sizeof(dec) - start),
n > 0 && n < dec + sizeof(dec) - start) {
start += n;
}
return n < 0;
}

</dev/urandom sed 's/[^[:digit:]]\+//g' | head -c10

/dev/random or /dev/urandom are likely to be present.
Another option is to write a small C program that calls srand(), then rand().

I Tried Gilles' first snippet with BusyBox 1.22.1 and I have some patches, which didn't fit into a comment:
while [ $n -gt 65535 ]; do
n=$(</dev/urandom tr -dc 0-9 | dd bs=5 count=1 2>/dev/null | sed -e 's/^0\+//' )
done
The loop condition should check for greater than the maximum value, otherwise there will be 0 executions.
I silenced dd's stderr
Leading zeros removed, which could lead to surprises in contexts where interpreted as octal (e.g. $(( )))

Hexdump and dc are both available with busybox. Use /dev/urandom for mostly random or /dev/random for better random. Either of these options are better than $RANDOM and are both faster than looping looking for printable characters.
32-bit decimal random number:
CNT=4
RND=$(dc 10 o 0x$(hexdump -e '"%02x" '$CNT' ""' -n $CNT /dev/random) p)
24-bit hex random number:
CNT=3
RND=0x$(hexdump -e '"%02x" '$CNT' ""' -n $CNT /dev/random)
To get smaller numbers, change the format of the hexdump format string and the count of bytes that hexdump reads.

Trying escitalopram's solution didn't work on busybox v1.29.0 but inspired me doing a function.
sI did actually come up with a portable random number generation function that asks for the number of digits and should work fairly well (tested on Linux, WinNT10 bash, Busybox and msys2 so far).
# Get a random number on Windows BusyBox alike, also works on most Unixes
function PoorMansRandomGenerator {
local digits="${1}" # The number of digits of the number to generate
local minimum=1
local maximum
local n=0
if [ "$digits" == "" ]; then
digits=5
fi
# Minimum already has a digit
for n in $(seq 1 $((digits-1))); do
minimum=$minimum"0"
maximum=$maximum"9"
done
maximum=$maximum"9"
#n=0; while [ $n -lt $minimum ]; do n=$n$(dd if=/dev/urandom bs=100 count=1 2>/dev/null | tr -cd '0-9'); done; n=$(echo $n | sed -e 's/^0//')
# bs=19 since if real random strikes, having a 19 digits number is not supported
while [ $n -lt $minimum ] || [ $n -gt $maximum ]; do
if [ $n -lt $minimum ]; then
# Add numbers
n=$n$(dd if=/dev/urandom bs=19 count=1 2>/dev/null | tr -cd '0-9')
n=$(echo $n | sed -e 's/^0//')
if [ "$n" == "" ]; then
n=0
fi
elif [ $n -gt $maximum ]; then
n=$(echo $n | sed 's/.$//')
fi
done
echo $n
}
The following gives a number between 1000 and 9999
echo $(PoorMansRandomGenerator 4)

Improved the above reply to a more simpler version,that also runs really faster, still compatible with Busybox, Linux, msys and WinNT10 bash.
function PoorMansRandomGenerator {
local digits="${1}" # The number of digits to generate
local number
# Some read bytes can't be used, se we read twice the number of required bytes
dd if=/dev/urandom bs=$digits count=2 2> /dev/null | while read -r -n1 char; do
number=$number$(printf "%d" "'$char")
if [ ${#number} -ge $digits ]; then
echo ${number:0:$digits}
break;
fi
done
}
Use with
echo $(PoorMansRandomGenerator 5)

Efficient transfer of console data, tar & gzip/ bzip2 without creating intermediary files

Linux environment. So, we have this program 't_show', when executed with an ID will write price data for that ID on the console. There is no other way to get this data.
I need to copy the price data for IDs 1-10,000 between two servers, using minimum bandwidth, minimum number of connections. On the destination server the data will be a separate file for each id with the format:
<id>.dat
Something like this would be the long-winded solution:
dest:
files=`seq 1 10000`
for id in `echo $files`;
do
./t_show $id > $id
done
tar cf - $files | nice gzip -c > dat.tar.gz
source:
scp user#source:dat.tar.gz ./
gunzip dat.tar.gz
tar xvf dat.tar
That is, write each output to its own file, compress & tar, send over network, extract.
It has the problem that I need to create a new file for each id. This takes up tonnes of space and doesn't scale well.
Is it possible to write the console output directly to a (compressed) tar archive without creating the intermediate files? Any better ideas (maybe writing compressed data directly across network, skipping tar)?
The tar archive would need to extract as I said on the destination server as a separate file for each ID.
Thanks to anyone who takes the time to help.

You could just send the data formatted in some way and parse it on the the receiver.
foo.sh on the sender:
#!/bin/bash
for (( id = 0; id <= 10000; id++ ))
do
data="$(./t_show $id)"
size=$(wc -c <<< "$data")
echo $id $size
cat <<< "$data"
done
On the receiver:
ssh -C user#server 'foo.sh'|while read file size; do
dd of="$file" bs=1 count="$size"
done
ssh -C compresses the data during transfer

You can at least tar stuff over a ssh connection:
tar -czf - inputfiles | ssh remotecomputer "tar -xzf -"
How to populate the archive without intermediary files however, I don't know.
EDIT: Ok, I suppose you could do it by writing the tar file manually. The header is specified here and doesn't seem too complicated, but that isn't exactly my idea of convenient...

I don't think this is working with a plain bash script. But you could have a look at the Archive::TAR module for perl or other scripting languages.
The Perl Module has a function add_data to create a "file" on the fly and add it to the archive for streaming accros the network.
The Documentation is found here:

You can do better without tar:
#!/bin/bash
for id in `seq 1 1000`
do
./t_show $id
done | gzip
The only difference is that you will not get the boundaries between different IDs.
Now put that in a script, say show_me_the_ids and do from the client
shh user#source ./show_me_the_ids | gunzip
And there they are!
Or either, you can specify the -C flag to compress the SSH connection and remove the gzip / gunzip uses all together.
If you are really into it you may try ssh -C, gzip -9 and other compression programs.
Personally I'll bet for lzma -9.

I would try this:
(for ID in $(seq 1 10000); do echo $ID: $(/t_show $ID); done) | ssh user#destination "ImportscriptOrProgram"
This will print "1: ValueOfID1" to standardout, which a transfered via ssh to the destination host, where you can start your importscript or program, which reads the lines from standardin.
HTH

Thanks all
I've taken the advice 'just send the data formatted in some way and parse it on the the receiver', it seems to be the consensus. Skipping tar and using ssh -C for simplicity.
Perl script. Breaks the ids into groups of 1000. IDs are source_id in hash table. All data is sent via single ssh, delimited by 'HEADER', so it writes to the appropriate file. This is a lot more efficient:
sub copy_tickserver_files {
my $self = shift;
my $cmd = 'cd tickserver/ ; ';
my $i = 1;
while ( my ($source_id, $dest_id) = each ( %{ $self->{id_translations} } ) ) {
$cmd .= qq{ echo HEADER $source_id ; ./t_show $source_id ; };
$i++;
if ( $i % 1000 == 0 ) {
$cmd = qq{ssh -C dba\#$self->{source_env}->{tickserver} " $cmd " | };
$self->copy_tickserver_files_subset( $cmd );
$cmd = 'cd tickserver/ ; ';
}
}
$cmd = qq{ssh -C dba\#$self->{source_env}->{tickserver} " $cmd " | };
$self->copy_tickserver_files_subset( $cmd );
}
sub copy_tickserver_files_subset {
my $self = shift;
my $cmd = shift;
my $output = '';
open TICKS, $cmd;
while(<TICKS>) {
if ( m{HEADER [ ] ([0-9]+) }mxs ) {
my $id = $1;
$output = "$self->{tmp_dir}/$id.ts";
close TICKSOP;
open TICKSOP, '>', $output;
next;
}
next unless $output;
print TICKSOP "$_";
}
close TICKS;
close TICKSOP;
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Get length of .wav from sox output - audio

There is a better way: soxi -D out.wav

The stat effect sends its output to stderr, use 2>&1 to redirect to stdout. Use sed to extract the relevant bits: sox out.wav -n stat 2>&1 | sed -n 's#^Length (seconds):[^0-9]\([0-9.]\)$#\1#p'

This can be done by using: soxi -D input.mp3 the output will be the duration directly in seconds soxi -d input.mp3 the output will be the duration with the following format hh:mm:ss.ss

This worked for me (in Windows): sox --i -D out.wav

I just added an option for JSON output on the 'stat' and 'stats' effects. This should make getting info about an audiofile a little bit easier. https://github.com/kylophone/SoxJSONStatStats $ sox somefile.wav -n stat -json

for ruby: string = `sox --i -D file_wav 2>&1` string.strip.to_f

In CentOS sox out.wav -e stat 2>&1 | sed -n 's#^Length (seconds):[^0-9]([0-9.])$#\1#p'

sox stat output to array and json encode $stats_raw = array(); exec('sox file.wav -n stat 2>&1', $stats_raw); $stats = array(); foreach($stats_raw as $stat) { $word = explode(':', $stat); $stats[] = array('name' => trim($word[0]), 'value' => trim($word[1])); } echo json_encode($stats);

Related

What to do in order to create a continuous .txt files without replacing the already existing .txt files using bash

fastest way to sum the file sizes by owner in a directory

Processing text in bash - extracting the volume of a program from pactl sink-inputs output

How to generate random numbers in the BusyBox shell

Efficient transfer of console data, tar & gzip/ bzip2 without creating intermediary files

Categories

Resources

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Get length of .wav from sox output - audio

There is a better way: soxi -D out.wav

The stat effect sends its output to stderr, use 2>&1 to redirect to stdout. Use sed to extract the relevant bits: sox out.wav -n stat 2>&1 | sed -n 's#^Length (seconds):[^0-9]*\([0-9.]*\)$#\1#p'

This can be done by using: soxi -D input.mp3 the output will be the duration directly in seconds soxi -d input.mp3 the output will be the duration with the following format hh:mm:ss.ss

This worked for me (in Windows): sox --i -D out.wav

I just added an option for JSON output on the 'stat' and 'stats' effects. This should make getting info about an audiofile a little bit easier. https://github.com/kylophone/SoxJSONStatStats $ sox somefile.wav -n stat -json

for ruby: string = `sox --i -D file_wav 2>&1` string.strip.to_f

In CentOS sox out.wav -e stat 2>&1 | sed -n 's#^Length (seconds):[^0-9]([0-9.])$#\1#p'

sox stat output to array and json encode $stats_raw = array(); exec('sox file.wav -n stat 2>&1', $stats_raw); $stats = array(); foreach($stats_raw as $stat) { $word = explode(':', $stat); $stats[] = array('name' => trim($word[0]), 'value' => trim($word[1])); } echo json_encode($stats);

Related

What to do in order to create a continuous .txt files without replacing the already existing .txt files using bash

fastest way to sum the file sizes by owner in a directory

Processing text in bash - extracting the volume of a program from pactl sink-inputs output

How to generate random numbers in the BusyBox shell

Efficient transfer of console data, tar & gzip/ bzip2 without creating intermediary files

Categories

Resources

The stat effect sends its output to stderr, use 2>&1 to redirect to stdout. Use sed to extract the relevant bits: sox out.wav -n stat 2>&1 | sed -n 's#^Length (seconds):[^0-9]\([0-9.]\)$#\1#p'