How do I insert the results of several commands on a file as part of my sed stream? - string

I use DJing software on linux (xwax) which uses a 'scanning' script (visible here) that compiles all the music files available to the software and outputs a string which contains a path to the filename and then the title of the mp3. For example, if it scans path-to-mp3/Artist - Test.mp3, it will spit out a string like so:
path-to-mp3/Artist - Test.mp3[tab]Artist - Test
I have tagged all my mp3s with BPM information via the id3v2 tool and have a commandline method for extracting that information as follows:
id3v2 -l name-of-mp3.mp3 | grep TBPM | cut -D: -f2
That spits out JUST the numerical BPM to me. What I'd like to do is prepend the BPM number from the above command as part of the xwax scanning script, but I'm not sure how to insert that command in the midst of the script. What I'd want it to generate is:
path-to-mp3/Artist - Test.mp3[tab][bpm]Artist - Test
Any ideas?

It's not clear to me where in that script you want to insert the BPM number, but the idea is this:
To embed the output of one command into the arguments of another, you can use the "command substitution" notation `...` or $(...). For example, this:
rm $(echo abcd)
runs the command echo abcd and substitutes its output (abcd) into the overall command; so that's equivalent to just rm abcd. It will remove the file named abcd.
The above doesn't work inside single-quotes. If you want, you can just put it outside quotes, as I did in the above example; but it's generally safer to put it inside double-quotes (so as to prevent some unwanted postprocessing). Either of these:
rm "$(echo abcd)"
rm "a$(echo bc)d"
will remove the file named abcd.
In your case, you need to embed the command substitution into the middle of an argument that's mostly single-quoted. You can do that by simply putting the single-quoted strings and double-quoted strings right next to each other with no space in between, so that Bash will combine them into a single argument. (This also works with unquoted strings.) For example, either of these:
rm a"$(echo bc)"d
rm 'a'"$(echo bc)"'d'
will remove the file named abcd.
Edited to add: O.K., I think I understand what you're trying to do. You have a command that either (1) outputs out all the files in a specified directory (and any subdirectories and so on), one per line, or (2) outputs the contents of a file, where the contents of that file is a list of files, one per line. So in either case, it's outputting a list of files, one per line. And you're piping that list into this command:
sed -n '
{
# /[<num>[.]] <artist> - <title>.ext
s:/\([0-9]\+.\? \+\)\?\([^/]*\) \+- \+\([^/]*\)\.[A-Z0-9]*$:\0\t\2\t\3:pi
t
# /<artist> - <album>[/(Disc|Side) <name>]/[<ABnum>[.]] <title>.ext
s:/\([^/]*\) \+- \+\([^/]*\)\(/\(disc\|side\) [0-9A-Z][^/]*\)\?/\([A-H]\?[A0-9]\?[0-9].\? \+\)\?\([^/]*\)\.[A-Z0-9]*$:\0\t\1\t\6:pi
t
# /[<ABnum>[.]] <name>.ext
s:/\([A-H]\?[A0-9]\?[0-9].\? \+\)\?\([^/]*\)\.[A-Z0-9]*$:\0\t\t\2:pi
}
'
which runs a sed script over that list. What you want is for all of the replacement-strings to change from \0\t... to \0\tBPM\t..., where BPM is the BPM number computed from your command. Right? And you need to compute that BPM number separately for each file, so instead of relying on seds implicit line-by-line looping, you need to handle the looping yourself, and process one line at a time. Right?
So, you should change the above command to this:
while read -r LINE ; do # loop over the lines, saving each one as "$LINE"
BPM=$(id3v2 -l "$LINE" | grep TBPM | cut -D: -f2) # save BPM as "$BPM"
sed -n '
{
# /[<num>[.]] <artist> - <title>.ext
s:/\([0-9]\+.\? \+\)\?\([^/]*\) \+- \+\([^/]*\)\.[A-Z0-9]*$:\0\t'"$BPM"'\t\2\t\3:pi
t
# /<artist> - <album>[/(Disc|Side) <name>]/[<ABnum>[.]] <title>.ext
s:/\([^/]*\) \+- \+\([^/]*\)\(/\(disc\|side\) [0-9A-Z][^/]*\)\?/\([A-H]\?[A0-9]\?[0-9].\? \+\)\?\([^/]*\)\.[A-Z0-9]*$:\0\t'"$BPM"'\t\1\t\6:pi
t
# /[<ABnum>[.]] <name>.ext
s:/\([A-H]\?[A0-9]\?[0-9].\? \+\)\?\([^/]*\)\.[A-Z0-9]*$:\0\t'"$BPM"'\t\t\2:pi
}
' <<<"$LINE" # take $LINE as input, rather than reading more lines
done
(where the only change to the sed script itself was to insert '"$BPM"'\t in a few places to switch from single-quoting to double-quoting, then insert the BPM, then switch back to single-quoting and add a tab).

Related

Find and copy specific files by date

I've been trying to get a script working to backup some files from one machine to another but have been running into an issue.
Basically what I want to do is copy two files, one .log and one (or more) .dmp. Their format is always as follows:
something_2022_01_24.log
something_2022_01_24.dmp
I want to do three things with these files:
find the second to last one .log file (i.e. something_2022_01_24.log is the latest,I want to find the one before that say something_2022_01_22.log)
get a substring with just the date (2022_01_22)
copy every .dmp that matches the date (i.e something_2022_01_24.dmp, something01_2022_01_24.dmp)
For the first one from what I could find the best way is to do: ls -t *.log | head-2 as it displays the second to last file created.
As for the second one I'm more at a loss because I'm not sure how to parse the output of the first command.
The third one I think I could manage with something of the sort:
[ -f "/var/www/my_folder/*$capturedate.dmp" ] && cp "/var/www/my_folder/*$capturedate.dmp" /tmp/
What do you guys think is there any way to do this? How can I compare the substring?
Thanks!
Would you please try the following:
#!/bin/bash
dir="/var/www/my_folder"
second=$(ls -t "$dir/"*.log | head -n 2 | tail -n 1)
if [[ $second =~ .*_([0-9]{4}_[0-9]{2}_[0-9]{2})\.log ]]; then
capturedate=${BASH_REMATCH[1]}
cp -p "$dir/"*"$capturedate".dmp /tmp
fi
second=$(ls -t "$dir"/*.log | head -n 2 | tail -n 1) will pick the
second to last log file. Please note it assumes that the timestamp
of the file is not modified since it is created and the filename
does not contain special characters such as a newline. This is an easy
solution and we may need more improvement for the robustness.
The regex .*_([0-9]{4}_[0-9]{2}_[0-9]{2})\.log will match the log
filename. It extracts the date substring (enclosed with the parentheses) and assigns the bash variable
${BASH_REMATCH[1]} to it.
Then the next cp command will do the job. Please be cateful
not to include the widlcard * within the double quotes so that
the wildcard is properly expanded.
FYI here are some alternatives to extract the date string.
With sed:
capturedate=$(sed -E 's/.*_([0-9]{4}_[0-9]{2}_[0-9]{2})\.log/\1/' <<< "$second")
With parameter expansion of bash (if something does not include underscores):
capturedate=${second%.log}
capturedate=${capturedate#*_}
With cut command (if something does not include underscores):
capturedate=$(cut -d_ -f2,3,4 <<< "${second%.log}")

Bash deleting a specific row in .dat file

So, I have this assignment which requires me to delete a certain line from a .dat file. Now my file is basically a phone book. I have a Bash script that adds the ID, name, last name, phone number, address, etc., to the .dat file. Now one of the flags is supposed to be -delete and it takes the parameter id. So, basically I need to implement the function where I'd put ./phonebook.sh -delete -id 7 and it would delete the row where the id is 7.
I tried using sed and awk, but nothing is working and it's frustrating. My current code for that short script (delete.sh) is:
id=$1
sed "/$id/d" phonebook.dat
Try this:
On Mac:
sed -i '' -e "/$id/d" phonebook.dat
Otherwise:
sed -i -e "/$id/d" phonebook.dat
By default, sed will output the results to stdout. So, your command was working, but the output wasn't going back into the file. The -i flag says that the file should be replaced with the results. -i is also meant to backup the original file. For example:
sed -i .bk -e "/$id/d" phonebook.dat
The above will create a copy of the original called: phonebook.dat.bk. However, to do in place replacement without a backup, you can specify no value for -i. On the MAC, sed really really really wants a value, so you can pass it an empty string ( making sure there is a space between the -i and the empty quotes ).
I'm making some assumptions because I don't know what the format of your dat file is. I'll assume that the id field is the first field and the file is comma delimited. If I'm wrong, you should be able to modify the below to fit your needs.
I personally like to use grep -v for this problem. From the --help:
-v, --invert-match select non-matching lines
Running this will output every line of a file that does not match your pattern.
id="$1"
grep -v "^${id}," phonebook.dat > phonebook.temp
mv phonebook.temp phonebook.dat
The pattern consists of
^: Beginning of the line
${id}: Your variable
,: Our assumed delimiter
The reason for specifying the beginning of the line to the first delimiter is to avoid deleting entries where the entered id ($1) is a substring of other ids. You wouldn't want to enter 22 and delete id 22 as well as id 122.

Dynamic searching and string copying in bash

I use mailget for a home-made "backup" system, which backs pre-specified files up when receiving a mail containing the string "backup" by using the following search command:
$ grep -rnw '/path/to/mailbox/' -e "backup"
I want to extract a mailaddress to a variable $var looking like this whereas the string "Return-Path: " (13 chars), always is static in the beginning of each mail file as following:
Return-Path: <someone#domain.com>
In conclusion: When a file containing the string "backup" is detected under a given path, the script is supposed to extract the mailaddress from the regarded file to $var.
Can't get my head around this one, grateful for any help.
The natural mechanism for capturing the output of a command in a variable is "command substitution". The syntax for a command substitution is $( <the command> ); it expands to the standard output of the specified command.
The standard lightweight general tools appropriate for extracting text from a file such as yours are sed and awk. You can also use grep's -l option to make it emit the name of the file wherein it found a match, rather than the match itself. You might put those together something like this:
var=$(sed -n -e '/^Return-Path:/ {s/.*<\(.*\)>.*/\1/;p;q}' $(grep -rlw '/path/to/mailbox/' -e "backup"))
The nested command substitution obtains the names of the files containing the target string; the sed command processes those files and extracts (only) the text between the < and > on the first line starting with "Return-Path:". It makes some assumptions that render it shorter but less robust; my objective is merely to demonstrate, not to write production-quality code for you.

Get numeric value from file name

I am a new guy of Linux. I have a question:
I have a bunch of files in a directory, like:
abc-188_1.out
abc-188_2.out
abc-188_3.out
how can a get the number 188 from those names?
Assuming (since you are on linux and are working with files), that you will use a shell / bash-script... (If you use something different (say, python, ...), the solution will, of course, be a different one.)
... this will work
for file in `ls *`; do out=`echo "${file//[!0-9]/ }"|xargs|cut -d' ' -f1`; echo $out; done
Explanation
The basic problem is to extract a number from a string in bash script (search stackoverflow for this, you will find dozens of different solutions).
This is done in the command above as (the string from which numbers are to be extracted being saved in the variable file):
${file//[!0-9]/ }
or, without spaces
${file//[!0-9]/}
It is complicated here by two things:
Do this recursively on the contents of a directory. This is done here with a bash for loop (note that the variable file takes as value the name of each of the files on the current working directory, one after another)
for file in ls *; do (commands you want done for every file in the CWD, seperated by ";"); done
There are multiple numbers in the filenames, you just want the first one.
Therefore, we leave the spaces in, and pipe the result (that being only numbers and spaces from the current file name) into two other commands, xargs (removes leading and trailing whitespace) and cut -d' ' -f1` (returns only the part of the string before the first remaining space, i.e. the first number in our filename),
We save the resulting string in a variable "out" and print it with echo $out,
out=echo "${file//[!0-9]/ }"|xargs|cut -d' ' -f1; echo $out
Note that the number is still in a string data type. You can transform it to integer if you want by using double brackets preceeded by $ out_int=$((out))

Extracting sub-strings in Unix

I'm using cygwin on Windows 7. I want to loop through a folder consisting of about 10,000 files and perform a signal processing tool's operation on each file. The problem is that the files names have some excess characters that are not compatible with the operation. Hence, I need to extract just a certain part of the file names.
For example if the file name is abc123456_justlike.txt.rna I need to use abc123456_justlike.txt. How should I write a loop to go through each file and perform the operation on the shortened file names?
I tried the cut - b1-10 command but that doesn't let my tool perform the necessary operation. I'd appreciate help with this problem
Try some shell scripting, using the ${NAME%TAIL} parameter substitution: the contents of variable NAME are expanded, but any suffix material which matches the TAIL glob pattern is chopped off.
$ NAME=abc12345.txt.rna
$ echo ${NAME%.rna} #
# process all files in the directory, taking off their .rna suffix
$ for x in *; do signal_processing_tool ${x%.rna} ; done
If there are variations among the file names, you can classify them with a case:
for x in * ; do
case $x in
*.rna )
# do something with .rna files
;;
*.txt )
# do something else with .txt files
;;
* )
# default catch-all-else case
;;
esac
done
Try sed:
echo a.b.c | sed 's/\.[^.]*$//'
The s command in sed performs a search-and-replace operation, in this case it replaces the regular expression \.[^.]*$ (meaning: a dot, followed by any number of non-dots, at the end of the string) with the empty string.
If you are not yet familiar with regular expressions, this is a good point to learn them. I find manipulating string using regular expressions much more straightforward than using tools like cut (or their equivalents).
If you are trying to extract the list of filenames from a directory use the below command.
ls -ltr | awk -F " " '{print $9}' | cut -c1-10

Resources