Loop command, piping previous command's output - linux

I understand that you can use loops in bash to repeat a command a certain amount of times, although it is more conveniently used within bash scripts.
For example, say I have a file that has been compressed several times, and I wish to fully decompress it.
cat file.txt | gzip -d | gzip -d | gzip -d
This is practical enough for a file that has been compressed 3 times, but it would become unwieldy if the file was compressed, for example, 18 times. How could this be simplified? I want to run the gzip -d command on the previous command's output n times. Is there a way to execute this from the command line?

You could do it with something along those lines (pardon any syntax error, consider this pseudo-code close to bash syntax) :
#!/bin/bash
# $1 = iterations left
# $2 = final output file
recursive_gzip()
{
if
[[ "$1" -gt 0 ]]
then
gzip -d | recursive_gzip $(($1 - 1)) "$2"
else
gzip -d > "$2"
fi
}
recursive_gzip 18 "file.txt" <"file.txt.gz"
Please note I replaced your cat with a redirection.
You could generalize the idea to share the same function for compress / decompress, and actually make it work for an arbitrary command by using it as positional arguments after 2.
#!/bin/bash
# $1 = iterations left
# $2 = final output file
recursive_pipe()
{
if
[[ "$1" -gt 0 ]]
then
"${#:3}" | recursive_pipe $(($1 - 1)) "$2" "${#:3}"
else
"${#:3}" > "$2"
fi
}
# Create gzipped file
recursive_pipe 18 "file.txt.gz" gzip <"file.txt"
# Uncompress gzipped file
recursive_pipe 18 "file.txt" gzip -d <"file.txt.gz"

What if you can't remember or wasn't told how often the file was compressed?
You would like to gunzip until it is a normal file.
When you do not want to overwrite your original file, making a copy is the easiest thing:
tmpfile=/tmp/mightbegz
cp file.txt ${tmpfile};
while [ $? -eq 0 ]; do
echo "Gunzipping Again"
mv ${tmpfile} ${tmpfile}.gz
gunzip -d ${tmpfile}.gz
done
The while loop will stop when gzip can not gunzip the file.

Related

Stream edit text file and replace string with increments

I want to duplicate markdown files to ~200 files and replace some strings in the document with incrementing changes. How should i do this? I am a new web developer, and my familiarity is just so-so with PHP and Node.js, but decent Linux user. I am thinking of using sed but can't wrap my mind around it.
Let's say i want to duplicate:
post.md "this is post _INCREMENT_"
then I run:
run generate "post.md" --number 2
to generate:
post1.md "this is post 1"
post2.md "this is post 2"
Combining sed and bash, would you please try the following:
#!/bin/bash
# show usage on improper arguments
usage() {
echo "usage: $1 filename.md number"
exit 1
}
file=$1 # obtain filename
[[ -n $file ]] || usage "$0" # filename is not specified
num=$2 || usage "$0" # obtain end bumber
(( num > 0 )) || usage "$0" # number is not specified or illegal
for (( i = 1; i <= num; i++ )); do
new="${file%.md}$i.md" # new filename with the serial number
cp -p -- "$file" "$new" # duplicate the file
sed -i "s/_INCREMENT_/$i/" "$new"
# replace "_INCREMENT_" with the serial number
done
Save the script above as a file generate or whatever, then run:
bash generate post.md 2

Shell script that filters command output and saves it in Json formated list

never worked with shell scripts before,but i need to in my current task.
So i have to run a command that returns output like this:
awd54a7w6ds54awd47awd refs/heads/SomeInfo1
awdafawe23413f13a3r3r refs/heads/SomeInfo2
a8wd5a8w5da78d6asawd7 refs/heads/SomeInfo3
g9reh9wrg69egs7ef987e refs/heads/SomeInfo4
And i need to loop over every line of output get only the "SomeInfo" part and write it to a file in a format like this:
["SomeInfo1","SomeInfo2","SomeInfo3"]
I've tried things like this:
for i in $(some command); do
echo $i | cut -f2 -d"heads/" >> text.txt
done
But i don't know how to format it into an array without using a temporary file.
Sorry if the question is dumb and probably too easy and im sure i can figure it out on my own,but i just don't have the time for it because its just an extra conveniance feature that i personally want to implement.
Try this
# json_encoder.sh
arr=()
while read line; do
arr+=(\"$(basename "$line")\")
done
printf "[%s]" $(IFS=,; echo "${arr[*]}")
And then invoke
./your_command | json_encoder.sh
PS. I personally do this kind of data massaging with Vim.
Using Perl one-liner
$ cat petar.txt
awd54a7w6ds54awd47awd refs/heads/SomeInfo1
awdafawe23413f13a3r3r refs/heads/SomeInfo2
a8wd5a8w5da78d6asawd7 refs/heads/SomeInfo3
g9reh9wrg69egs7ef987e refs/heads/SomeInfo4
$ perl -ne ' { /.*\/(.*)/ and push(#res,"\"$1\"") } END { print "[".join(",",#res)."]\n" }' petar.txt
["SomeInfo1","SomeInfo2","SomeInfo3","SomeInfo4"]
While you should rarely ever use a script to format json, in your case you are simply parsing output into a comma-separated line with added end-caps of [...]. You can use bash parameter expansion to avoid spawning any additional subshells to obtain the last field of information in each line as follows:
#!/bin/bash
[ -z "$1" -o ! -r "$1" ] && { ## validate file given as argument
printf "error: file doesn't exist or not readable.\n" >&2
exit 1
}
c=0 ## simple flag variable
while read -r line; do ## read each line
if [ "$c" -eq '0' ]; then ## is flag 0?
printf "[\"%s\"" "${line##*/}" ## output ["last"
else ## otherwise
printf ",\"%s\"" "${line##*/}" ## output ,"last"
fi
c=1 ## set flag 1
done < file ## redirect file to loop
echo "]" ## append closing ]
Example Use/Output
Using your given data as the input file, you would get the following:
$ bash script.sh file
["SomeInfo1","SomeInfo2","SomeInfo3","SomeInfo4"]
Look things over and let me know if you have any questions.
You can also use awk without any loops I guess:
cat prev_output | awk -v ORS=',' -F'/' '{print "\042"$3"\042"}' | \
sed 's/^/[/g ; s/,$/]\n/g' > new_output
cat new_output
["SomeInfo1","SomeInfo2","SomeInfo3","SomeInfo4"]

Running diff and have it stop on a difference

I have a script running that is checking multiples directories and comparing them to expanded tarballs of the same directories elsewhere.
I am using diff -r -q and what I would like is that when diff finds any difference in the recursive run it will stop running instead of going through more directories in the same run.
All help appreciated!
Thank you
#bazzargh I did try it like you suggested or like this.
for file in $(find $dir1 -type f);
do if [[ $(diff -q $file ${file/#$dir1/$dir2}) ]];
then echo differs: $file > /tmp/$runid.tmp 2>&1; break;
else echo same: $file > /dev/null; fi; done
But this only works with files that exist in both directories. If one file is missing I won't get information about that. Also the directories I am working with have over 300.000 files so it seems to be a bit of overhead to do a find for each file and then diff.
I would like something like this to work, with and elif statement that checks if $runid.tmp contains data and breaks if it does. I added 2> after the first if statement so stderr is sent to the $runid.tmp file.
for file in $(find $dir1 -type f);
do if [[ $(diff -q $file ${file/#$dir1/$dir2}) ]] 2> /tmp/$runid.tmp;
then echo differs: $file > /tmp/$runid.tmp 2>&1; break;
elif [[ -s /tmp/$runid.tmp ]];
then echo differs: $file >> /tmp/$runid.tmp 2>&1; break;
else echo same: $file > /dev/null; fi; done
Would this work?
You can do the loop over files with 'find' and break when they differ. eg for dirs foo, bar:
for file in $(find foo -type f); do if [[ $(diff -q $file ${file/#foo/bar}) ]]; then echo differs: $file; break; else echo same: $file; fi; done
NB this will not detect if 'bar' has directories that do not exist in 'foo'.
Edited to add: I just realised I overlooked the really obvious solution:
diff -rq foo bar | head -n1
It's not 'diff', but with 'awk' you can compare two files (or more) and then exit when they have a different line.
Try something like this (sorry, it's a little rough)
awk '{ h[$0] = ! h[$0] } END { for (k in h) if (h[k]) exit }' file1 file2
Sources are here and here.
edit: to break out of the loop when two files have the same line, you may have to do the loop in awk. See here.
You can try the following:
#!/usr/bin/env bash
# Determine directories to compare
d1='./someDir1'
d2='./someDir2'
# Loop over the file lists and diff corresponding files
while IFS= read -r line; do
# Split the 3-column `comm` output into indiv. variables.
lineNoTabs=${line//$'\t'}
numTabs=$(( ${#line} - ${#lineNoTabs} ))
d1Only='' d2Only='' common=''
case $numTabs in
0)
d1Only=$lineNoTabs
;;
1)
d2Only=$lineNoTabs
;;
*)
common=$lineNoTabs
;;
esac
# If a file exists in both directories, compare them,
# and exit if they differ, continue otherwise
if [[ -n $common ]]; then
diff -q "$d1/$common" "$d2/$common" || {
echo "EXITING: Diff found: '$common'" 1>&2;
exit 1; }
# Deal with files unique to either directory.
elif [[ -n $d1Only ]]; then # fie
echo "File '$d1Only' only in '$d1'."
else # implies: if [[ -n $d2Only ]]; then
echo "File '$d2Only' only in '$d2."
fi
# Note: The `comm` command below is CASE-SENSITIVE, which means:
# - The input directories must be specified case-exact.
# To change that, add `I` after the last `|` in _both_ `sed commands`.
# - The paths and names of the files diffed must match in case too.
# To change that, insert `| tr '[:upper:]' '[:lower:]' before _both_
# `sort commands.
done < <(comm \
<(find "$d1" -type f | sed 's|'"$d1/"'||' | sort) \
<(find "$d2" -type f | sed 's|'"$d2/"'||' | sort))
The approach is based on building a list of files (using find) containing relative paths (using sed to remove the root path) for each input directory, sorting the lists, and comparing them with comm, which produces 3-column, tab-separated output to indicated which lines (and therefore files) are unique to the first list, which are unique to the second list, and which lines they have in common.
Thus, the values in the 3rd column can be diffed and action taken if they're not identical.
Also, the 1st and 2nd-column values can be used to take action based on unique files.
The somewhat complicated splitting of the 3 column values output by comm into individual variables is necessary, because:
read will treat multiple tabs in sequence as a single separator
comm outputs a variable number of tabs; e.g., if there's only a 1st-column value, no tab is output at all.
I got a solution to this thanks to #bazzargh.
I use this code in my script and now it works perfectly.
for file in $(find ${intfolder} -type f);
do if [[ $(diff -q $file ${file/#${intfolder}/${EXPANDEDROOT}/${runid}/$(basename ${intfolder})}) ]] 2> ${resultfile}.tmp;
then echo differs: $file > ${resultfile}.tmp 2>&1; break;
elif [[ -s ${resultfile}.tmp ]];
then echo differs: $file >> ${resultfile}.tmp 2>&1; break;
else echo same: $file > /dev/null;
fi; done
thanks!

bash script to select multiple file formats at once for encode process

I have a BASH script I downloaded (that works) to do some video conversions using handbrake-CLI but at the moment it only allows conversion from a single file format at a time avi to mkv only. I would like to be able to have it convert any type of input file (avi,wmv,flv...to mkv ) all at once instead of changing the script each time for each input format. How can I adjust the bash script to allow this?
So how can I get this bash script with the line input_file_type="avi" to work with input_file_type="avi,wmv,flv,mp4"
PS: I tried posting the bash script but the format got all messed up if someone knows how to post a bash script to the forum with correct formatting let me know I'll post it here instead of the link below
[http://pastebin.com/hzyxnsYY][1]
Paste the code here:
#!/bin/sh
###############################################################################
#
# Script to recursively search a directory and batch convert all files of a given
# file type into another file type via HandBrake conversion.
#
# To run in your environment set the variables:
# hbcli - Path to your HandBrakeCLI
#
# source_dir - Starting directory for recursive search
#
# input_file_type - Input file type to search for
#
# output_file_type - Output file type to convert into
#
#
# Change log:
# 2014-01-27: Initial release. Tested on ubuntu 13.10.
#
###############################################################################
hbcli=HandBrakeCLI
source_dir="/media/rt/1tera_ext/1_Video_Stuff/1 Documentary"
#source_dir="/media/rt/1tera_ext/1_Video_Stuff/1 Nova and bbc/Carbon diamonds"
input_file_type="avi"
output_file_type="mkv"
echo "# Using HandBrakeCLI at "$hbcli
echo "# Using source directory " "$source_dir"
echo "# Converting "$input_file_type" to "$output_file_type
# Convert from one file to another
convert() {
# The beginning part, echo "" | , is really important. Without that, HandBrake exits the while loop.
#echo "" | $hbcli -i "$1" -o "$2" --preset="Universal";
echo "" | $hbcli -i "$1" -t 1 --angle 1 -c 1 -o "$2" -f mkv --decomb --loose-anamorphic --modulus 2 -e x264 -q 20 --cfr -a 1,1 -E faac,copy:ac3 -6 dpl2,auto -R Auto,Auto -B 160,0 -D 0,0 --gain 0,0 --audio-fallback ffac3 --x264-profile=high --h264-level="4.1" --verbose=1
#echo "" | $hbcli -i "$1" -t 1 --angle 1 -c 1 -o "$2" -f mkv --decomb -w 640 --loose-anamorphic --modulus 2 -e x264 -q 20 --cfr -a 1,1 -E faac,copy:ac3 -6 dpl2,auto -R Auto,Auto -B 160,0 -D 0,0 --gain 0,0 --audio-fallback ffac3 --x264-profile=high --h264-level="4.1" --verbose=1
}
# Find the files and pipe the results into the read command. The read command properly handles spaces in directories and files names.
find "$source_dir" -name *.$input_file_type | while read in_file
do
echo "Processing…"
echo ">Input "$in_file
# Replace the file type
out_file=$(echo $in_file|sed "s/\(.*\.\)$input_file_type/\1$output_file_type/g")
echo ">Output "$out_file
# Convert the file
convert "$in_file" "$out_file"
if [ $? != 0 ]
then
echo "$in_file had problems" >> handbrake-errors.log
fi
echo ">Finished "$out_file "\n\n"
done
echo "DONE CONVERTING FILES"
Assuming the conversion command is the same for all file types, you can use a single find like this:
find "$source_dir" -type f -regex ".*\.\(avi\|wmv\|flv\|mp4\)" -print0 | while IFS= read -r -d $'\0' in_file
do
done
Alternatively, create an array of file types that you are interested in and loop over them:
input_file_types=(avi wmv flv mp4)
# loop over the types and convert
for input_file_type in "${input_file_types[#]}"
do
find "$source_dir" -name "*.$input_file_type" -print0 | while IFS= read -r -d $'\0' in_file
do
done
done
In order to correctly handle filenames containing whitespace and newline characters, you should use null delimited output. That's what the -print0 and read -d $'\0' is for.
You can use OR( -o) operator in find lane.
E.g.
find "$source_dir" -name *.avi -o -name *wmv -o -name *.flv -o -name *.mp4 | while read in_file
If run by bash:
#!/usr/bin/bash
input_file_type="avi|wmv|flv|mp4"
find "$source_dir" -type f|egrep "$input_file_type" | while read in_file
do
echo "Processing…"
echo ">Input "$in_file
# Replace the file type
out_file=$(in_file%.*}.${output_file_type} # replace file type with different command.
echo ">Output "$out_file
# Convert the file
convert "$in_file" "$out_file"
if [ $? != 0 ]
then
echo "$in_file had problems" >> handbrake-errors.log
fi
echo ">Finished "$out_file "\n\n"
done
Here's the final code that may help someone else, sorry about the link I didn't know how to paste bash script in here without the format getting all messed up
Link to the final code
#!/bin/bash
###############################################################################
#execute using bash mkvconv.sh
# Script to recursively search a directory and batch convert all files of a given
# file type into another file type via HandBrake conversion.
#
# To run in your environment set the variables:
# hbcli - Path to your HandBrakeCLI
#
# source_dir - Starting directory for recursive search
#
# input_file_types - Input file types to search for
#
# output_file_type - Output file type to convert into
#
#
# Change log:
# 2014-01-27: Initial release. Tested on ubuntu 13.10.
#http://stackoverflow.com/questions/21404059/bash-script-to-select-multiple-file-formats-at-once-for-encode-process/21404530#21404530
###############################################################################
hbcli=HandBrakeCLI
source_dir="/media/rt/1tera_ext/1_Video_Stuff/1 Nova and bbc/Carbon diamonds"
input_file_types=(avi wmv flv mp4 webm mov mpg)
output_file_type="mkv"
echo "# Using HandBrakeCLI at "$hbcli
echo "# Using source directory " "$source_dir"
echo "# Converting "$input_file_types" to "$output_file_type
# Convert from one file to another
convert() {
# The beginning part, echo "" | , is really important. Without that, HandBrake exits the while loop.
#echo "" | $hbcli -i "$1" -o "$2" --preset="Universal"; # dont use with preses things are left out
echo "" | $hbcli -i "$1" -t 1 --angle 1 -c 1 -o "$2" -f mkv --decomb --loose-anamorphic --modulus 2 -e x264 -q 20 --cfr -a 1,1 -E faac,copy:ac3 -6 dpl2,auto -R Auto,Auto -B 160,0 -D 0,0 --gain 0,0 --audio-fallback ffac3 --x264-profile=high --h264-level="4.1" --verbose=1
}
# loop over the types and convert
for input_file_types in "${input_file_types[#]}"
do
# Find the files and pipe the results into the read command. The read command properly handles spaces in directories and files names.
#find "$source_dir" -name *.$input_file_type | while read in_file
find "$source_dir" -name "*.$input_file_types" -print0 | while IFS= read -r -d $'\0' in_file
#In order to correctly handle filenames containing whitespace and newline characters, you should use null delimited output. That's what the -print0 and read -d $'\0' is for.
do
echo "Processing…"
echo ">Input "$in_file
# Replace the file type
out_file=$(echo $in_file|sed "s/\(.*\.\)$input_file_types/\1$output_file_type/g")
echo ">Output "$out_file
# Convert the file
convert "$in_file" "$out_file"
if [ $? != 0 ]
then
echo "$in_file had problems" >> handbrake-errors.log
fi
echo ">Finished "$out_file "\n\n"
done
done
echo "DONE CONVERTING FILES"

Create new file but add number if filename already exists in bash

I found similar questions but not in Linux/Bash
I want my script to create a file with a given name (via user input) but add number at the end if filename already exists.
Example:
$ create somefile
Created "somefile.ext"
$ create somefile
Created "somefile-2.ext"
The following script can help you. You should not be running several copies of the script at the same time to avoid race condition.
name=somefile
if [[ -e $name.ext || -L $name.ext ]] ; then
i=0
while [[ -e $name-$i.ext || -L $name-$i.ext ]] ; do
let i++
done
name=$name-$i
fi
touch -- "$name".ext
Easier:
touch file`ls file* | wc -l`.ext
You'll get:
$ ls file*
file0.ext file1.ext file2.ext file3.ext file4.ext file5.ext file6.ext
To avoid the race conditions:
name=some-file
n=
set -o noclobber
until
file=$name${n:+-$n}.ext
{ command exec 3> "$file"; } 2> /dev/null
do
((n++))
done
printf 'File is "%s"\n' "$file"
echo some text in it >&3
And in addition, you have the file open for writing on fd 3.
With bash-4.4+, you can make it a function like:
create() { # fd base [suffix [max]]]
local fd="$1" base="$2" suffix="${3-}" max="${4-}"
local n= file
local - # ash-style local scoping of options in 4.4+
set -o noclobber
REPLY=
until
file=$base${n:+-$n}$suffix
eval 'command exec '"$fd"'> "$file"' 2> /dev/null
do
((n++))
((max > 0 && n > max)) && return 1
done
REPLY=$file
}
To be used for instance as:
create 3 somefile .ext || exit
printf 'File: "%s"\n' "$REPLY"
echo something >&3
exec 3>&- # close the file
The max value can be used to guard against infinite loops when the files can't be created for other reason than noclobber.
Note that noclobber only applies to the > operator, not >> nor <>.
Remaining race condition
Actually, noclobber does not remove the race condition in all cases. It only prevents clobbering regular files (not other types of files, so that cmd > /dev/null for instance doesn't fail) and has a race condition itself in most shells.
The shell first does a stat(2) on the file to check if it's a regular file or not (fifo, directory, device...). Only if the file doesn't exist (yet) or is a regular file does 3> "$file" use the O_EXCL flag to guarantee not clobbering the file.
So if there's a fifo or device file by that name, it will be used (provided it can be open in write-only), and a regular file may be clobbered if it gets created as a replacement for a fifo/device/directory... in between that stat(2) and open(2) without O_EXCL!
Changing the
{ command exec 3> "$file"; } 2> /dev/null
to
[ ! -e "$file" ] && { command exec 3> "$file"; } 2> /dev/null
Would avoid using an already existing non-regular file, but not address the race condition.
Now, that's only really a concern in the face of a malicious adversary that would want to make you overwrite an arbitrary file on the file system. It does remove the race condition in the normal case of two instances of the same script running at the same time. So, in that, it's better than approaches that only check for file existence beforehand with [ -e "$file" ].
For a working version without race condition at all, you could use the zsh shell instead of bash which has a raw interface to open() as the sysopen builtin in the zsh/system module:
zmodload zsh/system
name=some-file
n=
until
file=$name${n:+-$n}.ext
sysopen -w -o excl -u 3 -- "$file" 2> /dev/null
do
((n++))
done
printf 'File is "%s"\n' "$file"
echo some text in it >&3
Try something like this
name=somefile
path=$(dirname "$name")
filename=$(basename "$name")
extension="${filename##*.}"
filename="${filename%.*}"
if [[ -e $path/$filename.$extension ]] ; then
i=2
while [[ -e $path/$filename-$i.$extension ]] ; do
let i++
done
filename=$filename-$i
fi
target=$path/$filename.$extension
Use touch or whatever you want instead of echo:
echo file$((`ls file* | sed -n 's/file\([0-9]*\)/\1/p' | sort -rh | head -n 1`+1))
Parts of expression explained:
list files by pattern: ls file*
take only number part in each line: sed -n 's/file\([0-9]*\)/\1/p'
apply reverse human sort: sort -rh
take only first line (i.e. max value): head -n 1
combine all in pipe and increment (full expression above)
Try something like this (untested, but you get the idea):
filename=$1
# If file doesn't exist, create it
if [[ ! -f $filename ]]; then
touch $filename
echo "Created \"$filename\""
exit 0
fi
# If file already exists, find a similar filename that is not yet taken
digit=1
while true; do
temp_name=$filename-$digit
if [[ ! -f $temp_name ]]; then
touch $temp_name
echo "Created \"$temp_name\""
exit 0
fi
digit=$(($digit + 1))
done
Depending on what you're doing, replace the calls to touch with whatever code is needed to create the files that you are working with.
This is a much better method I've used for creating directories incrementally.
It could be adjusted for filename too.
LAST_SOLUTION=$(echo $(ls -d SOLUTION_[[:digit:]][[:digit:]][[:digit:]][[:digit:]] 2> /dev/null) | awk '{ print $(NF) }')
if [ -n "$LAST_SOLUTION" ] ; then
mkdir SOLUTION_$(printf "%04d\n" $(expr ${LAST_SOLUTION: -4} + 1))
else
mkdir SOLUTION_0001
fi
A simple repackaging of choroba's answer as a generalized function:
autoincr() {
f="$1"
ext=""
# Extract the file extension (if any), with preceeding '.'
[[ "$f" == *.* ]] && ext=".${f##*.}"
if [[ -e "$f" ]] ; then
i=1
f="${f%.*}";
while [[ -e "${f}_${i}${ext}" ]]; do
let i++
done
f="${f}_${i}${ext}"
fi
echo "$f"
}
touch "$(autoincr "somefile.ext")"
without looping and not use regex or shell expr.
last=$(ls $1* | tail -n1)
last_wo_ext=$($last | basename $last .ext)
n=$(echo $last_wo_ext | rev | cut -d - -f 1 | rev)
if [ x$n = x ]; then
n=2
else
n=$((n + 1))
fi
echo $1-$n.ext
more simple without extension and exception of "-1".
n=$(ls $1* | tail -n1 | rev | cut -d - -f 1 | rev)
n=$((n + 1))
echo $1-$n.ext

Resources