Linux: append all filenames in path to text file - linux

I want to add the filenames of all files of a certain type (*.cub) in the path to a text file in the same path. This file will become the batch (.submit) file. That I can run overnight. I also need to adapt the name a bit.
I do not really know how to describe it better, so I'll give an example:
Let's say I have three files: 001.cub, 002.cub & 003.cub
Then the final text file must be:
[program] -i 001.cub -o 001.vdb
[program] -i 002.cub -o 002.vdb
[program] -i 003.cub -o 003.vdb
It seems a fairly easy operation, but I simply can't get it right.
Also, it really has to become a .submit (or at least some text) file. I cannot run the program immediately.
I hope someone can help!

A simple for loop will do the job:
for i in *.cub
b=$(basename "$i" .cub)
echo "program -i \"$b.cub\" -o \"$b.vdb\""
done >output.txt

Create an empty sh file
List the files *.cub and loop through them
Store the sequence by splitting on dot [.]
echo the required string and append to the sh file of step 1
echo -n "" > 'Run.sh'
for filename in `ls *.cub`
do
sequence=`echo $filename | cut -d "." -f1`
echo "Program -i $filename -o $sequence.vdb" >> Run.sh
done
Directly put the stream into the file as below:
for filename in `ls *.cub`
do
sequence=`echo $filename | cut -d "." -f1`
echo "Program -i $filename -o $sequence.vdb"
done > Run.sh
For everything before the extension to be retained in the variable:
for filename in `ls *.cub`
do
sequence=`echo $filename | rev | cut -d "." -f2- | rev`
echo "Program -i $filename -o $sequence.vdb"
done > Run.sh
For extracting only the numbers from the filename and use accordingly:
for filename in `ls *.cub`
do
sequence=`echo $filename | sed 's/[^0-9]*//g'`
echo "Program -i $filename -o $sequence.vdb"
done > Run.sh

This oneliner will do what you want:
ls *.cub | sort | awk '{split($1,x,"."); print "[program] -i "$1" -o "x[1]".vdb "}' > something.sh

Related

Copy files containing a word and not containing other. / grep not working with for loop

I am new to Linux and got stuck when I tried to used pipe grep or find commands. I need to find a file with:
name pattern request_q_t.xml
contains "Phrase 1"
not contains "word 2" copy it to specific location.
I tried pipe grep command to locate the file and than copy.
for filename in $(grep --include=request_q*_t*.xml -li '"phrase 1"' $d/ | xargs grep -L '"word 2"')
do
echo "coping file: '$filename'"
cp $filename $outputpath
filefound=true
done
When I tried this grep command in command line its working fine
grep --include=request_q*_t*.xml -li '"phrase 1"' $d/ | xargs grep -L '"word 2"'
but I am getting error in for loop. for some reason output of grep command is
(Standard Input)
(Standard Input)
(Standard Input)
(Standard Input)
I am not sure what I am doing wrong.
what is the efficient way to do it.. Its a huge filesystem I have to search in.
find . -name "request_q*_t*.xml" -exec sh -c "if grep -q phrase\ 1 {} && ! grep -q word\ 2 {} ;then cp {} /path/to/somewhere/;fi;" \;
You can use AWK for this in combination with xargs. The problem is that you have to read all files completely as they cannot contain that single string, but you can also terminate early if that string is found:
awk '(FNR==1){if(a) print fname; fname=FILENAME; a=0}
/Phrase 1/{a=1}
/Word 2/{a=0;nextfile}
END{if(a) print fname}' request_q*_t*.xml \
| xargs -I{} cp "{}" "$outputpath"
If you want to store "Phrase 1" and "Word 2" in variables, you can use:
awk -v include="Phrase 1" -v exclude="Word 2" \
'(FNR==1){if(a) print fname; fname=FILENAME; a=0}
($0~include){a=1}
($0~exclude){a=0;nextfile}
END{if(a) print fname}' request_q*_t*.xml \
| xargs -I{} cp "{}" "$outputpath"
You can nest the $() constructs:
for filename in $( grep -L '"word 2"' $(grep --include=request_q*_t*.xml -li '"phrase 1"' $d/ ))
do
echo "coping file: '$filename'"
cp $filename $outputpath
filefound=true
done

shell one-liner to cat a file only if its ASCII format file

Is it possible to write a one-liner to cat a file only if it is a text file, and not if it is a binary?
Something like:
echo "/root/mydir/foo" | <if file is ASCII then cat "/root/mydir/foo"; else echo "file is a binary">
You can use the output of the file command with the --mime and -b option.
$ file -b --mime filename.bin
application/octet-stream; charset=binary
The -b option suppresses the filename from being printed in the output so you don't have to worry about false matching the filename and --mime will give you the character set.
You can use grep to test for the occurrence of charset=binary
$ file -b --mime filename.bin | grep -q "charset=binary"
You can then use the exit status of grep and the &&, || operators to cat the file or echo a message.
$ echo filename | xargs -I% bash -c 'file -b --mime % | grep -q "charset=binary" || cat % && echo "binary file"'
Finally xargs is used to plug in the filename from the previous command echo filename and replace the symbol % in our binary testing command.
filename=$(echo "/root/mydir/foo")
if file "$filename" | grep -q "ASCII text"; then cat "$filename"; else echo "file is a binary"; fi
But why does it have to be on one line? It's much more readable if you spread it out:
filename=$(echo "/root/mydir/foo")
if file "$filename" | grep -q "ASCII text"
then cat "$filename"
else echo "file is a binary"
fi
With $filename being your file,
grep -P '[\x80-\xFF]' "$filename" && echo "file is a binary" || cat "$filename"
will be tantamount to cat if and only if $filename does not contain any non-ASCII character.
Note that GNU grep is required, since the one-liner requires Perl-style pattern matching capabilities (-P).
Not exactly what the op is asking, but strings is dead simple:
> strings --help
Usage: strings [option(s)] [file(s)]
Display printable strings in [file(s)] (stdin by default)
strings /root/mydir/foo

Creating a directory name based on a file name

In my script I am taking a text file and splitting into sections. Before doing any splitting, I am reformatting the name of the text file. PROBLEM: Creating a folder/directory and naming it the formatted file name. This is where segments are placed. However the script breaks when the text file has spaces in it. But that is the reason I am trying to reformat the name first and then do the rest of the operations. How could I do so in that sequence?
execute script: text_split.sh -s "my File .txt" -c 2
text_split.sh
# remove whitespace and format file name
FILE_PATH="/archive/"
find $FILE_PATH -type f -exec bash -c 'mv "$1" "$(echo "$1" \
| sed -re '\''s/^([^-]*)-\s*([^\.]*)/\L\1\E-\2/'\'' -e '\''s/ /_/g'\'' -e '\''s/_-/-/g'\'')"' - {} \;
sleep 1
# arg1: path to input file / source
# create directory
function fallback_out_file_format() {
__FILE_NAME=`rev <<< "$1" | cut -d"." -f2- | rev`
__FILE_EXT=`rev <<< "$1" | cut -d"." -f1 | rev`
mkdir -p $FILE_PATH${__FILE_NAME};
__OUT_FILE_FORMAT="$FILE_PATH${__FILE_NAME}"/"${__FILE_NAME}-part-%03d.${__FILE_EXT}"
echo $__OUT_FILE_FORMAT
exit 1
}
# Set variables and default values
OUT_FILE_FORMAT=''
# Grab input arguments
while getopts “s:c” OPTION
do
case $OPTION in
s) SOURCE=$(echo "$OPTARG" | sed 's/ /\\ /g' ) ;;
c) CHUNK_LEN="$OPTARG" ;;
?) usage
exit 1
;;
esac
done
if [ -z "$OUT_FILE_FORMAT" ] ; then
OUT_FILE_FORMAT=$(fallback_out_file_format $SOURCE)
fi
Your script takes a filename argument, specified with -s, then modifies a hard-coded directory by renaming the files it contains, then uses the initial filename to generate an output directory and filename. It definitely sounds like the workflow should be adjusted. For instance, instead of trying to correct all the bad filenames in /archive/, just fix the name of the file specified with -s.
To get filename and extension, use bash's string manipulation ability, as shown in this question:
filename="${fullfile##*/}"
extension="${filename##*.}"
name="${filename%.*}"
You can trim whitespace from the input string using tr -d ' '.
You can then join this to your FILE_PATH variable with something like this:
FILE_NAME=$(echo $1 | tr -d ' ')
FILE_PATH="/archive/"
FILE_PATH=$FILE_PATH$FILE_NAME
You can escape the space using a back slash \
Now the user may not always provide with the back slash, so the script can use sed to convert all (space) to \
sed 's/ /\ /g'
you can obtain the new directory name as
dir_name=`echo $1 | sed 's/ /\ /g'

Bash substring from position not printing

I am using the following format #{string:start:length} to extract the file name from wget's .listing file, line by line.
The format for the file is something I think we are all familiar with:
04-30-13 01:41AM 7033614 some_archive.zip
04-29-13 08:13PM <DIR> DIRECTORY NAME 1
04-29-13 05:41PM <DIR> DIRECTORY NAME 2
All file names start at pos:40, so setting :start to 39, with no :length should (and does) return the file name for each line:
#!/bin/bash
cat .listing | while read line; do
file="${line:40}"
echo $file
done
Correctly returns:
some_archive.zip
DIRECTORY NAME 1
DIRECTORY NAME 2
However, if I get any more creative, it breaks:
#!/bin/bash
cat .listing | while read line; do
file="${line:40}"
dir=$(echo $line | egrep -o '<DIR>' | head -n1)
if [ $dir ]; then
echo "the file $file is a $dir"
fi
done
Returns:
$ ./test.sh
is a <DIR>ECTORY NAME 1
is a <DIR>ECTORY NAME 2
What gives? I lose "the file " and the rest of the test looks like it prints on top of "the file DIRECTORY NAME 1" from pos:0.
It's weird, what's it on account of?
The answer, as I am learning more and more with linux as I progress, is non-printing control characters.
Adding a pipe to egrep for only printing characters solved the problem:
#!/bin/bash
cat .listing | while read line; do
file=$(echo ${line:39} | egrep -o '[[:print:]]+' | head -n1)
dir=$(echo $line | egrep -o '<DIR>' | head -n1)
if [ $dir ]; then
echo "the file $file is a $dir"
fi
done
Correctly returns:
$ ./test.sh
the file DIRECTORY NAME 1 is a <DIR>
the file DIRECTORY NAME 2 is a <DIR>
Wish there were a better way to visualize these control characters, but what the above does is basically take the string segment, pull out the first string of printable characters, and assign it to the variable.
I assume there is a control character at the end of the line that returns the cursor to the beginning of the line. Causing the rest of the echo to be printed there, overwriting the previous characters.'
Odd.
You can remove the \r control characters from the whole file by using the tr command on the first line of your script:
#!/bin/bash
cat .listing | tr -d '\015' | while read line; do
file="${line:39}"
dir=$(echo $line | egrep -o '<DIR>' | head -n1)
if [ $dir ]; then
echo "the file $file is a $dir"
fi
done

File that autoruns itself

The same way it’s possible to write a file that autoextracts itself, I’m looking for a way to autorun a program within a script (or whatever it needs). I want the program part of the script, because I just want one file. It’s actually a challenge: I have a xz compressed program, and I wanna be able to run it without any intervention of the xz program by the user (just a ./theprogram).
Any idea?
Autorun after doing what? Login? Call it in ~/.bashrc. During boot? Write an appropriate /etc/init.d/yourprog and link it to the desired runlevel. Selfextract? Make it a shell archive (shar file). See the shar utility, http://linux.die.net/man/1/shar
Sorry but I was just thinking... Something like this would not work?
(I am assuming it is a script...)
#!/bin/bash
cat << 'EOF' > yourfile
yourscript
EOF
chmod +x yourfile
./yourfile
Still, it's pretty hard to understand exactly what you are trying to do... it seems to me that the "autorun" is pretty similar to a "call the program from within the script"..
I had written a script for this. This should help:
#!/bin/bash
set -e
payload=$(cat $0 | grep --binary-files=text -n ^PAYLOAD: | cut -d: -f1 )
filaname=`head $0 -n $payload | tail -n 1 | cut -d: -f2-`
tail -n +$(( $payload + 1 )) $0 > /tmp/$filaname
set +e
#Do whatever with the payload
exit 0
#Command to add payload:
#read x; ls $x && ( cp 'binary_script.sh' ${x}_binary_script.sh; echo $x >> ${x}_binary_script.sh; cat $x >> ${x}_binary_script.sh )
#Note: Strictly NO any character after "PAYLOAD:", not even newline...
PAYLOAD:
Sample usage:
Suppose myNestedScript.sh contains below data:
#!/bin/bash
echo hello world
Then run
x=myNestedScript.sh; ls $x && ( cp 'binary_script.sh' ${x}_binary_script.sh; echo $x >> ${x}_binary_script.sh; cat $x >> ${x}_binary_script.sh )
It will generate below file, which you can directly execute. Upon executing below file, it will extract myNestedScript.sh to /tmp & run that script.
#!/bin/bash
set -e
payload=$(cat $0 | grep --binary-files=text -n ^PAYLOAD: | cut -d: -f1 )
filaname=`head $0 -n $payload | tail -n 1 | cut -d: -f2-`
tail -n +$(( $payload + 1 )) $0 > /tmp/$filaname
set +e
chmod 755 /tmp/$filaname
/tmp/$filaname
exit 0
PAYLOAD:myNestedScript.sh
#!/bin/bash
echo hello world

Resources