I want to have a command in a variable that runs a program and specifies the output filename for it depending on the number of files exits (to work on a new file each time).
Here is what I have:
export MY_COMMAND="myprogram -o ./dir/outfile-0.txt"
However I would like to make this outfile number increases each time MY_COMMAND is being executed. You may suppose myprogram creates the file soon enough before the next call. So the number can be retrieved from the number of files exists in the directory ./dir/. I do not have access to change myprogram itself or the use of MY_COMMAND.
Thanks in advance.
Given that you can't change myprogram — its -o option will always write to the file given on the command line, and assuming that something also out of your control is running MY_COMMAND so you can't change the way that MY_COMMAND gets called, you still have control of MY_COMMAND
For the rest of this answer I'm going to change the name MY_COMMAND to callprog mostly because it's easier to type.
You can define callprog as a variable as in your example export callprog="myprogram -o ./dir/outfile-0.txt", but you could instead write a shell script and name that callprog, and a shell script can do pretty much anything you want.
So, you have a directory full of outfile-<num>.txt files and you want to output to the next non-colliding outfile-<num+1>.txt.
Your shell script can get the numbers by listing the files, cutting out only the numbers, sorting them, then take the highest number.
If we have these files in dir:
outfile-0.txt
outfile-1.txt
outfile-5.txt
outfile-10.txt
ls -1 ./dir/outfile*.txt produces the list
./dir/outfile-0.txt
./dir/outfile-1.txt
./dir/outfile-10.txt
./dir/outfile-5.txt
(using outfile and .txt means this will work even if there are other files not name outfile)
Scrape out the number by piping it through the stream editor sed … capture the number and keep only that part:
ls -1 ./dir/outfile*.txt | sed -e 's:^.*dir/outfile-\([0-9][0-9]*\)\.txt$:\1:'
(I'm using colon : instead of the standard slash / so I don't have to escape the directory separator in dir/outfile)
Now you just need to pick the highest number. Sort the numbers and take the top
| sort -rn | head -1
Sorting with -n is numeric, not lexigraphic sorting, -r reverses so the highest number will be first, not last.
Putting it all together, this will list the files, edit the names keeping only the numeric part, sort, and get just the first entry. You want to assign that to a variable to work with it, so it is:
high=$(ls -1 ./dir/outfile*.txt | sed -e 's:^.*dir/outfile-\([0-9][0-9]*\)\.txt$:\1:' | sort -rn | head -1)
In the shell (I'm using bash) you can do math on that, $[high + 1] so if high is 10, the expression produces 11
You would use that as the numeric part of your filename.
The whole shell script then just needs to use that number in the filename. Here it is, with lines broken for better readability:
#!/bin/sh
high=$(ls -1 ./dir/outfile*.txt \
| sed -e 's:^.*dir/outfile-\([0-9][0-9]*\)\.txt$:\1:' \
| sort -rn | head -1)
echo "myprogram -o ./dir/outfile-$[high + 1].txt"
Of course you wouldn't echo myprogram, you'd just run it.
you could do this in a bash function under your .bashrc by using wc to get the number of files in the dir and then adding 1 to the result
yourfunction () {
dir=/path/to/dir
filenum=$(expr $(ls $dir | wc -w) + 1)
myprogram -o $dir/outfile-${filenum}.txt
}
this should get the number of files in $dir and append 1 to that number to get the number you need for the filename. if you place it in your .bashrc or under .bash_aliases and source .bashrc then it should work like any other shell command
You can try exporting a function for MY_COMMAND to run.
next_outfile () {
my_program -o ./dir/outfile-${_next_number}.txt
((_next_number ++ ))
}
export -f next_outfile
export MY_COMMAND="next_outfile" _next_number=0
This relies on a "private" global variable _next_number being initialized to 0 and not otherwise modified.
Related
I've been trying to get a script working to backup some files from one machine to another but have been running into an issue.
Basically what I want to do is copy two files, one .log and one (or more) .dmp. Their format is always as follows:
something_2022_01_24.log
something_2022_01_24.dmp
I want to do three things with these files:
find the second to last one .log file (i.e. something_2022_01_24.log is the latest,I want to find the one before that say something_2022_01_22.log)
get a substring with just the date (2022_01_22)
copy every .dmp that matches the date (i.e something_2022_01_24.dmp, something01_2022_01_24.dmp)
For the first one from what I could find the best way is to do: ls -t *.log | head-2 as it displays the second to last file created.
As for the second one I'm more at a loss because I'm not sure how to parse the output of the first command.
The third one I think I could manage with something of the sort:
[ -f "/var/www/my_folder/*$capturedate.dmp" ] && cp "/var/www/my_folder/*$capturedate.dmp" /tmp/
What do you guys think is there any way to do this? How can I compare the substring?
Thanks!
Would you please try the following:
#!/bin/bash
dir="/var/www/my_folder"
second=$(ls -t "$dir/"*.log | head -n 2 | tail -n 1)
if [[ $second =~ .*_([0-9]{4}_[0-9]{2}_[0-9]{2})\.log ]]; then
capturedate=${BASH_REMATCH[1]}
cp -p "$dir/"*"$capturedate".dmp /tmp
fi
second=$(ls -t "$dir"/*.log | head -n 2 | tail -n 1) will pick the
second to last log file. Please note it assumes that the timestamp
of the file is not modified since it is created and the filename
does not contain special characters such as a newline. This is an easy
solution and we may need more improvement for the robustness.
The regex .*_([0-9]{4}_[0-9]{2}_[0-9]{2})\.log will match the log
filename. It extracts the date substring (enclosed with the parentheses) and assigns the bash variable
${BASH_REMATCH[1]} to it.
Then the next cp command will do the job. Please be cateful
not to include the widlcard * within the double quotes so that
the wildcard is properly expanded.
FYI here are some alternatives to extract the date string.
With sed:
capturedate=$(sed -E 's/.*_([0-9]{4}_[0-9]{2}_[0-9]{2})\.log/\1/' <<< "$second")
With parameter expansion of bash (if something does not include underscores):
capturedate=${second%.log}
capturedate=${capturedate#*_}
With cut command (if something does not include underscores):
capturedate=$(cut -d_ -f2,3,4 <<< "${second%.log}")
how can I use program as a key to sort in Unix shell? In other words to sort output of 'ls' (or any other program) by return value of a program applied on each line.
I'll give two example solutions:
A one-line command that is simpler and therefore something I'd try use first.
A bash script that allows sorting a list by output from an arbitrary bash function that reads each line of the list as input.
Example 1 (without executing command on each line)
If the question is how to, in general, sort outputs of programs like ls, below is an example specific to ls that sorts by inode. However, every program may have its own idiosyncrasies when generating its output so this example may have to be adapted:
ls -ail /home/user/ | tail -n+2 | tr -s ' ' | sort -t' ' -k1,1 -g
Here are the different parts of this command broken down:
ls -ail /home/user/
Lists all (-a) files in directory /home/user/ in list (-l) format with inode (-i).
tail -n+1
Cuts off first line from ls output.
tr -s ' '
Combines (-s) multiple spaces (' ') for sort.
sort -t ' ' -k 1 -g
Sorts list by first (1) field of integers (-g) separated by one space (' ').
Example 2 (executing command with each line as input)
Here is a more adaptable example in a bash script I worked up to show how the list of files generated from ls -a1 can be fed into bash function getinode which uses stat to output the inode for each file. A while loop repeats this process for each file, saving in comma-delimited format the data by repeatedly appending a variable named OUTPUT which at the end is sorted by sort using the first field.
The important part is that the function getinode can be anything, so long as it outputs a string. I set up getinode to receive a file path as input (first argument $1) and to then output the inode to stdout via echo $INODE. The script calls getinode via $(getinode "$FILEPATH").
#!/bin/bash
# Usage: lsinodesort.sh [file]
# Refs/attrib:
# [1]: How to sort a csv file by sorting on a single field. https://stackoverflow.com/a/44744800
# [2]: How to read a while loop variable. https://stackoverflow.com/a/16854326
WORKDIR="$1" # read directory from first argument
getinode() {
# Usage: getinode [path]
INODE="$(stat "$1" --format=%i)"
echo $INODE
}
if [ -d "$WORKDIR" ]; then
LINES="$(ls -a1 "$WORKDIR")" # save `ls` output to variable LINES
else
exit 1; # not a valid directory
fi
while read line; do
path="$WORKDIR"/"$line" # Determine path.
if [ -f "$path" ]; then # Check if path is a file.
FILEPATH="$path"
FILENAME="$(basename "$path")" # Determine filename from path.
FILEINODE=$(getinode "$FILEPATH") # Get inode.
OUTPUT="$FILEINODE"",""$FILENAME""\n""$OUTPUT" ; # Append inode and file name to OUTPUT
fi
done <<< "$LINES" # See [2].
OUTPUT=$(printf "${OUTPUT}" | sort -t, -k1,1) # sort OUTPUT. See [1]
OUTPUT="inode","filename""\n""$OUTPUT"
printf "${OUTPUT}\n" # print final OUTPUT.
When I run it on my own home folder I get output like this:
inode,filename
3932162,.bashrc
3932165,.bash_logout
3932382,.zshrc
3932454,.gitconfig
3933234,.bash_aliases
3933512,.profile
3933612,.viminfo
I'm not sure to understand your question, so I'll try to rephrase it first.
If I'm not mistaken, you want to sort the output of a program (it may be ls or any other command in a Unix shell).
I'll suggest using the pipeline feature available on Unix shell.
For instance, you can sort the output of the ls command using :
ls /home | sort
This feature is available but not limited to the ls command.
By the way, there are optional flags you can use for sorting ls command results if that's your specific use case :
ls -S # for sorting by file size
ls -t # for sorting by modification time
You can also append the --reverse or -r flag for displaying the result in reverse order.
As for the sort function, there are also flags allowing to customize your result as per your needs :
sort -n # for sorting numerically instead of alphabetically
sort -k5 # for sorting based on the 5th column
sort -t "," # for using the comma as a field separator
You can combine all of them like that for sorting the output of ‘ls -l‘ command on the basis of field 2,5 (Numeric) and 9 (Non-Numeric/alphabetically).
ls -l /home/$USER | sort -t "," -nk2,5 -k9
sort function examples
There is a program that I run with command line. The output is a file. I have to run the program with various parameters so I always have to change the output filename (otherwise it will always be the same and the older will automatically be deleted) and run the program again and again. I tried :
./program param1 param2 > result1.txt
but not surprisingly
cat result1.txt
run the program. I need a command line that will automatically rename the output file at the end of the program.
I can not change the program code.
Thanks
You can enclose your line in another script that does something like:
PARAM_1="$1"
PARAM_2="$2"
CMD="./program"
$CMD $PARAM_1 $PARAM_2 > "result-${PARAM_1}-${PARAM_2}"
The scripts calls your command and redirects the output to a filename with a name that depends on the input parameters
This works with 2 parameters, but it can be easily generalised
UPDATE:
I just though of a different version that uses MD5 for the output filename, so that it will be consistent even with long, messy parameters and it's also valid for any number of params:
#!/bin/bash
HASH="$(echo "$#" | md5sum | cut -f1 -d' ')"
CMD="./program"
"$CMD" "$#" > "result-$HASH.txt"
Just rename the output filename using nanosecond date value as:
mv result.txt "result-$(date --rfc-3339=ns).txt"
at the end of your script.
I have several header files in a directory with the format imageN.hd where N is some integer. Only one of these header files contains the text 'trans'. What I am trying to do is find which image contains this expression using csh (I need to use csh for this purpose - although I can call sed or perl one-liners) and show the corresponding image.
show iN
Here is my initial unsophisticated approach which does not work.
#find number of header files in directory
set n_images = `ls | grep 'image[0-9]*.hd' | wc -l`
foreach N(`seq 1 n_images`)
if (`more image$N{.hd} | grep -i 'trans`) then
show i$N
sc c image #this command uses an alias to set the displayed image as current within the script
endif
end
I'm not sure what is wrong with the above commands but it does not return the correct image number.
Also I'm sure there is a more elegant one line perl or sed solution but I am fairly unfamiliar with both
show `grep -l trans image[0-9]*.hd | sed 's/image/i/`
Let me first describe my situation, I am working on a Linux platform and have a collection of .bmp files that add one to the picture number from filename0022.bmp up to filename0680.bmp. So a total of 658 pictures. I want to be able to run each of these pictures through a .exe file that operates on the picture then kicks out the file to a file specified by the user, it also has some threshold arguments: lower, upper. So the typical call for the executable is:
./filter inputfile outputfile lower upper
Is there a way that I can loop this call over all the files just from the terminal or by creating some kind of bash script? My problem is similar to this: Execute a command over multiple files with a batch file but this time I am working in a Linux command line terminal.
You may be interested in looking into bash scripting.
You can execute commands in a for loop directly from the shell.
A simple loop to generate the numbers you specifically mentioned. For example, from the shell:
user#machine $ for i in {22..680} ; do
> echo "filename${i}.bmp"
> done
This will give you a list from filename22.bmp to filename680.bmp. That simply handles the iteration of the range you had mentioned. This doesn't cover zero padding numbers. To do this you can use printf. The printf syntax is printf format argument. We can use the $i variable from our previous loop as the argument and apply the %Wd format where W is the width. Prefixing the W placeholder will specify the character to use. Example:
user#machine $ for i in {22..680} ; do
> echo "filename$(printf '%04d' $i).bmp"
> done
In the above $() acts as a variable, executing commands to obtain the value opposed to a predefined value.
This should now give you the filenames you had specified. We can take that and apply it to the actual application:
user#machine $ for i in {22..680} ; do
> ./filter "filename$(printf '%04d' $i).bmp" lower upper
> done
This can be rewritten to form one line:
user#machine $ for i in {22..680} ; do ./filter "filename$(printf '%04d' $i).bmp" lower upper ; done
One thing to note from the question, .exe files are generally compiled in COFF format where linux expects an ELF format executable.
here is a simple example:
for i in {1..100}; do echo "Hello Linux Terminal"; done
to append to a file:(>> is used to append, you can also use > to overwrite)
for i in {1..100}; do echo "Hello Linux Terminal" >> file.txt; done
You can try something like this...
#! /bin/bash
for ((a=022; a <= 658 ; a++))
do
printf "./filter filename%04d.bmp outputfile lower upper" $a | "sh"
done
You can leverage xargs for iterating:
ls | xargs -i ./filter {} {}_out lower upper
Note:
{} corresponds to one line output from the pipe, here it's the inputfile name.
Output files wouldbe named with postfix '_out'.
You can test that AS-IS in your shell :
for i in *; do
echo "$i" | tr '[:lower:]' '[:upper:]'
done
If you have a special path, change * by your path + a glob : Ex :
for i in /home/me/*.exe; do ...
See http://mywiki.wooledge.org/glob
This while prepend the name of the output images like filtered_filename0055.bmp
for i in *; do
./filter $i filtered_$i lower upper
done