How not to overwrite an existing file when both are named by RANDOM special variable

How not to overwrite an existing file when both are named by RANDOM special variable - linux

I have written a basic script for saving the not-empty error files from the different parts of a big program because they are deleted after a while.
I thought I solved the problem by saving them under different names using the bash special variable $RANDOM. It worked well, but now I just realized that I have lost some error files, probably because it was overwritten by random naming procedure. How can I save the new (not empty) error files as well without overwriting the older ones?
My script is:
while [ ! -e ${myfile} ]; do
for FILE in $( find dirnames -name job.err )
do
if [[ -s ${FILE} ]] ; then
echo ${FILE} >> LIST
cp ${FILE} COLLECT/job_${RANDOM}.err
fi
done
sleep 3600
done

You could use mktemp to create a file name that's guaranteed to be unique. For example:
cp "${FILE}" "COLLECT/$(mktemp job_XXXXXXXXX)"
You lose the ".err" suffix in the above case, but you could work around that with some additional code if you really want it.

Well, for a simple solution, just add the time since the epoch in nanoseconds to the filename. While it won't guarantee an absence of collisions, it will at least make them very unlikely, especially when combined with the random value. E.g.
cp ${FILE} COLLECT/job_${RANDOM}-$(date +%s.%N).err

Using $RANDOM gives you an 1 in 32767 chance of having a name collision. That's pretty likely.
You already know how to check if a file exists, use that same method to approach your file copy:
copy="COLLECT/job_${RANDOM}.err"
while [[ -e "$copy" ]] ; do
copy="COLLECT/job_${RANDOM}.err"
done
cp "${FILE}" "$copy"
Of course, the closer you get to 32767 files, the harder it will try to find a free destination. It would be better to use a scheme that didn't depend on randomness.

Related

How to rename multiple files in linux and store the old file names with the new file name in a text file?

I am a novice Linux user. I have 892 .pdb files, I want to rename all of them in a sequential order as L1,L2,L3,L4...........,L892. And then I want a text file which contains the old names assigned to new names ( i.e L1,L2,L3). Please help me with this. Thank you for your time.

You could just do:
#!/bin/sh
i=0
for f in *.pdb; do
: $((i += 1))
mv "$f" L"$i" && echo "$f --> L$i"
done > filelist
Note that you probably want to move the files into a different directory, as that will make it easier to recover if an error occurs midway through. Also be wary that this will overwrite any existing files and potentially cause a big mess. It's not idempotent (you can't run it twice). You would probably be better off not doing the move at all and instead do something like:
#!/bin/sh
i=0
mkdir -p newfiles
for f in *.pdb; do
ln "$f" newfiles/L"$((++i))" && printf "%s\0%s\0" "$f" "L$i"
done > filelist
This latter solution creates links to the original files in a subdirectory, so you can run it multiple times without munging the original data. Also, it uses null separators in the file list so you can unambiguously distinguish names that have newlines or tabs or spaces in them. It makes for a list that is not particularly human readable, but you can easily filter it through tr to make it pretty.

bash compare files between folders, and if they don't exit, do something

I have a folder with regular pictures, and another with resized ones.
The goal is to check if a picture is not resized, do the resizing and save in another folder.
I'm using an echo for simplicity, because I don't have the comparison working.
for file in ../regular/*.jpg;
do
img=`basename "$file"`
FILE=./resized/$img
if [ ! -f "$FILE" ]; then
echo "$img NOT RESIZED"
fi
done
This code just echoes NOT RESIZED for all the pictures in the regular folder i.e. it doesn't seem to make the comparison at all.
Where is my mistake?

for file in ../regular/*.jpg;
FILE=./resized/$img
Try to use absolute path, You can also add echo $FILE to see what scripts tries to verify
If this directory contains a huge amount of files, you can exceed command line length limit (usually ~4kb-32kb)
You are using quotas in basename command, why? If your images could contain spaces, you should use quotas also in "if" command, check script below
for file in ../regular/*.jpg;
do
img=$(basename "$file")
if [ ! -f "./resized/$img" ]; then
echo "$img NOT RESIZED"
fi
done

You should try to use diff command to compare directories:
diff -r "$PATH1" "$PATH2"

find returning inverted results

In a few words a wrote this little script to clean up some directories where I had consolidated directories/files from multiple sources where I used the cp command with the --backup=numbered feature so that files with identical names would have a suffix like .~1~ appended to avoid overwriting. I then ran fdupes to remove duplicate files, in some cases fdupes removed the file which did not have the suffix appended from the cp command (the original file) so I wanted to scan the directories looking for files with the suffix appended by the cp command and if the file does not exist with the suffix removed I would move mv the file otherwise I would leave it to avoid deleting anything as fdupes did not think it was a duplicate.
The issues is the test condition if [ -f ... ] part of the code below returns inverted results than what it should and I cannot understand why. For example, when the file exists it would return false and when the file did not exist it would return true. I fixed it by reversing the actions that I wanted to do based on the inverted return code and verified it was working as intended and it was so I ran it as such but would like to know if anyone knows why it would behave the way it did. I am not a bash script expert by any means so its possible that I missed something simple.
#!/bin/bash
logfile=$$.log
exec > $logfile 2>&1
IFS='
'
#set -f
for FILE in $(find . -type f -regextype posix-extended -regex '^.*(\.~[0-9]+~)+$')
do
FILE2=${FILE%%.~[0-9]*} # remove the suffix
if [ -f "${FILE2}" ]
then
echo ERROR: "${FILE2}" already exists!
else
echo "${FILE}" renamed "${FILE2}"
mv "${FILE}" "${FILE2}"
fi
done

You might be able to see the problem by modifying your script to show both FILE and FILE2 in the error message. There are a few minor problems with the script which could cause some confusion (but not the "inverted" logic):
find output is not sorted. If you had more than one backup file, a randomly chosen one would replace the original file;
you could sort the output using an expression like |sort -t~ -n -k2 on the end of the find-command.
the regular expression allows multiple matches of the ~[0-9]~ pattern. Conceivably you could have some odd file which ends with ~1~~2~.
the part where the suffix is removed assumes a single ~[0-9]~ is on the end of the filename. An embedded ~0, e.g., foo~0bar~1~ would reduce FILE to foo. The workaround for that would be more cumbersome (since the suffix-stripping uses globbing), but could be done with a case statement which matched an explicit number of digits (likely three digits would be enough).

Centos copy file into another file, if exists, create a version

Does anyone know of a way to (via bash) setup a "versioning" copy of a file into another? For example: I am copying file into file.bak. If file.bak exists, I am currently overwriting. What I'd like to do is set it up so that it creates multiple files: file, file.bak, file.bak.1, file.bak.2, etc...
Right now, I'm using:
cp -rf file file.bak
This currently overwrites the file(as expected)

or:
cp --backup=t file1 file2
repeat few times to see the result...
see https://www.gnu.org/software/coreutils/manual/html_node/cp-invocation.html

Simply use a test
[ -e file.bak ] && cp -r file file.bak.$(date +%s) || cp -r file file.bak
This will create a unique backup if file.bak already exists in the form file.bak.1411505497

There are many ways to skin this cat.
Since you're using Linux, it's likely you've got the GNU mv command, which may include a --backup option. You could wrap this in a shell function:
bkp() {
file="$1"
if [ -f "$file" ]; then
/bin/mv -v --backup=numbered "$(mktemp ${file}XXX)" "$file"
#/bin/rm "$file"
fi
}
You can put this in your .bashrc, for example. Then you can use this as follows:
# bkp foo
This will copy foo to numbered backup files. You can uncomment the rm if this is, for example, a log file that you're rotating.
Another option, which is more portable to operating systems that don't use GNU tools (i.e. FreeBSD, OSX) might be something like this quick-and-dirty solution might work:
bkp() {
file="$1"
if [ -f "$file" ]; then
# increment existing files up to 10
for n in {9..1}; do
if [ -f $file.$n ]; then
# remove -v if you want less noise.
mv -v "${file}.$n" "${file}.$[n+1]"
fi
done
# move the original to first backup position
mv "$file" "$file.1"
else
echo "Not found: $file" >&2
fi
}
It suffers in that it won't compact your list of files (and will throw errors) if some numbers are missing, but that's stuff you can add if it's important. You'd use it pretty much the same way, changing the final mv to a cp if you need to keep the original in place.
Final option I'll mention is in comments as well. Since you've said that you're using this solution to back up "system files" (which I assume you mean to be things in /etc/) you should consider using an actual version control system to control your versions of these files.
Many options exist, but I'd recommend RCS for its simplicity and low overhead. Simply install the package, mkdir /etc/RCS to keep your /etc directory clean, read the man pages for rcs, ci, co, rlog, rcsdiff and perhaps rcsintro, and you're good to go. You'll get better control of diffs and history, opportunity for comments, none of the overhead of a repository for a large VCS like SVN or Git. I've been using this on various servers for years, as RCS is still built in to the base system in FreeBSD. :)

Shell script best way to remove files not in a pair

I have a set of files that come in pairs:
/var/log/messages-20111001
/var/log/messages-20111001.hash
I've had several of these rotate away and now I'm left with a ton of /var/log/messages-201110xx.hash files with no associated log. I'd like to clean up the mess, but I'm uncertain how to remove a file that isn't part of a "pair". I can use bash or zsh (or any LSB tool, really). I need to remove all the .hash files that don't have an associated log.
Example
/var/log/messages-20111001.hash
/var/log/messages-20111002.hash
/var/log/messages-20111003.hash
/var/log/messages-20111004.hash
/var/log/messages-20111005
/var/log/messages-20111005.hash
/var/log/messages-20111006
/var/log/messages-20111006.hash
Should be reduced to:
/var/log/messages-20111005
/var/log/messages-20111005.hash
/var/log/messages-20111006
/var/log/messages-20111006.hash

for file in *.hash; do test -f "${file%.hash}" || rm -- "$file"; done

Something like this?
for f in /var/log/messages-????????.hash ; do
[[ -e "${f%.hash}" ]] || rm "$f"
done

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How not to overwrite an existing file when both are named by RANDOM special variable - linux

You could use mktemp to create a file name that's guaranteed to be unique. For example: cp "${FILE}" "COLLECT/$(mktemp job_XXXXXXXXX)" You lose the ".err" suffix in the above case, but you could work around that with some additional code if you really want it.

Well, for a simple solution, just add the time since the epoch in nanoseconds to the filename. While it won't guarantee an absence of collisions, it will at least make them very unlikely, especially when combined with the random value. E.g. cp ${FILE} COLLECT/job_${RANDOM}-$(date +%s.%N).err

Related

How to rename multiple files in linux and store the old file names with the new file name in a text file?

bash compare files between folders, and if they don't exit, do something

find returning inverted results

Centos copy file into another file, if exists, create a version

Shell script best way to remove files not in a pair

Categories

Resources