Move and copy text file in bash script

Move and copy text file in bash script - linux

I referred to all the previous responses to this question from stackoverflow and tried out the following. But unfortunately, I am still encountering an issue.
I have a text file named Rels_obs inside my directory home/manuela/PycharmProjects/knowledgegraphidentification/data. As the script runs, it extracts a kgis.tar.gz compressed folder and extracts it in the following manner.
#!/bin/bash
readonly DATA_URL='https://linqs-data.soe.ucsc.edu/public/psl-examples-data/kgi.tar.gz'
readonly DATA_FILE='kgis.tar.gz'
readonly DATA_DIR='kgi'
function main() {
trap exit SIGINT
check_requirements
fetch_file "${DATA_URL}" "${DATA_FILE}" 'data'
extract_tar "${DATA_FILE}" "${DATA_DIR}" 'data'
}
The extraction results in two directories within the data directory found at home/manuela/PycharmProjects/knowledgegraphidentification/data :
eval directory : home/manuela/PycharmProjects/knowledgegraphidentification/data/kgi/eval
learn directory : home/manuela/PycharmProjects/knowledgegraphidentification/data/kgi/learn
What I want to do is to copy my Rels_obs file to both of these newly available directories, eval and learn.
I tried doing the following but it resulted in an error as shown below.
#!/bin/bash
readonly DATA_URL='https://linqs-data.soe.ucsc.edu/public/psl-examples-data/kgi.tar.gz'
readonly DATA_FILE='kgis.tar.gz'
readonly DATA_DIR='kgi'
function main() {
trap exit SIGINT
check_requirements
fetch_file "${DATA_URL}" "${DATA_FILE}" 'data'
extract_tar "${DATA_FILE}" "${DATA_DIR}" 'data'
echo "COPYING"
//I have only one file that is of plain text format within the data directory
for file in ~/PycharmProjects/knowledgegraphidentification/data/*.txt
do
name="$(basename "$file" .txt)"
cp "$file" "~/PycharmProjects/knowledgegraphidentification/data/kgi/eval"
cp "$file" "~/PycharmProjects/knowledgegraphidentification/data/kgi/learn"
done
echo "SUCCESSFULLY COPIED FILES"
}
Error
COPYING
cp: cannot stat '/home/manuelanayantarajeyaraj/PycharmProjects/knowledgegraphidentification/data/.txt': No such file or directory
cp: cannot stat '/home/manuelanayantarajeyaraj/PycharmProjects/knowledgegraphidentification/data/.txt': No such file or directory
ls -l on the data directory
total 20524
-rwxr-xr-x 1 manuelanayantarajeyaraj manuelanayantarajeyaraj 2210
Feb 5 15:19 fetchData.sh
drwxrwxr-x 4 manuelanayantarajeyaraj manuelanayantarajeyaraj 4096
Nov 19 2017 kgi
-rw-rw-r-- 1 manuelanayantarajeyaraj manuelanayantarajeyaraj 18546351
Feb 5 15:21 kgis.tar.gz
-rw-rw-r-- 1 manuelanayantarajeyaraj manuelanayantarajeyaraj 2459319
Feb 5 13:31 Rels_obs
Any suggestions in this regard will be highly appreciated.

It says no such file or directory, which is visible from the file name:
/home/manuelanayantarajeyaraj/PycharmProjects/knowledgegraphidentification/data/.txt
It doesn't parse an asterisk in the for loop.
As a hot fix you can use:
SRC_DIR="~/PycharmProjects/knowledgegraphidentification/data/"
for file in $(ls -1 ${SRC_DIR}/*.txt); do
cp "$file" "${SRC_DIR}/kgi/eval"
cp "$file" "${SRC_DIR}/kgi/learn"
done
echo "SUCCESSFULLY COPIED FILES"
Or you can just copy then by two commands:
SRC_DIR="~/PycharmProjects/knowledgegraphidentification/data/"
cp ${SRC_DIR}/*.txt ${SRC_DIR}/kgi/eval/
cp ${SRC_DIR}/*.txt ${SRC_DIR}/kgi/learn/

You mention a text file named Rels_obs, but is it named that or is it named Rels_obs.txt? Linux (text) files do not need an extension per se, extensions are purely for the user's convenience. So if the file does not have the extension, this script will not find it since it is looking for the extension.
If you're sure this file will always have the same name, and you only want this file, I'd simply use the name.
ie.
cp "~/PycharmProjects/knowledgegraphidentification/data/Rels_obs" "~/PycharmProjects/knowledgegraphidentification/data/kgi/eval"
cp "~/PycharmProjects/knowledgegraphidentification/data/Rels_obs" "~/PycharmProjects/knowledgegraphidentification/data/kgi/learn"
If you really want all plaintext files in that directory, you could do this:
for file in ~/PycharmProjects/knowledgegraphidentification/data/*
do
test "`file $file`" =~ "ASCII text" || continue
cp "$file" "~/PycharmProjects/knowledgegraphidentification/data/kgi/eval"
cp "$file" "~/PycharmProjects/knowledgegraphidentification/data/kgi/learn"
done
Or you could rename the file to add the .txt extension first somehow.

Related

Using bash to loop through nested folders to run script in current working directory

I've got (what feels like) a fairly simple problem but my complete lack of experience in bash has left me stumped. I've spent all day trying to synthesize a script from many different SO threads explaining how to do specific things with unintuitive commands, but I can't figure out how to make them work together for the life of me.
Here is my situation: I've got a directory full of nested folders each containing a file with extension .7 and another file with extension .pc, plus a whole bunch of unrelated stuff. It looks like this:
Folder A
Folder 1
Folder x
data_01.7
helper_01.pc
...
Folder y
data_02.7
helper_02.pc
...
...
Folder 2
Folder z
data_03.7
helper_03.pc
...
...
Folder B
...
I've got a script that I need to run in each of these folders that takes in the name of the .7 file as an input.
pc_script -f data.7 -flag1 -other_flags
The current working directory needs to be the folder with the .7 file when running the script and the helper.pc file also needs to be present in it. After the script is finished running, there are a ton of new files and directories. However, I need to take just one of those output files, result.h5, and copy it to a new directory maintaining the same folder structure but with a new name:
Result Folder/Folder A/Folder 1/Folder x/new_result1.h5
I then need to run the same script again with a different flag, flag2, and copy the new version of that output file to the same result directory with a different name, new_result2.h5.
The folders all have pretty arbitrary names, though there aren't any spaces or special characters beyond underscores.
Here is an example of what I've tried:
#!/bin/bash
DIR=".../project/data"
for d in */ ; do
for e in */ ; do
for f in */ ; do
for PFILE in *.7 ; do
echo "$d/$e/$f/$PFILE"
cd "$DIR/$d/$e/$f"
echo "Performing operation 1"
pc_script -f "$PFILE" -flag1
mkdir -p ".../results/$d/$e/$f"
mv "results.h5" ".../project/results/$d/$e/$f/new_results1.h5"
echo "Performing operation 2"
pc_script -f "$PFILE" -flag 2
mv "results.h5" ".../project/results/$d/$e/$f/new_results2.h5"
done
done
done
done
Obviously, this didn't work. I've also tried using find with -execdir but then I couldn't figure out how to insert the name of the file into the script flag. I'd appreciate any help or suggestions on how to carry this out.

Another, perhaps more flexible, approach to the problem is to use the find command with the -exec option to run a short "helper-script" for each file found below a directory path that ends in ".7". The -name option allows find to locate all files ending in ".7" below a given directory using simple file-globbing (wildcards). The helper-script then performs the same operation on each file found by find and handles moving the result.h5 to the proper directory.
The form of the command will be:
find /path/to/search -type f -name "*.7" -exec /path/to/helper-script '{}` \;
Where the -f option tells find to only return files (not directories) ending in ".7". Your helper-script needs to be executable (e.g. chmod +x helper-script) and unless it is in your PATH, you must provide the full path to the script in the find command. The '{}' will be replaced by the filename (including relative path) and passed as an argument to your helper-script. The \; simply terminates the command executed by -exec.
(note there is another form for -exec called -execdir and another terminator '+' that can be used to process the command on all files in a given directory -- that is a bit safer, but has additional PATH requirements for the command being run. Since you have only one ".7" file per-directory -- there isn't much benefit here)
The helper-script just does what you need to do in each directory. Based on your description it could be something like the following:
#!/bin/bash
dir="${1%/*}" ## trim file.7 from end of path
cd "$dir" || { ## change to directory or handle error
printf "unable to change to directory %s\n" "$dir" >&2
exit 1
}
destdir="/Result_Folder/$dir" ## set destination dir for result.h5
mkdir -p "$destdir" || { ## create with all parent dirs or exit
printf "unable to create directory %s\n" "$dir" >&2
exit 1
}
ls *.pc 2>/dev/null || exit 1 ## check .pc file exists or exit
file7="${1##*/}" ## trim path from file.7 name
pc_script -f "$file7" -flags1 -other_flags ## first run
## check result.h5 exists and non-empty and copy to destdir
[ -s "result.h5" ] && cp -a "result.h5" "$destdir/new_result1.h5"
pc_script -f "$file7" -flags2 -other_flags ## second run
## check result.h5 exists and non-empty and copy to destdir
[ -s "result.h5" ] && cp -a "result.h5" "$destdir/new_result2.h5"
Which essentially stores the path part of the file.7 argument in dir and changes to that directory. If unable to change to the directory (due to read-permissions, etc..) the error is handled and the script exits. Next the full directory structure is created below your Result_Folder with mkdir -p with the same error handling if the directory cannot be created.
ls is used as a simple check to verify that a file ending in ".pc" exits in that directory. There are other ways to do this by piping the results to wc -l, but that spawns additional subshells that are best avoided.
(also note that Linux and Mac have files ending in ".pc" for use by pkg-config used when building programs from source -- they should not conflict with your files -- but be aware they exists in case you start chasing why weird ".pc" files are found)
After all tests are performed, the path is trimmed from the current ".7" filename storing just the filename in file7. The file7 variabli is then used in your pc_script command (which should also include the full path to the script if not in you PATH). After the pc_script is run [ -s "result.h5" ] is used to verify that result.h5 exists and is non-empty before moving that file to your Result_Folder location.
That should get you started. Using find to locate all .7 files is a simple way to let the tool designed to find the files for you do its job -- rather than trying to hand-roll your own solution. That way you only have to concentrate on what should be done for each file found. (note: I don't have pc_script or the files, so I have not testes this end-to-end, but it should be very close if not right-on-the-money)
There is nothing wrong in writing your own routine, but using find eliminates a lot of area where bugs can hide in your own solution.
Let me know if you have further questions.

Copying files from another directory with shellscript

I have a question, I have a script that takes a file and copy it to several computers. but my question is the file is in /opt/scripts but the files to copy are in /opt/file/copy. how can I do without moving the file from its directories.
#Here I put a list or array of my server endings
IPS=('15' )
#Name of the file to copy t
FILE="$1"
DIRECTORY1=/opt/bots
# if number of parameters less than or equal to 0
if [ $# -le 0 ]; then
echo "The tar name must be entered."
exit 1
fi
# I loop through the array or list with a for
for i in ${IPS[#]}
do
xxxxxxxx
done```

If you want to copy files from serverA (in /sourcedirectory) to serverB (in /targetdirectory), you can use scp, assuming ssh is setup.
On serverA, do:
cd /sourcedirectory
scp file useronserverB#serverB:/targetdirectory/
Then the file on serverA did not move, and it is copied into the /targetdirectory on serverB.

Linux move files without replacing if files exists

In Linux how do I move files without replacing if a particular file already exists in the destination?
I tried the following command:
mv --backup=t <source> <dest>
The file doesn't get replaced but the issue is the extension gets changed because it puts "~" at the back of the filename.
Is there any other way to preserve the extension but only the filename gets changed when moving?
E.g.
test~1.txt instead of test.txt~1
When the extension gets replaced, subsequently you can't just view a file by double clicking on it.

If you want to make it in shell, without requiring atomicity (so if two shell processes are running the same code at the same time, you could be in trouble), you simply can (using the builtin test(1) feature of your shell)
[ -f destfile.txt ] || mv srcfile.txt destfile.txt
If you require atomicity (something that works when two processes are simultaneously running it), things are quite difficult, and you'll need to call some system calls in C. Look into renameat2(2)
Perhaps you should consider using some version control system like git ?

mv has an option:
-S, --suffix=SUFFIX
override the usual backup suffix
which you might use; however afaik mv doesn't have a functionality to change part of the filename but not the extension. If you just want to be able to open the backup file with a text editor, you might consider something like:
mv --suffix=.backup.txt <source> <dest>
how this would work: suppose you have
-rw-r--r-- 1 chris users 2 Jan 25 11:43 test2.txt
-rw-r--r-- 1 chris users 0 Jan 25 11:42 test.txt
then after the command mv --suffix=.backup.txt test.txt test2.txt you get:
-rw-r--r-- 1 chris users 0 Jan 25 11:42 test2.txt
-rw-r--r-- 1 chris users 2 Jan 25 11:43 test2.txt.backup.txt

#aandroidtest: if you are able to rely upon a Bash shell script and the source directory (where the files reside presently) and the target directory (where you want to them to move to) are same file system, I suggest you try out a script that I wrote. You can find it at https://github.com/jmmitchell/movestough
In short, the script allows you to move files from a source directory to a target directory while taking into account new files, duplicate (same file name, same contents) files, file collisions (same file name, different contents), as well as replicating needed subdirectory structures. In addition, the script handles file collision renaming in three forms. As an example if, /some/path/somefile.name.ext was found to be a conflicting file. It would be moved to the target directory with a name like one of the following, depending on the deconflicting style chosen (via the -u= or --unique-style= flag):
default style : /some/path/somefile.name.ext-< unique string here >
style 1 : /some/path/somefile.name.< unique string here >.ext
style 2 : /some/path/somefile.< unique string here >.name.ext
Let me know if you have any questions.

Guess mv command is quite limited if moving files with same filename.
Below is the bash script that can be used to move and if the file with the same filename exists it will append a number to the filename and the extension is also preserved for easier viewing.
I modified the script that can be found here:
https://superuser.com/a/313924
#!/bin/bash
source=$1
dest=$2
file=$(basename $source)
basename=${file%.*}
ext=${file##*.}
if [[ ! -e "$dest/$basename.$ext" ]]; then
mv "$source" "$dest"
else
num=1
while [[ -e "$dest/$basename$num.$ext" ]]; do
(( num++ ))
done
mv "$source" "$dest/$basename$num.$ext"
fi

How to move and number files?

I working with linux, bash.
I have one directory with 100 folders in it, each one named different.
In each of these 100 folders, there is a file called first.bars (so I have 100 files named first.bars). Although all named first.bars, the files are actually slightly different.
I want to get all these files moved to one new folder and rename/number these files so that I know which file comes from which folder. So the first first.bars file must be renamed to 001.bars, the second to 002.bars.. etc.
I have tried the following:
ls -d * >> /home/directorywiththe100folders/list.txt
cat list.txt | while read line;
do cd $line;
mv first.bars /home/newfolder
This does not work because I can't have 100 files, named the same, in one folder. So I only need to know how to rename them. The renaming must be connected to the cat list.txt, because the first line is the folder containing the first file wich is moved and renamed. That file will be called 001.bars.

Try doing this :
$ rename 's/^.*?\./sprintf("%03d.", $c++)/e' *.bar
If you want more information about this command, see this recent response I gave earlier : How do I rename multiple files beginning with a Unix timestamp - imapsync issue

If the rename command is not available,
for d in /home/directorywiththe100folders/*/; do
newfile=$(printf "/home/newfolder/%d.bars" $(( c++ )) )
mv "$d/first.bars" "$newfile"
done

Linux: Unzip an archive containing files with the same name

I was sent a zip file containing 40 files with the same name.
I wanted to extract each of these files to a seperate folder OR extract each file with a different name (file1, file2, etc).
Is there a way to do this automatically with standard linux tools? A check of man unzip revealed nothing that could help me. zipsplit also does not seem to allow an arbitrary splitting of zip files (I was trying to split the zip into 40 archives, each containing one file).
At the moment I am (r)enaming my files individually. This is not so much of a problem with a 40 file archive, but is obviously unscalable.
Anyone have a nice, simple way of doing this? More curious than anything else.
Thanks.

Assuming that no such tool currently exists, then it should be quite easy to write one in python. Python has a zipfile module that should be sufficient.
Something like this (maybe, untested):
#!/usr/bin/env python
import os
import sys
import zipfile
count = 0
z = zipfile.ZipFile(sys.argv[1],"r")
for info in z.infolist():
directory = str(count)
os.makedirs(directory)
z.extract(info,directory)
count += 1
z.close()

I know this is a couple years old, but the answers above did not solve my particular problem here so I thought I should go ahead and post a solution that worked for me.
Without scripting, you can just use command line input to interact with the unzip tools text interface. That is, when you type this at the command line:
unzip file.zip
and it contains files of the same name, it will prompt you with:
replace sameName.txt? [y]es, [n]o, [A]ll, [N]one, [r]ename:
If you wanted to do this by hand, you would type "r", and then at the next prompt:
new name:
you would just type the new file name.
To automate this, simply create a text file with the responses to these prompts and use it as the input to unzip, as follows.
r
sameName_1.txt
r
sameName_2.txt
...
That is generated pretty easily using your favorite scripting language. Save it as unzip_input.txt and then use it as input to unzip like this:
unzip < unzip_input.txt
For me, this was less of a headache than trying to get the Perl or Python extraction modules working the way I needed. Hope this helps someone...

here is a linux script version
in this case the 834733991_T_ONTIME.csv is the name of the file that is the same inside every zip file, and the .csv after "$count" simply has to be swapped with the file type you want
#!/bin/bash
count=0
for a in *.zip
do
unzip -q "$a"
mv 834733991_T_ONTIME.csv "$count".csv
count=$(($count+1))
done`

This thread is old but there is still room for improvement. Personally I prefer the following one-liner in bash
unzipd ()
{
unzip -d "${1%.*}" "$1"
}
Nice, clean, and simple way to remove the extension and use the

Using unzip -B file.zip did the trick for me. It creates a backup file suffixed with ~<number> in case the file already exists.
For example:
$ rm *.xml
$ unzip -B bogus.zip
Archive: bogus.zip
inflating: foo.xml
inflating: foo.xml
inflating: foo.xml
inflating: foo.xml
inflating: foo.xml
$ ls -l
-rw-rw-r-- 1 user user 1161 Dec 20 20:03 bogus.zip
-rw-rw-r-- 1 user user 1501 Dec 16 14:34 foo.xml
-rw-rw-r-- 1 user user 1520 Dec 16 14:45 foo.xml~
-rw-rw-r-- 1 user user 1501 Dec 16 14:47 foo.xml~1
-rw-rw-r-- 1 user user 1520 Dec 16 14:53 foo.xml~2
-rw-rw-r-- 1 user user 1520 Dec 16 14:54 foo.xml~3
Note: the -B option does not show up in unzip --help, but is mentioned in the man pages: https://manpages.org/unzip#options

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string