How to read the complete path till the end of the directory structure using loop in scripting

How to read the complete path till the end of the directory structure using loop in scripting - linux

I have a following directory structure as
/home/ABCD/apple/ball/car/divider.txt, /home/ABCD this is like a root directory for my apps, I can get that easily, and from there all the sub folders may vary for every case, so I am looking for a generic program where I can extract the path through some loops
I want to extract the directory structure to a separate variable as "/home/ABCD/apple/ball/car/"
Can any one help me
2nd Example : /home/ABCD/adam/nest/mary/user.txt
variable should get the following value - "/home/ABCD/adam/nest/mary/"

Use dirname
$ dirname /home/ABCD/apple/ball/car/divider.txt
/home/ABCD/apple/ball/car
To assign to variable do
var=$(dirname /home/ABCD/apple/ball/car/divider.txt)
echo "$var"
No spaces before and after the =

if the ending slash / is required, you could pick one:
kent$ echo "/home/ABCD/adam/nest/mary/user.txt"|grep -Po '.*/'
/home/ABCD/adam/nest/mary/
or
kent$ echo "/home/ABCD/adam/nest/mary/user.txt"|sed -r 's#(.*/).*#\1#'
/home/ABCD/adam/nest/mary/
or
kent$ echo $(dirname /home/ABCD/adam/nest/mary/user.txt)"/"
/home/ABCD/adam/nest/mary/

Related

How to auto insert a string in filename by bash?

I have the output file day by day:
linux-202105200900-foo.direct.tar.gz
The date and time string, ex: 202105200900 will change every day.
I need to manually rename these files to
linux-202105200900x86-foo.direct.tar.gz
( insert a short string x86 after date/time )
any bash script can help to do this?

If you're always inserting the string "x86" at character #18 in the string, you may use that command:
var="linux-202105200900-foo.direct.tar.gz"
var2=${var:0:18}"x86"${var:18}
echo $var2
The 2nd line means: "assign to variable var2 the first 18 characters of var, followed by x86 followed by the rest of the variable var"
If you want to insert "x86" just before the last hyphen in the string, you may write it like this:
var="linux-202105200900-foo.direct.tar.gz"
var2=${var%-*}"x86-"${var##*-}
echo $var2
The 2nd line means: "assign to variable var2:
the content of the variable var after removing the shortest matching pattern "-*" at the end
the string "x86-"
the content of the variable var after removing the longest matching pattern "*-" at the beginning

In addition to the very good answer by #Jean-Loup Sabatier another, perhaps more general way would simply be to replace the second occurrence of '-' with x86- which you can do with sed. Let's say you have:
fname=linux-202105200900-foo.direct.tar.gz
You can update that with:
fname="$(sed 's/-/x86-/2' <<< "$fname")"
Which simply uses a command substitution with sed and a herestring to modify fname assigning the modified result back to fname.
Example Use/Output
$ fname=linux-202105200900-foo.direct.tar.gz
fname="$(sed 's/-/x86-/2' <<< "$fname")"
echo $fname
linux-202105200900x86-foo.direct.tar.gz

Do you need this?
❯ dat=$(date '+%Y%m%d%H%M%S'); echo ${dat}
20210520170336
❯ filename="linux-${dat}x86-foo.direct.tar.gz"; echo ${filename}
linux-20210520170336x86-foo.direct.tar.gz

I wanted to go as simple as possible, considering only the timestamp is going to change, this script should do it. Just run it inside the folder where files are located and you'll get all of them renamed with x86.
#!/bin/bash
for file in $(ls); do
replaced=$(echo $file | sed 's|-foo|x86-foo|g')
mv $file $replaced
done
This is my output
filip#filip-ThinkPad-T14-Gen-1:~/test$ ls
linux-202105200900-foo.direct.tar.gz linux-202105201000-foo.direct.tar.gz linux-202105201100-foo.direct.tar.gz
filip#filip-ThinkPad-T14-Gen-1:~/test$ ./../development/bash-utils/bulk-rename.sh
filip#filip-ThinkPad-T14-Gen-1:~/test$ ls
linux-202105200900x86-foo.direct.tar.gz linux-202105201000x86-foo.direct.tar.gz linux-202105201100x86-foo.direct.tar.gz
Simply iterate through all the files in current folder and pipeline result to sed to replace regex -foo with x86-foo, then rename file with mv command.
As David mentioned in comment, if you're worried that there could be multiple occurrences of -foo then you can just replace g as global to 1 as first occurrence and that's it!

There is also the rename utility (https://man7.org/linux/man-pages/man1/rename.1.html), you could use:
rename -v 0-foo.direct.tar.gz 0x86-foo.direct.tar.gz *
which results in
`linux-202105200900-foo.direct.tar.gz' -> `linux-202105200900x86-foo.direct.tar.gz'
`linux-202205200900-foo.direct.tar.gz' -> `linux-202205200900x86-foo.direct.tar.gz'
`linux-202305200900-foo.direct.tar.gz' -> `linux-202305200900x86-foo.direct.tar.gz'

In addition to the very good answer by #David C. Rankin, just adding it in a loop and renaming the files
# !/usr/bin/bash
for file in `ls linux* 2>/dev/null` # Extract all files starting with linux
do
echo $file
fname="$(sed 's/-/x86-/2' <<< "$file")"
mv "$file" "$fname" # Rename file
done
Output recieved :
linux-202105200900x86-foo.direct.tar.gz

Changing the file names and copying into different directory

I have some files say about 1000 numbers.. Wanted to rename those files in such a way that, wanted to cut out only few chars from file name and copy it to some other directory.
Ex: Original file name.
vfcon062562~19.xml
vfcon058794~29.xml
vfcon072009~3.xml
vfcon071992~10.xml
vfcon071986~2.xml
vfcon071339~4.xml
vfcon069979~43.xml
Required O/P is cutting the ~and following chars.
O/P Ex:
vfcon058794.xml
vfcon062562.xml
vfcon069979.xml
vfcon071339.xml
vfcon071986.xml
vfcon071992.xml
vfcon072009.xml
But want to place n different directory.

If you are using bash or similar you can use the following simple loop:
for input in vfcon*xml
do
mv $input targetDir/$(echo $input | awk -F~ '{print $1".xml"}')
done
Or in a single line:
for input in vfcon*xml; do mv $input targetDir/$(echo $input | awk -F~ '{print $1".xml"}'); done
This uses awk to separate everything before ~ using it as a field separator and printing the first column and appending ".xml" to create the output file name. All this is prepended with the targetDir which can be a full path.
If you are using csh / tcsh then the syntax of the loop will be slightly different but the commands will be the same.

I like to make sure that my data set is correct prior to changing anything so I would put that into a variable first and then check over it.
files=$(ls vfcon*xml)
echo $files | less
Then, like #Stefan said, use a loop:
for i in $files
do
mv "$i" "$( echo "$file" | sed 's/~[0-9].//g')"
done
If you need help with bash you can use http://www.shellcheck.net/

Split files according to a field and save in subdirectory created using the root name

I am having trouble with several bits of code, I am no expert in Linux Bash programming unfortunately so I have tried unsuccessfully to find something that works for my task all day and was hoping you could help guide me in the right direction.
I have many large files that I would like to split according to the third field within each of them, I would like to keep the header in each of the sub-files, and save the created sub-files in new directories created from the root names of the files.
The initial files stored in the original directory are:
Downloads/directory1/Levels_CHG_Lab_S_sample1.txt
Downloads/directory1/Levels_CHG_Lab_S_sample2.txt
Downloads/directory1/Levels_CHG_Lab_S_sample3.txt
and so on..
Each of these files have 200 columns, and column 3 contains values from 1 through 10.
I would like to split each of the files above based on the value of this column, and store the subfiles in subfolders, so for example sub-folder "Downloads/directory1/sample1" will contain 10 files (with the header line) derived by splitting the file Downloads/directory1/Levels_CHG_Lab_S_sample1.txt.
I have tried now many different steps for these steps, with no success.. I must be making this more complicated than it is since the code I have tried looks aweful…
Here is the code I am trying to work from:
FILES=Downloads/directory1/
for f in $FILES
do
# Create folder with root name by stripping file names
fname=${echo $f | sed 's/.txt//;s/Levels_CHG_Lab_S_//'}
echo "Creating sub-directory [$fname]"
mkdir "$fname"
# Save the header
awk 'NR==1{print $0}' $f > header
# Split each file by third column
echo "Splitting file $f"
awk 'NR>1 {print $0 > $3".txt" }' $f
# Move newly created files in sub directory
mv {1..10}.txt $fname # I have no idea how to do specify the files just created
# Loop through the sub-files to attach header row:
for subfile in $fname
do
cat header $subfile >> tmp_file
mv -f tmp_file $subfile
done
done
All these steps seem very complicated to me, I would very much appreciate if you could help me solve this in the right way. Thank you very much for your help.
-fra

You have a few problems with your code right now. First of all, at no point do you list the contents of your downloads directory. You are simply setting the FILES variable to a string that is the path to that directory. You would need something like:
FILES=$(ls Downloads/directory1/*.txt)
You also never cd to the Downloads/directory1 folder, so your mkdir would create directories in cwd; probably not what you want.
If you know that the numbers in column 3 always range from 1 to 10, I would just pre-populate those files with the header line before you split the file.
Try this code to do what you want (untested):
BASEDIR=Downloads/directory1/
FILES=$(ls ${BASEDIR}/*.txt)
for f in $FILES; do
# Create folder with root name by stripping file names
dirname=$(echo $f | sed 's/.txt//;s/Levels_CHG_Lab_S_//')
dirname="${BASENAME}/${dirname}/"
echo "Creating sub-directory [$dirname]"
mkdir "$dirname"
# Save the header to each file
HEADER_LINE=$(head -n1 $f)
for i in {1..10}; do
echo ${HEADER_LINE} > ${dirname}/${i}.txt
done
# Split each file by third column
echo "Splitting file $f"
awk -v dirname=${dirname} 'NR>1 {filename=dirname$3".txt"; print $0 >> filename }' $f
done

EASY - How can I add text into all text files in a certain directory using script?

I need to write a script that adds a certain text into all the text files in a certain directory. Here's what I have:
$1
for f in *.txt
do
echo "$2" >> *.txt <-- HERE's the problem and I don't understand why
done
...everything else works just fine.

First of all, you are missing a semicolon. Second, you have to use the variable you defined to iterate through all the text-files:
for f in *.txt; do
echo "$1" >> $f
done
I don't know for which reason you used $2. If you really need more than one argument for your script, you have to adjust the script accordingly.

How to remove the extension of a file?

I have a folder that is full of .bak files and some other files also. I need to remove the extension of all .bak files in that folder. How do I make a command which will accept a folder name and then remove the extension of all .bak files in that folder ?
Thanks.

To remove a string from the end of a BASH variable, use the ${var%ending} syntax. It's one of a number of string manipulations available to you in BASH.
Use it like this:
# Run in the same directory as the files
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
That works nicely as a one-liner, but you could also wrap it as a script to work in an arbitrary directory:
# If we're passed a parameter, cd into that directory. Otherwise, do nothing.
if [ -n "$1" ]; then
cd "$1"
fi
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
Note that while quoting your variables is almost always a good practice, the for FILENAME in *.bak is still dangerous if any of your filenames might contain spaces. Read David W.'s answer for a more-robust solution, and this document for alternative solutions.

There are several ways to remove file suffixes:
In BASH and Kornshell, you can use the environment variable filtering. Search for ${parameter%word} in the BASH manpage for complete information. Basically, # is a left filter and % is a right filter. You can remember this because # is to the left of %.
If you use a double filter (i.e. ## or %%, you are trying to filter on the biggest match. If you have a single filter (i.e. # or %, you are trying to filter on the smallest match.
What matches is filtered out and you get the rest of the string:
file="this/is/my/file/name.txt"
echo ${file#*/} #Matches is "this/` and will print out "is/my/file/name.txt"
echo ${file##*/} #Matches "this/is/my/file/" and will print out "name.txt"
echo ${file%/*} #Matches "/name.txt" and will print out "/this/is/my/file"
echo ${file%%/*} #Matches "/is/my/file/name.txt" and will print out "this"
Notice this is a glob match and not a regular expression match!. If you want to remove a file suffix:
file_sans_ext=${file%.*}
The .* will match on the period and all characters after it. Since it is a single %, it will match on the smallest glob on the right side of the string. If the filter can't match anything, it the same as your original string.
You can verify a file suffix with something like this:
if [ "${file}" != "${file%.bak}" ]
then
echo "$file is a type '.bak' file"
else
echo "$file is not a type '.bak' file"
fi
Or you could do this:
file_suffix=$(file##*.}
echo "My file is a file '.$file_suffix'"
Note that this will remove the period of the file extension.
Next, we will loop:
find . -name "*.bak" -print0 | while read -d $'\0' file
do
echo "mv '$file' '${file%.bak}'"
done | tee find.out
The find command finds the files you specify. The -print0 separates out the names of the files with a NUL symbol -- which is one of the few characters not allowed in a file name. The -d $\0means that your input separators are NUL symbols. See how nicely thefind -print0andread -d $'\0'` together?
You should almost never use the for file in $(*.bak) method. This will fail if the files have any white space in the name.
Notice that this command doesn't actually move any files. Instead, it produces a find.out file with a list of all the file renames. You should always do something like this when you do commands that operate on massive amounts of files just to be sure everything is fine.
Once you've determined that all the commands in find.out are correct, you can run it like a shell script:
$ bash find.out

rename .bak '' *.bak
(rename is in the util-linux package)

Caveat: there is no error checking:
#!/bin/bash
cd "$1"
for i in *.bak ; do mv -f "$i" "${i%%.bak}" ; done

You can always use the find command to get all the subdirectories
for FILENAME in `find . -name "*.bak"`; do mv --force "$FILENAME" "${FILENAME%.bak}"; done

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string