Find specific string in files then delete other files with same filename - linux

I need some help making a bash script.
Okay, so lets say I have these 3 types of files in a system:
fe516148-3e8b-4816-8481-6fd079a46ae9.desc
fe516148-3e8b-4816-8481-6fd079a46ae9.meta
fe516148-3e8b-4816-8481-6fd079a46ae9~fe516148-3e8b-4816-8481-6fd079a46ae9.alias
I need to find a specific "string" in the .alias files and if I find it in that .alias file I then need to delete the associated .meta and .desc files with the same filename but they have slightly different filenames with an additional name after ~.
How would I script this?
I have
find . -name "*.alias" -exec grep -l "string" {} \;
and it returns
./fe516148-3e8b-4816-8481-6fd079a46ae9~fe516148-3e8b-4816-8481-6fd079a46ae9.alias
which is correct, but now I need it to only return fe516148-3e8b-4816-8481-6fd079a46ae9, then delete all the files with filename fe516148-3e8b-4816-8481-6fd079a46ae9 regardless of file extension including the .alias file.
That's as far as I got.

Script:
filename=fe516148-3e8b-4816-8481-6fd079a46ae9~fe516148-3e8b-4816-8481-6fd079a46ae9.alias
string=abcd
if [ "$(grep $string $filename)" -lt 1 ]
then
f1=$(echo $filename | cut -d'~' -f1)
rm -f $f1.desc
rm -f $f1.meta
fi
This is the script based on the description where string is grepped in the file and if found, then file name is cut from the .alias file and those files are deleted.
Hope this helps.

Thank you all especially #Barmar. I got it to work with
for file in $(grep -l "string" *.alias)
do
prefix=${file%%~*}
rm $prefix*.*
done

Related

Renaming folders and files in subdirectories using text file linux

I am trying to rename the files and directories using a text file separated by space.
The text file looks like this:
dir1-1 dir1_1
dir2-1 dir223_1
My command is as follows:
xargs -r -a files.txt -L1 mv
This command can rename only folders from dir1-1 to dir1_1 and dir2-1to dir223_1so on but it doesn't rename the files in the subdirectories. The files in the corresponding directories also have these prefix of these directories.
Looking forward for the assistance.
Assuming you don't have special characters(space of tab...) in your file/dir names,
try
perl_script=$(
echo 'chop($_); $orig=$_;'
while read -r src tgt; do
echo 'if (s{(.*)/'"$src"'([^/]*)}{$1/'"$tgt"'\2}) { print "$orig $_\n";next;}'
done < files.txt)
find . -depth | perl -ne "$perl_script" | xargs -r -L1 echo mv
Remove echo once you see it does what you wanted.

How can I make a bash script where I can move certain files to certain folders which are named based on a string in the files?

This is the script that I'm using to move files with the string "john" in them (124334_john_rtx.mp4 , 3464r64_john_gty.mp4 etc) to a certain folder
find /home/peter/Videos -maxdepth 1 -type f -iname '*john' -print0 | \
xargs -0 --no-run-if-empty echo mv --target-directory=/home/peter/Videos/john/
Since I have a large amount of videos with various names written in the files, I want to make a bash script which moves videos with a string between the underscores to a folder named based on the string between the underscores. So for example if a file is named 4345655_ben_rts.mp4 the script would identify the string "ben" between the underscores, create a folder named as the string between the underscores which in this case is "ben" and move the file to that folder. Any advice is greatly appreciated !
My way to do it :
cd /home/peter/Videos # Change directory to your start directory
for name in $(ls *.mp4 | cut -d'_' -f2 | sort -u) # loops on a list of names after the first underscore
do
mkdir -p /home/peter/Videos/${name} # create the target directory if it doesn't exist
mv *_${name}_*.mp4 /home/peter/Videos/${name} # Moving the files
done
This bash loop should do what you need:
find dir -maxdepth 1 -type f -iname '*mp4' -print0 | while IFS= read -r -d '' file
do
if [[ $file =~ _([^_]+)_ ]]; then
TARGET_DIR="/PARENTPATH/${BASH_REMATCH[1]}"
mkdir -p "$TARGET_DIR"
mv "$file" "$TARGET_DIR"
fi
done
It'll only move the files if it finds a directory token.
I used _([^_]+)_ to make sure there is no _ in the dir name, but you didn't specify what you want if there are more than two _ in the file name. _(.+)_ will work if foo_bar_baz_buz.mp4 is meant to go into directory bar_baz.
And this answer to a different question explains the find | while logic: https://stackoverflow.com/a/64826172/3216427 .
EDIT: As per a question in the comments, I added mkdir -p to create the target directory. The -p means recursively create any part of the path that doesn't already exist, and will not error out if the full directory already exists.

Searching through every file in a directory (and in any sub-directories) one by one

I'm trying to loop through every file in a directory (including files in its subdirectories) and perform some action if the file meets an if-condition.
Part of my code is as follows:
for f in $direc/*
do
if grep -q 'search_term' $f; then
#action on this file
fi
done
However, this fails in the case of subdirectories. I would be very grateful if someone could help me out.
Thank you!
The -R option to grep will read all files in the directory tree including subdirectories. Combined with the -l option to print only the matching file names, you can use that to perform an action on each file that matches.
egrep -Rl pattern directory | while read path; do echo $path && mv $path /tmp; done
For example, that would print the file name and move the file to a different directory.
Find | xargs is the usual pattern I use, and has the advantage of not getting hung up on special characters in file names (spaces etc.) if you use the -print0 option of find.
find . -type f -print0 | xargs -0 -I{} sh -c "if grep -q 'search string' '{}'; then cmd-to-run '{}'; fi"
Yes because with this syntax, grep expect to process file(s) not directories. Minimal change to your script would be to test if $f is a file or not:
...
if [ -f "$f" ] && grep -q 'search_term' $f; then
...
In reality you would probably want to get list of files with patter match and act on those:
while read f; do
: #action on file file $f
done < <(grep -rl 'search_term' $direc/)
I've opted for getting the get the list of files through <(list) because piping it into while would cause the inside of your loop to run in another process (which could be a problem in particular if you expect any variable (changes) to be accessible from outside. And unlike simple for with `` it's not as as sensitive to what filenames you encounter (namely I have spaces in mind, this would still get confused by newlines though). Speaking of which:
while read -d "" f; do
: #action on file file $f
done < <(grep -rZl 'search_term' $direc/)
Nothing should be able to confuse that, as entries are nul character delimited and that one just must not appear in a file name.
Assuming no newlines in your file names:
find "$direc" -type f -exec grep -q 'search_term' {} \; -print |
while IFS= read -r f; do
#action on this file
done

Removing Colons From Multiple FIles on Linux

I am trying to take some directories that and transfer them from Linux to Windows. The problem is that the files on Linux have colons in them. And I need to copy these directories (I cannot alter them directly since they are needed as they are the server) over to files with a name that Windows can use. For example, the name of a directory on the server might be:
IAPLTR2b-ERVK-LTR_chr9:113137544-113137860_-
while I need it to be:
IAPLTR2b-ERVK-LTR_chr9-113137544-113137860_-
I have about sixty of these directories and I have collected the names of the files with their absolute paths in a file I call directories.txt. I need to walk through this file changing the colons to hyphens. Thus far, my attempt is this:
#!/bin/bash
$DIRECTORIES=`cat directories.txt`
for $i in $DIRECTORIES;
do
cp -r "$DIRECTORIES" "`echo $DIRECTORIES | sed 's/:/-/'`"
done
However I get the error:
./my_shellscript.sh: line 10: =/bigpartition1/JKim_Test/test_bs_1/129c-test-biq/IAPLTR1_Mm-ERVK-LTR_chr10:104272652-104273004_+.fasta: No such file or directory ./my_shellscript.sh: line 14: `$i': not a valid identifier
Can anyone here help me identify what I am doing wrong and maybe what I need to do?
Thanks in advance.
This monstrosity will rename the directories in situ:
find tmp -depth -type d -exec sh -c '[ -d "{}" ] && echo mv "{}" "$(echo "{}" | tr : _)"' \;
I use -depth so it descends down into the deepest subdirectories first.
The [ -d "{}" ] is necessary because as soon as the subdirectory is renamed, its parent directory (as found by find) may no longer exist (having been renamed).
Change "echo mv" to "mv" if you're satisfied it will do what you want.

Printing the name of each file with certain extension

how can I print out the name of each file in a certain directory with a specific extension?
Here's what I have so far:
#!/bin/sh
DIR="~/Desktop"
SUFFIX="in"
for file in $DIR/*.$SUFFIX
do
if [ -f $file ]; then
echo $file
fi
done
Unfortunately it doesn't work.
What's wrong with it?
In your DIR="~/Desktop" the "~" not expanded, because it is in "".
remove the "". DIR=~/Desktop
You could use find with -type f
#!/bin/sh
DIR="~/Desktop"
SUFFIX="in"
find "$DIR" -maxdepth 1 -type f -name "*.${SUFFIX}" -exec somecommand {} \;
For your information: "file" in Unix systems is typically the name of a command.
Its purpose is to analyze the files given as argument and infer the format. Example:
$ file entries.jar
entries.jar: Zip archive data, at least v2.0 to extract
for file in `ls $DIR/*.$SUFFIX`
Note the ls and backticks

Resources