Moving folders with the same name to new directory Linux, Ubuntu - linux

I have a folder with 100,000 sub-folders.
Because of the size I cannot open the folder.
Now I am looking for a shell script to help me move or split the folders.
Current = Folder Research : with 100,000 sub-folders. (Sorted A, B, C, D)
Needed = New folder All folders starting with name A-science. should be moved to a new
folder AScience.
All folders starting with B-Science.. should be move to a new folder BScience
I found this script below. But don't know how to make it work.
find /home/spenx/src -name "a1a2*txt" | xargs -n 1 dirname | xargs -I list mv list /home/spenx/dst/
find ~ -type d -name "*99966*" -print

I had a look at the command you supplied to see what it did. here's what each command does (correct me if I'm wrong)
| = pipes output of command to the left of pipe to the input of the command on the right
find /home/spenx/src -name "a1a2*txt" = finds all files within given directory that match between "" and pipes output
xargs -n 1 dirname = takes in all the piped files outputted by the find command and gets the directory name of each file and pipes to output
xargs - I list mv list /home/spenx/dst = takes in all piped folders and and puts them into list variable, mv all items in list to the given folder
find ~ -type d -name "**" -print = runs a test for all files within given name to see if they exist and print any found out (this line is only a test command, it's not necessary for the actual move)
/home/spenx/src = folder to look in (absolute file path, or just folder name without '/')
/home/spenx/dst = folder to move all files to (absolute file path, or just folder name without '/')
"a1a2*txt" = files to look for (since you only care about folders, just use *.* to catch all files
"*99966* = files to test for but I'm not sure what you would put here
I took a look at the command and decided to modify it a little, but It still won't move each folder category (i.e. A-science, B-science) into a separate dirs, this will just get all folders in a given directory and move them to a given destination, or at least as far as I can tell.
You might want to try find all folders of each category ( A-Science) and moving them to a destination folder of Ascience one by one like so
find /home/spenx/src -name "A-science/*.*" | xargs -n 1 dirname | sort -u | xargs - I list mv list /home/spenx/dst/Ascience
find /home/spenx/src -name "B-science/*.*" | xargs -n 1 dirname | sort -u | xargs - I list mv list /home/spenx/dst/Bscience
Again, test the command out before using it on your actual files.
You might want to take look at this question, specifically:
list.txt
1abc
2def
3xyz
script to run:
while read pattern; do
mv "${pattern}"* ../folder_b/"$pattern"
done < list.txt

Related

Remove all but newest file from all sub directories

I have found the following which will list the files in all subdirectories, hide the last 5, and then delete the rest:
find -type f -printf '%T# %P\n' | sort -n | cut -d' ' -f2- | head -n -5 | xargs rm
Unfortunately if I don't know how many subdirectories there are, it won't delete the correct number of files. Does anyone have a way to transverse each directory, and then delete all but the newest of file in each subdirectory?
Directory structure would be the following:
-> Base Directory -> Parent Directory -> Child directory
I'd write a script.
It would be a recursive function:
call function: rm_files(base_dir)
list all directories
if there are directories go through on the list and call rm_files(act_dir) for each item
else (if there is no directories):
list all files
delete all files but the newest
return from function
In case lot of subdirectories it may be memory problem because of the recursive function.
I found I was able to do what I needed to do with the following one liner:
find . -name *.* -mmin +59 -delete > /dev/null

Bash script to rename folder with dynamic name and replace it's strings inside

I'm just starting to use Docker, and I'm newbie in bash scripts, but now I need to write bash script that will do next thing:
I have 2 dirs with some subdirs: Rtp/ and Rtp-[version]/, I need if Rtp-[version]/ dir exists rename it to the Rtp/ and override it's content. [version] - is dynamic number.
My dir structure:
|-- Rtp
|--- subdir 1
|--- subdir 2
|-- Rtp-1.0 (or Rtp-1.6, Rtp-2.7)
|--- subdir 1
|--- subdir 2
After this I need in the new Rtp/ dir find specific file app.properties, and change inside of it string: myvar=my value to string myvar=new value, and do the same thing with 3 more files
I tried this link: http://stackoverflow.com/questions/15290186/…: find . -name 'Rtp-' -exec bash -c 'mv $0 ${0/*/Rtp}' {} \; The problem that if dir already exists it move one directory into another.
Also I want rename it and not copy because it's big dir, and it can take some time to copy.
Thanks in advance, can you explain please the solution, in order to I will can change in the future if something will be changed.
1.
for dir in $(find Rtp-[version] -maxdepth 1 -type d): do
cp -Rf $dir Rtp
done
Find all directories in Rtp-version
Iterate through all of the results (for...)
Copy recursively to Rtp/, and -f will overwrite
2.
for f in $(find Rtp -type f -name "app.properties"): do
sed -ie s/myvar=myval/myvar=newval/ $f
done
Find all files named app.properties
Use sed (the Stream editor) to -i interactively -e search for a string (by regex) and replace it (eg s/<oldval>/<newval>/). Note that oldval and newval will need to be escaped. If they contain a lot of /'s,you could do something like s|<oldval>|<newval>|.
Based on #Brian Hazeltine answer and Check if a file exists with wildcard in shell script
I found next solution:
if ls Rtp-*/ 1> /dev/null 2>&1; then
mv -T Rtp-*/ Rtp
find appl.properties -type f -exec sed -ie 's/myvar=my value/myvar=new value/' {} \;
fi

How to find/list the directories where a particular sub-directory is not present

I am writing a shell script where it is checking if the bin directory is present under all the users directory under /home directory. The bin directory can be present directly under user directory or under the child directory of the user directory.
I mean let say I have a user as amit under /home. So the bin directory can be present directly as /amit/bin or can be present as /amit/jash/bin
Now my requirement is that I should have a list of users directories where the bin directory is not present either directly under user directory or under the child directory of the user directory. I tried the command as :
find /home -type d ! -exec test -e '{}/bin' \; -print
but it is not working. However when I am replacing the bin directory with some file, the command is working fine. Looks like this command is particularly for files. Is there any similar command for directories?? Any help on this will be greatly appreciated.
You're on the right track. The catch is that your test of "does the following directory NOT exist in this target" can't be expressed within find's conditions in such a way as to return only the top-level directory. So you need to nest, one way or another.
One strategy would be to use a for loop in bash:
$ mkdir foo bar baz one two
$ mkdir bar/bin baz/bin
$ for d in /home/*/; do find "$d" -type d -name bin | grep -q . || echo "$d"; done
foo/
one/
two/
This uses pathname expansion (globbing) to generate the list of directories to test, and then checks for the existence of "bin". If that check fails (i.e. find outputs nothing), the directory is printed. Note the trailing slash on /home/*/, which ensures that you will only be searching within directories, rather than files that might accidentally exist in /home/.
Another possibility might be to use nested finds, if you don't want to depend on bash:
$ find /home/ -type d -depth 1 -not -exec sh -c "find {}/ -type d -name bin -print | grep -q . " \; -print
/home/foo
/home/one
/home/two
This roughly duplicates the effect of the bash for loop above, but by nesting find within find -exec. It uses grep -q . to convert the output of find into an exit status that can be used as a condition for the outer find.
Note that since you're looking for a bin directory, we want to use test -d rather than test -e (which would also check for a bin file, which probably does not matter to you.)
Another option is to use bash process redirection. On multiple lines for easier reading:
cd /home/
comm -3 \
<(printf '%s\n' */ | sed 's|/.*||' | sort) \
<(find */ -type d -name bin | cut -d/ -f1 | uniq)
This unfortunately requires you to change to the /home directory before running, because of the way it strips off subdirectories. You can of course collapse this into a big long one-liner if you feel so inclined.
This comm solution also has the risk of failing on directories with special characters in their names, like newlines.
One last option is bash-only but more than a one-liner. It involves subtracting the directories containing "bin" from the full list. It uses an associative array and globstar, so it depends on bash version 4.
#!/usr/bin/env bash
shopt -s globstar
# Go to our root
cd /home
# Declare an associative array
declare -A dirs=()
# Populate the array with our "full" list of home directories
for d in */; do dirs[${d%/}]=""; done
# Remove directories that contain a "bin" somewhere inside 'em
for d in **/bin; do unset dirs[${d%%/*}]; done
# Print the result in reproducible form
declare -p dirs
# Or print the result just as a list of words.
printf '%s\n' "${!dirs[#]}"
Note that we're storing directories in the array index, which (1) makes it easy for us to find and delete items, and (2) insures unique entries, even if one user has multiple "bin" directories under their home.
cd /home
find . -maxdepth 1 -type d ! -name . | sort > a
find . -type d -name bin | cut -d/ -f1,2 | sort > b
comm -23 a b
Here, I'm making two sorted lists. The first contains all the home directories, and the second contains the top parent of any bin subdirectory. Finally I output any items from the first list not present in the second.

Find all directories containing a file that contains a keyword in linux

In my hierarchy of directories I have many text files called STATUS.txt. These text files each contain one keyword such as COMPLETE, WAITING, FUTURE or OPEN. I wish to execute a shell command of the following form:
./mycommand OPEN
which will list all the directories that contain a file called STATUS.txt, where this file contains the text "OPEN"
In future I will want to extend this script so that the directories returned are sorted. Sorting will determined by a numeric value stored the file PRIORITY.txt, which lives in the same directories as STATUS.txt. However, this can wait until my competence level improves. For the time being I am happy to list the directories in any order.
I have searched Stack Overflow for the following, but to no avail:
unix filter by file contents
linux filter by file contents
shell traverse directory file contents
bash traverse directory file contents
shell traverse directory find
bash traverse directory find
linux file contents directory
unix file contents directory
linux find name contents
unix find name contents
shell read file show directory
bash read file show directory
bash directory search
shell directory search
I have tried the following shell commands:
This helps me identify all the directories that contain STATUS.txt
$ find ./ -name STATUS.txt
This reads STATUS.txt for every directory that contains it
$ find ./ -name STATUS.txt | xargs -I{} cat {}
This doesn't return any text, I was hoping it would return the name of each directory
$ find . -type d | while read d; do if [ -f STATUS.txt ]; then echo "${d}"; fi; done
... or the other way around:
find . -name "STATUS.txt" -exec grep -lF "OPEN" \{} +
If you want to wrap that in a script, a good starting point might be:
#!/bin/sh
[ $# -ne 1 ] && echo "One argument required" >&2 && exit 2
find . -name "STATUS.txt" -exec grep -lF "$1" \{} +
As pointed out by #BroSlow, if you are looking for directories containing the matching STATUS.txt files, this might be more what you are looking for:
fgrep --include='STATUS.txt' -rl 'OPEN' | xargs -L 1 dirname
Or better
fgrep --include='STATUS.txt' -rl 'OPEN' |
sed -e 's|^[^/]*$|./&|' -e 's|/[^/]*$||'
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# simulate `xargs -L 1 dirname` using `sed`
# (no trailing `\`; returns `.` for path without dir part)
Maybe you can try this:
grep -rl "OPEN" . --include='STATUS.txt'| sed 's/STATUS.txt//'
where grep -r means recursive , -l means only list the files matching, '.' is the directory location. You can pipe it to sed to remove the file name.
You can then wrap this in a bash script file where you can pass in keywords such as 'OPEN', 'FUTURE' as an argument.
#!/bin/bash
grep -rl "$1" . --include='STATUS.txt'| sed 's/STATUS.txt//'
Try something like this
find -type f -name "STATUS.txt" -exec grep -q "OPEN" {} \; -exec dirname {} \;
or in a script
#!/bin/bash
(($#==1)) || { echo "Usage: $0 <pattern>" && exit 1; }
find -type f -name "STATUS.txt" -exec grep -q "$1" {} \; -exec dirname {} \;
You could use grep and awk instead of find:
grep -r OPEN * | awk '{split($1, path, ":"); print path[1]}' | xargs -I{} dirname {}
The above grep will list all files containing "OPEN" recursively inside you dir structure. The result will be something like:
dir_1/subdir_1/STATUS.txt:OPEN
dir_2/subdir_2/STATUS.txt:OPEN
dir_2/subdir_3/STATUS.txt:OPEN
Then the awk script will split this output at the colon and print the first part of it (the dir path).
dir_1/subdir_1/STATUS.txt
dir_2/subdir_2/STATUS.txt
dir_2/subdir_3/STATUS.txt
The dirname will then return only the directory path, not the file name, which I suppose it what you want.
I'd consider using Perl or Python if you want to evolve this further, though, as it might get messier if you want to add priorities and sorting.
Taking up the accepted answer, it does not output a sorted and unique directory list. At the end of the "find" command, add:
| sort -u
or:
| sort | uniq
to get the unique list of the directories.
Credits go to Get unique list of all directories which contain a file whose name contains a string.
IMHO you should write a Python script which:
Examines your directory structure and finds all files named STATUS.txt.
For each found file:
reads the file and executes mycommand depending on what the file contains.
If you want to extend the script later with sorting, you can find all the interesting files first, save them to a list, sort the list and execute the commands on the sorted list.
Hint: http://pythonadventures.wordpress.com/2011/03/26/traversing-a-directory-recursively/

Create a bash script to delete folders which do not contain a certain filetype

I have recently run into a problem.
I used a utility to move all my music files into directories based on tags. This left a LOT of almost empty folders. The folders, in general, contain a thumbs.db file or some sort of image for album art. The mp3s have the correct album art in their new directories, so the old ones are okay to delete.
Basically, I need to find any directories within D:/Music/ that:
-Do not have any subdirectories
-Do not contain any mp3 files
And then delete them.
I figured this would be easier to do in a shell script or bash script or whatever else linux/unix world than in Windows 8.1 (HAHA).
Any suggestions? I'm not very experienced writing scripts like this.
This should get you started
find /music -mindepth 1 -type d |
while read dt
do
find "$dt" -mindepth 1 -type d | read && continue
find "$dt" -iname '*.mp3' -type f | read && continue
echo DELETE $dt
done
Here's the short story...
find . -name '*.mp3' -o -type d -printf '%h\n' | sort | uniq > non-empty-dirs.tmp
find . -type d -print | sort | uniq > all-dirs.tmp
comm -23 all-dirs.tmp non-empty-dirs.tmp > dirs-to-be-deleted.tmp
less dirs-to-be-deleted.tmp
cat dirs-to-be-deleted.tmp | xargs rm -rf
Note that you might have to run all the commands a few times (depending on your repository's directory depth) before you're done deleting all recursive empty directories...
And the long story goes...
You can approach this problem from two basic perspective: either you find all directories, then iterate over each of them, check if it contain any mp3 file or any subdirectory, if not, mark that directory for deletion. It will works, but on large very large repositories, you might expect a significant run time.
Another approach, which is in my sense much more interesting, is to build a list of directories NOT to be deleted, and subtract that list from the list of all directories. Let's work the second strategy, one step at a time...
First of all, to find the path of all directories that contains mp3 files, you can simply do:
find . -name '*.mp3' -printf '%h\n' | sort | uniq
This means "find any file ending with .mp3, then print the path to it's parent directory".
Now, I could certainly name at least ten different approaches to find directories that contains at least one subdirectory, but keeping the same strategy as above, we can easily get...
find . -type d -printf '%h\n' | sort | uniq
What this means is: "Find any directory, then print the path to it's parent."
Both of these queries can be combined in a single invocation, producing a single list containing the paths of all directories NOT to be deleted.. Let's redirect that list to a temporary file.
find . -name '*.mp3' -o -type d -printf '%h\n' | sort | uniq > non-empty-dirs.tmp
Let's similarly produce a file containing the paths of all directories, no matter if they are empty or not.
find . -type d -print | sort | uniq > all-dirs.tmp
So there, we have, on one side, the complete list of all directories, and on the other, the list of directories not to be deleted. What now? There are tons of strategies, but here's a very simple one:
comm -23 all-dirs.tmp non-empty-dirs.tmp > dirs-to-be-deleted.tmp
Once you have that, well, review it, and if you are satisfied, then pipe it through xargs to rm to actually delete the directories.
cat dirs-to-be-deleted.tmp | xargs rm -rf

Resources