linux command to concatenate multiple files with content separated by filenames? - linux

I am looking for a command that will concatenate multiple files in a directory tree with sames having a pattern such that the resulting file has contents of all the files separated by the name(path) of each file. I tried using find -exec and sed but couldn't succeed.Please help.
More specifically I have a directory containing many sub-directories having file named 'test.FAILED'. I want to concatenate all the test.FAILED files separated by their Paths so that I can have a look at all of them at the same time.

for i in <pattern>
do
echo "$i"
cat "$i"
done > output

Using (gnu) find:
find . -name \*.FAILED -print -exec cat "{}" \;

Related

Copying a type of file, in specific directories, to another directory

I have a .txt file that contains a list of directories. I want to make a script that goes through this .txt file, copies anything in the directory thats listed of a certain file type, to another directory.
I've never done this with directories, only files.
How can i edit this simple script to work for reading a directory list, looking for a .csv file, and copy it to another directory?
cat filenames.list | \
while read FILENAME
do
find . -name "$FILENAME" -exec cp '{}' new_dir\;
done
for DIRNAME in $(dirname.list); do find $DIRNAME -type f -name "*.csv" -exec cp \{} dest \; ; done;
sorry, in my first answer i didnt understand what you asking for.
The first line of code, simply, take a dirname entry in your directory list as a path and search in it for each file which end with ".csv" extension; then copy it inside the destination you want.
But you could do with less code:
for DIRNAME in $(dirname.list); do cp $DIRNAME/*.csv dest ; done
Despite the filename of the list filenames.list, let me assume the file contains the list of directory names, not filenames. Then would you please try:
while IFS= read -r dir; do
find "$dir" -type f -name "*.mp3" -exec cp -p -- {} new_dir \;
done < filenames.list
The find command searches in "$dir" for files which have an extension .mp3 then copies them to the new_dir.
The script above does not care the duplication of the filenames. If you want to keep the original directory tree and/or need a countermeasure for the duplication of the filenames, please let me know.
Using find inside a while loop works but find will run on each line of the file, another alternative is to save the list in an array, that way find can search on the directories in the list in one search.
If you have bash4+ you can use mapfile.
mapfile -t directories < filenames.list
If you're stuck at bash3.
directories=()
while IFS= read -r line; do
directories+=("$lines")
done < filenames.list
Now if you're just after one file type like files ending in *.csv.
find "${directories[#]}" -type f -name '*.csv' -exec sh -c 'cp -v -- "$#" /newdirectory' _ {} +
If you have multiple file type to match and multiple directories to copy the files.
while IFS= read -r -d '' file; do
case $file in
*.csv) cp -v -- "$file" /foodirectory;; ##: csv file copy to foodirectory
*.mp3) cp -v -- "$file" /bardirectory;; ##: mp3 file copy to bardirectory
*.avi) cp -v -- "$file" /bazdirectory;; ##: avi file copy to bazdirectory
esac
done < <(find "${directories[#]}" -type f -print0)
find's print0 will work with read's -d '' when dealing with files with white spaces and newlines. see How can I find and deal with file names containing newlines, spaces or both?
The -- is there so if you have a problematic filename that starts with a dash - cp will not interpret it as an option.
Given find ability to process multiple folder, and assuming goal is to 'flatten' all csv files into a single destination, consider the following.
Note that it assumes folder names do not have special characters (including spaces, tabs, new lines, etc).
As a side benefit, it will minimize the number of 'cp' calls, making the process efficient across large number of files/folders.
find $(<filename.list) -name '*.csv' | xargs cp -t DESTINATION/
For the more complex case, where folder names/file name can be anything (including space, '*', etc.), consider using NUL separator (-print0 and -0).
xargs -I{} -t find '{}' -name '*.csv' <dd -print0 | xargs -0 -I{} -t cp -t new/ '{}'
Which will fork multiple find and multiple cp.

Best way to tar and zip files meeting specific name criteria?

I'm writing a shell script on a Linux machine to be run via a crontab which is meant to move all files older than the current day to a new folder, and then tar and zip the entire folder. Seems like a simple task but for some reason, I'm running into all kinds of roadblocks. I'm new to this and self-taught so any help or redirection would be greatly appreciated.
Specific criteria for which files to archive:
All log files are in /home/tech/logs/ and all pdfs are in /home/tech/logs/pdf
All files are over a day old as indicated by the file name (file name does not include $CURRENT_DATE)
All files must be *.log or *.pdf (i.e. don't archive files that don't include $CURRENT_DATE if it isn't a log or pdf file.
Filename formatting specifics:
All the log file names are in home/tech/logs in the format NAME 00_20180510.log, and all the pdf files are in a "pdf" subdirectory (home/tech/logs/pdf) with the format NAME 00_20180510_00000000.pdf ("20180510" would be whenever the file was created and the 0's would be any number). I need to use the name rather than the file metadata for the creation date, and all files (pdf/log) whose name does not include the current date are "old". I also can't just move all files that don't contain $CURRENT_DATE in the name because it would take any non-*.pdf or *.log files with it.
Right now the script creates a new folder with a new pdf subdir for the old files (mkdir -p /home/tech/logs/$ARCHIVE_NAME/pdf). I then want to move the old logs into $ARCHIVE_NAME, and move all old pdfs from the original pdf subdirectory into $ARCHIVE_NAME/pdf.
Current code:
find /home/tech/logs -maxdepth 1 -name ( "*[^$CURRENT_DATE].log" "*.log" ) -exec mv -t "$ARCHIVE_NAME" '{}' ';'
find /home/tech/logs/pdf -maxdepth 1 -name ( "*[^$CURRENT_DATE]*.pdf" "*.pdf" ) -exec mv -t "$ARCHIVE_NAME/pdf" '{}' ';'
This hasn't been working because it treats the numbers in $CURRENT_DATE as a list of numbers to exclude rather than a literal string.
I've considered just using tar's exclude options like this:
tar -cvzPf "$ARCHIVE_NAME.tgz" --directory /home/tech/logs --exclude="$CURRENT_DATE" --no-unquote --recursion --remove-files --files-from="/home/tech/logs/"
But a) it doesn't work, and b) it would theoretically include all files that weren't *.pdf or *.log files, which would be a problem.
Am I overcomplicating this? Is there a better way to go about this?
I would go about this using bash's extended glob features, which allow you to negate a pattern:
#!/bin/bash
shopt -s extglob
mv /home/tech/logs/*!("$CURRENT_DATE")*.log "$ARCHIVE_NAME"
mv /home/tech/logs/pdf/*!("$CURRENT_DATE")*.pdf "$ARCHIVE_NAME"/pdf
With extglob enabled, !(pattern) expands to everything that doesn't match the pattern (or list of pipe-separated patterns).
Using find it should also be possible:
find /home/tech/logs -name '*.log' -not -name "*$CURRENT_DATE*" -exec mv -t "$ARCHIVE_NAME" {} +
Building on #tom-fenech answer, optimized to avoid many mv invocations:
find /home/tech/logs -maxdepth 1 -name '*.log' -not -name "*_${CURRENT_DATE?}.log" | \
xargs mv -t "${ARCHIVE_NAME?}"
An interesting feature, from processing the file thru pipes, is the ability to filter them with extra tools (aka grep :), which can (arguably) become more readable i.e. ->
find /home/tech/logs -maxdepth 1 -name '*.log' | fgrep -v "_${CURRENT_DATE?}" | \
xargs mv -t "${ARCHIVE_NAME?}"
Then similarly for the pdf ones, BTW you can "dry-run" above by just replacing mv by echo mv.
--jjo

Finding a file within recursive directory of zip files

I have an entire directory structure with zip files. I would like to:
Traverse the entire directory structure recursively grabbing all the zip files
I would like to find a specific file "*myLostFile.ext" within one of these zip files.
What I have tried
1. I know that I can list files recursively pretty easily:
find myLostfile -type f
2. I know that I can list files inside zip archives:
unzip -ls myfilename.zip
How do I find a specific file within a directory structure of zip files?
You can omit using find for single-level (or recursive in bash 4 with globstar) searches of .zip files using a for loop approach:
for i in *.zip; do grep -iq "mylostfile" < <( unzip -l $i ) && echo $i; done
for recursive searching in bash 4:
shopt -s globstar
for i in **/*.zip; do grep -iq "mylostfile" < <( unzip -l $i ) && echo $i; done
You can use xargs to process the output of find or you can do something like the following:
find . -type f -name '*zip' -exec sh -c 'unzip -l "{}" | grep -q myLostfile' \; -print
which will start searching in . for files that match *zip then will run unzip -ls on each and search for your filename. If that filename is found it will print the name of the zip file that matched it.
Some have suggested to use ugrep to search zip files and tarballs. To find the zip files that contain a mylostfile file, specify it as a -g glob pattern like so:
ugrep -z -l -g'myLostfile' ''
With the empty regex pattern '' this this recursively searches all files down the working directory, including any zip, tar, cpio/pax archives for mylostfile. If you only want to search the zip files located in the working directory:
ugrep -z -l -g'myLostfile' '' *.zip

Search for text files in a directory and append a (static) line to each of them

I have a directory with many subdirectories and files with suffixes in those subdirectories (e.g FileA-suffixA FileB-SuffixB FileC-SuffixC FileD-SuffixA, etc).
How can I recursively search for files with a certain suffix, and append a user-defined line of text to those files? I feel like this is a job for grep and sed, but I'm not sure how I would go about doing it. I'm fairly new to scripting, so please bear with me.
You can do it like
find /where/to/search -type f -iname '*.SUFFIX' -exec echo "USER DEFINED STRING" >> \{\} \;
find searches in the suplied path
-type f finds only files
-iname '*.SUFFIX' find the .SUFFIXed names, case ignored
find ./ -name "*suffix" -exec bash -c 'echo "line_to_add" >> $1' -- {} \;
Basically you use find to get a list of the files. Then you use bash to echo append your line to that list.

How to pass hundreds of files from multiple subdirectories to cat?

I have a directory full of subdirectories and each subdirectory has some text files inside of them (i.e. depth is 1).
I'd like to cat all these files (in no particular order) into one file:
cat file1 file2.... fileN >new.txt
Is there a bash shell one-liner that could list all the files inside of these directories and pass them to cat?
How about this?
find . -name '*.txt' -exec cat {} \; > concatenated.txt
Granted it calls cat a bunch of times rather than just once, but the effect is the same.
find . -type f -print0 | xargs -0 cat
find will recursively search for files (-type f) and print their names as null-terminated strings (-print0).
xargs will read null terminated strings (-0) from stdin and pass them as arguments to cat

Resources