How to use a text file containing list of files as input for cat Linux command? - linux

I would like to use the cat command to concatenate multiple files located in different folders. I have a text file containing the name and location (path) of each file as a long list (e.g. filesLocationNames.txt), and would like to use it as an input for the cat command.
I tried: 'cat filesLocationNames.txt | cat * > output.txt'
but it didn't work.

How about cat filesLocationNames.txt | xargs cat > output.txt

cat will pipe input to output, so piping the names in won't work. You need to supply the filenames as arguments to cat. E.g.:
cat `cat filesLocationNames.txt`
But that has the same problem with spaces in filenames/pathnames...
In TCSH you can try:
cat "'cat filesLocationNames.txt'"
That is doublequote (") backquote (') at the start. (No space between them!) But it will handle spaces in the names...
Also, in TCSH:
foreach FILE ( "`cat filesLocationNames.txt`" )
echo $FILE
end
Will handle spaces in the names...
Only one catch: If filesLocationNames.txt is too long, it will exceed the line buffer and you'll need xargs. How big is it?

Related

Store files to a list with ls, while removing parts of the dirname with sed

I have a folder lets say located here:
/Users/spotter/Downloads
and within the root folder there are two files:
test1.txt and test2.txt.
I want to write a shell script to save all the files to a list with a line like this:
file_list="$(ls /Users/spotter/Downloads)"
and echo $file_list will return:
/Users/spotter/Downloads/test1.txt
/Users/spotter/Downloads/test2.txt
However I want to change part of the dirname. Particularly I want to remove the /Users/spotter part.
I tried this like so:
file_list="$(ls /Users/spotter/Downloads |
while read path; do dirname "$path" | sed 's/users/spotter///'; done)"
which returns:
sed: 1: "s/users/spotter/Downloa ...": bad flag in substitute command: 'D'
sed: 1: "s/users/spotter/Downloa ...": bad flag in substitute command: 'D'
when I do echo $file_list I want this to be the output:
Downloads/test1.txt
Downloads/test2.txt
The problem is that sed thinks '/' is the delimiter between the RE and the substitution, so sed is not reading the other '/'s the way you want it to. You can use other characters as a delimiter. For instance 's~/Users/spotter/~~'.

Copy a txt file twice to a different file using bash

I am trying to cat a file.txt and loop it twice through the whole content and copy it to a new file file_new.txt. The bash command I am using is as follows:
for i in {1..3}; do cat file.txt > file_new.txt; done
The above command is just giving me the same file contents as file.txt. Hence file_new.txt is also of the same size (1 GB).
Basically, if file.txt is a 1GB file, then I want file_new.txt to be a 2GB file, double the contents of file.txt. Please, can someone help here? Thank you.
Simply apply the redirection to the for loop as a whole:
for i in {1..3}; do cat file.txt; done > file_new.txt
The advantage of this over using >> (aside from not having to open and close the file multiple times) is that you needn't ensure that a preexisting output file is truncated first.
Note that the generalization of this approach is to use a group command ({ ...; ...; }) to apply redirections to multiple commands; e.g.:
$ { echo hi; echo there; } > out.txt; cat out.txt
hi
there
Given that whole files are being output, the cost of invoking cat for each repetition will probably not matter that much, but here's a robust way to invoke cat only once:[1]
# Create an array of repetitions of filename 'file' as needed.
files=(); for ((i=0; i<3; ++i)); do files[i]='file'; done
# Pass all repetitions *at once* as arguments to `cat`.
cat "${files[#]}" > file_new.txt
[1] Note that, hypothetically, you could run into your platform's command-line length limit, as reported by getconf ARG_MAX - given that on Linux that limit is 2,097,152 bytes (2MB) that's not likely, though.
You could use the append operator, >>, instead of >. Then adjust your loop count as needed to get the output size desired.
You should adjust your code so it is as follows:
for i in {1..3}; do cat file.txt >> file_new.txt; done
The >> operator appends data to a file rather than writing over it (>)
if file.txt is a 1GB file,
cat file.txt > file_new.txt
cat file.txt >> file_new.txt
The > operator will create file_new.txt(1GB),
The >> operator will append file_new.txt(2GB).
for i in {1..3}; do cat file.txt >> file_new.txt; done
This command will make file_new.txt(3GB),because for i in {1..3} will run three times.
As others have mentioned, you can use >> to append. But, you could also just invoke cat once and have it read the file 3 times. For instance:
n=3; cat $( yes file.txt | sed ${n}q ) > file_new.txt
Note that this solution exhibits a common anti-pattern and fails to properly quote the arguments, which will cause issues if the filename contains whitespace. See mklement's solution for a more robust solution.

Looping through a file with path and file names and within these file search for a pattern

I have a file called lookupfile.txt with the following info:
path, including filename
Within bash I would like to search through these files in mylookup file.txt for a pattern : myerrorisbeinglookedat. When found, output the lines where found into another recorder file. All the found result can land in the same file.
Please help.
You can write a single grep statement to achieve this:
grep myerrorisbeinglookedat $(< lookupfile.txt) > outfile
Assuming:
the number of entries in lookupfile.txt is small (tens or hundreds)
there are no white spaces or wildcard characters in the file names
Otherwise:
while IFS= read -r file; do
# print the file names separated by a NULL character '\0'
# to be fed into xargs
printf "$file\0"
done < lookupfile.txt | xargs -0 grep myerrorisbeinglookedat > outfile
xargs takes output of the loop, tokenizes them correctly and invokes grep command. xargs batches up the files based on operating system limits in case there are a large number of files.

Linux bash output fdirectory files to a text file with xargs and add new line

I want to generate a text file with the list of files present in the folder
ls | xargs echo > text.txt
I want to prepend the IP address to each file so that I can run parallel wget as per this post : Parallel wget in Bash
So my text.txt file content will have these lines :
123.123.123.123/file1
123.123.123.123/file2
123.123.123.123/file3
How can I append a string as the ls feeds xargs? (and also add line break at the end.)
Thank you
Simply printf and globbing to get the filenames:
printf '123.123.123.123/%s\n' * >file.txt
Or longer approach, leverage a for construct with help from globbing:
for f in *; do echo "123.123.123.123/$f"; done >file.txt
Assuming no filename with newline exists.

How to append contents of multiple files into one file

I want to copy the contents of five files to one file as is. I tried doing it using cp for each file. But that overwrites the contents copied from the previous file. I also tried
paste -d "\n" 1.txt 0.txt
and it did not work.
I want my script to add the newline at the end of each text file.
eg. Files 1.txt, 2.txt, 3.txt. Put contents of 1,2,3 in 0.txt
How do I do it ?
You need the cat (short for concatenate) command, with shell redirection (>) into your output file
cat 1.txt 2.txt 3.txt > 0.txt
Another option, for those of you who still stumble upon this post like I did, is to use find -exec:
find . -type f -name '*.txt' -exec cat {} + >> output.file
In my case, I needed a more robust option that would look through multiple subdirectories so I chose to use find. Breaking it down:
find .
Look within the current working directory.
-type f
Only interested in files, not directories, etc.
-name '*.txt'
Whittle down the result set by name
-exec cat {} +
Execute the cat command for each result. "+" means only 1 instance of cat is spawned (thx #gniourf_gniourf)
>> output.file
As explained in other answers, append the cat-ed contents to the end of an output file.
if you have a certain output type then do something like this
cat /path/to/files/*.txt >> finalout.txt
If all your files are named similarly you could simply do:
cat *.log >> output.log
If all your files are in single directory you can simply do
cat * > 0.txt
Files 1.txt,2.txt, .. will go into 0.txt
for i in {1..3}; do cat "$i.txt" >> 0.txt; done
I found this page because I needed to join 952 files together into one. I found this to work much better if you have many files. This will do a loop for however many numbers you need and cat each one using >> to append onto the end of 0.txt.
Edit:
as brought up in the comments:
cat {1..3}.txt >> 0.txt
or
cat {0..3}.txt >> all.txt
Another option is sed:
sed r 1.txt 2.txt 3.txt > merge.txt
Or...
sed h 1.txt 2.txt 3.txt > merge.txt
Or...
sed -n p 1.txt 2.txt 3.txt > merge.txt # -n is mandatory here
Or without redirection ...
sed wmerge.txt 1.txt 2.txt 3.txt
Note that last line write also merge.txt (not wmerge.txt!). You can use w"merge.txt" to avoid confusion with the file name, and -n for silent output.
Of course, you can also shorten the file list with wildcards. For instance, in case of numbered files as in the above examples, you can specify the range with braces in this way:
sed -n w"merge.txt" {1..3}.txt
if your files contain headers and you want remove them in the output file, you can use:
for f in `ls *.txt`; do sed '2,$!d' $f >> 0.out; done
All of the (text-) files into one
find . | xargs cat > outfile
xargs makes the output-lines of find . the arguments of cat.
find has many options, like -name '*.txt' or -type.
you should check them out if you want to use it in your pipeline
If the original file contains non-printable characters, they will be lost when using the cat command. Using 'cat -v', the non-printables will be converted to visible character strings, but the output file would still not contain the actual non-printables characters in the original file. With a small number of files, an alternative might be to open the first file in an editor (e.g. vim) that handles non-printing characters. Then maneuver to the bottom of the file and enter ":r second_file_name". That will pull in the second file, including non-printing characters. The same could be done for additional files. When all files have been read in, enter ":w". The end result is that the first file will now contain what it did originally, plus the content of the files that were read in.
Send multi file to a file(textall.txt):
cat *.txt > textall.txt
If you want to append contents of 3 files into one file, then the following command will be a good choice:
cat file1 file2 file3 | tee -a file4 > /dev/null
It will combine the contents of all files into file4, throwing console output to /dev/null.

Resources