Linux: Merging multiple files, each on a new line - linux

I am using cat *.txt to merge multiple txt files into one, but I need each file to be on a separate line.
What is the best way to merge files with each file appearing on a new line?

just use awk
awk 'FNR==1{print ""}1' *.txt

If you have a paste that supports it,
paste --delimiter=\\n --serial *.txt
does a really great job

You can iterate through each file with a for loop:
for filename in *.txt; do
# each time through the loop, ${filename} will hold the name
# of the next *.txt file. You can then arbitrarily process
# each file
cat "${filename}"
echo
# You can add redirection after the done (which ends the
# for loop). Any output within the for loop will be sent to
# the redirection specified here
done > output_file

for file in *.txt
do
cat "$file"
echo
done > newfile

I'm assuming you want a line break between files.
for file in *.txt
do
cat "$file" >> result
echo >> result
done

Related

Split and rename single file into multiple files using keywords present in file

New to awk like commands. I have single text file holding SQL DDL's in below format.
DROP TABLE IF EXISTS $database.TABLE_A ;
...
...
DROP TABLE IF EXISTS $database.TABLE_B ;
...
...
Would like to split single file into multiple files as
TABLE_A.SQL
TABLE_B.SQL
TABLE_X.SQL
I am able to get the table names from single file with the help of below awk command. Still struggling to split and rename file with TABLE_X.SQL name.
awk 'FNR==1 {split($5,a,"."); print a[2]}' *.SQL
I am using Windows 10 DOS shell.
Finally I am able to acheive desired output with the help of below Shell script, which we can run in Windows bash shell ...
#!/bin/bash
#Split single file
awk '/DROP/{x="F"++i;}{print > x".TXT";}' $1
#Create output directory
mkdir -p ./_output
#Move file by chaning extention
for f in *.TXT ; do
newfilename=$(awk 'FNR==1 {split($5,a,"."); print a[2]}' "$f")
echo Processed $f ... new file is $newfilename".SQL" ...
mv $f ./_output/$newfilename".SQL"
done
Could you please try following.
awk '/DROP/{if(file){close(file)};match($0,/TABLE_[^ ]*/);file=substr($0,RSTART,RLENGTH)".SQL"} {print > (file)}' Input_file
awk -F "[. ]" '{print >($(NF-1)".SQL")}' file.sql

Save Bash Shell Script Output To a File with a String

I have an executable that takes a file and outputs a line.
I am running a loop over a directory:
for file in $DIRECTORY/*.png
do
./eval $file >> out.txt
done
The output of executable does not contain the name of the file.
I want to append the file name with each output.
EDIT1
Perhaps, I could not explain it correctly
I want the name of the file and the output of the program as well, which is processing the same file, Now I am doing following
for file in $DIRECTORY/*.png
do
echo -n $file >> out.txt
or
printf "%s" "$file" >> out.txt
./eval $file >> out.txt
done
For both new line is inserted
If I understood your question, what you want is:
get the name of the file,
...and the output or the program processing the file (in your case, eval),
...on the same line. And this last part is your problem.
Then I'd suggest composing a single line of text (using echo), comprising:
the name of the file, this is the $file part,
...followed by a separator, you may not need that but it may help further processing of the result. I used ":". You can skip this part if this is not interesting for you,
...followed by the output of the program processing the file: this is the $(...) construct
echo $file ":" $(./eval $file) >> out.txt
...and finally appending this line of text to a file, you got that part right.
please use like this
echo -n `echo ${file}|tr -d '\n'` >> out.txt
OR
newname=`echo ${file}|tr -d '\n'`
echo -n $newname >> out.txt

Paste files from list of paths into single output file

I have a file containing a list of filenames and their paths, as in the example below:
$ cat ./filelist.txt
/trunk/data/9.20.txt
/trunk/data/9.30.txt
/trunk/data/50.3.txt
/trunk/data/55.100.txt
...
All of these files, named as X.Y.txt, contain a list of double values. For example:
$ cat ./9.20.txt
1.23
1.0e-6
...
I'm trying to paste all of these X.Y.txt files into a single file, but I'm not sure about how to do it. Here's what I've been able to do so far:
cat ./filelist.txt | xargs paste output.txt >> output.txt
Any ideas on how to do it properly?
You could simply cat-append each file into your output file, as in:
$ cat <list_of_paths> | xargs -I {} cat {} >> output.txt
In the above command, each line from your input file will be taken by xargs, and will be used to replace {}, so that each actual command being run is:
$ cat <X.Y.txt> >> output.txt
If all you're looking to do is to read each line from filelist.txt and append the contents of the file that the line refers to to a single output file, use this:
while read -r file; do
[[ -f "$file" ]] && cat "$file"
done < "filelist.txt" > "output.txt"
Edit: If you know your input file to only contain lines that are file paths (and optionally empty lines) - and no comments, etc. - #Rubens' xargs-based solution is the simplest.
The advantage of the while loop is that you can pre-process each line from the input file, as demonstrated by the -f test above, which ensures that the input line refers to an existing file.
More complex but without argument length limit
Well, the limit here is the available computer memory.
The file buffer.txt must not exist already.
touch buffer.txt
cat filelist.txt | xargs -iXX bash -c 'paste buffer.txt XX > output.txt; mv output.txt buffer.txt';
mv buffer.txt output.txt
What this does, by line:
Create a buffer.txt file which must be initially empty. (paste does not seem to like non-existent files. There does not seem to be a way to make it treat such files as empty.)
Run paste buffer.txt XX > output.txt; mv output.txt buffer.txt. XX is replaced by each file in the filelist.txt file. You can't just do paste buffer.txt XX > buffer.txt because buffer.txt will be truncated before paste processes it. Hence the mv rigmarole.
Move buffer.txt to output.txt so that you get your output with the file name you wanted. Also makes it safe to rerun the whole process.
The previous version forced xargs to issue exactly one paste per file you want to paste but for even better performance, you can do this:
touch buffer.txt;
cat filelist.txt | xargs bash -c 'paste buffer.txt "$#" > output.txt; mv output.txt buffer.txt' FILLER;
mv buffer.txt output.txt
Note the presence of "$#" in the command that bash executes. So paste gets the list of arguments from the list of arguments given to bash. The FILLER parameter passed to bash is to give it a value for $0. If it were not there, then the first file that xargs gives to bash would be used for $0 and thus paste would skip some files.
This way, xargs can pass hundreds of parameters to paste with each invocation and thus reduce dramatically the number of times paste is invoked.
Simpler but limited way
This method suffer from limitations on the number of arguments that a shell can pass to a command it executes. However, in many cases it is good enough. I can't count the number of times when I was performing spur-of-the-moment operations where using xargs would have been superfluous. (As part of a long term solution, that's another matter.)
The simpler way is:
paste `cat filelist.txt` > output.txt
It seems you were thinking that xargs would execute paste output.txt >> output.txt multiple times but that's not how it works. The redirection applies to the entire cat ./filelist.txt | xargs paste output.txt (as you initially had it). If you want to have redirection apply to the individual commands launched by xargs you have it launch a shell, like I do above.
#!/usr/bin/env bash
set -x
while read -r
do
echo "${REPLY}" >> output.txt
done < filelist.txt
OR, to get the files directly:-
#!/usr/bin/env bash
set -x
find *.txt -type f | while read $files
do
echo "${files}" >> output.txt
done
A simple while loop should do the trick:
while read line; do
cat ${line} >> output.txt
done < filelist.txt

Prepend data from one file to another

How do I prepend the data from file1.txt to file2.txt?
The following command will take the two files and merge them into one
cat file1.txt file2.txt > file3.txt; mv file3.txt file2.txt
You can do this in a pipeline using sponge from moreutils:
cat file1.txt file2.txt | sponge file2.txt
Another way using GNU sed:
sed -i -e '1rfile1.txt' -e '1{h;d}' -e '2{x;G}' file2.txt
That is:
On line 1, append the content of the file file1.txt
On line 1, copy pattern space to hold space, and delete pattern space
On line 2, exchange the content of the hold and pattern spaces, and append the hold space to pattern space
The reason it's a bit tricky is that the r command appends content,
and line 0 is not addressable, so we have to do it on line 1,
moving the content of the original line out of the way and then bringing it back after the content of the file is appended.
If it's available on your system, then sponge from moreutils is designed for this. Here is an example:
cat file1.txt file2.txt | sponge file2.txt
If you don't have sponge, then the following script does the same job using a temporary file. It makes sure that the temporary file is not accessible by other users, and cleans it up at the end.
If your system, or the script crashes, you may need to clean up the temporary file manually. Tested on Bash 4.4.23, and Debian 10 (Buster) Gnu/Linux.
#!/bin/bash
#
# ----------------------------------------------------------------------------------------------------------------------
# usage [ from, to ]
# [ from, to ]
# ----------------------------------------------------------------------------------------------------------------------
# Purpose:
# Prepend the contents of file [from], to file [to], leaving the result in file [to].
# ----------------------------------------------------------------------------------------------------------------------
# check
[[ $# -ne 2 ]] && echo "[exit]: two filenames are required" >&2 && exit 1
# init
from="$1"
to="$2"
tmp_fn=$( mktemp -t TEMP_FILE_prepend.XXXXXXXX )
chmod 600 "$tmp_fn"
# prepend
cat "$from" "$to" > "$tmp_fn"
mv "$tmp_fn" "$to"
# cleanup
rm -f "$tmp_fn"
# [End]
The way of writing file is like 1). append at the end of the file or 2). rewrite that file.
If you want to put the content in file1.txt ahead of file2.txt, I'm afraid you need to rewrite the combined fine.

How can i add StdOut to a top of a file (not the bottom)?

I am using bash with linux to accomplish adding content to the top of a file.
Thus far i know that i am able to get this done by using a temporary file. so
i am doing it this way:
tac lines.bar > lines.foo
echo "a" >> lines.foo
tac lines.foo > lines.bar
But is there a better way of doing this without having to write a second file?
echo a | cat - file1 > file2
same as shellter's
and sed in one line.
sed -i -e '1 i<whatever>' file1
this will insert to file1 inplace.
the sed example i referred to
tac is very 'expensive' solution, especially as you need to use it 2x. While you still need to use a tmp file, this will take less time:
edit per notes from KeithThompson, now using '.$$' filename and condtional /bin/mv.
{
echo "a"
cat file1
} > file1.$$ && /bin/mv file1.$$ file1
I hope this helps
Using a named pipe and in place replacement with sed, you could add the output of a command at the top of a file without explicitly needing a temporary file:
mkfifo output
your_command >> output &
sed -i -e '1x' -e '1routput' -e '1d' -e '2{H;x}' file
rm output
What this does is buffering the output of your_command in a named pipe (fifo), and inserts in place this output using the r command of sed. For that, you need to start your_command in the background to avoid blocking on output in the fifo.
Note that the r command output the file at the end of the cycle, so we need to buffer the 1st line of file in the hold space, outputting it with the 2nd line.
I write without explicitly needing a temporary file as sed might use one for itself.

Resources