How to combine multiple files in linux with delimiter seperation? - linux

I'm trying to combine multiple files to 1 file using cat command.
However I wish to add a separation line like "----" in between the file contents.
Is there a way we can achieve this with cat or any other tool?
cat file1 file2 file3 file4 > newfile

you can use the following command for combining multiple files with --- delimiter.
awk 'FNR==1 && NR!=1 {print "---"}{print}' file1 file2 > newfile
command is copied from this post of Unix stack excahnge
https://unix.stackexchange.com/questions/163782/combine-two-text-files-with-adding-some-separator-between

Related

Append multiple directories contents to one file with AWK?

I've got a shell script which merges all my migration and seeder files to two bigger files, but I want to merge migrate.sql and seed.sql into just one big file called deploy.sql.
Is there a way with AWK to accept multiple directories into one final file?
Example:
#!/bin/bash
mkdir -p output
awk '{print}' ./migrations/*.sql > "output/migrate.sql"
awk '{print}' ./seeders/*.sql > "output/seed.sql"
Is there a way with AWK to accept multiple directories into one final
file?
GNU AWK does not accept directories, but rather files, in your case
awk '{print}' ./migrations/*.sql > "output/migrate.sql"
awk '{print}' ./seeders/*.sql > "output/seed.sql"
argument with * is replaced by all files compliant with descripition before being rammed into awk, consider following example, say you have only following files in current dir
file1.txt
file2.txt
file3.txt
which are empty then
awk 'BEGIN{print ARGV[1],ARGV[2],ARGV[3]}' file*.txt
does output
file1.txt file2.txt file3.txt
Observe that even in BEGIN, ARGV has entry for each file, rather than single entry with file*.txt.
You might use more than 1 argument with * when using GNU AWK that is you might do
awk '{print}' ./migrations/*.sql ./seeders/*.sql > "output/deploy.sql"
(tested in gawk 4.2.1)

Compare two files in unix and add the delta to one file

Both files has lines of string and numeric data minimum of 2000 lines.
How to add non duplicate data from file2.txt to file1.txt.
Basically file2 has the new data lines but we also want to ensure we are not adding duplicate lines to file1.txt.
File1.txt > this is the main data file
File2.txt > this file has the new data we want to add to file1
thanks,
Sort the two files together with the -u option to remove duplicates.
sort -u File1.txt File2.txt > NewFile.txt && mv NewFile.txt File1.txt
Another option if the file is sorted, just to have some choice (and I like comm :) )
comm --check-order --output-delimiter='' -13 File1.txt File2.txt >> File1.txt
use awk:
awk '!a[$0]++' File1.txt File2.txt
You can use grep, like this:
# grep those lines from file2 which are not in file1
grep -vFf file1 file2 > new_file2
# append the results to file1
cat new_file2 >> file1

shell script to compare two files and write the difference to third file

I want to compare two files and redirect the difference between the two files to third one.
file1:
/opt/a/a.sql
/opt/b/b.sql
/opt/c/c.sql
In case any file has # before /opt/c/c.sql, it should skip #
file2:
/opt/c/c.sql
/opt/a/a.sql
I want to get the difference between the two files. In this case, /opt/b/b.sql should be stored in a different file. Can anyone help me to achieve the above scenarios?
file1
$ cat file1 #both file1 and file2 may contain spaces which are ignored
/opt/a/a.sql
/opt/b/b.sql
/opt/c/c.sql
/opt/h/m.sql
file2
$ cat file2
/opt/c/c.sql
/opt/a/a.sql
Do
awk 'NR==FNR{line[$1];next}
{if(!($1 in line)){if($0!=""){print}}}
' file2 file1 > file3
file3
$ cat file3
/opt/b/b.sql
/opt/h/m.sql
Notes:
The order of files passed to awk is important here, pass the file to check - file2 here - first followed by the master file -file1.
Check awk documentation to understand what is done here.
You can use some tools like cat, sed, sort and uniq.
The main observation is this: if the line is in both files then it is not unique in cat file1 file2.
Furthermore in cat file1 file2| sort, all doubles are in sequence. Using uniq -u we get unique lines and have this pipe:
cat file1 file2 | sort | uniq -u
Using sed to remove leading whitespace, empty and comment lines, we get this final pipe:
cat file1 file2 | sed -r 's/^[ \t]+//; /^#/ d; /^$/ d;' | sort | uniq -u > file3

Awk to print each file

I have about 50 files in a directory
Have
File1: 1|2|3
File2: 3|4|5
File3: A|B|C
WANT
File1: A|1|2|3
File2: A|3|4|5
File3: A|A|B|C
I'll appreciate if anyone can solve it with awk command. I'm open to other solutions in linux. Also, I want to run it once an perform edits on all files in a directory.
The solution (see below) I have will require me to run it on each file one at a time and I don't think that's efficient
awk '{print "A|"$0}' File1
Try the below sed command,
sed -i 's/^/A|/' file1 file2 file3
To make it work on all the files in the current directory,
sed -i 's/^/A|/' *
With GNU awk for -i inplace:
gawk -i inplace '{print "A|"$0}' file1 file2 file3

Separating a joined file to original files in Linux

I know that to append or join multiple files in Linux, we can use the command: cat file1 >> file2.
But I couldn't find any command to separate file1 from file2 after joining them. In other words, I want both original file1 and file2 back again. I tried to use the split command but it just dismembers a file into multiple files with the same size.
Is there a way to do it?
There is no such command, since no information about what was file1 or file2 is retained. The new combined file is just a data stream.
In order to "split" them back up, you need rules about how to do so (such as, how many bytes file1 and file2 were).
When you perform the concatenation, the system doesn't keep track of how the resulting file was created. So it has no way of remembering where the original split was located in that file.
Can you explain what you are trying to do ?
No problem, as long as you still have file1:
$ echo foobar >file1
$ echo blah >file2
$ cat file1 >> file2
$ truncate -s $(( $(stat -c '%s' file2) - $(stat -c '%s' file1) )) file2
$ cat file2
blah
Also, instead of stat -c '%s' filename you can use wc -c filename | cut -f 1 -d ' ', which is longer but more portable.

Resources