Concatenating Files And Insert New Line In Between Files - linux

I have multiple files which I want to concat with cat.
Let's say
File1.txt
foo
File2.txt
bar
File3.txt
qux
I want to concat so that the final file looks like:
foo
bar
qux
Instead of this with usual cat File*.txt > finalfile.txt
foo
bar
qux
What's the right way to do it?

You can do:
for f in *.txt; do (cat "${f}"; echo) >> finalfile.txt; done
Make sure the file finalfile.txt does not exist before you run the above command.
If you are allowed to use awk you can do:
awk 'FNR==1{print ""}1' *.txt > finalfile.txt

If you have few enough files that you can list each one, then you can use process substitution in Bash, inserting a newline between each pair of files:
cat File1.txt <(echo) File2.txt <(echo) File3.txt > finalfile.txt

If it were me doing it I'd use sed:
sed -e '$s/$/\n/' -s *.txt > finalfile.txt
In this sed pattern $ has two meanings, firstly it matches the last line number only (as a range of lines to apply a pattern on) and secondly it matches the end of the line in the substitution pattern.
If your version of sed doesn't have -s (process input files separately) you can do it all as a loop though:
for f in *.txt ; do sed -e '$s/$/\n/' $f ; done > finalfile.txt

This works in Bash:
for f in *.txt; do cat $f; echo; done
In contrast to answers with >> (append), the output of this command can be piped into other programs.
Examples:
for f in File*.txt; do cat $f; echo; done > finalfile.txt
(for ... done) > finalfile.txt (parens are optional)
for ... done | less (piping into less)
for ... done | head -n -1 (this strips off the trailing blank line)

You may do it using xargs if you like, but the main idea is still the same:
find *.txt | xargs -I{} sh -c "cat {}; echo ''" > finalfile.txt

That's how I just did it on OsX 10.10.3
for f in *.txt; do (cat $f; echo '') >> fullData.txt; done
since the simple 'echo' command with no params ended up in no new lines inserted.

In python, this concatenates with blank lines between files (the , suppresses adding an extra trailing blank line):
print '\n'.join(open(f).read() for f in filenames),
Here is the ugly python one-liner that can be called from the shell and prints the output to a file:
python -c "from sys import argv; print '\n'.join(open(f).read() for f in argv[1:])," File*.txt > finalfile.txt

You could use grep, with -h to not echo the filenames
grep -h "" File*.txt
Will give:
foo
bar
qux

Related

How to rename a file that starts with a sequence using bash command line

I have multiple files in a folder that i want to rename. The file names are currently in the below format.
axuajsnd_file1.txt
asdeacasasacas_file2.txt
What i am trying to do is rename all these files to the name after the underscore, so axuajsnd_file1.txt would be file1.txt.
Can i do this using a single line command or would i need a script to rename all my files?
With a for loop , Parameter Expansion and mv.
for f in *_*.txt; do echo mv -v "$f" "${f#*_}"; done
Remove the echo if you're satisfied with the output, so mv can move/rename the files.
Let's say you have:
$ ls
axuajsnd_file1.txt
asdeacasasacas_file2.txt
sdmsdmksdmsddsms_file3.txt
skdksdksdkmdskm_file4.txt
Check that the result is correct:
$ for i in *.txt; do echo "$i -> $(echo $i | awk -F '_' '{print $2}')"; done
asdeacasasacas_file2.txt -> file2.txt
axuajsnd_file1.txt -> file1.txt
sdmsdmksdmsddsms_file3.txt -> file3.txt
skdksdksdkmdskm_file4.txt -> file4.txt
Now that you checked, you may rename your files:
$ for i in *.txt; do mv "$i" $(echo "$i" | awk -F '_' '{print $2}'); done
Output:
$ ls
file1.txt file2.txt file3.txt file4.txt

Filename manipulation

Kindly help me with a unix script to modify the filename in required format as shown below:
AN_555a_orange_20190513.txt
AN_555b_apple_20190513.txt
Required format: Fruits names first character should be in Caps and also its position should be is changed to second:
AN_Orange_555a_20190513.txt
AN_Apple_555a_20190513.txt
And it should apply for all files present in directory,
below is the command i'm trying which is not working
for in in aaal*
do
out=${in#*_}
out=${out%_*_*_*}
out=${out%[0-9]}
out1=${out#*_}
out2=${out%_*}
AAAI_$out1$out2.txt
done
This script is simple, but worked with your sample:
#!/bin/bash
for i in AN*; do
NAME=$(echo $i | awk -F_ '{printf "%s_%s%s_%s_%s", $1,toupper( substr( $3,1,1)),(substr($3,2,100)),$2,$4,$5}')
echo "--> $NAME"
done
An interesting solution for this case is to use sed, just like this:
$ ls -1 | sed 's/\(AN_\)\([^_]*_\)\([a-z]*_\)\([0-9]*.txt\)/mv "&" "\1\u\3\2\4"/e'
Note the final e at the end of the sed command. It tells sed to execute the result of the substitution as a bash command.
So if you remove the e (which you could do at first, to check the substitution works as expected), you would get in the console:
$ ls -1 | sed 's/\(AN_\)\([^_]*_\)\([a-z]*_\)\([0-9]*.txt\)/mv "&" "\1\u\3\2\4"/'
mv "AN_555a_orange_20190513.txt" "AN_Orange_555a_20190513.txt"
mv "AN_555b_apple_20190513.txt" "AN_Apple_555b_20190513.txt"
(The sed substitution matches the several groups of characters, reorders them and creates the mv ... ... line. Note that & in the replacement pattern denotes the whole pattern matched, and \u tells sed to put the next character as upper case.)
Then add back that final e, and instead of printing these lines sed will execute them, effectively renaming the files.
This onliner could give you more idas:
awk -F_ '{printf "mv %s %s_%s%s_%s_%s\n", $0, $1,toupper(substr($3,1,1)), substr($3, 2),$2,$4}' <(ls *.txt)
This will print something like:
mv AN_555a_orange_20190513.txt AN_Orange_555a_20190513.txt
mv AN_555b_apple_20190513.txt AN_Apple_555b_20190513.txt
Then if are happy with the results, pipe it to sh for example:
awk -F_ '{printf "mv %s %s_%s%s_%s_%s\n", $0, $1,toupper(substr($3,1,1)), substr($3, 2),$2,$4}' <(ls *.txt) | sh

how to get last line first 2 character of a file in linux

I have files containing below format, generated by another system
12;453453;TBS;OPPS;
12;453454;TGS;OPPS;
12;453455;TGS;OPPS;
12;453456;TGS;OPPS;
20;787899;THS;CLST;
33;786789;
i have to check the last line contains 33 , then have to continue to copy the file/files to other location. else discard the file.
currently I am doing as below
tail -1 abc.txt >> c.txt
awk '{print substr($0,0,2)}' c.txt
then if the o/p is saved to another variable and copying.
Can anyone suggest any other simple way.
Thank you!
R/
Imagine you have the following input file:
$ cat file
a
b
c
d
e
agc
Then you can run the following commands (grep, awk, sed, cut) to get the first 2 char of last line:
AWK
$ awk 'END{print substr($0,0,2)}' file
ag
SED
$ sed -n '$s/^\(..\).*/\1/p' file
ag
GREP
$ tail -1 file | grep -oE '^..'
ag
CUT
$ tail -1 file | cut -c '1-2'
ag
BASH SUBSTRING
line=$(tail -1 file); echo ${line:0:2}
All of those commands do what you are looking for, the awk command will just do the operation on the last line of the file so you do not need tail anymore, the said command will extract the last line of the file and store it in its pattern buffer, then replace everything that is not the first 2 chars by nothing and then print the pattern buffer (the 2 fist char of the last line), another solution is just to tail the last line of the file and to extract the first 2 chars using grep, by piping those 2 commands you can also do it in one step without using intermediate variables, files.
Now if you want to put everything in one script this become:
$ more file check_2chars.sh
::::::::::::::
file
::::::::::::::
a
b
c
d
e
33abc
::::::::::::::
check_2chars.sh
::::::::::::::
#!/bin/bash
s1=$(tail -1 file | cut -c 1-2) #you can use other commands from this post
s2=33
if [ "$s1" == "$s2" ]
then
echo "match" #implement the copy/discard logic
fi
Execution:
$ ./check_2chars.sh
match
I will let you implement the copy/discard logic
PROOF:
Given the task of either copying or deleting files based on their contents, shell variables aren't necessary.
Using the sed File name command and xargs the whole task can be done in just one line:
find | xargs -l sed -n '${/^33/!F}' | xargs -r rm ; cp * dest/dir/
Or preferably, with GNU sed:
sed -sn '${/^33/!F}' * | xargs -r rm ; cp * dest/dir/
Or if all the filenames contain no whitespace:
rm -r $(sed -sn '${/^33/!F}' *) ; cp * dest/dir/
That assumes all the files in the current directory are to be tested.
sed looks at the last line ($) of every file, and runs what's in the curly braces.
If any of those last lines line do not begin with 33 (/^33/!), sed outputs just those unwanted file names (F).
Supposing the unwanted files are named foo and baz -- those are piped to xargs which runs rm foo baz.
At this point the only files left should be copied over to dest/dir/: cp * dest/dir/.
It's efficient, cp and rm need only be run once.
If a shell variable must be used, here are two more methods:
Using tail and bash, store first two chars of the last line to $n:
n="$(tail -1 abc.txt)" n="${n:0:2}"
Here's a more portable POSIX shell version:
n="$(tail -1 abc.txt)" n="${n%${n#??}}"
You may explicitly test with sed for the last line ($) starting with 33 (/^33.*/):
echo " 12;453453;TBS;OPPS;
12;453454;TGS;OPPS;
12;453455;TGS;OPPS;
12;453456;TGS;OPPS;
20;787899;THS;CLST;
33;786789;" | sed -n "$ {/^33.*/p}"
33;786789;
If you store the result in a variable, you may test it for being empty or not:
lastline33=$(echo " 12;453453;TBS;OPPS;
12;453454;TGS;OPPS;
12;453455;TGS;OPPS;
12;453456;TGS;OPPS;
20;787899;THS;CLST;
33;786789;" | sed -n "$ {/^33.*/p}")
echo $(test -n "$lastline33" && echo not null || echo null)
not null
Probably you like the regular expression to contain the semicolon, because else it would match 330, 331, ...339, 33401345 and so on, but maybe that can be excluded from the context - to me it seems a good idea:
lastline33=$(sed -n "$ {/^33;.*/p}" abc.txt)

Cat several files into one file with the file name before the data

I have several log files with data in them. What i want to do is cat all these files into one file. But before the data goes in i want the file name to be there without the extension. For Example:
Files I have:
file1.log file2.log file3.log
The file that i want: all.log
all.log to have in it:
file1
file1's data
file2
file2's data
file3
file3's data
Using awk
awk 'FNR==1{sub(/[.][^.]*$/, "", FILENAME); print FILENAME} 1' file*.log >all.log
FNR is the file record number. It is one at the beginning of each file. Thus, the test FNR==1 tells us if we are at the beginning of a file. If we are, then we remove the extension from the filename using sub(/[.][^.]*$/, "", FILENAME) and then we print it.
The final 1 in the program is awk's cryptic way of saying print-this-line.
The redirection >all.log saves all the output in file all.log.
Using shell
for f in file*.log; do echo "${f%.*}"; cat "$f"; done >all.log
Or:
for f in file*.log
do
echo "${f%.*}"
cat "$f"
done >all.log
In shell, for f in file*.log; do starts a loop over all files matching the glob file*.log. The statement echo "${f%.*}" prints the file name minus the extension. ${f%.*} is an example of suffix removal. cat "$f" prints the contents of the file. done >all.log terminates the loop and saves all the output in all.log.
This loop will work correctly even if file names contain spaces, tabs, newlines, or other difficult characters.
Suppose you have two files:
foo:
a
b
c
bar:
d
e
f
Using Perl:
perl -lpe 'print $ARGV if $. == 1; close(ARGV) if eof' foo bar > all.log
foo
a
b
c
bar
d
e
f
$. is the line number
$ARGV is the name of the current file
close(ARGV) if eof resets the line number at the end of each file
Using grep:
grep '' foo bar > all.log
foo:a
foo:b
foo:c
bar:d
bar:e
bar:f
for i in `ls file*`; do `echo $i | awk -F"." '{print $1}' >> all.log; cat $i >> all.log`; done

insert the contents of a file to another (in a specific line of the file that is sent)-BASH/LINUX

I tried doing it with cat and then after I type the second file I added | head -$line | tail -1 but it doesn't work because it performs cat first.
Any ideas? I need to do it with cat or something else.
I'd probably use sed for this job:
line=3
sed -e "${line}r file2" file1
If you're looking to overwrite file1 and you have GNU sed, add the -i option. Otherwise, write to a temporary file and then copy/move the temporary file over the original, cleaning up as necessary (that's the trap stuff below). Note: copying the temporary over the file preserves links; moving does not (but is swifter, especially if the file is big).
line=3
tmp="./sed.$$"
trap "rm -f $tmp; exit 1" 0 1 2 3 13 15
sed -e "${line}r file2" file1 > $tmp
cp $tmp file1
rm -f $tmp
trap 0
Just for fun, and just because we all love ed, the standard editor, here's an ed version. It's very efficient (ed is a genuine text editor)!
ed -s file2 <<< $'3r file1\nw'
If the line number is stored in the variable line then:
ed -s file2 <<< "${line}r file1"$'\nw'
Just to please Zack, here's one version with less bashism, in case you don't like bash (personally, I don't like pipes and subshells, I prefer herestrings, but hey, as I said, that's only to please Zack):
printf "%s\n" "${line}r file1" w | ed -s file2
or (to please Sorpigal):
printf "%dr %s\nw" "$line" file1 | ed -s file2
As Jonathan Leffler mentions in a comment, and if you intend to use this method in a script, use a heredoc (it's usually the most efficient):
ed -s file2 <<EOF
${line}r file1
w
EOF
Hope this helps!
P.S. Don't hesitate to leave a comment if you feel you need to express yourself about the ways to drive ed, the standard editor.
cat file1 >>file2
will append content of file1 to file2.
cat file1 file2
will concatenate file1 and file2 and send output to terminal.
cat file1 file2 >file3
will create or overwite file3 with concatenation of file1 and file2
cat file1 file2 >>file3
will append concatenation of file1 and file2 to end of file3.
Edit:
For trunking file2 before adding file1:
sed -e '11,$d' -i file2 && cat file1 >>file2
or for making a 500 lines file:
n=$((500-$(wc -l <file1)))
sed -e "1,${n}d" -i file2 && cat file1 >>file2
Lots of ways to do it, but I like to to choose a way that involves making tools.
First, setup test environment
rm -rf /tmp/test
mkdir /tmp/test
printf '%s\n' {0..9} > /tmp/test/f1
printf '%s\n' {one,two,three,four,five,six,seven,eight,nine,ten} > /tmp/test/f2
Now let's make the tool, and in this first pass we'll implement it badly.
# insert contents of file $1 into file $2 at line $3
insert_at () { insert="$1" ; into="$2" ; at="$3" ; { head -n $at "$into" ; ((at++)) ; cat "$insert" ; tail -n +$at "$into" ; } ; }
Then run the tool to see the amazing results.
$ insert_at /tmp/test/f1 /tmp/test/f2 5
But wait, the result is on stdout! What about overwriting the original? No problem, we can make another tool for that.
insert_at_replace () { tmp=$(mktemp) ; insert_at "$#" > "$tmp" ; mv "$tmp" "$2" ; }
And run it
$ insert_at_replace /tmp/test/f1 /tmp/test/f2 5
$ cat /tmp/test/f2
"Your implementation sucks!"
I know, but that's the beauty of making simple tools. Let's replace insert_at with the sed version.
insert_at () { insert="$1" ; into="$2" ; at="$3" ; sed -e "${at}r ${insert}" "$into" ; }
And insert_at_replace keeps working (of course). The implementation of insert_at_replace can also be changed to be less buggy, but I'll leave that as an exercise for the reader.
I like doing this with head and tail if you don't mind managing a new file:
head -n 16 file1 > file3 &&
cat file2 >> file3 &&
tail -n+56 file1 >> file3
You can collapse this onto one line if you like. Then, if you really need it to overwrite file1, do: mv file3 file1 (optionally include && between commands).
Notes:
head -n 16 file1 means first 16 lines of file1
tail -n+56 file1 means file1 starting from line 56 to the end
Hence, I actually skipped lines 17 through 55 from file1.
Of course, if you could change 56 to 17 so no lines are skipped.
I prefer to mix simple head and tail commands then try a magic sed command.

Resources