How to copy the contents of a file to the end of the same file? - linux

I need to copy the contents of a file to the end of the same file.
I wrote the following code.
#!/bin/bash
cat file.txt >> file1.txt
cat file1.txt >> file.txt
rm file1.txt
But it creates an additional file. How can this be done without creating an additional file?

You could do something like:
input=file1.txt
dd bs="$(wc -c < "$input")" count=1 if="$input" >> "$input"
but....why? There's nothing wrong with using auxiliary files, and if you remove it shortly after it is created there is almost zero chance it will ever cause any actual disk operations. Just create the temp file and remove it. Note that this has an inherent race condition, since the computation of the file size (and wc -c is a terrible way to do that!) may produce a value that is wrong if any other process is mutating the file, but this issue is inherent in your problem description.

You can create a variable that as the value of you file and put it to the end :)

Or use sponge - that's exactly what it's there for.
cat file1.txt file1.txt | sponge file1.txt

Related

Copy a txt file twice to a different file using bash

I am trying to cat a file.txt and loop it twice through the whole content and copy it to a new file file_new.txt. The bash command I am using is as follows:
for i in {1..3}; do cat file.txt > file_new.txt; done
The above command is just giving me the same file contents as file.txt. Hence file_new.txt is also of the same size (1 GB).
Basically, if file.txt is a 1GB file, then I want file_new.txt to be a 2GB file, double the contents of file.txt. Please, can someone help here? Thank you.
Simply apply the redirection to the for loop as a whole:
for i in {1..3}; do cat file.txt; done > file_new.txt
The advantage of this over using >> (aside from not having to open and close the file multiple times) is that you needn't ensure that a preexisting output file is truncated first.
Note that the generalization of this approach is to use a group command ({ ...; ...; }) to apply redirections to multiple commands; e.g.:
$ { echo hi; echo there; } > out.txt; cat out.txt
hi
there
Given that whole files are being output, the cost of invoking cat for each repetition will probably not matter that much, but here's a robust way to invoke cat only once:[1]
# Create an array of repetitions of filename 'file' as needed.
files=(); for ((i=0; i<3; ++i)); do files[i]='file'; done
# Pass all repetitions *at once* as arguments to `cat`.
cat "${files[#]}" > file_new.txt
[1] Note that, hypothetically, you could run into your platform's command-line length limit, as reported by getconf ARG_MAX - given that on Linux that limit is 2,097,152 bytes (2MB) that's not likely, though.
You could use the append operator, >>, instead of >. Then adjust your loop count as needed to get the output size desired.
You should adjust your code so it is as follows:
for i in {1..3}; do cat file.txt >> file_new.txt; done
The >> operator appends data to a file rather than writing over it (>)
if file.txt is a 1GB file,
cat file.txt > file_new.txt
cat file.txt >> file_new.txt
The > operator will create file_new.txt(1GB),
The >> operator will append file_new.txt(2GB).
for i in {1..3}; do cat file.txt >> file_new.txt; done
This command will make file_new.txt(3GB),because for i in {1..3} will run three times.
As others have mentioned, you can use >> to append. But, you could also just invoke cat once and have it read the file 3 times. For instance:
n=3; cat $( yes file.txt | sed ${n}q ) > file_new.txt
Note that this solution exhibits a common anti-pattern and fails to properly quote the arguments, which will cause issues if the filename contains whitespace. See mklement's solution for a more robust solution.

How do I update a file using commands run against the same file?

As an easy example, consider the following command:
$ sort file.txt
This will output the file's data in sorted order. How do I put that data right back into the same file? I want to update the file with the sorted results.
This is not the solution:
$ sort file.txt > file.txt
... as it will cause the file to come out blank. Is there a way to update this file without creating a temporary file?
Sure, I could do something like this:
sort file.txt > temp.txt; mv temp.txt file.txt
But I would rather keep the results in memory until processing is done, and then write them back to the same file. sort actually has a flag that will allow this to be possible:
sort file.txt -o file.txt
...but I'm looking for a solution that doesn't rely on the binary having a special flag to account for this, as not all are guaranteed to. Is there some kind of linux command that will hold the data until the processing is finished?
For sort, you can use the -o option.
For a more general solution, you can use sponge, from the moreutils package:
sort file.txt | sponge file.txt
As mentioned below, error handling here is tricky. You may end up with an empty file if something goes wrong in the steps before sponge.
This is a duplicate of this question, which discusses the solutions above: How do I execute any command editing its file (argument) "in place" using bash?
You can do it with sed (with its r command), and Process Substitution:
sed -ni r<(sort file) file
In this way, you're telling sed not to print the (original) lines (-n option) and to append the file generated by <(sort file).
The well known -i option is the one which does the trick.
Example
$ cat file
b
d
c
a
e
$ sed -ni r<(sort file) file
$ cat file
a
b
c
d
e
Try vim-way:
$ ex -s +'%!sort' -cxa file.txt

Best way to overwrite file with itself

What is the fastest / most elegant way to read out a file and then write the content to that same file?
On Linux, this is not always the same as 'touching' a file, e.g. if the file represents some hardware device.
One possibility that worked for me is echo $(cat $file) > $file but I wondered if this is best practice.
You cannot generically do this in one step because writing to the file can interfere with reading from it. If you try this on a regular file, it's likely the > redirection will blank the file out before anything is even read.
Your safest bet is to split it into two steps. You can hide it behind a function call if you want.
rewrite() {
local file=$1
local contents=$(< "$file")
cat <<< "$contents" > "$file"
}
...
rewrite /sys/misc/whatever
I understand that you are not interested in simple touch $file but rather a form of transformation of the file.
Since transformation of the file can cause its temporary corruption it is NOT good to make it in place while others may read it in the same moment.
Because of that temporary file is the best choice of yours, and only when you convert the file you should replace the file in a one step:
cat "$file" | process >"$file.$$"
mv "$file.$$" "$file"
This is the safest set of operations.
Of course I omitted operations like 'check if the file really exists' and so on, leaving it for you.
The sponge utility in the moreutils package allows this
~$ cat file
foo
~$ cat file | sponge file
~$ cat file
foo

How can i add StdOut to a top of a file (not the bottom)?

I am using bash with linux to accomplish adding content to the top of a file.
Thus far i know that i am able to get this done by using a temporary file. so
i am doing it this way:
tac lines.bar > lines.foo
echo "a" >> lines.foo
tac lines.foo > lines.bar
But is there a better way of doing this without having to write a second file?
echo a | cat - file1 > file2
same as shellter's
and sed in one line.
sed -i -e '1 i<whatever>' file1
this will insert to file1 inplace.
the sed example i referred to
tac is very 'expensive' solution, especially as you need to use it 2x. While you still need to use a tmp file, this will take less time:
edit per notes from KeithThompson, now using '.$$' filename and condtional /bin/mv.
{
echo "a"
cat file1
} > file1.$$ && /bin/mv file1.$$ file1
I hope this helps
Using a named pipe and in place replacement with sed, you could add the output of a command at the top of a file without explicitly needing a temporary file:
mkfifo output
your_command >> output &
sed -i -e '1x' -e '1routput' -e '1d' -e '2{H;x}' file
rm output
What this does is buffering the output of your_command in a named pipe (fifo), and inserts in place this output using the r command of sed. For that, you need to start your_command in the background to avoid blocking on output in the fifo.
Note that the r command output the file at the end of the cycle, so we need to buffer the 1st line of file in the hold space, outputting it with the 2nd line.
I write without explicitly needing a temporary file as sed might use one for itself.

Why doesn't "sort file1 > file1" work?

When I am trying to sort a file and save the sorted output in itself, like this
sort file1 > file1;
the contents of the file1 is getting erased altogether, whereas when i am trying to do the same with 'tee' command like this
sort file1 | tee file1;
it works fine [ed: "works fine" only for small files with lucky timing, will cause lost data on large ones or with unhelpful process scheduling], i.e it is overwriting the sorted output of file1 in itself and also showing it on standard output.
Can someone explain why the first case is not working?
As other people explained, the problem is that the I/O redirection is done before the sort command is executed, so the file is truncated before sort gets a chance to read it. If you think for a bit, the reason why is obvious - the shell handles the I/O redirection, and must do that before running the command.
The sort command has 'always' (since at least Version 7 UNIX) supported a -o option to make it safe to output to one of the input files:
sort -o file1 file1 file2 file3
The trick with tee depends on timing and luck (and probably a small data file). If you had a megabyte or larger file, I expect it would be clobbered, at least in part, by the tee command. That is, if the file is large enough, the tee command would open the file for output and truncate it before sort finished reading it.
It doesn't work because '>' redirection implies truncation, and to avoid keeping the whole output of sort in the memory before re-directing to the file, bash truncates and redirects output before running sort. Thus, contents of the file1 file will be truncated before sort will have a chance to read it.
It's unwise to depend on either of these command to work the way you expect.
The way to modify a file in place is to write the modified version to a new file, then rename the new file to the original name:
sort file1 > file1.tmp && mv file1.tmp file1
This avoids the problem of reading the file after it's been partially modified, which is likely to mess up the results. It also makes it possible to deal gracefully with errors; if the file is N bytes long, and you only have N/2 bytes of space available on the file system, you can detect the failure creating the temporary file and not do the rename.
Or you can rename the original file, then read it and write to a new file with the same name:
mv file1 file1.bak && sort file1.bak > file1
Some commands have options to modify files in place (for example, perl and sed both have -i options (note that the syntax of sed's -i option can vary). But these options work by creating temporary files; it's just done internally.
Redirection has higher precedence. So in the first case, > file1 executes first and empties the file.
The first command doesn't work (sort file1 > file1), because when using the redirection operator (> or >>) shell creates/truncates file before the sort command is even invoked, since it has higher precedence.
The second command works (sort file1 | tee file1), because sort reads lines from the file first, then writes sorted data to standard output.
So when using any other similar command, you should avoid using redirection operator when reading and writing into the same file, but you should use relevant in-place editors for that (e.g. ex, ed, sed), for example:
ex '+%!sort' -cwq file1
or use other utils such as sponge.
Luckily for sort there is the -o parameter which write results to the file (as suggested by #Jonathan), so the solution is straight forward: sort -o file1 file1.
Bash open a new empty file when reads the pipe, and then calls to sort.
In the second case, tee opens the file after sort has already read the contents.
You can use this method
sort file1 -o file1
This will sort and store back to the original file. Also, you can use this command to remove duplicated line:
sort -u file1 -o file1

Resources