Redirecting the cat ouput of file to the same file - linux

In a particular directory, I made a file named "fileName" and add contents to it. When I typed cat fileName, it's content are printed on the terminal. Now I used the following command:
cat fileName>fileName
No error was shown. Now when I try to see contents of file using,
cat fileName
nothing was shown in the terminal and file is empty (when I checked it). What is the reason for this?

> i.e. redirection to the same file will create/truncate the file before cat command is invoked as it has a higher precedence. You could avoid the same by using intermediate file and then from intermediate to actual file or you could use tee like:
cat fileName | tee fileName

To clarify on SMA's answer, the file is truncated because redirection is handled by the shell, which opens the file for writing before invoking the command. when you run cat file > file,the shell truncates and opens the file for writing, sets stdout to the file, and then execute ["cat", "file"]. So you will have to use some other command for the task like tee

The answers given here are wrong. You will have a problem with truncating regardless of using the redirect or pipeline, although it may APPEAR to work sometimes, depending on size of file or length of your pipeline. It is a race condition, as the reader may have a chance to read some or all of the file before the writer starts, but the point of the pipeline is to run all these at the same time so they will be starting at the same time and the first thing tee executable will do is open the output file (and truncate it in the process). The only way you will not have a problem in this scenario is if the end of the pipeline would load the entirety of the output into memory and only write it to file on shutdown. It is unlikely to happen and defeats the point of having a pipeline.
Proper solution for making this reliable is to just write to a temp file and then rename the temp file back to original filename:
TMP="$(mktemp fileName.XXXXXXXX)"
cat fileName | grep something | tee "${TMP}"
mv "${TMP}" fileName

Related

Using tee command with soft linked file

I have a script start.sh which runs another script run.sh. run.sh starts my executable.
I wanted to record what run.sh does and I used a tee command to log it into a file loglink, start.sh:
exec run.sh | tee -a loglink
loglink is a soft linked file.
I have a logic where I have 3 log files log1.txt log2.txt log3.txt, I need each file to have a max size of only 1024 bytes. So in my code I keep checking every 5 seconds if log1.txt has reached max size, if reached max size, I change the softlink loglink to point to log2.txt, same with log2.txt and log3.txt in circular way.
As per my understanding, when I change the softlink from log1.txt to log2.txt, tee should print to log2.txt, but strange, tee is still saving output to log1.txt and not log2.txt
And to add on,
I see the softlink changed in ls -l
I tried something ls-l | tee loglink, it does to log2.txt.
Why the tee in script start.sh is not recognising this link change?
Am I missing some logic here?
In short, a filename or symbol link is just a proxy for program to tell the kernel setup the reading or writing path for the real file representation in kernel.
tee used file descriptor to represent files, as its source code(from freebsd) explains:
for (exitval = 0; *argv; ++argv)
if ((fd = open(*argv, append ? O_WRONLY|O_CREAT|O_APPEND :
O_WRONLY|O_CREAT|O_TRUNC, DEFFILEMODE)) < 0) {
warn("%s", *argv);
exitval = 1;
} else
add(fd, *argv);
once a file is opened, in your case, the symbol link is followed and open the target log file, after then, the path for writing do file is opened, and symbol link or filename is not need anymore.
A program which opens a file keeps that file link. If you change the link from outside, the program is not impressed and keeps writing to (or reading from) the original file.
Only if your program closes the file and reopens it, it will be using the new link.
You may, for example, open a file in vlc and play it, then, while playing, move it to a different directory. No problem. Then delete it. You now can't open it with a new program, but the old one is using it until the file is closed by that program.
Its a normal behaviour, as rightly explained in other answers.
As a solution you should periodically open and close output file in your run.sh or use
very nice utility for runtime change of the other process output:
reredirect -m newfile.log `pidof run.sh`

Why would vim create a new file every time you save a file?

I have a file named test:
[test#mypc ~]$ ls -i
4982967 test
Then I use vim to change its content and enter :w to save it.
It now has a different inode:
[test#mypc ~]$ ls -i
4982968 test
That means it's a different file already, why would vim save it to another file as I use :w to save to the original one?
You see, echo to a file will not change the inode, which is expected:
[test#mypc ~]$ echo v >> test
[test#mypc ~]$ ls -i
4982968 test
It is trying to protect you from disk and os problems. It writes out a complete copy of the file, and when it is satisfied this has finished properly, renames this file to the required filename. Hence, new inode number.
If there were a crash during the save process, the original file would remain untouched, possibly saving you from losing the file completely.

Copy modified content to new file in linux

how can we write a shell script in linux to copy a newly added content from a file and append it to another file.
I have a log file where errors will be stored and i am supposed to retrieve the new errors and store it in a database table. I will run a cron job invoking the shell script in a certain interval.
Edited:
Sample Log
140530 13:48:57 [ERROR] Event Scheduler: [root#%][test.event] Table 'test.test_event' doesn't exist
140530 13:48:57 [Note] Event Scheduler: [root#%].[test.event] event execution failed.
140530 13:49:57 [ERROR] Event Scheduler: [root#%][test.event] Table 'test.test_event' doesn't exist
140530 13:49:57 [Note] Event Scheduler: [root#%].[test.event] event execution failed.
initially i copied this into a file using cat but later some more error will be logged, only newly added lines should be logged.how can i do it in a routine basis.
Kindly help! Thanks in advance !
Simplest case
You can use tail -f to keep retrieving data from a file whenever it is appended to, then use >> (appending redirect) to append it to your second file.
tail -f file1.txt >> file2.txt
will "watch" file1.txt and append new content to file2.txt.
To test that it works, open another terminal and do:
echo "Hello!" >> file1.txt
You should see "Hello!" appear in file2.txt.
Please note that this will only work if the underlying I/O operation on file1.txt was an actual append. It won't work if you open file1.txt in a text editor and change its content, for instance. It also won't work as a cron job, because it needs to run continuously.
With cron
To periodically check for appends, you could do a diff on an earlier version of the file you saved somewhere, then use sed to get only those lines that were appended in the meantime:
diff file1_old.txt file1_current.txt | \
sed -r -e '/^[^>]/ d' -e 's/^> //' >> file2.txt
But then you have to take care of storing the earlier versions somewhere etc. in your cron job as well.
If you need to append (catenate) one file with another, use the "cat" command:
cat file1.txt file2.txt > fileall.txt
But if you need to modify the contents of a file, I recommend you to use "sed", or "grep" if what you need is a filter.
Sorry, your specification is a bit loose, so I cannot give you a more exact answer.
BTW. Database table? Can you please explain?

Is it OK to use the same input file as output of a piped command?

Consider something like:
cat file | command > file
Is this good practice? Could this overwrite the input file as the same time as we are reading it, or is it always read first in memory then piped to second command?
Obviously, I can use temp files as intermediary step, but I'm just wondering..
t=$(mktemp)
cat file | command > ${t} && mv ${t} file
No, it is not ok. All commands in a pipeline execute at the same time, and the shell prepares redirections before executing the commands. So, it is likely that the command will overwrite the file before cat reads it.
You need sponge(1) from moreutils.
You can also use something like this (not recommended, use explicit temp files in production code):
{ rm file && your_command > file; } < file
Not only should you NOT write your output to your input, but also you should avoid looping your output back to your input.
When dealing with big files, I tried
cat *allfastq30 > Sample_All_allfastq30
and it generated error messages:
cat: Sample_All_allfastq30: input file is output file

Replacing file using "sort x.txt > x.txt" in Cygwin

Why is it that "sort x.txt > x.txt" clears the contents of a file while "sort x.txt > y.txt" writes the sorted file to y.txt as you would expect
The shell truncates x.txt before it invokes the command sort x.txt, so by the time the sort command is running, there is nothing to sort.
Just about all shells behave this way (including Windows CMD window); it is not just a feature of Cygwin.
When you run the command, you're effectively telling the shell to open x.txt for write (>> would be append, which would be different), and then dump the results of "sort x.txt" into it - it just so happens that since it's opening the file for write, first, it effectively starts a new file with the name x.txt, and then executes sort x.txt, which sorts an empty file.
I'm not positive on the why the timing is as it is - but I believe it may keep you from trying to run a command to write to a file you don't have permission to write to, etc. (aka- it opens it for write, first, to make sure it can).
sort a > b open a and b together with a for reading and b for writing. as b for writing, it would be cleared.
When executing the command, first of all the shell opens the output file to write the programs output to, effectively truncating it to zero length. Then it starts the sort command, and in the sort x.txt > x.txt case this sorts the newly empty file x.txt.
When the shell sees the command sort x.txt > x.txt it sees that the output of the sort command needs to go into the file x.txt, so it opens the file x.txt for writing, this will wipe out the contents of the file, if the file already had anything in it.
If you want to avoid it, you can redirect the sort output to a temp file and later rename the temp file as x.txt

Resources