command line "cat >> " behavior - linux

What does this command do cat t.txt >> t.txt ? let say the t.txt only have one line of text "abc123". I assume the output of "abc123" is appended to t.txt. So, I should have 2 lines of "abc123". However, it just going in to a infinite loop. It doesn't stop until I hit Control-C. Is this the expect behavior of >>?

cat program opens the file for reading, reads the file and writes to standard out.
>> is a shell append redirect.
What you are seeing is the following cycle:
cat reads a line from t.txt
cat prints the line to file
the line is appended to t.txt
cat tests if it is at the end of the file
That 4th step will always be false, because by the time the EOF check happens a new line has been written. cat waits because the write always happens first.
If you want to prevent that behavior, you can add a buffer in between:
$ cat t.txt | cat >> t.txt
In this way, the write occurs after cat t.txt checks for EOF

What you are trying to do by:
cat t.txt >> t.txt
is like telling your system to read t.txt line by line and append each line to t.txt. Or in better words, "append the file to itself". The file is being gradually filled up with repetitions of the original contents of the file -- the reason behind your infinite loop.
Generally speaking, try to stay away from reading and writing to the same file using redirections. Is it not possible to break this down to two steps -- 1. Read from file, output to a temporary file 2. append to the temporary file to the original file?

cat is a command in unix-like systems that concatenates multiple input files and sends their result to the standard output. If only one file is specified, it just outputs that one file. The >> part redirects the output to the given file name, in your case t.txt.
But what you have says, overwrite t.txt with the contents of itself. I don't think this behavior is defined, and so I'm not surprised that you have an infinite loop!

Related

How do you append a string built with interpolation of vars and STDIN to a file?

Can someone fix this for me.
It should copy a version log file to backup after moving to a repo directory
Then it automatically appends line given as input to the log file with some formatting.
That's it.
Assume existence of log file and test directory.
#!/bin/bash
cd ~/Git/test
cp versionlog.MD .versionlog.MD.old
LOGDATE="$(date --utc +%m-%d-%Y)"
read -p "MSG > " VHMSG |
VHENTRY="- **${LOGDATE}** | ${VHMSG}"
cat ${VHENTRY} >> versionlog.MD
shell output
virufac#box:~/Git/test$ ~/.logvh.sh
MSG > testing script
EOF
EOL]
EOL
e
E
CTRL + C to get out of stuck in reading lines of input
virufac#box:~/Git/test$ cat versionlog.MD
directly outputs the markdown
# Version Log
## version 0.0.1 established 01-22-2020
*Working Towards Working Mission 1 Demo in 0.1 *
- **01-22-2020** | discovered faker.Faker and deprecated old namelessgen
EOF
EOL]
EOL
e
E
I finally got it to save the damned input lines to the file instead of just echoing the command I wanted to enter on the screen and not executing it. But... why isn't it adding the lines built from the VHENTRY variable... and why doesn't it stop reading after one line sometimes and this time not. You could see I was trying to do something to tell it to stop reading the input.
After some realizing a thing I had done in the script was by accident... I tried to fix it and saw that the | at the end of the read command was seemingly the only reason the script did any of what it did save to the file in the first place.
I would have done this in python3 if I had know this script wouldn't be the simplest thing I had ever done. Now I just have to know how you do it after all the time spent on it so that I can remember never to think a shell script will save time again.
Use printf to write a string to a file. cat tries to read from a file named in the argument list. And when the argument is - it means to read from standard input until EOF. So your script is hanging because it's waiting for you to type all the input.
Don't put quotes around the path when it starts with ~, as the quotes make it a literal instead of expanding to the home directory.
Get rid of | at the end of the read line. read doesn't write anything to stdout, so there's nothing to pipe to the following command.
There isn't really any need for the VHENTRY variable, you can do that formatting in the printf argument.
#!/bin/bash
cd ~/Git/test
cp versionlog.MD .versionlog.MD.old
LOGDATE="$(date --utc +%m-%d-%Y)"
read -p "MSG > " VHMSG
printf -- '- **%s** | %s\n' "${LOGDATE}" "$VHMSG" >> versionlog.MD

understanding I/O redirection in Unix

I am not sure if I fully understand Redirection in Unix.
I was playing around with this command " cat hello.txt world.txt > world.txt "
The output I get is strange to me. I was guessing that world.txt will be empty initially and content of hello.txt will be stdout to world.txt.
When I run the " cat hello.txt world.txt > world.txt "
command
and later check the content in world.txt, I see the content of hello.txt being repeatedly copied.
For eg If hello.txt has the content "Hello World!", then world.txt will print
"Hello, world !" so many times like hundreds of lines.
Any reason why this happens
thank you
Your understanding of redirection is fine; it's your understanding of cat that is lacking. Your implementation of cat writes the contents of hello.txt to world.txt (which was indeed empty when cat started due to the redirection), then goes into the following loop:
Read from world.txt.
If no data was found, exit
Write data to world.txt,
GOTO 1
POSIX doesn't require such a loop; cat could (I think) simply read the entire contents of world.txt first, write it out to standard output, then exit without checking if world.txt has grown. But I suspect no implementation would ever actually do that, for performance reasons.
First, cat will open hello.txt and effectivley copy its contents to world.txt due to redirection. What's going to happen next however, is that cat will open world.txt, which is now a copy of hello.txt. As cat reads data from world.txt, it redirects its output to the designated file (i.e., world.txt), meaning there's always more to read (assuming infinite disk).

Redirecting output of a program to a rotating file

I am trying to redirect the output of a continuously running program to a file say error_log.
The command I am using looks like this, where myprogram.sh runs continuously generating data and writing to /var/log/httpd/error_log
myprogram.sh >> /var/log/httpd/error_log
Now I am using logrotate to rotate this file per hour. I am using create command in logrotate so that it renames the original file and creates a new one.
The logrotate config looks like
/var/log/httpd/error_log {
# copytruncate
create
rotate 4
dateext
missingok
ifempty
.
.
.
}
But here redirection fails. What I want is myprogram.sh to write data to error_log file irrespective of it being rotated by logrotate, obviously to newly generated error_log file
Any idea how to make redirection work based on the file name and not the descriptor ?
OR
Any other way of doing it in bash ?
If I understood your problem, one solution (without modify myprogram.sh) could be:
$ myprogram.sh | while true; do head -10 >> /var/log/httpd/error_log; done
Explaining:
myprogram.sh writes to stdout
We redirect this output to while bash sentence through a pipe |.
while true is an infinite loop that will never end, nor even when myprogram.sh ends which should break the pipe.
In each loop head command is called to append the first 10 lines read from the pipe to the end of current /var/log/httpd/error_log file (that may be different from the last loop because of logrotate).
(You can change the number of lines being written in each loop)
And another way is:
$ myprogram.sh | while read line; do echo "$line" >> /var/log/httpd/error_log; done
That's very similar to the first one, but this ends the loop when myprogram.sh ends or closes it's stdout.
It works line by line instead in groups of lines.

How to use sed command to delete lines without backup file?

I have large file with size of 130GB.
# ls -lrth
-rw-------. 1 root root 129G Apr 20 04:25 syslog.log
So I need to reduce file size by deleting line which starts with "Nov 2" , So I have given the following command,
sed -i '/Nov 2/d' syslog.log
So I can't edit file using VIM editor also.
When I trigger SED command , its creating backup file also. But I don't have much space in root. Please try to give alternate solution to delete particular line from this file without increasing space in server.
It does not create a real backup file. sed is a stream editor. When applied to a file with option -i it will stream that file through the sed process, write the output to a new file (a temporary one), when everything is done, it will rename the new file to the original name.
(There are options to create backup files also, but you didn't give them, so I won't mention that further.)
In your case you have a very large file and don't want to create any copy, however temporary. For this you need to open the file for reading and writing at the same time, then your sed process can overwrite the original. After this, you will have to truncate the file at the end of the writing.
To demonstrate how this can be done, we first perform a test case.
Create a test file, containing lots of lines:
seq 0 999999 > x
Now, lets say we want to remove all lines containing the digit 4:
grep -v 4 1<>x <x
This will open the file for reading and writing as STDOUT (1), and for reading as STDIN. The grep command will read all lines and will output only the lines not containing a 4 (option -v).
This will effectively overwrite the beginning of the original file.
You will not know how long the output is, so after the output the original contents of the file will appear:
…
999991
999992
999993
999995
999996
999997
999998
999999
537824
537825
537826
537827
537828
537829
…
You can use the Unix tool truncate to shorten your file manually afterwards. In a real scenario you will have trouble finding the right spot for this, so it makes sense to count the number of bytes written (using wc):
(Don't forget to recreate the original x for this test.)
(grep -v 4 <x | tee /dev/stderr 1<>x) |& wc -c
This will preform the step above and additionally print out the number of bytes written to the terminal, in this example case the output will be 3653658. Now use truncate:
truncate -s 3653658 x
Now you have the result you want.
If you want to do this in a script, i. e. without interaction, you can use this:
length=$((grep -v 4 <x | tee /dev/stderr 1<>x) |& wc -c)
truncate -s "$length" x
I cannot guarantee that this will work for files >2GB or >4GB on your machine; depending on your operating system (32bit?) and the versions of the installed tools you might run into largefile issues. I'd perform tests with large files first (>4GB as this is typically a limit for many things) and then cross your fingers and give it a try :)
Some caveats you have to keep in mind:
Of course, nobody is supposed to append log entries to that log file while the procedure is running.
Also, any abort during the running of the process (power failure, signal caught, etc.) will leave the file in an undefined state. But re-running the command again after such a mishap will in most cases produce the correct output; some lines might be doubled, but not more than a single line should be corrupted then.
The output must be smaller than the input, of course, otherwise the writing will overtake the reading, corrupting the whole result so that lines which should be there will be missing (or truncated at the start).

How can I replace a specific line by line number in a text file?

I have a 2GB text file on my linux box that I'm trying to import into my database.
The problem I'm having is that the script that is processing this rdf file is choking on one line:
mismatched tag at line 25462599, column 2, byte 1455502679:
<link r:resource="http://www.epuron.de/"/>
<link r:resource="http://www.oekoworld.com/"/>
</Topic>
=^
I want to replace the </Topic> with </Line>. I can't do a search/replace on all lines but I do have the line number so I'm hoping theres some easy way to just replace that one line with the new text.
Any ideas/suggestions?
sed -i yourfile.xml -e '25462599s!</Topic>!</Line>!'
sed -i '25462599 s|</Topic>|</Line>|' nameoffile.txt
The tool for editing text files in Unix, is called ed (as opposed to sed, which as the name implies is a stream editor).
ed was once intended as an interactive editor, but it can also easily scripted. The way ed works, is that all commands take an address parameter. The way to address a specific line is just the line number, and the way to change the addressed line(s) is the s command, which takes the same regexp that sed would. So, to change the 42nd line, you would write something like 42s/old/new/.
Here's the entire command:
FILENAME=/path/to/whereever
LINENUMBER=25462599
ed -- "${FILENAME}" <<-HERE
${LINENUMBER}s!</Topic>!</Line>!
w
q
HERE
The advantage of this is that ed is standardized, while the -i flag to sed is a proprietary GNU extension that is not available on a lot of systems.
Use "head" to get the first 25462598 lines and use "tail" to get the remaining lines (starting at 25462601). Though... for a 2GB file this will likely take a while.
Also are you sure the problem is just with that line and not somewhere previous (ie. the error looks like an XML parse error which might mean the actual problem is someplace else).
My shell script:
#!/bin/bash
awk -v line=$1 -v new_content="$2" '{
if (NR == line) {
print new_content;
} else {
print $0;
}
}' $3
Arguments:
first: line number you want change
second: text you want instead original line contents
third: file name
This script prints output to stdout then you need to redirect. Example:
./script.sh 5 "New fifth line text!" file.txt
You can improve it, for example, by taking care that all your arguments has expected values.

Resources