Catalina.out not logging after edition by bash script - linux

This is a tomcat7 installation with the default logging configuration, the catalina.out is only being rolled out when we restart the server. As it is prod server we can not restart it very often. We have a huge number of entries going to that file which causes our catalina.out to grow very high in few days until it consumes the whole diskspace.
As we don't want to change the logging configuration as it is puppetized and we would need to create devops tickets and all that slow stuff, I wrote a bash script that is being run every 5 min via crontab that will cut the log file by half when a limit is reached, the script is like the following:
if [ $catalinaSize -gt $catalinaThreshold ]; then
middle=$(wc -l $catalinaLoc | awk '{ print $1 }')
middle=$(( $middle / 2 ))
sed -i -e 1,${middle}d $catalinaLoc
echo "+++ catalina.out was cut by half"
Basically this script checks the current size of the file and compares it to a threshold value, then it uses wc and awk to retrieve the number of lines in that file so it can use then sed for cutting the file by half.
I tested the script in other environments and it worked. The problem is that after some successful runs for several days in production, suddenly the catalina.out is not getting any log entries from tomcat since some days ago.
The explanation I think about is that Tomcat is not able to write into that file anymore because of the cut by half operation.
Is it possible to know what is preventing Tomcat to write into that file?

I suspect it is sed -i doing the damage: behind the scenes, it writes the output stream to a temp file, then moves the temp file to the original name. I suspect the file handle held by catalina no longer points to any file.
You'll have to find a way to actually edit the file, not replace it. This might be a valid replacement for sed:
printf "%s\n" "1,${middle}d" "wq" | ed "$catalinaLoc"
Tangentially, an easier way to get the number of lines:
middle=$(( $(wc -l < "$catalinaLoc") / 2 ))
When you redirect to wc, it no longer prints out the filename.

Related

Need suggestion to move a big live file in linux

Multiple scripts are running in my Linux server which are generating huge data and I realise that it will eat all my 500GB of storage size in next 2-5 days and scripts require 10 more days to finish the process means they need more space. So most likely I am going to have a space issue problem and I will have to restart the entire process again.
Process is like this -
script1.sh content is like below
"calling an api" > /tmp/output1.txt
script2.sh content is like below
"calling an api" > /tmp/output2.txt
Executed like this -
nohup ./script1.sh & ### this create file in /tmp/output1.txt
nohup ./script2.sh & ### this create file in /tmp/output2.txt
My understand initially was, if I will follow below steps, it should work --
when scripts are running with nohup in background execute this command -
mv /tmp/output1.txt /tmp/output1.txt_bkp; touch /tmp/output1.txt
And then transfer this file /tmp/output1.txt_bkp to another server via ftp and remove it after that to get space on server and script will keep on writing in /tmp/output1.txt file.
But this assumption was wrong and script is keep on writing in /tmp/output1.txt_bkp file. I think script is writing based on inode number that is why it is keep on writing in old file.
Now the question is how to avoid space issue without killing/restart scripts?
Essentially what you're trying to do is pull a file out from under a script that's actively writing into it. I'm not sure how nohup would let you do that.
May I suggest a different approach?
Why don't you move an x number of lines from your /tmp/output[x].txt to /tmp/output[x].txt_bkp? You can do so without much trouble while your script is running and dumping stuff into /tmp/output[x].txt. That way you can free up space by shrinking your output[x] files.
Try this as a test. Open 2 terminals (or use screen) to your Linux box. Make sure both are in the same directory. Run this command in one of your terminals:
for line in `seq 1 2000000`; do echo $line >> output1.txt; done
And then run this command in the other before the first one finishes:
head -1000 output1.txt > output1.txt_bkp && sed -i '1,+999d' output1.txt
Here is what's going to happen. The first command will start producing a file that looks like this:
1
2
3
...
2000000
The second command will chop off the first 1000 lines of output1.txt and put them into output1.txt_bkp and it will do so WHILE the file is being generated.
Afterwards, look inside output1.txt and output1.txt_bkp, you will see that the former looks like this:
1001
1002
1003
1004
...
2000000
While the latter will have the first 1000 lines. You can do the same exact thing with your logs.
A word of caution: Based on your description, your box is under a heavy load from all that dumping. This may negatively impact the process outlined above.

How to take control on files in Linux before processing starts - bash

I am currently working on project to automate a manual task in my office. We have a process that we have to re-trigger some of our ID's when they fall in repair. As part of the process, we have to extract those ID's from a oracle DB table and then put in a file on our Linux server and run the command like this-
Example file:
$cat /task/abc_YYYYMMDD_1.txt
23456
45678
...and so on
cat abc_YYYYMMDD_1.txt | scripttoprocess -args
I am using an existing java based code called 'scripttoprocess'. I can't see what's inside this code as it is encrypted( it seems) in my script. I simply go to the location where my files are present present and then use it like this:
cd /export/incoming/task
for i in `ls abc_YYYYMMDD*.txt`;do
cat $i | scripttoprocess -args
if [ $? -eq 0];then
mv $i /export/incoming/HIST/
fi
done
scripttoprocess is and existing script. I am just calling it in my own script. My script is running continuously in a loop in the background. It simply searches for abc_YYYYMMDD_1.txt file in /task directory and if it detects such a file then it starts processing the file. But I have noticed that my script starts processing the file well before it is fully written and sometime moves the file to HIST without fully processing it.
How can handle this situation. I want to be fully sure that file is completely written before I start processing it. Secondly, Is there any way to take control of the file like preparing a control file which contains list of the files which are present in the /task directory. And then I can cat this control file and pick up file names from inside of it ? Your guidance will be much appreciated.
I used
iwatch -e close_write -c "/usr/bin/pdflatex -interaction batchmode %f" document.tex
To run a command (Latex to PDF conversion) when a file (document.tex) is closed after writing to it, which you could do as well.
However, there is a caveat: This was only meant to catch manual edits to the file and failure was not critical. Therefore, this ignores the case that immediately after closing, it is opened and written again. Ask yourself if that is good enough for you.
I agree with #TenG, normally you shouldn't move a file until it is fully written. If you know for sure that the file is finished (like a file from yesterday) then you can move it safely, otherwise you can process it, but not move it. You can for example process a part of it and remember the number of processed rows so that you don't restart from scratch next time.
If you really really want to work with files that are "in progress", sometimes tail -F works for this case, but then your bash script is an ongoing process as well, not a job, and you have to manage it.
You can also check if a file is currently open (and thus unfinished) using lsof (see https://superuser.com/questions/97844/how-can-i-determine-what-process-has-a-file-open-in-linux ; check if file is open with lsof ).
Change the process, that extracts the ID's from the oracle DB table.
You can use the mv as commented by #TenG, or put something special in the file that shows the work is done:
#!/bin/bash
source file_that_runs_sqlcommands_with_credentials
output=$(your_sql_function "select * from repairjobs")
# Something more for removing them from the table and check the number of deleted records
printf "%s\nFinished\n" "${output}" >> /task/abc_YYYYMMDD_1.txt
or
#!/bin/bash
source file_that_runs_sqlcommands_with_credentials
output=$(your_sql_function "select * from repairjobs union select 'EOF' from dual")
# Something more for removing them from the table and check the number of deleted records
printf "%s\n" "${output}" >> /task/abc_YYYYMMDD_1.txt

Linux tail on rotating log file using busybox

In my bash script am trying to monitor the out from the /var/log/message log file - and continue even when the file rotates (is re-created and started again). I tried using tail -f filename but quickly realised this is no good for when the file rotates.
So there are lots of answers for using tail -F filename or tail -f --retry filename (and a few other variants).
But on my embedded Linux I am using busybox which has a lightweight version of tail:
tail [OPTIONS] [FILE]...
Print last 10 lines of each FILE to standard output. With more than one
FILE, precede each with a header giving the file name. With no FILE, or
when FILE is -, read standard input.
Options:
-c N[kbm] Output the last N bytes
-n N[kbm] Print last N lines instead of last 10
-f Output data as the file grows
-q Never output headers giving file names
-s SEC Wait SEC seconds between reads with -f
-v Always output headers giving file names
If the first character of N (bytes or lines) is a '+', output begins with
the Nth item from the start of each file, otherwise, print the last N items
in the file. N bytes may be suffixed by k (x1024), b (x512), or m (1024^2).
So I can't do the usual tail -F ... since that option is not implemented. The above document snippet is the latest busybox version - and mine is a bit older.
So I need another way of logging /var/log/messages since the file gets overwritten at a certain size.
I was thinking of some simple bash line. So I saw things like inotifywait, but busybox does not have that. I looked here:
busybox docs and there is a inotifyd, but my version does not have that particular command. So I am wandering if there is a clever way of doing this with simple Linux commands/combination of commands like watch and tail -f and cat/less/more etc... I can't quite figure out what I need to do with the limited commands that I have :(
How are the logs rotated? Are you using a logrotate utility?
If yes, have you tried to add your line to postrotate section in the config file?
from man logrotate
postrotate/endscript
The lines between postrotate and endscript (both of which must
appear on lines by themselves) are executed after the log file
is rotated. These directives may only appear inside of a log
file definition. See prerotate as well.

How to run a script automatically when the user logs out in Linux?

I need to implement a feature that monitors which user logs in or out of the Linux desktop. When a user logs in or out, a script needs to be run automatically to notify a daemon process which user logged in or out.
I searched in Google and found a script under /etc/profile.d will be run automatically after the user logs in.
But I didn't find a common solution that will run a script automatically when the user logs out. It looks the solution is different for different linux distribution. Such as:
For Ubuntu, I need to modify the file /etc/lightdm/lightdm.conf
I need to support multiple Linux distributions, including: CentOS, Ubuntu, Redhat, and so on. If I use different solutions for different Linux distributions, my code will be very complicated.
I would like find a common solution for different Linux distributions. Can you please give some clues?
In bash, the ~/.bash_logout file will be executed when exiting shell.
So place in it script you want to execute
simply find out WHO's logged in and record when you first see them, and when you not longer see them. then read the "crontab" manual page and install a process that keeps track of this
the basic command: who | awk '{ print $1 }' | sort -u
set -- /tmp/whoseloggedin /tmp/whoWASloggedin
saving the data. ... | tee $1
comm -23 $1 $2 | sed "s/^/$(date) /" >> /tmp/justloggedIN
comm -13 $1 $2 | sed 's/^/$(date) /" >> /tnp/justloggedOFF
mv $1 $2
sleep for a second or two, and repeat.
you might store the data in a more reliable place than "/tmp/"

How to make nohup.out update with perl script?

I have a perl script that copies a large amount of files. It prints some text to standard out and also writes a logfile. However, when running with nohup, both of these display a blank file:
tail -f nohup.out
tail -f logfile.log
The files don't update until the script is done running. Moreover, for some reason tailing the .log file does work if I don't use nohup!
I found a similar question for python (
How come I can't tail my log?)
Is there a similar way to flush the output in perl?
I would use tmux or screen, but they don't exist on this server.
Check perldoc,
HANDLE->autoflush( EXPR );
To disable buffering on standard output that would be,
STDOUT->autoflush(1);

Resources