Process logging binary data to log file - node.js

I'm starting a node process with the following upstart script, logging stdout & stderr into separate files:
script
sudo -u node /usr/local/bin/node /var/node/services/someServer.js 1> /var/log/node/someServer.log 2> /var/log/node/someServer.error.log
end script
The problem is that both log files have binary data in the head. I can't use less or more to quickly check those logs, which is terribly annoying. Any ideas how I can prevent that process logging binary data?

Try opening using less with the -f and -R options. -f will force open binary files and -R will better handle control characters, if they exist. Does cat display the contents ok?

Related

Redirect wget screen output to a log file in bash

First of all, thank you everyone for your help. I have the following file that contains a series of URL:
Salmonella_enterica_subsp_enterica_Typhi https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/003/717/755/GCF_003717755.1_ASM371775v1/GCF_003717755.1_ASM371775v1_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Paratyphi_A https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/818/115/GCF_000818115.1_ASM81811v1/GCF_000818115.1_ASM81811v1_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Paratyphi_B https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/018/705/GCF_000018705.1_ASM1870v1/GCF_000018705.1_ASM1870v1_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Infantis https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/011/182/555/GCA_011182555.2_ASM1118255v2/GCA_011182555.2_ASM1118255v2_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Typhimurium_LT2 https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/006/945/GCF_000006945.2_ASM694v2/GCF_000006945.2_ASM694v2_translated_cds.faa.gz
Salmonella_enterica_subsp_diarizonae https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/003/324/755/GCF_003324755.1_ASM332475v1/GCF_003324755.1_ASM332475v1_translated_cds.faa.gz
Salmonella_enterica_subsp_arizonae https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/635/675/GCA_900635675.1_31885_G02/GCA_900635675.1_31885_G02_translated_cds.faa.gz
Salmonella_bongori https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/006/113/225/GCF_006113225.1_ASM611322v2/GCF_006113225.1_ASM611322v2_translated_cds.faa.gz
And I have to download the url using wget I have already achieve to download the URL but the typicall output in shell appears:
--2021-04-23 02:49:00-- https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/635/675/GCA_900635675.1_31885_G02/GCA_900635675.1_31885_G02_translated_cds.faa.gz
Reusing existing connection to ftp.ncbi.nlm.nih.gov:443.
HTTP request sent, awaiting response... 200 OK
Length: 1097880 (1,0M) [application/x-gzip]
Saving to: ‘GCA_900635675.1_31885_G02_translated_cds.faa.gz’
GCA_900635675.1_31885_G0 100%[=================================>] 1,05M 2,29MB/s in 0,5s
2021-04-23 02:49:01 (2,29 MB/s) - ‘GCA_900635675.1_31885_G02_translated_cds.faa.gz’ saved [1097880/1097880]
I want to redirect that output to a log file. Also as the files download, I want to decompress them, because they are zip in .gz. My code is the following
cat $ncbi_urls_file | while read line
do
echo " Downloading fasta files from NCBI..."
awk '{print $2}' | wget -i-
done
wget
wget does have options allowing logging to files, from man wget
Logging and Input File Options
-o logfile
--output-file=logfile
Log all messages to logfile. The messages are normally reported to standard error.
-a logfile
--append-output=logfile
Append to logfile. This is the same as -o, only it appends to logfile instead of overwriting the old log file. If logfile does not exist, a new file is created.
-d
--debug
Turn on debug output, meaning various information important to the developers of Wget if it does not work properly. Your system administrator may have chosen to compile Wget without debug support, in which case -d will not work. Please note that compiling with debug support is always safe---Wget compiled with the debug support will not print any debug info unless requested with -d.
-q
--quiet
Turn off Wget's output.
-v
--verbose
Turn on verbose output, with all the available data. The default output is verbose.
-nv
--no-verbose
Turn off verbose without being completely quiet (use -q for that), which means that error messages and basic information still get printed.
You would need to experiment to got what you need, if you need all logs in single file use -a log.out, which will cause wget to append logging information to said file and not writing to stderr.
Standard output can be redirected to a file in bash using the >> operator (for appending to the file) or the > operator (for truncating / overwriting the file). e.g.
echo hello >> log.txt
will append "hello" to log.txt. If you still want to be able to see the output in your terminal and also write it to a log file, you can use tee:
echo hello | tee.txt
However, wget outputs most of its basic progress information through standard error rather than standard output. This is actually a very common practice. Displaying progress information often involves special characters to overwrite lines (e.g. to update a progress bar), change terminal colors, etc. Terminals can process these characters sensibly in real time, but it often does not make much sense to store them in a file. For this reason, such kinds of incremental progress output are often separated from other output which is more sensible to store in a log file to make them easier to redirect accordingly, and hence incremental progress information is often output through standard error rather than standard output.
However, you can still redirect standard error to a log file:
wget example.com 2>> log.txt
Or using tee:
wget example.com 2>&1 | tee log.txt
(2>&1 redirects standard error through standard output, which is then piped to tee).

How to supress printing of warning massage in bash script?

I have a bash script, which when it's finished running, it going to write some files into a directory.
It prints a lot of warnings, which slows down the process, I'm looking for an effective way to prevent printing the warnings in the shell.
I added 2> /dev/null to end of my command at mybash.sh:
#!/bin/bash
command -f file 2> /dev/null
the original command:
java -mx28000M -jar ChromHMM.jar BinarizeBam CHROMSIZES/hg19.txt /Volumes/Control/ /primary_bam_files/FCX/Control/Marks_Run1.txt /Volumes/Control/output1/
When I run my mybash.sh, it will write 20 different output files in a directory.
However, when I use 2>/dev/null, it does not print any warning in the shell which is great but also does not write anything in the output directory which in principle should those 20 files.
Can anyone help me to solve this problem?

Python - read from a remote logfile that is updated frequently

I have a logfile that is written constantly on a remote networking device (F5 bigip). I have a Linux hopping station from where I can fetch that log file and parse it. I did find a solution that would implement a "tail -f" but I cannot use nice or similar to keep my script running after I log out. What I can do is to run a cronjob and copy over the file every 5 min let's say. I can process the file I downloaded but the next time I copy it it will contain a lot of common data, so how do I process only what is new? Any help or sugestions are welcome!
Two possible (non-python) solutions for your problems. If you want to keep a script running on your machine after logout, check nohup in combination with & like:
nohup my_program & > /dev/null
On a linux machine you can extract the difference between the two files with
grep -Fxv -f old.txt new.txt > dif.txt
This might be slow if the file is large. The dif.txt file will only contain the new stuff and can be inspected by your program. There also might be a solution involving diff.

Write to STDERR by filename even if not writable for the user

My user does not have write permissions for STDERR:
user#host:~> readlink -e /dev/stderr
/dev/pts/19
user#host:~> ls -l /dev/pts/19
crw--w---- 1 sysuser tty 136, 19 Apr 26 14:02 /dev/pts/19
It is generally not a big issue,
echo > /dev/stderr
fails with
-bash: /dev/stderr: Permission denied
but usual redirection like
echo >&2
works alright.
However I now need to cope with a 3rd-party binary which only provides logging output to a specified file:
./utility --log-file=output.log
I would like to see the logging output directly in STDERR.
I cannot do it the easy way like --log-file=/dev/stderr due to missing write permissions. On other systems where the write permissions are set this works alright.
Furthermore, I also need to parse output of the process to STDOUT, therefore I cannot simply send log to STDOUT and then redirect to STDERR with >&2. I tried to use the script utility (where the redirection to /dev/stderr works properly) but it merges STDOUT and STDERR together as well.
You can use a Bash process substitution:
./utility --log-file=>(cat>&2)
The substitution will appear to the utility like --log-file=/dev/fd/63, which can be opened. The cat process inherits fd 2 without needing to open it, so it can do the forwarding.
I tested the above using chmod -w /dev/stderr and dd if=/etc/issue of=/dev/stderr. That fails, but changing to dd if=/etc/issue of=>(cat>&2) succeeds.
Note that your error output may suffer more buffering than you would necessarily want/expect, and will not be synchronous with your shell prompt. In other words, your prompt may appear mixed in with error output that arrives at your terminal after utility has completed. The dd example will likely demonstrate this. You may want to append ;wait after the command to ensure that the cat has finished before your PS1 prompt appears: ./utility --log-file=>(cat>&2); wait

How to make nohup.out update with perl script?

I have a perl script that copies a large amount of files. It prints some text to standard out and also writes a logfile. However, when running with nohup, both of these display a blank file:
tail -f nohup.out
tail -f logfile.log
The files don't update until the script is done running. Moreover, for some reason tailing the .log file does work if I don't use nohup!
I found a similar question for python (
How come I can't tail my log?)
Is there a similar way to flush the output in perl?
I would use tmux or screen, but they don't exist on this server.
Check perldoc,
HANDLE->autoflush( EXPR );
To disable buffering on standard output that would be,
STDOUT->autoflush(1);

Resources