Why does dos2unix print to stderr? - linux

When running dos2unix on a file I get the following printed to the terminal
dos2unix: converting file <filename> to UNIX format ...
In my attempt to suppress the output by sending it to /dev/null I noticed that this is sent out on stderr instead of stdout as I'd expected (since it seems like a normal message, not an error). Is there a reason for this?

In Unix-like environments it is common to chain processes: The result of one program is used as input for another program. Mixing results with diagnostics would confuse the next processing stage. It would also hide the diagnostics from a potential user watching the terminal, where processing results piped to the next program don't show.
This is the reason for the separation of results and diagnostics in stdout and stderr. Diagnostics are not restricted to errors but should contain everything which is not a processing result which subsequent programs would expect.
With respect to the actual question: dos2unix is often used to transform files in-place but can also output to stdout (when called without a file name, it reads from stdin and outputs to stdout). stdout then can be redirected independently from stderr. Consider cat blados | dos2unix > blaunix. You would still see the diagnostics (which may contain error messages!), but the result of the processing will go to blaunix.
It's not so common to print diagnostics at all in case of success — probably a little nod to DOS users. It would be pretty bad if the processing result contained the informational message; for example, it would break a C file.

There is no reason, but normally stderr is not just for error output. It is another stream that is often used for logging or informational messages. Since a log message is not output, it is not sent to stdout, which is for the results of the program.
The reason its printed on your terminal is a consequence of your shell, and not really controlled by the application.

Try dos2unix -q <filename>
-q, --quiet
Quiet mode. Suppress all warnings and messages. The return value is zero. Except when wrong command-line options are used.

Simply because that's the way it was implemented...
If you'll check out the source-code you'll see:
...
if (!pFlag->Quiet)
fprintf(stderr, _("dos2unix: converting file %s to file %s in UNIX format ...\n"), argv[ArgIdx-1], argv[ArgIdx]);
...

I use this one-liner to redirect stderr to stdout, skip the resulting first irrelevant line, and send the rest back into stderr.
dos2unix thefile 2>&1|tail -n+2 1>&2

Related

How to remove a part of stdout of a running shell script based on starting and ending strings?

I have a shell script in which other scripts are called, where new processes are created. There is a certain part of that output logs that i don't want to see on the screen, since it s excessive. If this was a text file, i could remove that part using the following line:
sed 's/String1.*String2//g'
However, i guess when running a script (continuous logs) or even receiving the logs through cat tool, the logs are send line by line with EOF. Hence sed can not track the regular expression.
Long story short, i want to be able to do this to shorten the stdout of the test run:
./test1 | sed 's/String1.*String2//g'
and as a result can get rid of almost 200 redundant lines of logs on the output screen.

Bash - process backspace control character when redirecting output to file

I have to run a third-party program in background and capture its output to file. I'm doing this simply using the_program > output.txt. However, the coders of said program decided to be flashy and show processed lines in real-time, using \b characters to erase the previous value. So, one of the lines in output.txt ends up like Lines: 1(b)2(b)3(b)4(b)5, (b) being an unprintable character with ASCII code 08. I want that line to end up as Lines: 5.
I'm aware that I can write it as-is and post-process the file using AWK, but I wonder if it's possible to somehow process the control characters in-place, by using some kind of shell option or by piping some commands together, so that line would become Lines: 5 without having to run any additional commands after the program is done?
Edit:
Just a clarification: what I wrote here is a simplified version, actual line count processed by the program is a hundred thousands, so that string ends up quite long.
Thanks for your comments! I ended up piping the output of that program to AWK Script I linked in the question. I get a well-formed file in the end.
the_program | ./awk_crush.sh > output.txt
The only downside is that I get the output only once the program itself is finished, even though the initial output exceeds 5M and should be passed in the lesser chunks. I don't know the exact reason, perhaps AWK script waits for EOF on stdin. Either way, on more modern system I would use
stdbuf -oL the_program | ./awk_crush.sh > output.txt
to process the output line-by-line. I'm stuck on RHEL4 with expired support though, so I'm unable to use neither stdbuf nor unbuffer. I'll leave it as-is, it's fine too.
The contents of awk_crush.sh are based on this answer, except with ^H sequences (which are supposed to be ASCII 08 characters entered via VIM commands) replaced with escape sequence \b:
#!/usr/bin/awk -f
function crushify(data) {
while (data ~ /[^\b]\b/) {
gsub(/[^\b]\b/, "", data)
}
print data
}
crushify($0)
Basically, it replaces character before \b and \b itself with empty string, and repeats it while there are \b in the string - just what I needed. It doesn't care for other escape sequences though, but if it's necessary, there's a more complete SED solution by Thomas Dickey.
Pipe it to col -b, from util-linux:
the_program | col -b
Or, if the input is a file, not a program:
col -b < input > output
Mentioned in Unix & Linux: Evaluate large file with ^H and ^M characters.

Bash (or other shell): wrap all commands with function/script

Edit: This question was originally bash specific. I'd still rather have a bash solution, but if there's a good way to do this in another shell then that would be useful to know as well!
Okay, top level description of the problem. I would like to be able to add a hook to bash such that, when a user enters, for example $cat foo | sort -n | less, this is intercepted and translated into wrapper 'cat foo | sort -n | less'. I've seen ways to run commands before and after each command (using DEBUG traps or PROMPT_COMMAND or similar), but nothing about how to intercept each command and allow it to be handled by another process. Is there a way to do this?
For an explanation of why I'd like to do this, in case people have other suggestions of ways to approach it:
Tools like script let you log everything you do in a terminal to a log (as, to an extent, does bash history). However, they don't do it very well - script mixes input with output into one big string and gets confused with applications such as vi which take over the screen, history only gives you the raw commands being typed in, and neither of them work well if you have commands being entered into multiple terminals at the same time. What I would like to do is capture much richer information - as an example, the command, the time it executed, the time it completed, the exit status, the first few lines of stdin and stdout. I'd also prefer to send this to a listening daemon somewhere which could happily multiplex multiple terminals. The easy way to do this is to pass the command to another program which can exec a shell to handle the command as a subprocess whilst getting handles to stdin, stdout, exit status etc. One could write a shell to do this, but you'd lose much of the functionality already in bash, which would be annoying.
The motivation for this comes from trying to make sense of exploratory data analysis like procedures after the fact. With richer information like this, it would be possible to generate decent reporting on what happened, squashing multiple invocations of one command into one where the first few gave non-zero exits, asking where files came from by searching for everything that touched the file, etc etc.
Run this bash script:
#!/bin/bash
while read -e line
do
wrapper "$line"
done
In its simplest form, wrapper could consist of eval "$LINE". You mentioned wanting to have timings, so maybe instead have time eval "$line". You wanted to capture exit status, so this should be followed by the line save=$?. And, you wanted to capture the first few lines of stdout, so some redirecting is in order. And so on.
MORE: Jo So suggests that handling for multiple-line bash commands be included. In its simplest form, if eval returns with "syntax error: unexpected end of file", then you want to prompt for another line of input before proceeding. Better yet, to check for proper bash commands, run bash -n <<<"$line" before you do the eval. If bash -n reports the end-of-line error, then prompt for more input to add to `$line'. And so on.
Binfmt_misc comes to mind. The Linux kernel has a capability to allow arbitrary executable file formats to be recognized and passed to user application.
You could use this capability to register your wrapper but instead of handling arbitrary executable, it should handle all executable.

Does piping write to stdin?

Does running something like below cause the textfile lines to be directed to the STDIN of program.sh?
cat textfile | program.sh
Yes; and the rest of this answer comes to satisfy SO's requirement of minimum 30 characters per answer (excluding links).
http://en.wikipedia.org/wiki/Pipeline_(Unix)
Yes. You're writing the stdout from cat to the stdin of program.sh. Because cat isn't doing much except reading the file, you can also write it as:
program.sh < textfile
...which does the same thing.
From a technical standpoint, stdin is accessed through file descriptor 0, while stdout is file descriptor 1 and stderr is file descriptor 2. With this information, you can make more complicated redirections, such as redirecting stderr to the same place (or a different place!) than stdout. For a cheat sheet about redirections, see Peteris Krumins's Bash Redirections Cheat Sheet.
Yes.
You are running the command sort on a text file. The output goes to program.sh

How to eliminate the delay in message redirection to a file in Vim?

I have this line in my vimrc:
redir! >/Users/seanmackesey/.vim/.vimmessages
But messages do not show up in this file immediately after they are generated-- when I run tail -f .vimmessages in the shell, messages show up slowly and somewhat erratically. I get a big dump of messages to it sometimes when I run the :messages command, but I can't figure out exactly what the pattern is. Is there a way to simply append each message as it occurs, immediately, to the end of a file?
The problem with a global :redir is that it doesn't nest, so it'll cause errors with mappings and functions that use :redir, too. Rather, use
:set verbosefile=/Users/seanmackesey/.vim/.vimmessages
to capture all messages. Because Vim's implementation uses buffered output, you'd still experience some amount of chunking, though.
You didn't mention where you intend to use this output, so it's hard to give a better recommendation. If you really need immediate output to an external file, you'd have to use writefile() or use an embedded scripting language to write and flush a file.
This seemed more likely to be simple data buffering, not any specific time delay.
I grepped through the Vim 7.3 source, and it looks like the redir is done with fopen, puts, and putc, and fclose (i.e. stdio). There did not appear to be any calls to fflush, setbuf, setbuffer, setlinebuf, or setvbuf so the redirection will always use the default buffering provided by your system’s stdio (probably “block buffering” of some convenient size).
You could periodically stop and restart the redirection to effectively flush the data:
redir END | redir! >>~/.vim/.vimmessages
Short of that, there does not seem to be a nice way to do what you want with redir to a file.

Resources