Passing dashed argument in Python incorrectly encoding it

Passing dashed argument in Python incorrectly encoding it - python-3.x

I am sending a Linux command vi os.system in Python. The command I am sending contains a dashed argument (-archive_dir) however the command is not recognizable in the system as it sees the dash as \xe2\x80\x93. How to do dashed arguments, so the dash is seen as what it is, a dash?
#cmd im sending
os.system('-archive_dir') <---cmd
\xe2\x80\x93archive_dir <---what linux system sees.

I believe the problem is that your shell is not interpreting the character encoding of the string '-archive_dir' correctly. It's important to realize that characters are just bytes, nothing more. Your shell needs to know how to decode these bytes in order to interpret them properly. See here for more details.
I think subprocess is a bit smarter about converting to your local character encoding. Although I couldn't find it in the docs, the subprocess module is just generally more robust with system calls than the os module. It should beat the string into the encoding that your shell expects. It's not possible for me to test in your exact environment, but this is at least worth a try:
import subprocess
subprocess.call(["-archive_dir"])
if you need to specify multiple arguments (since your argument looks like a flag and not a command), you have to seperate them in the list. For example:
import subprocess
subprocess.call(["ls", "-a"]) # System call: 'ls -a'

Related

Multi line command to os.system

There may be something obvious that I'm missing here but searching google/so has not provided anything useful.
I'm writing a python script utilizes tkinter's filedialog.askopenfilename to open a file picker. Without getting into to much detail, I have the following line, which serves to bring the file picker to the front of the screen (taken directly from this helpful answer):
os.system('''/usr/bin/osascript -e 'tell app "Finder" to set frontmost of process "Python" to true' ''')
As you can see from the above code snippet, this line is too long for pep8 guidelines and I'd like to break it down.
However, despite my best efforts I can't seem to get it to split. This is due (I think) to the fact that the line contains both single and double quotes, and unfortunately os.system seems to insist on it being a single line.
I've tried
Triple quotes
String literal patching (\ at end, and + at beginning of each line)
Triple quotes on a per line basis
If it's relevant: using OSX and running python 3.6.4.
What is the correct (and ideally, minimal) way to go about breaking this line down?

Using the much improved subprocess module is usually a much better, more powerful, and safer way to call an external executable.
You can of course pass variables with \n in them as the arguments as well.
Note, the double (()) are because the first parameter is a tuple.
import subprocess
subprocess.call((
'/usr/bin/osascript',
'-e',
'tell app "Finder" to set frontmost of process "Python" to true',
))
There are at times reasons to call through the shell, but not usually.
https://docs.python.org/3.6/library/subprocess.html

How to output Python3 (unicode) strings while ignoring un-encodable characters

Consider the following terminal command line
python3 -c 'print("hören")'
In most terminals this prints "hören" (German for "to hear"), in some terminals you get the error
UnicodeEncodeError: 'ascii' codec can't encode character '\xf6'
in position 1: ordinal not in range(128)
In my Python3 program I don't want that just printing out something can raise an exception like this, instead I'd rather like to output the characters which will not raise an exception.
So my questions is: How can I output in Python3 (unicode) strings while ignoring un-encodable characters?
Some notes
What I've tried so far
I tried using sys.stdout.write instead ofprint, but the encoding problem still can occur.
I tried encoding the string in byes via
bytes=line.encode('utf-8')
This never raises an exception on print, but even in capable terminals non-ascii characters are replaced by their code point numbers.
I tried using the decode method with the 'ignore' parameter:
bytes=line.encode('utf-8')
decoded=bytes.decode('utf-8', 'ignore')
print(decoded)
But the problem is not the decoding in the string but the enconding in the print function.
Here some terminals which appear not to be capable of all characters
bash shell inside Emacs on macOS.
Receiving a "printed" string in Applescript via do shell script, e.g.:
set txt to do shell script "/usr/local/bin/python3 -c \"print('hören')\" "
Update: These terminals all return from locale.getpreferredencoding()the value US-ASCII.

My preferred way is to set PYTHONIOENCODING variable depending on the terminal you are using.
For UTF-8-enabled terminals you can do:
export PYTHONIOENCODING='utf-8'
For printing '?'s in ASCII terminals, you can do:
export PYTHONIOENCODING='ascii:replace'
Or even better, if you don't care about the encoding, you should be able to do:
export PYTHONIOENCODING=':replace'

How not to escape an ampersand with python subprocess

I'd like to execute with subprocess.Popen() this command containing an ampersand to be interpreted as a batch concatenation operator:
COMMAND="c:\p\a.exe & python run.py"
subprocess.Popen(COMMAND,cwd=wd,shell=False)`
The ampersand & however is interpreted as an argument of a.exe and not as a batch operator.
Solution 1: Having seen the related question, a solution could be to set shell=True but that gives the error 'UNC path are not supported', since my working directory is remote. This solution does not work as it is.
Solution2: The executable a.exe should take a -inputDir parameter to specify the remote location of the files and use a local working directory. I think this solution could work but I may not have the source code of the executable.
Solution3: I could instead write c:\p\a.exe & python run.py into command.bat and then use
COMMAND="c:\p\command.bat"
subprocess.Popen(COMMAND,cwd=wd,shell=False)`
Could this approach work?
Solution4: I am trying to solve this changing only the subprocess.Popen() call. Is it possible to do it? Based on the python Popen doc I suspect is not possible. Please tell me I am wrong.
See also this related questions.
UPDATE:
Solution 5: #mata suggested to use Powershell Popen(['powershell', '-Command', r'C:\p\a.exe; python run.py']). This actually works, but now I have to deal with slightly different commands, and being lazy I've decided to use solution 3.
My favourite solution was Solution 3, to create a .bat file and call it
COMMAND="c:\p\command.bat"
subprocess.Popen(COMMAND,cwd=wd,shell=False)

I would use Solution 3
The & character is used to separate multiple commands on one command line.
Cmd.exe runs the first command, and then the second command.
In this case you could just write your batch file like this:
#echo off
c:\p\a.exe
python run.py
Also, it's worth noting when using cmd.exe:
The ampersand (&), pipe (|), and parentheses ( ) are special characters
that must be preceded by the escape character (^) or quotation marks
when you pass them as arguments.

Bash (or other shell): wrap all commands with function/script

Edit: This question was originally bash specific. I'd still rather have a bash solution, but if there's a good way to do this in another shell then that would be useful to know as well!
Okay, top level description of the problem. I would like to be able to add a hook to bash such that, when a user enters, for example $cat foo | sort -n | less, this is intercepted and translated into wrapper 'cat foo | sort -n | less'. I've seen ways to run commands before and after each command (using DEBUG traps or PROMPT_COMMAND or similar), but nothing about how to intercept each command and allow it to be handled by another process. Is there a way to do this?
For an explanation of why I'd like to do this, in case people have other suggestions of ways to approach it:
Tools like script let you log everything you do in a terminal to a log (as, to an extent, does bash history). However, they don't do it very well - script mixes input with output into one big string and gets confused with applications such as vi which take over the screen, history only gives you the raw commands being typed in, and neither of them work well if you have commands being entered into multiple terminals at the same time. What I would like to do is capture much richer information - as an example, the command, the time it executed, the time it completed, the exit status, the first few lines of stdin and stdout. I'd also prefer to send this to a listening daemon somewhere which could happily multiplex multiple terminals. The easy way to do this is to pass the command to another program which can exec a shell to handle the command as a subprocess whilst getting handles to stdin, stdout, exit status etc. One could write a shell to do this, but you'd lose much of the functionality already in bash, which would be annoying.
The motivation for this comes from trying to make sense of exploratory data analysis like procedures after the fact. With richer information like this, it would be possible to generate decent reporting on what happened, squashing multiple invocations of one command into one where the first few gave non-zero exits, asking where files came from by searching for everything that touched the file, etc etc.

Run this bash script:
#!/bin/bash
while read -e line
do
wrapper "$line"
done
In its simplest form, wrapper could consist of eval "$LINE". You mentioned wanting to have timings, so maybe instead have time eval "$line". You wanted to capture exit status, so this should be followed by the line save=$?. And, you wanted to capture the first few lines of stdout, so some redirecting is in order. And so on.
MORE: Jo So suggests that handling for multiple-line bash commands be included. In its simplest form, if eval returns with "syntax error: unexpected end of file", then you want to prompt for another line of input before proceeding. Better yet, to check for proper bash commands, run bash -n <<<"$line" before you do the eval. If bash -n reports the end-of-line error, then prompt for more input to add to `$line'. And so on.

Binfmt_misc comes to mind. The Linux kernel has a capability to allow arbitrary executable file formats to be recognized and passed to user application.
You could use this capability to register your wrapper but instead of handling arbitrary executable, it should handle all executable.

Why do my keystrokes turn into crazy characters after I dump a bunch of binary data into my terminal?

If I do something like:
$ cat /bin/ls
into my terminal, I understand why I see a bunch of binary data, representing the ls executable. But afterwards, when I get my prompt back, my own keystrokes look crazy. I type "a" and I get a weird diagonal line. I type "b" and I get a degree symbol.
Why does this happen?

Because somewhere in your binary data were some control sequences that your terminal interpreted as requests to, for example, change the character set used to draw. You can restore everything to normal like so:
reset

Just do a copy-paste:
echo -e '\017'
to your bash and characters will return to normal. If you don't run bash, try the following keystrokes:
<Ctrl-V><Ctrl-O><Enter>
and hopefully your terminal's status will return to normal when it complains that it can't find either a <Ctrl-V><Ctrl-O> or a <Ctrl-O> command to run.
<Ctrl-N>, or character 14 —when sent to your terminal— orders to switch to a special graphics mode, where letters and numbers are replaced with symbols. <Ctrl-O>, or character 15, restores things back to normal.

The terminal will try to interpret the binary data thrown at it as control codes, and garble itself up in the process, so you need to sanitize your tty.
Run:
stty sane
And things should be back to normal. Even if the command looks garbled as you type it, the actual characters are being stored correctly, and when you press return the command will be invoked.
You can find more information about the stty command here.

You're getting some control characters piped into the shell that are telling the shell to alter its behavior and print things differently.

VT100 is pretty much the standard command set used for terminal windows, but there are a lot of extensions. Some control character set used, keyboard mapping, etc.
When you send a lot of binary characters to such a terminal, a lot of settings change. Some terminals have options to 'clear' the settings back to default, but in general they simply weren't made for binary data.
VT100 and its successors are what allow Linux to print in color text (such as colored ls listings) in a simple terminal program.
-Adam

If you really must dump binary data to your terminal, you'd have much better luck if you pipe it to a pager like less, which will display it in a slightly more readable format. (You may also be interested in strings and od, both can be useful if you're fiddling around with binary files.)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string