Printed python variable does not equal variable value - python-3.x

I have a python script which calls a php function using subprocess. The output of the php script is captured as a variable for further use. When analysing the stored variable I have found that if I print the variable it has a different value than the variable itself and I can't understand why. I am bracing myself to be embarrassed so please enlighten me.
Yes the password in this fictional example is "password"
cmd = "/usr/bin/php -f /opt/hello_world.php {} {}".format(encrypted_password,of_secret)
decrypted_password = subprocess.getoutput(cmd)
print(decrypted_password)
pdb.set_trace()
in pdb here is the output
password
(Pdb) decrypted_password
'\x00p\x00a\x00s\x00s\x00w\x00o\x00r\x00d'
(Pdb) print(decrypted_password)
password
(Pdb) locals()
{'cmd': '/usr/bin/php -f /opt/hello_world.php f2e57ba074b3c3d8d4d010bcff13083dc5928107f8cfbfaa4a52fff0155eebe5 JqiRnKJBaSwEOCI', 'decrypted_password': '\x00p\x00a\x00s\x00s\x00w\x00o\x00r\x00d'}

subprocess.get_output() is returning a byte string, but Python is recognizing it as a string. So you need to first .encode() that and then decode() it:
>>> '\x00p\x00a\x00s\x00s\x00w\x00o\x00r\x00d'.encode().decode('utf_16_be')
'password'
It is encoded in UTF-16 BE, which is giving two bytes to each character hence the \x00 for a zero.
You may be able to adjust the locale or otherwise manipulate the encoding to recognize the string properly.

Related

pynag with python3.6 TypeError

I'm trying to read my nagios config data as follows:
pynag.Model.cfg_file = "path_to_nagios.cfg"
all_hosts = pynag.Model.Host.objects.all"
This returns an error
TypeError: endswith first arg must be bytes or a tuple of bytes
From what I've read so far, it seems that it's related to how files are opened in python3
Do you know how to correct this?
Thanks.
The fix was in the library code. The def parse_file() is opening files as 'rb'. The reason this is an error in Python 3 and not Python 2 is that Python 2 treats bytes as an alias or synonym for str. It doesn't make a distinction between byte strings and unicode strings as is done in Python 3.
In pynag/Parsers/init.py changed
lines = open(self.filename, 'rb').readlines()
to
lines = open(self.filename, 'r').readlines()

How to use f'string bytes'string together? [duplicate]

I'm looking for a formatted byte string literal. Specifically, something equivalent to
name = "Hello"
bytes(f"Some format string {name}")
Possibly something like fb"Some format string {name}".
Does such a thing exist?
No. The idea is explicitly dismissed in the PEP:
For the same reason that we don't support bytes.format(), you may
not combine 'f' with 'b' string literals. The primary problem
is that an object's __format__() method may return Unicode data
that is not compatible with a bytes string.
Binary f-strings would first require a solution for
bytes.format(). This idea has been proposed in the past, most
recently in PEP 461. The discussions of such a feature usually
suggest either
adding a method such as __bformat__() so an object can control how it is converted to bytes, or
having bytes.format() not be as general purpose or extensible as str.format().
Both of these remain as options in the future, if such functionality
is desired.
In 3.6+ you can do:
>>> a = 123
>>> f'{a}'.encode()
b'123'
You were actually super close in your suggestion; if you add an encoding kwarg to your bytes() call, then you get the desired behavior:
>>> name = "Hello"
>>> bytes(f"Some format string {name}", encoding="utf-8")
b'Some format string Hello'
Caveat: This works in 3.8 for me, but note at the bottom of the Bytes Object headline in the docs seem to suggest that this should work with any method of string formatting in all of 3.x (using str.format() for versions <3.6 since that's when f-strings were added, but the OP specifically asks about 3.6+).
From python 3.6.2 this percent formatting for bytes works for some use cases:
print(b"Some stuff %a. Some other stuff" % my_byte_or_unicode_string)
But as AXO commented:
This is not the same. %a (or %r) will give the representation of the string, not the string iteself. For example b'%a' % b'bytes' will give b"b'bytes'", not b'bytes'.
Which may or may not matter depending on if you need to just present the formatted byte_or_unicode_string in a UI or if you potentially need to do further manipulation.
As noted here, you can format this way:
>>> name = b"Hello"
>>> b"Some format string %b World" % name
b'Some format string Hello World'
You can see more details in PEP 461
Note that in your example you could simply do something like:
>>> name = b"Hello"
>>> b"Some format string " + name
b'Some format string Hello'
This was one of the bigger changes made from python 2 to python3. They handle unicode and strings differently.
This s how you'd convert to bytes.
string = "some string format"
string.encode()
print(string)
This is how you'd decode to string.
string.decode()
I had a better appreciation for the difference between Python 2 versus 3 change to unicode through this coursera lecture by Charles Severence. You can watch the entire 17 minute video or fast forward to somewhere around 10:30 if you want to get to the differences between python 2 and 3 and how they handle characters and specifically unicode.
I understand your actual question is how you could format a string that has both strings and bytes.
inBytes = b"testing"
inString = 'Hello'
type(inString) #This will yield <class 'str'>
type(inBytes) #this will yield <class 'bytes'>
Here you could see that I have a string a variable and a bytes variable.
This is how you would combine a byte and string into one string.
formattedString=(inString + ' ' + inBytes.encode())

How does ruamel.yaml determine the encoding of escaped byte sequences in a string?

I am having trouble figuring out where to modify or configure ruamel.yaml's loader to get it to parse some old YAML with the correct encoding. The essence of the problem is that an escaped byte sequence in the document seems to be interpreted as latin1, and I have no earthly clue where it is doing that, after some source diving here. Here is a code sample that demonstrates the behavior (this in particular was run in Python 3.6):
from ruamel.yaml import YAML
yaml = YAML()
yaml.load('a:\n b: "\\xE2\\x80\\x99"\n') # Note that this is a str (that is, unicode) with escapes for the byte escapes in the YAML document
# ordereddict([('a', ordereddict([('b', 'â\x80\x99')]))])
Here are the same bytes decoded manually, just to show what it should parse to:
>>> b"\xE2\x80\x99".decode('utf8')
'’'
Note that I don't really have any control over the source document, so modifying it to produce the correct output with ruamel.yaml is out of the question.
ruamel.yaml doesn't interpret individual strings, it interprets the
stream it gets hanled, i.e. the argument to .load(). If that
argument is a byte-stream or a file like object then its encoding is
determined based on the BOM, defaulting to UTF-8. But again: that is
at the stream level, not at individual scalar content after
interpreting escapes. Since you hand .load() Unicode (as this is
Python 3) that "stream" needs no further decoding. (Although
irrelevant for this question: it is done in the reader.py:Reader methods stream and
determine_encoding)
The hex escapes (of the form \xAB), will just put a specific hex
value in the type the loader uses to construct the scalar, that is
value for key 'b', and that is a normal Python 3 str i.e. Unicode in
one of its internal representations. That you get the â in your
output is because of how your Python is configured to decode it str
tyes.
So you won't "find" the place where ruamel.yaml decodes that
byte-sequence, because that is already assumed to be Unicode.
So the thing to do is that you double decode your double quoted
scalars (you only have to address those as plain, single quoted,
literal/folded scalars cannot have the hex escapes). There are various
points at which you can try to do that, but I think
constructor.py:RoundTripConsturtor.construct_scalar and
scalarstring.py:DoubleQuotedScalarString are the best candidates. The former of those might take some digging to find, but the latter is actually the type you'll get if you inspect
that string after loading when you add the option to preserve quotes:
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
data = yaml.load('a:\n b: "\\xE2\\x80\\x99"\n')
print(type(data['a']['b']))
which prints:
<class 'ruamel.yaml.scalarstring.DoubleQuotedScalarString'>
knowing that you can inspect that rather simple wrapper class:
class DoubleQuotedScalarString(ScalarString):
__slots__ = ()
style = '"'
def __new__(cls, value, anchor=None):
# type: (Text, Any) -> Any
return ScalarString.__new__(cls, value, anchor=anchor)
"update" the only method there (__new__) to do your double
encoding (you might have to put in additional checks to not double encode all
double quoted scalars0:
import sys
import codecs
import ruamel.yaml
def my_new(cls, value, anchor=None):
# type information only needed if using mypy
# value is of type 'str', decode to bytes "without conversion", then encode
value = value.encode('latin_1').decode('utf-8')
return ruamel.yaml.scalarstring.ScalarString.__new__(cls, value, anchor=anchor)
ruamel.yaml.scalarstring.DoubleQuotedScalarString.__new__ = my_new
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
data = yaml.load('a:\n b: "\\xE2\\x80\\x99"\n')
print(data)
which gives:
ordereddict([('a', ordereddict([('b', '’')]))])

Python OrderedDict to valid json

I have a python OrderedDict as follows.
sample_dict = OrderedDict([('foo', 'bar'), ('foo1', 'bar1')])
I need to convert it to valid JSON. I tried
json.loads(json.dumps(sample_dict))
The output is
{'foo1': 'bar1', 'foo': 'bar'}
The output contains single quote I'm expecting the double quote
json.dumps(sample_dict)
Already returns the JSON, that's enough.
You then feed it to json.loads, which turns it into a Python object in memory again. When you print that, Python is free to choose whether to use ' or " (it really doesn't matter) and happens to choose '. But that has nothing to do with JSON.

Why is str.translate() returning an error and how can I fix it?

import os
def rename_files():
file_list = os.listdir(r"D:\360Downloads\test")
saved_path = os.getcwd()
os.chdir(r"D:\360Downloads\test")
for file_name in file_list:
os.rename(file_name, file_name.translate(None,"0123456789"))
rename_files()
the error message is TypeError: translate() takes exactly one argument (2 given). How can I format this so that translate() does not return an error?
Hope this helps!
os.rename(file_name,file_name.translate(str.maketrans('','','0123456789')))
or
os.rename(file_name,file_name.translate({ ord(i) : None for i in '0123456789' }))
Explanation:
I think you're using Python 3.x and syntax for Python 2.x. In Python 3.x translate() syntax is
str.translate(table)
which takes only one argument, not like Python 2.x in which translate() syntax is
str.translate(table[, deletechars])
which can takes more than one arguments.
We can make translation table easily using maketrans function.
In this case, In first two parameters, we're replacing nothing to nothing and in third parameter we're specifying which characters to be removed.
We can also make translation table manually using dictionary in which key contains ASCII of before and value contains ASCII of after character.If we want to remove some character it value must be None.
I.e. if we want to replace 'A' with 'a' and remove '1' in string then our dictionary looks like this
{65: 97, 49: None}

Resources