Streaming data from the R&S rto oscilloscope - UnicodeDecodeError python3.6 - python-3.x

I'm trying to get the signal data for a specific channel on the Rhode and Schwarz RTO oscilloscope . I'm using the vxi11 python(3.6) library to communicate with the scope.
On my first try, I was able to extract all the data of the scope channel I was querying without any errors(using this query command CHAN1:WAV1:DATA?) but soon after I started getting this error message.
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc7 in position 10: invalid continuation byte
The wierd thing is that I'm still able to get the head of the data without any issues. It's only when I request the entire data to be sent over that I see this error.
I've tried to change the format of the data between REAL(binary) and ASCii, but to no avail.
Another weird thing is that when I switch the data encoding of the received data to 'latin-1', it works fine for a moment(giving me a strange character string, that I'm assuming is the data I want - just in another format) and then crashes.
The entire output looks as follows:
****IDN : Rohde&Schwarz,RTO,1329.7002k04/100938,4.20.1.0
FORM[:DATA]ASCii : None
CHAN1:WAV1:DATA:HEAD? : -0.2008,0.1992,10000000,1
'utf-8' codec can't decode byte 0xc7 in position 10: invalid continuation byte
'utf-8' codec can't decode byte 0xc7 in position 10: invalid continuation byte
Traceback (most recent call last):
File "testing_rtodto.py", line 21, in ask_query
logger.debug(print(query+" :",str(conn._ask(query))))
File "../lib_maxiv_rtodto/client.py", line 187, in _ask
response = self.instrument.ask(data)#, encoding="latin-1")
File "/usr/lib/python3.6/site-packages/vxi11/vxi11.py", line 743, in ask
return self.read(num, encoding)
File "/usr/lib/python3.6/site-packages/vxi11/vxi11.py", line 731, in read
return self.read_raw(num).decode(encoding).rstrip('\r\n')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc7 in position 10: invalid continuation byte

Alrighty, I found a fix. Thanks mostly to this thread https://github.com/pyvisa/pyvisa/issues/306
Though I'm not using the same communication library as they are, the problem seemed to be the way I was querying the data not how the library was reading it.
Turns out you have to follow R&S's instrument instructions very very VERY closely(although their documentation is extremely confusing and hard to find - not to mention the lack of example query strings for important query functions)
Essentially, the query command that worked was FORM ASC;:CHAN1:DATA?. This explicitly converts the data to ASCii format before returning it to the communicating library.
I also found some sample python scripts that R&S have provided (https://cdn.rohde-schwarz.com/pws/service_support/driver_pagedq/files_1/directscpi/DirectSCPI_PyCharm_Python_Examples.zip).

Related

double encoding through cp1252 and base 64

From a client I am getting a pdf file, which is encoded in cp 1252 and for transfer is also encoded in base 64. Till now a shell program returns the file into the original form through this code line:
output= [System.Text.Encoding]::GetEncoding(1252).GetString([System.Convert]::FromBase64String(input))
and this works.
Now I am implementing a python version, to supersede this implementation. This looks generally like this:
enc_file = read_from_txt.open_file(location_of_file)
plain_file= base64.b64decode(enc_file)
with open('filename', 'w') as writer:
writer.write(plain_file.decode('cp1252'))
where read_from_txt.open_file just does this:
with open(file_location, 'rb') as fileReader:
read = fileReader.read()
return read
But for some reason, I am getting an error in the plain_file.decode('cp1252'), where it can not decode a line in the file. From what I am understanding though, the python program should do exactly the same, as the powershell does.
Concrete error is:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 188: character maps to undefined
Any help is appreciated.

NLTK access local files; UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 834: char acter maps to <undefined>

This is my first post and I have little to no experience but I like to learn.
I hope this post is comprehensible but please feel free to ask for further details.
I am working with Cygwin, and occasionally I use IDLE Python 3.9 to complete some tasks for University.
Currently I am trying to use the NLTK module and tokenize a text.
The first thing I do is open python (through Cygwin or directly from IDLE but I've mostly been using Cygwin).
>>>import nltk
>>> from nltk import word_tokenize
>>> from nltk.book import *
at which point a library of different books would be downloaded for me to access. I don't really need them, though, because I need to access a local file, in a folder called "Tint"
The command that I HAVE managed to do but cannot replicate is
>>>Rev = open("/Users/acer/OneDrive - Università di Pavia/Desktop/Tint/amazon_jamon.no_alterations.txt", "r").read()
In the past, the first problem I was experiencing was using the escape command due to back-slash but when I fixed that to regular slashes it would work. Now that I am trying to access a similar .txt file in the same "Tint" folder, through this command I get a different error.
>>> desc = open("/Users/acer/OneDrive - Università di Pavia/Desktop/Tint/salame_
P.txt", "r").read()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python\Python 395\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 834: char
acter maps to <undefined>

Python : Base64 Decode codec can't decode bytes in position 47-48 : invalid continuation byte

I found many questions related to this question on stackoverflow and even following and applying what solved problems for other users. I'm still at starting line.
I'm receiving response in UTF-8 encoded format. It's XML file I want to decode.
I saved the response in .txt file saved as UTF-8 encoding and tried following :
import base64
with open('docdata.txt', 'r') as f:
e = f.read()
print(e[:50])
decoded = base64.b64decode(e)
print(str(decoded, "utf-8"))
When I run above program I get this error:
print(str(decoded, "utf-8"))
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 47-48: invalid continuation byte
File size is around 26MB. When I tried uploading the same file on Base64decode I am getting proper output file without any error.
print(decoded[:50])
>> b'PK\x03\x04\x14\x00\x08\x08\x08\x005O=Q\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x11\x00\x00\x00newUserPkList.xml\xec\xbd\xcb'
print(decoded[47:50]
>> b'\xec\xbd\xcb'
Please let me know what mistake I'm doing and how can I solve this error ?
Thanks.

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 591: character maps to <undefined>

I have a code to convert docx files to pure text:
import docx
import glob
def getText(filename):
doc = docx.Document(filename)
fullText = []
for para in doc.paragraphs:
fullText.append(para.text)
return '\n'.join(fullText)
for file in glob.glob('*.docx'):
outfile = open(file.replace('.docx', '-out.txt'), 'w', encoding='utf8')
for line in open(file):
print(getText(filename), end='', file=outfile)
outfile.close()
However, when I execute it, there is the following error:
Traceback (most recent call last):
File "C:\Users\User\Desktop\add spaces docx\converting docx to pure text.py", line 16, in <module>
for line in open(file):
File "C:\Users\User\AppData\Local\Programs\Python\Python35-32\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 591: character maps to <undefined>
I am using Python 3.5.2.
Can anyone help to resolve this issue?
Thanks in advance.
Although I do not know the docx modules as much, I think I can find a solution.
According to fileformat, the Unicode character 8f (which is what the charmap codec couldn't decode, resulting in a UnicodeDecodeError) is a control character.
You should be aware that when reading files (which seems to be the case for what the docx module is doing), you should be aware of control characters, because sometimes Python can't decode it.
The solution to this is to give up on the docx module, learn how .docx files work and are formatted, and when you read a docx file, use open(filename, "rb") so Python will be able to decode it.
However, this might not be the problem. As you can see, in the directory encodings, it uses cp1512 as it's encoding (default) instead of utf-8. Try changing it to utf_8.py (for me it comes up as utf_8.pyc).
NOTE: Sorry for the lack of links. This is because I do not have higher than 10 reputation (because I am new to Stack Overflow).

UnicodeDecodeError: 'ascii' codec can't decode byte 0xcc in position 5

I am trying to read a file using the following code.
precomputed = pickle.load(open('test/vgg16_features.p', 'rb'))
features = precomputed['features']
But getting this error.
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcc in position 5: ordinal not in range(128)
The file I am trying to read contains image features which are extracted using deep neural networks. The file content looks like below.
(dp0
S'imageIds'
p1
(lp2
I262145
aI131074
aI131075
aI393221
aI393223
aI393224
aI524297
aI393227
aI393228
aI262146
aI393230
aI262159
aI524291
aI322975
aI131093
aI524311
....
....
....
Please note that, this is big file, of size 2.8GBs.
I know this is a duplicate question but I followed the suggested solutions in other stackoverflow posts but couldn't solve it. Any help would be appreciated!
Finally I found the solution. The problem was actually about unpickling a python 2 object with python 3 which I couldn't understand first because the pickle file I got was written through a python 2 program.
Thanks to this answer which solved the problem. So, all I need to do is set the encoding parameter of pickle.load() function to latin1 because latin1 works for any input as it maps the byte values 0-255 to the first 256 Unicode codepoints directly.
So, the following worked for me!
precomputed = pickle.load(open('test/vgg16_features.p', 'rb'), encoding='latin1')

Resources