Trying to convert CSV to Excel file in python - python-3.x

I have tried different codes and check online for the Solution. But not getting success in the below code.
df_new = pd.read_csv(path+'output.csv')
writer = pd.ExcelWriter(path+'output.xlsx')
df_new.to_excel(writer, index = False)
writer.save()
I am getting the below error when I am trying to execute it, I have try to add encoded as latin . But it is not working. Please guide me with it. When I am doing ignore_error it is running , but not providing any result.
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

Related

How do I use importlib.resources with pickle files?

I'm trying to load a pickle file with importlib.resources, but I'm getting the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
The bit that is raising the error is:
with importlib.resources.open_text("directory_with_pickle_file", "pickle_file.pkl") as f:
data = pickle.load(f)
I'm certain that the file (pickle_file.pkl) was created with pickle.dump.
What am I doing wrong?
Through lots of trial and error I figured out that importlib.resources has a read_binary function which can be used to read pickled files like so:
text = importlib.resources.read_binary("directory_with_pickle_file", "pickle_file.pkl")
data = pickle.loads(text)
Here, data is the pickled object.

Seaborn Error: 'ascii' codec can't encode character '\xda' in position 4710: ordinal not in range(128)

I'm trying to create a bar graph similar to this example https://seaborn.pydata.org/examples/grouped_barplot.html
My code is as follows:
sns.set(style="whitegrid")
df_noshow = sns.load_dataset(df)
g = sns.catplot(x="Noshow", y="SMS_received", hue="Gender",
data=df_noshow, height=6, kind="bar", palette="muted")
g.despine(left=True)
g.set_ylabels("Text Message Received")
I'm getting this error: UnicodeEncodeError: 'ascii' codec can't encode character '\xda' in position 4710: ordinal not in range(128)
Additionally, I'm not 100% sure I did the following code correctly:
df_noshow = sns.load_dataset(df)
I did create df earlier with
import pandas as pd
df= pd.read_csv('noshow2016.csv')
and all the previous code has been working and I can't imagine the unicode error having anything to do with csv file not loading correctly, however I wanted to include it just in case. Thank you.

Need to open and read a .bin file in Python. Getting error: utf-8' codec can't decode byte 0x81 in position 11: invalid start byte

I am trying to read and convert binary into text that anyone could read. I am having trouble with the error message:
'utf-8' codec can't decode byte 0x81 in position 11: invalid start byte
I have gone throughout: Reading binary file and looping over each byte
trying multiple versions of trying to open and read the binary file in some way. After reading about this error message, most people either had trouble with .cvs files, or had to change the utf-8 to -16. But reading up on https://en.wikipedia.org/wiki/UTF-16#Byte_order_encoding_schemes , Python does not use -16 anymore.
Also, if I add encoding = utf-16/32, the error states: binary mode doesn't take an encoding argument
Here is my code:
with open(b"P:\Projects\2018\1809-0068-R\Bin_Files\snap-pac-eb1-R10.0d.bin", "rb") as f:
byte = f.read(1)
while byte != b"":
byte = f.read(1)
print(f)
I am expecting to be able to read and write to the binary file. I would like to translate it to Hex and then to text (or to legible text somehow), but I think I have to go through this step before. If anyone could help with what I am missing, that would be greatly appreciated! Any way to open and read a binary file would be accepted. Thank you for your time!
I am not sure but this might help:
import binascii
with open('snap-pac-eb1-R10.0d.bin', 'rb') as f:
header = f.read(6)
b = bytearray(header)
binary=[bin(i)[2:].zfill(8) for i in b]
n = int('0b'+''.join(binary), 2)
nn = binascii.unhexlify('%x' % n)
nnn=nn.decode("ascii")[0:-1]
result='.'.join(str(ord(c)) for c in nnn[0:-1])
print(result)
Output:
16.0.8.0

'charmap' codec can't encode characters in position XX

I have a simple script that is attempting to extract mutiple json objects from a single file, and store it as a list:
import json
URL = r"C:\Users\Kenneth\Youtube_comment_parser\Testing.txt"
with open(URL, 'r', encoding="utf-8") as handle:
json_data = [json.loads(line) for line in handle]
print(json_data) # Can't .encode() because it's a list
Even after specifying utf-8 encoding, I'm still running into a codec error. If possible, I would also like to change this object into a dictionary, but this is as far as I've got.
The exact error reads:
UnicodeEncodeError: 'charmap' codec can't encode characters in position
394-395: character maps to <undefined>
Thanks in advance.
I was able to solve this issue by removing one unicode character that was producing "/undefined>", the string '\ufeff', and then the rest was able to display nicely. This required me to iterate over the keys in the list of dictionaries, and replace as necessary.
import json
URL = r"C:\Users\Kenneth\Youtube_comment_parser\Testing.txt"
json1_file = open(URL, encoding='utf-8')
json1_str = json1_file.read()
json1_str = [d.strip() for d in json1_str.splitlines()]
json1_data = [json.loads(i) for i in json1_str]
json1_data = [{key:value.replace(u'\ufeff', '') for
key, value in json1_data[index].items()} for
index in range(len(json1_data))]
print(json1_data[1]['text'].encode('utf-8'))
Still not sure why I have to open with utf-8 and then encode again with my print statement, but it produced the string nicely.

Trying to output hex data as readable text in Python 3.6

I am trying to read hex values from specific offsets in a file, and then show that as normal text. Upon reading the data from the file and saving it to a variable named uName, and then printing it, this is what I get:
Card name is: b'\x95\xdc\x00'
Here's the code:
cardPath = str(input("Enter card path: "))
print("Card name is: ", end="")
with open(cardPath, "rb+") as f:
f.seek(0x00000042)
uName = f.read(3)
print(uName)
How can remove the 'b' I am getting at the beginning? And how can I remove the '\x'es so that b'\x95\xdc\x00' becomes 95dc00? If I can do that, then I guess I can convert it to text using binascii.
I am sorry if my mistake is really really stupid because I don't have much experience with Python.
Those string started with b in python is a byte string.
Usually, you can use decode() or str(byte_string,'UTF-8) to decode the byte string(i.e. the string start with b') to string.
EXAMPLE
str(b'\x70\x79\x74\x68\x6F\x6E','UTF-8')
'python'
b'\x70\x79\x74\x68\x6F\x6E'.decode()
'python'
However, for your case, it raised an UnicodeDecodeError during decoding.
str(b'\x95\xdc\x00','UTF-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x95 in position 0: invalid start byte
I guess you need to find out the encoding for your file and then specify it when you open the file, like below:
open("u.item", encoding="THE_ENCODING_YOU_FOUND")

Resources