Convert pickle file from protocol 3 to protocol 2 - python-3.x

I dumped a pickle file using protocol 3, the default used by python3, but while deploying it on Google cloud which works on python2 , so i need to convert pickle file to protocol 2 . Now i want to directly convert this pickle file of protocol3 to pickle file of protocol 2. How to do ?

Can you try something like below?
I did not find any direct converter in standard. May be someone can.
Load the file into object obj and then do the following.
pickle.dump(obj, fileObject, 2)
There is an option to pass to dump function:
https://docs.python.org/3.1/library/pickle.html#pickle.dump
Rough code:
import pickle
with open('data1.pickle', 'rb') as f1:
data = pickle.load(f1)
with open('data2.pickle', 'wb') as f2:
pickle.dump(data, f2, 2)

Related

Read a utf-16LE file directly in cloud function -python/GCP

I have a csv file with utf-16le encoding, I tried to open it in cloud function using
import pandas as pd
from io import StringIO as sio
with open("gs://bucket_name/my_file.csv", "r", encoding="utf16") as f:
read_all_once = f.read()
read_all_once = read_all_once.replace('"', "")
file_like = sio(read_all_once)
df = pd.read_csv(file_like, sep=";", skiprows=5)
I get the error that the file is not found on location. what is the issue? When I run the same code locally with a local path it works.
Also when the file is in utf-8 encoding I can read it directly with
df = pd.read_csv("gs://bucket_name/my_file.csv, delimiter=";", encoding="utf-8", skiprows=0,low_memory=False)
I need to know if I can read the utf16 file directly with pd.read_csv()? if no, how do I make with open() recognize the path?
Thanks in advance!
Yes, you can read the utf-16 csv file directly with the pd.read_csv() method.
For the method to work please make sure that the service account attached to your function has access to read the CSV file in the Cloud Storage bucket.
Please ensure whether the encoding of the csv file you are using is “utf-16” or “utf-16le” or “utf-16be” and use the appropriate one in the method.
I used python 3.7 runtime.
My main.py file and requirement.txt file looks as below. You can
modify the main.py according to your use case.
main.py
import pandas as pd
def hello_world(request):
#please change the file's URI
data = pd.read_csv('gs://bucket_name/file.csv', encoding='utf-16le')
print (data)
return f'check the results in the logs'
requirement.txt
pandas==1.1.0
gcsfs==0.6.2

How to read a binary file and write it in a txt or csv file using python?

I have a binary file (.man), containing data that I want to read, using python 3.7. The idea is to convert this binary file into a txt or a csv file.
I know the total number of values in the binary file but not the number of bytes per value.
I have red many post talking about binary file but none was helpful...
Thank you in advance,
Simply put, yes.
with open('file.man', 'rb') as f:
data = f.readlines()
print(data) # binary values represented as string
Opening a file with the optimal parameter 'rb' means that it will read a binary file and translate it to ASCII (abstracted) for you.
The solution I found is that:
import struct
import numpy as np
data =[]
with open('binary_file', "rb") as f:
while len(data)<length(binary_file):
data.extend([struct.unpack('f',f.read(4))])
Of course, this works because I know that the encoding is simple precision.

Cannot load csv file using numpy loadtxt

I cannot load a csv file using the numpy loadtxt function. There must be something wrong with the file format or something else. I am using anocanda notebook on macbook.
OSError: Macintosh HD⁩\\Users⁩\\binhao⁩\\Downloads⁩\\Iris_data.csv not found.
np.loadtxt("Macintosh HD⁩\\Users⁩\\binhao⁩\\Downloads⁩\\Iris_data.csv")
I tried a solution I found on stackflow involved using:
f = open(u"Macintosh HD⁩\\Users⁩\\binhao⁩\\Downloads⁩\\Iris_data.csv")
f = open("Macintosh HD⁩\\Users⁩\\binhao⁩\\Downloads⁩\\Iris_data.csv")
Above don't work - No such file or directory error
Most of the time is due to some non-escaped charter, try to use raw string:
r"Macintosh HD⁩\\Users⁩\\binhao⁩\\Downloads⁩\\Iris_data.csv"

how to write data from a point cloud file to a text file

I am trying to make a program that takes point cloud data and publishes the data into a txt file for some reason I am getting this error when I run my code:
File "readpts.py", line 14, in <module>
f.write("%d "%float(array[i][0].item()))
io.UnsupportedOperation: not writable
This should be a simple fix I just don't know what I am doing wrong. Here is my code:
import numpy as np
import open3d as o3d
pcd= o3d.io.read_point_cloud("cloud_cd.ply")
#print(pcd)
#print(np.asarray(pcd.points))
array=np.asarray(pcd.points)
f=open("cloud_cd.ply")
#print(type(float(array[0][0].item())))
for i in range(len(array)):
f.write("%d "%float(array[i][0].item()))
f.write("%d "%float(array[i][1].item()))
f.write("%d \n"%float(array[i][2].item()))
You are opening your file in read mode which is the default one when using the open function.
You should do something like this:
import numpy as np
import open3d as o3d
pcd= o3d.io.read_point_cloud("cloud_cd.ply")
array=np.asarray(pcd.points)
with open("points.txt", mode='w') as f: # I add the mode='w'
for i in range(len(array)):
f.write("%f "%float(array[i][0].item()))
f.write("%f "%float(array[i][1].item()))
f.write("%f \n"%float(array[i][2].item()))
The with allows to close the file even if an error occured.
Edit
For the rounding issue, it is due to the %d. In order to have a float, replace the %d with %f (done in the code above). If you want to have only two decimals: %.2f (more information in the doc).
If you are in python3.6+, you can use the formatted string.

Python 3 pickle load from Python 2

I have a pickle file that was created (I don't know how exactly) in python 2. It is intended to be loaded by the following python 2 lines, which when used in python 3 (unsurprisingly) do not work:
with open('filename','r') as f:
foo, bar = pickle.load(f)
Result:
'ascii' codec can't decode byte 0xc2 in position 1219: ordinal not in range(128)
Manual inspection of the file indicates it is utf-8 encoded, therefore:
with open('filename','r', encoding='utf-8') as f:
foo, bar = pickle.load(f)
Result:
TypeError: a bytes-like object is required, not 'str'
With binary encoding:
with open('filename','rb', encoding='utf-8') as f:
foo, bar = pickle.load(f)
Result:
ValueError: binary mode doesn't take an encoding argument
Without binary encoding:
with open('filename','rb') as f:
foo, bar = pickle.load(f)
Result:
UnpicklingError: invalid load key, '
'.
Is this pickle file just broken? If not, how can I pry this thing open in python 3? (I have browsed the extensive collection of related questions and not found anything that works yet.)
Finally, note that the original
import cPickle as pickle
has been replaced with
import _pickle as pickle
The loading of python2 pickles in python3 (version 3.7.2 in this example) can be helped using the fix_imports parameter in the pickle.load function, but in my case it also worked without setting that parameter to True.
I was attempting to load a scipy.sparse.csr.csr_matrix contained in pickle generated using Python2.
When inspecting the file format using the UNIX command file it says:
>file -bi python2_generated.pckl
application/octet-stream; charset=binary
I could load the pickle in Python3 using the following code:
with open("python2_generated.pckl", "rb") as fd:
bh01 = pickle.load(fd, fix_imports=True, encoding="latin1")
Note that the loading was successful with and without setting fix_imports to True
As for the "latin1" encoding, the Python3 documentation (version 3.7.2) for the pickle.load function says:
Using encoding='latin1' is required for unpickling NumPy arrays and instances of datetime, date and time pickled by Python 2
Although this is specifically for scipy matrixes (or Numpy arrays), and since Novak is not clarifing what his pickle file contained,
I hope this could of help to other users :)
Two errors were conflating each other.
First: By the time the .p file reached me, it had almost certainly been corrupted in transit, likely by FTP-ing (or similar) in ASCII rather than binary mode. I was able to get my hands on a properly transmitted copy, which allowed me to discover...
Second: Whatever the file might have implied on the inside, the proper encoding was 'latin1' not 'utf-8'.
So in a sense, yes, the file was broken, and even after that I was doing it wrong. I leave this here as a reminder to whoever eventually has the next bizarre pickle/python2/python3 issue that there can be multiple things gone wrong, and they have to be solved in the correct orderr.

Resources