Pickle can't be load for Pascal VOC pickle dataset - python-3.x

I'm trying to load Pascal VOC dataset from Stanford website here. Also trying to implement a code from Semantic Image Segmentation on Pascal VOC Pystruct blog. But I'm getting UnicodeDecodeError when I tried to load the pickle file. I tried below code so far:
import numpy as np
try:
import cPickle as pickle
except ImportError:
import pickle
from pystruct import learners
import pystruct.models as crfs
from pystruct.utils import SaveLogger
data_train = pickle.load(open("trainingData/data_train.pickle"))
C = 0.01
And I got this errror:
Traceback (most recent call last):
File "/Users/mypath/PycharmProjects/semantic_segmentation_ex/ex1.py", line 11, in <module>
data_train = pickle.load(open("trainingData/data_train.pickle"))
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128)
I couldn't find any same problem and solution. How do I get this to work?

One of my friend told me the reason. Serialized object is a python2 object, so if you load with Python2, it's opening directly without any problem.
But If you would like to load with Python3, you need to add encoding parameters to pickle not into open function. Here is sample code:
import numpy as np
try:
import cPickle as pickle
except ImportError:
import pickle
with open('data_train.pickle', 'rb') as f:
# If you use Python 3 needs a parameter as encoding='bytes'
# Otherwise, you shouldn't add encoding parameters in Python 2
data_train = pickle.load(f, encoding='bytes')
print("Finished loading data!")
print(data_train.keys())
Special thanks to #ahmet-sezgin-duran

Related

AttributeError: 'FastTextKeyedVectors' object has no attribute 'vocab'

I'm trying to load and use some pre-trained fasttext embeddings (that were trained by me and stored in .kv). In the same directory I have stored the "vectors_1920_fullsample.kv.vectors_vocab.npy" file. When Does someone know what is going on?
This doesn't give any error:
import matplotlib
matplotlib.use('Agg')
import numpy as np
from scipy.spatial.distance import cosine
from nltk.stem.snowball import SnowballStemmer
stemmer = SnowballStemmer("english")
import matplotlib.pyplot as plt
from wordcloud import WordCloud
import os
import joblib
from gensim.models import Word2Vec
import random
from gensim.models import KeyedVectors
import pandas as pd
model = KeyedVectors.load(wd_model + '/vectors_1920_fullsample.kv', mmap='r')
words = ['immigrant','immigrants','migrant','migrants','foreign','foreigner','foreigners','alien','aliens','expatriate','expatriates','emigrant','emigrants','nonnative','nonnatives','stranger','strangers']
But then when I do this I get the error below:
words = pd.DataFrame([np.array(model[word]) for word in words])
Error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <listcomp>
File "/cluster/apps/nss/gcc-6.3.0/python/3.7.4/x86_64/lib64/python3.7/site-packages/gensim/models/keyedvectors.py", line 353, in __getitem__
return self.get_vector(entities)
File "/cluster/apps/nss/gcc-6.3.0/python/3.7.4/x86_64/lib64/python3.7/site-packages/gensim/models/keyedvectors.py", line 471, in get_vector
return self.word_vec(word)
File "/cluster/apps/nss/gcc-6.3.0/python/3.7.4/x86_64/lib64/python3.7/site-packages/gensim/models/keyedvectors.py", line 2124, in word_vec
if word in self.vocab:
AttributeError: 'FastTextKeyedVectors' object has no attribute 'vocab'

" TypeError: 'int' object is not callable" unique problem in python3

I'm getting an error I have never seen in python before, and I'm having trouble finding any information on the internet that solves my problem. I'm trying to program a voice assistant. Here is my code:
from fuzzywuzzy import fuzz
import sys
#sys.path.insert(0, '/home/pi/AIY-voice-kit-python/src/aiy/voice')
from aiy.voice import tts
#import tts
tts.say('Harry activated')
from os import environ, path
import os
#sys.path.insert(0, '/home/pi/AIY-voice-kit-python/src/examples/voice')
#from __init__ import LiveSpeech, get_model_path
from pocketsphinx import LiveSpeech, get_model_path
model_path = get_model_path()
print("active")
speech = LiveSpeech(
verbose=False,
sampling_rate=16000,
buffer_size=2048,
no_search=False,
full_utt=False,
hmm=os.path.join(model_path, 'en-us'),
lm=os.path.join(model_path, 'en-us.lm.bin'),
dic=os.path.join(model_path, 'cmudict-en-us.dict')
)
for phrase in speech:
p = str(phrase)
print(p)
r1 = (fuzz.ratio(p,"harry"))
print(r1)
ri = int(r1)
if ri > 60():
print("you said my name")
I'm not having any problems with imports. It's just the speech recognition has accuracy issues which is why I'm experimenting with "fuzzywuzzy". Python spits this error at me:
Traceback (most recent call last):
File "/home/pi/Desktop/AIY-projects-python/src/examples/voice/speack.py", line 31, in <module>
if ri > 60():
TypeError: 'int' object is not callable
I don't know where to go from here. Does anyone know how to resolve this issue? (Yes I know stackoverflow already has a similar question, but the answers don't seem to apply my situation)
I don't think that you need the () so try
int(ri) > 60:

Python3 Glove Attribute Error: 'generator' object has no attribute 'shape'

I am trying to train my model using glove. My code is as below:
#!/usr/bin/env python3
from __future__ import print_function
import argparse
import pprint
import gensim
from glove import Glove
from tensorflow.python.keras.utils.data_utils import Sequence
def read_corpus(filename):
delchars = [chr(c) for c in range(256)]
delchars = [x for x in delchars if not x.isalnum()]
delchars.remove(' ')
delchars = ''.join(delchars)
with open(filename, 'r') as datafile:
for line in datafile:
yield line.lower().translate(None, delchars).split(' ')
if __name__ == '__main__':
base_path = "/home/hunzala_awan/vocab.pubmed1.txt"
get_data = read_corpus(base_path)
glove = Glove(no_components=100, learning_rate=0.05)
glove.fit(get_data, epochs=10, verbose=True)
pprint.pprint(glove.most_similar("cancer", number=10))
When I try to run this code, I get the following error:
Traceback (most recent call last):
File "mytest3.py", line 36, in
glove.fit(get_data, epochs=10, verbose=True)
File "/usr/local/lib/python3.5/dist-packages/glove/glove.py", line 86, in fit
shape = matrix.shape
AttributeError: 'generator' object has no attribute 'shape'
What am I missing? Any help in this issue will be highly appreciated.
Thanks in advance
I'm not familiar with Glove, but it seems that it can't fit from genberator function. You can try yield it ahead-of-time and convert to list (it will be more memory-consuming):
glove.fit(list(get_data), epochs=10, verbose=True)

looping over images in a directory

I have images in the same directory with a python file, i am trying to loop over the images and convert them into base64 but am getting this error.
Am using Ubuntu 14.0.4
Traceback (most recent call last):
File "convert_to_base64.py", line 33, in <module>
print(main())
File "convert_to_base64.py", line 26, in main
convert_to_base64()
File "convert_to_base64.py", line 19, in convert_to_base64
with open("*.jpg", "rb") as f:
IOError: [Errno 2] No such file or directory: '*.jpg'
Here is my python code
# -*- coding: utf-8 -*-
import os
import sys
import xlrd
import base64
import urllib
from datetime import datetime
reload(sys) # to re-enable sys.setdefaultencoding()
sys.setdefaultencoding('utf-8')
def convert_to_base64():
"""
Read all jpg images in a folder,
and print them in base64
"""
with open("*.jpg", "rb") as f:
data = base64.b64decode(f.read())
print data
def main():
start_datetime = datetime.now()
convert_to_base64()
end_datetime = datetime.now()
print '------------------------------------------------------'
print 'Script started : {}'.format(start_datetime)
print 'Script finished: {}'.format(end_datetime)
if __name__ == '__main__':
print(main())
print('Done')
someone help me figure out what am doing wrong.
Thanks
This is how I looped for images in a directory:
import os
pictures = []
for file in os.listdir("pictures"):
if file[-3:].lower() in ["png"]:
pictures.append(file)
Please refer to Python documentation https://docs.python.org/2/tutorial/inputoutput.html for more info on open() function:
open() returns a file object, and is most commonly used with two arguments: open(filename, mode).

Procedure on adding image pixel data in a file in newline?

import cv2
import numpy as np
import os
k=[]
file1=open("TextData.txt",'w')
fn=input("Enter filename : ")
img=cv2.imread(fn,cv2.IMREAD_GRAYSCALE)
l=len(img)
w=len(img[0])
print(str(l)+"\n"+str(w))
for i in range(len(img)):
for j in range(len(img[0])):
k.append(img[i,j])
for a in range(len[k]):
file1.write(str(k[a])+"\n")
file1.close()
Basically, I'm running into the error :
Traceback (most recent call last):
File "imagereads.py", line 17, in <module>
for a in range(len[k]):
TypeError: 'builtin_function_or_method' object is not subscriptable
I'm trying to write a program that will store each image data in a file and access that later on when needed. Can anyone help me in this ? I'm doing this so that I can directly use file1.readLines() to read each data later on.
At first I tried appending each element to k, converting to a string and storing it directly. But I'm having problems getting back the data from the file into a list. Any help on this matter too will be appreciated.

Resources