I am training my system for texture analysis, using local binary pattern. here I am training images. taken code from somewhere. I am getting the error in defining the path of images.
# OpenCV bindings
import cv2
# To performing path manipulations
import os
# Local Binary Pattern function
from skimage.feature import local_binary_pattern
# To calculate a normalized histogram
from scipy.stats import itemfreq
from sklearn.preprocessing import normalize
# Utility package -- use pip install cvutils to install
import cvutils
# To read class from file
import csv
#Store the path of training images in train_images
train_images = cvutils.imlist ("'C:\Users\Babar\MATLAB\isp\training
images\fire-image1.jpg',
'C:\Users\Babar\MATLAB\isp\training images\fire-image2.jpg',
'C:\Users\Babar\MATLAB\isp\training images\fire-image3.jpg'")
# Dictionary containing image paths as keys and corresponding label as
value
train_dic = {'fire-image1':0,'fire-image2':0,'fire-image3':0}
with open('C:\Users\Babar\MATLAB\isp\class_train.txt', 'rb') as csvfile:
reader = csv.reader(csvfile, delimiter=' ')
for row in reader:
train_dic[row[0]] = int(row[1])
Related
I am currently doing an assignment on deep learning by downloading the assignment files from github.
import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset
%matplotlib inline
You are given a dataset ("data.h5") containing: - a training set of m_train images labeled as cat (y=1) or non-cat (y=0) - a test set of m_test images labeled as cat or non-cat - each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px).
# Loading the data (cat/non-cat)
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
I ran the setup.sh file too but the error doesn't seem to go away.
lr_utils.py file:
import numpy as np
import h5py
def load_dataset():
train_dataset = h5py.File('datasets/train_catvnoncat.h5', "r")
train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
test_dataset = h5py.File('datasets/test_catvnoncat.h5', "r")
test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels
classes = np.array(test_dataset["list_classes"][:]) # the list of classes
train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes
Kindly help!
I solved the issue by downloading uncorrupted .h5 files and putting them in the folder datasets/ in the same directory.
The files you downloaded are corrupted. You can visit https://github.com/abdur75648/Deep-Learning-Specialization-Coursera to download the uncorrupted files.
you can download uncorrupted files from here :
https://www.kaggle.com/datasets/muhammeddalkran/catvnoncat
and replace it in the directory of the corrupted files
I want to decompress a butch of nii.gz files in python so that they could be processed in sitk later on. When I decompress a single file manually by right-clicking the file and choosing 'Extract..', this file is then correctly interpreted by sitk (I do sitk.ReadImage(unzipped)). But when I try to decompress it in python using following code:
with gzip.open(segmentation_zipped, "rb") as f:
bindata = f.read()
segmentation_unzipped = os.path.join(segmentation_zipped.replace(".gz", ""))
with gzip.open(segmentation_unzipped, "wb") as f:
f.write(bindata)
I get error when sitk tries to read the file:
RuntimeError: Exception thrown in SimpleITK ReadImage: C:\d\VS14-Win64-pkg\SimpleITK\Code\IO\src\sitkImageReaderBase.cxx:82:
sitk::ERROR: Unable to determine ImageIO reader for "E:\BraTS19_2013_10_1_seg.nii"
Also when trying to do it a little differently:
input = gzip.GzipFile(segmentation_zipped, 'rb')
s = input.read()
input.close()
segmentation_unzipped = os.path.join(segmentation_zipped.replace(".gz", ""))
output = open(segmentation_unzipped, 'wb')
output.write(s)
output.close()
I get:
RuntimeError: Exception thrown in SimpleITK ReadImage: C:\d\VS14-Win64-pkg\SimpleITK-build\ITK\Modules\IO\PNG\src\itkPNGImageIO.cxx:101:
itk::ERROR: PNGImageIO(0000022E3AF2C0C0): PNGImageIO failed to read header for file:
Reason: fread read only 0 instead of 8
can anyone help?
No need to unzip the Nifti images, libraries such as Nibabel can handle it without decompression.
#==================================
import nibabel as nib
import numpy as np
import matplotlib.pyplot as plt
#==================================
# load image (4D) [X,Y,Z_slice,time]
nii_img = nib.load('path_to_file.nii.gz')
nii_data = nii_img.get_fdata()
fig, ax = plt.subplots(number_of_frames, number_of_slices,constrained_layout=True)
fig.canvas.set_window_title('4D Nifti Image')
fig.suptitle('4D_Nifti 10 slices 30 time Frames', fontsize=16)
#-------------------------------------------------------------------------------
mng = plt.get_current_fig_manager()
mng.full_screen_toggle()
for slice in range(number_of_slices):
# if your data in 4D, otherwise remove this loop
for frame in range(number_of_frames):
ax[frame, slice].imshow(nii_data[:,:,slice,frame],cmap='gray', interpolation=None)
ax[frame, slice].set_title("layer {} / frame {}".format(slice, frame))
ax[frame, slice].axis('off')
plt.show()
Or you can Use SimpleITK as following:
import SimpleITK as sitk
import numpy as np
# A path to a T1-weighted brain .nii image:
t1_fn = 'path_to_file.nii'
# Read the .nii image containing the volume with SimpleITK:
sitk_t1 = sitk.ReadImage(t1_fn)
# and access the numpy array:
t1 = sitk.GetArrayFromImage(sitk_t1)
import time
# cv2.cvtColor takes a numpy ndarray as an argument
import numpy as nm
import pytesseract
# importing OpenCV
import cv2
from PIL import ImageGrab, Image
bboxes = [(1469, 1014, 1495, 1029)]
def imToString():
# Path of tesseract executable
pytesseract.pytesseract.tesseract_cmd = 'D:\Program Files (x86)\Tesseract-OCR' + chr(92) + 'tesseract.exe'
while (True):
for box in bboxes:
# ImageGrab-To capture the screen image in a loop.
# Bbox used to capture a specific area.
cap = ImageGrab.grab(bbox=box)
# Converted the image to monochrome for it to be easily
# read by the OCR and obtained the output String.
tesstr = pytesseract.image_to_string(
cv2.cvtColor(nm.array(cap), cv2.COLOR_BGR2GRAY), lang='eng', config='digits') # ,lang='eng')
cap.show()
#input()
time.sleep(5)
print(tesstr)
# Calling the function
imToString()
It captures an image like this:
It isn't always two digits it can be one or three digits too.
Pytesseract returns values like: asi and oli
So, which Image To Text (OCR) Algorithm should I use for this problem? And, how to use that? I need a very precise value in this example it's 53 so the output should be around 50.
I am a beginner, i am converting audio files into mfccs , i have done it for one file but don't know how to iterate it through all dataset. I have multiple folders in Training folder ,one of them is 001(0) from which one wav file is converted.I want to convert all folder's wav files present in Training folder
import os
import numpy as np
import matplotlib.pyplot as plt
from glob import glob
import scipy.io.wavfile as wav
from python_speech_features import mfcc, logfbank
# Read the input audio file
(rate,sig) = wav.read('Downloads/DataVoices/Training/001(0)/001000.wav')
# Take the first 10,000 samples for analysis
sig = sig[:10000]
features_mfcc = mfcc(sig,rate)
# Print the parameters for MFCC
print('\nMFCC:\nNumber of windows =', features_mfcc.shape[0])
print('Length of each feature =', features_mfcc.shape[1])
# Plot the features
features_mfcc = features_mfcc.T
plt.matshow(features_mfcc)
plt.title('MFCC')
# Extract the Filter Bank features
features_fb = logfbank(sig, rate)
# Print the parameters for Filter Bank
print('\nFilter bank:\nNumber of windows =', features_fb.shape[0])
print('Length of each feature =', features_fb.shape[1])
# Plot the features
features_fb = features_fb.T
plt.matshow(features_fb)
plt.title('Filter bank')
plt.show()
You can use glob recursively with wildcards to find all of the wav files.
for f in glob.glob(r'Downloads/DataVoices/Training/**/*.wav', recursive=True):
(rate,sig) = wav.read(f)
# Rest of your code
I have followed this example:
https://www.pyimagesearch.com/2017/10/30/how-to-multi-gpu-training-with-keras-python-and-deep-learning/
and had an issue with the following line(line #51):
((trainX, trainY), (testX, testY)) = cifar10.load_data()
as i would like to train it on my own data
is there any simple way to generate this kind of output without digging deep into cifar's implementation?
I am pretty sure it is something that people already did but i cannot find a sample/tutorial/example
Thanks..
Assume you have your images as .jpg format, and your labels as csv format called label.csv, and separated them into 2 folders, train folder and test folder.
Then you can do the following to get the x_train
import cv2 #library for reading images
import numpy as np
import glob #library for reading files in a folder
x_train= []
for file in glob.glob("train/*.jpg"):
im = cv2.imread(file) #reading each image from the folder
x_train.append(im)
x_train = np.array(x_train)
And you can do the following to get the y_train
import csv
y_train= []
with open('train/label.csv', 'r') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
y_train.append([int(row[0])]) #converting the string to int (otherwise the csv data will be read as string)
y_train = np.array(y_train)
You can do the same for your test folder, just change the name of the parameters and arguments.