Trying to print image count - conv-neural-network

I am new to Python and I am trying to start CNN for one project. I mounted the gdrive and I am trying to download images from the gdrive directory. After, I am trying to count the images that I have in that directory. Here is my code:
import pathlib
dataset_dir = "content/drive/My Drive/Species_Samples"
data_dir = tf.keras.utils.get_file('Species_Samples', origin=dataset_dir, untar=True)
data_dir = pathlib.Path(data_dir)
image_count = len(list(data_dir('*/*.png')))
print(image_count)
However, I get the following error.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-78-e5d9409807d9> in <module>()
----> 1 image_count = len(list(data_dir('*/*.png')))
2 print(image_count)
TypeError: 'PosixPath' object is not callable
Can you help, please?
After suggestion, my code looks like this:
import pathlib
data_dir = pathlib.Path("content/drive/My Drive/Species_Samples/")
count = len(list(data_dir.rglob("*.png")))
print(count)

You are trying to glob files you need to use one of the glob methods that pathlib has:
import pathlib
data_dir = pathlib.Path("/path/to/dir/")
count = len(list(data_dir.rglob("*.png")))
In this case .rglob is a recursive glob.

Related

get current working directory in jupyter notebook

I have code that gets the parent directory of the current file, this works when running in VScode but when I transfer the code to a Jupyter notebook it stops working.
import pandas as pd
import os
from pathlib import Path
import matplotlib.pyplot as plt
cur_path = Path(os.path.dirname(__file__))
root_path = cur_path.parent.absolute()
train_data_path = '{}\\data\\train_codified.csv'.format(str(root_path))
test_data_path = '{}\\data\\test_codified.csv'.format(str(root_path))
this returns the following error:
NameError Traceback (most recent call last)
Input In [1], in <cell line: 7>()
3 from pathlib import Path
4 import matplotlib.pyplot as plt
----> 7 cur_path = Path(os.path.dirname(__file__))
8 root_path = cur_path.parent.absolute()
10 train_data_path = '{}\\data\\train_codified.csv'.format(str(root_path))
NameError: name '__file__' is not defined
why does the code not function in a notebook?
I have replaced
cur_path = Path(os.path.dirname(__file__))
with
cur_path = Path(os.getcwd())
this works so my problem is solved. I am still not sure why __file__ is not defined so if anyone knows please let me know.

Python Code Image Stitching not working - Mac M1

so a friend of mine sent me this code that worked for him to stitch images together vertically. However, when I try to run it, it gives me this:
Traceback (most recent call last):
File "/Users/usr/Downloads/Test/Concat.py", line 15, in <module>
final = Image.new('RGB', (images[0].width, totalHeight))
IndexError: list index out of range
How can I resolve this?
Here's my code:
import os
from PIL import Image
from os import listdir
from os.path import isfile
files = [f for f in listdir() if isfile(f)]
images = []
totalHeight = 0
for file in files:
if file.lower().endswith(('.png', '.jpg', '.jpeg', '.tiff', '.bmp', '.gif')):
images.append(Image.open(file))
totalHeight += images[-1].height
final = Image.new('RGB', (images[0].width, totalHeight))
runningHeight = 0
for im in images:
final.paste(im, (0, runningHeight))
runningHeight += im.height
final.save("Concat.png")
Thanks in advance.

Semantic Segmentation in google colab using fastai

I keep getting the unsupported operand types 'str' and 'str' in my code.
I have created a dataset for semantic segmentation of sidewalk across a campus. I want to train this dataset but i am getting errors when trying to get the labels from the labeled images to map them with the input images with the function: 'get_y_fn' . I wabt to train this dataset with fastai library in google colab
%reload_ext autoreload
%autoreload 2
%matplotlib inline
import fastai
from fastai import *
from fastai.vision import *
import pathlib
import os
from PIL import Image
import matplotlib.pyplot as plt
fnames = get_image_files(path_img)
lbl_names = get_image_files(path_lbl)
get_y_fn = lambda x: path_lbl/f'{x.stem}.png'
data = (SegmentationItemList.from_folder(path_img)
.random_split_by_pct()
.label_from_func(get_y_fn,classes=codes)
.transform(get_transforms(),size=128,tfm_y=True)
.databunch(bs=4))
TypeError Traceback (most recent call last)
<ipython-input-18-80efbaeba6e7> in <module>()
2 data = (SegmentationItemList.from_folder(path_img)
3 .split_by_rand_pct()
----> 4 .label_from_func(get_y_fn,classes=codes)
5 .transform(get_transforms(),size=128,tfm_y=True)
6 .databunch(bs=4))
3 frames
<ipython-input-10-44f94a438cac> in <lambda>(x)
----> 1 get_y_fn = lambda x: path_lbl/f'{x.stem}.png'
TypeError: unsupported operand type(s) for /: 'str' and 'str'
Program on google colab error
beginning of program
Going through your code in google colab, I found that you've been using string as a path whereas if you're trying to reproduce fastai code then it uses path object for paths not string so you can simply replace:
get_y_fn = lambda x: path_lbl/f'{x.stem}_mask{x.suffix}'
with
get_y_fn = lambda x: path_lbl + "/" +f'{x.stem}_mask{x.suffix}'
Since path_lbl is a string object not path object.
You can also change path_lbl object from string to path using pathlib library of python.

python cv2.imread return none on 6th image

I am trying to import and read all images in a folder. However, when I have more than 5 images, cv2.imread returns none for the 6th image. I have tried using different file names, different files, etc, but I can't get it to work.
import cv2
import numpy as np
import matplotlib.pyplot as plt
from tkinter import filedialog
import os
from mpl_toolkits.mplot3d import Axes3D
global scan_dir
scan_dir = filedialog.askdirectory()
print(scan_dir)
x=os.listdir(scan_dir)
img={}
print(x)
for i in range(0,len(x)):
print(i)
img[i] = cv2.imread(x[i], cv2.IMREAD_GRAYSCALE)
indices[i] = np.where(img[i]<100)
I get the following error...(None is the return of print(img[i] on 6th iteration of the loop)
None
Traceback (most recent call last):
File "C:\CodeRepository\US-3D\GettingCloser.py", line 55, in <module>
indices[i] = np.where(img[i]<100)
TypeError: '<' not supported between instances of 'NoneType' and 'int'
I have the same problem if I try this
global scan_dir
scan_dir = filedialog.askdirectory()
print(scan_dir)
x=os.listdir(scan_dir)
img = cv2.imread(x[5], cv2.IMREAD_GRAYSCALE)
It will return that img is None. This is true for anything beyond the 5th image.
Must be something wrong with the file. dicts are an unordered Data structure. Should not give error always on 5th iteration. However, I have made the changes which will not throw the error. But you need to debug that image
for i in range(0,len(x)):
print(i)
img[i] = cv2.imread(x[i], cv2.IMREAD_GRAYSCALE)
if img[i]:
indices[i] = np.where(img[i]<100)

Not able to write the Count Vectorizer vocabulary

I want to save and load the count vectorizer vocabulary.This is my code
from sklearn.feature_extraction.text import CountVectorizer
cv = CountVectorizer(max_features = 1500)
Cv_vec = cv.fit(X['review'])
X_cv=Cv_vec.transform(X['review']).toarray()
dictionary_filepath='CV_dict'
pickle.dump(Cv_vec.vocabulary_, open(dictionary_filepath, 'w'))
It shows me
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-407-3a9b06f969a9> in <module>()
1 dictionary_filepath='CV_dict'
----> 2 pickle.dump(Cv_vec.vocabulary_, open(dictionary_filepath, 'w'))
TypeError: write() argument must be str, not bytes
I want to save the vocabulary of the count vectorizer and load it.Can anyone help me with it please?.
Open the file in binary mode when pickling out an object. And try to use a context manager, i.e.
from sklearn.feature_extraction.text import CountVectorizer
cv = CountVectorizer(max_features = 1500)
Cv_vec = cv.fit(X['review'])
X_cv=Cv_vec.transform(X['review']).toarray()
dictionary_filepath='CV_dict'
with open('CV_dict.pkl', 'wb') as fout:
pickle.dump(Cv_vec.vocabulary_, fout)

Resources