Related
For example, I have a 3D ndarray of the shape (10,10,10) and whenever I try to change all the cells in this section [5,:,9] to a specific single value I end up changing values in this section too [4,:,9]. Which to me makes no sense. I do not get this behavior when I convert to a list of lists.
I use a simply for loop:
For i in range(0,10):
matrix[5,i, 9]= matrix[5,9,9]
Is there anyway to avoid this? I do not get this behavior when using a list of lists but I don’t wanna convert back and forth between the two as it takes too much processing time.
Doesn't happen that way for me:
In [232]: arr = np.ones((10,10,10),int)
In [233]: arr[5,9,9] = 10
In [234]: for i in range(10): arr[5,i,9]=arr[5,9,9]
In [235]: arr[5,:,9]
Out[235]: array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10])
In [236]: arr[4,:,9]
Out[236]: array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
or assigning a whole "column" at once:
In [237]: arr[5,:,9] = np.arange(10)
In [239]: arr[5]
Out[239]:
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 3],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 4],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 5],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 6],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 7],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 8],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 9]])
What I'm currently doing is a implementation of Genetic Algorithms. I have written my Crossover and mutation methods and now i'm currently writing my Fitness method.
I need to convert my list of 0s and 1s to decimal values for calculating distance.
My current output that I'm working with are a list of integer values of 1s and 0s. (Example below):
[[0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1], [0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1]]
<class 'list'>
I want to convert these numbers to their respected binary equivalent.
I have tried converting the list to groups of 4 and then calling a binaryToDecimal function to convert the bits to decimal values. However, Im getting an error 'TypeError: 'numpy.ndarray' object is not callable'.
I have summarized my code and this is what it looks like so far.
def converting_binary_to_decimal(L):
output = []
for l in L:
l = list(map(str, l))
sub_output = []
for j in range(0, len(l)-1, 4):
sub_output.append(int(''.join(l[j:j+4]), 2))
output.append(sub_output)
return output
def chunks(L, n):
for i in range(0, len(L), n):
yield L[i:i+n]
def fitness(child):
newList1=list(chunks(child[0], 4))
newList2=list(chunks(child[1], 4))
if __name__ == "__main__":
myFitness = fitness(afterMU)
A sample output of what i want is:
[[0, 13, 6, 8, 12, 8, 10, 9, 15], [0, 8, 7, 0, 4, 4, 1, 8, 15]]
Try this code.
def converting_binary_to_decimal(L):
output = []
for l in L:
l = list(map(str, l))
sub_output = []
for j in range(0, len(l)-1, 4):
sub_output.append(int(''.join(l[j:j+4]), 2))
output.append(sub_output)
return output
L = [[0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1], [0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1]]
converting_binary_to_decimal(L)
I think i figured it out.
x=[0, 1, 1, 0]
k = 4
n = len(x)//k
for i in range(n):
y = x[i*k:(i+1)*k]
y = [str(j) for j in y]
y = ''.join(y)
y = int(y,2)
print(y)
Thank you.
I am currently working on detecting outliers in my dataset using Isolation Forest in Python and I did not completely understand the example and explanation given in scikit-learn documentation
Is it possible to use Isolation Forest to detect outliers in my dataset that has 258 rows and 10 columns?
Do I need a separate dataset to train the model? If yes, is it necessary to have that training dataset free from outliers?
This is my code:
rng = np.random.RandomState(42)
X = 0.3*rng.randn(100,2)
X_train = np.r_[X+2,X-2]
clf = IsolationForest(max_samples=100, random_state=rng, contamination='auto'
clf.fit(X_train)
y_pred_train = clf.predict(x_train)
y_pred_test = clf.predict(x_test)
print(len(y_pred_train))
I tried by loading my dataset to X_train but that does not seem to work.
Do I need a separate dataset to train the model?
Short answer is "No". You train and predict outliers on the same data.
IsolationForest is an unsupervised learning algorithm that's intended to clean your data from outliers (see docs for more). In usual machine learning settings, you would run it to clean your training dataset. As far as your toy example concerned:
rng = np.random.RandomState(42)
X = 0.3*rng.randn(100,2)
X_train = np.r_[X+2,X-2]
from sklearn.ensemble import IsolationForest
clf = IsolationForest(max_samples=100, random_state=rng, behaviour="new", contamination=.1)
clf.fit(X_train)
y_pred_train = clf.predict(X_train)
y_pred_train
array([ 1, 1, 1, -1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 1,
1, -1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, -1, 1, -1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, -1, 1, -1, 1, 1, 1, 1, 1, -1, -1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
-1, 1, 1, -1, 1, 1, 1, 1, -1, -1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
where 1 represent inliers and -1 represent outliers. As specified by contamination param, the fraction of outliers is 0.1.
Finally, you would remove outliers like:
X_train_cleaned = X_train[np.where(y_pred_train == 1, True, False)]
I´ve got a 3D numpy bit array, I need to pack them along the third axis. So exactly what numpy.packbits does. But unfortunately it packs it only to uint8, but I need more data, is there a similar way to pack it to uint16 or uint32?
Depending on your machine's endianness it is either a matter of simple view casting or of byte swapping and then view casting:
>>> a = np.random.randint(0, 2, (4, 16))
>>> a
array([[1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1],
[0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1],
[1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1]])
>>> np.packbits(a.reshape(-1, 2, 8)[:, ::-1]).view(np.uint16)
array([53226, 23751, 25853, 64619], dtype=uint16)
# check:
>>> [bin(x + (1<<16))[-16:] for x in _]
['1100111111101010', '0101110011000111', '0110010011111101', '1111110001101011']
You may have to reshape in the end.
I have SAR CEOS format files which consist of data file, leader file, null volume directory file and volume directory file.
I am reading the data file using gdal ReadAsArray and then I am doing operations on this 2d Array and now I want to save this 2d array as an ENVI binary file.
Kindly guide how to do this in Python 3.5.
Find help for Tutorial Website: https://pcjericks.github.io/py-gdalogr-cookbook/
Such as the example of
import gdal, ogr, os, osr
import numpy as np
def array2raster(newRasterfn,rasterOrigin,pixelWidth,pixelHeight,array):
cols = array.shape[1]
rows = array.shape[0]
originX = rasterOrigin[0]
originY = rasterOrigin[1]
driver = gdal.GetDriverByName('ENVI')
outRaster = driver.Create(newRasterfn, cols, rows, 1, gdal.GDT_Byte)
outRaster.SetGeoTransform((originX, pixelWidth, 0, originY, 0, pixelHeight))
outband = outRaster.GetRasterBand(1)
outband.WriteArray(array)
outRasterSRS = osr.SpatialReference()
outRasterSRS.ImportFromEPSG(4326)
outRaster.SetProjection(outRasterSRS.ExportToWkt())
outband.FlushCache()
def main(newRasterfn,rasterOrigin,pixelWidth,pixelHeight,array):
reversed_arr = array[::-1] # reverse array so the tif looks like the array
array2raster(newRasterfn,rasterOrigin,pixelWidth,pixelHeight,reversed_arr) # convert array to raster
if __name__ == "__main__":
rasterOrigin = (-123.25745,45.43013)
pixelWidth = 10
pixelHeight = 10
newRasterfn = 'test.tif'
array = np.array([[ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[ 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1],
[ 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1],
[ 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1],
[ 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1],
[ 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1],
[ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])
main(newRasterfn,rasterOrigin,pixelWidth,pixelHeight,array)