Related
How to display the percentage on the y-axis, please? Doesn't stat='frequency' do this?
import matplotlib.pyplot as plt
import seaborn as sns
data = [6, 3, 1, 5, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 1, 7, 3, 1, 9, 2, 1, 4, 2, 1, 3, 1, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 1, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 9, 4, 1, 7, 2, 1, 3, 1, 4, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 2, 1, 5, 1, 8, 4, 1, 6, 3, 9, 4, 2, 1, 6, 1, 9, 4, 2, 7, 3, 1, 5, 1, 8, 4, 2, 1, 5, 1, 8, 4, 2, 1, 9, 2, 1, 7, 2, 1, 5, 2, 8, 4, 2, 6, 3, 9, 4, 2, 1, 5, 2, 1, 4, 2, 6, 3, 1, 5, 2, 1, 6, 3, 9, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 1, 7, 3, 1, 9, 2, 1, 4, 2, 1, 3, 1, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 1, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 9, 4, 1, 7, 2, 1, 3, 1, 4, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 2, 1, 5, 1, 8, 4, 1, 6, 3, 9, 4, 2, 1, 6, 1, 9, 4, 2, 7, 3, 1, 5, 1, 8, 4, 2, 1, 5, 1, 8, 4, 2, 1, 9, 4, 1, 7, 3, 1, 5, 1, 8, 2, 1, 6, 3, 9, 4, 2, 1, 5, 2, 1, 4, 2, 6, 3, 1, 5, 2, 1, 6, 3, 9, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 1, 7, 3, 1, 9, 2, 1, 4, 2, 1, 3, 1, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 1, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 9, 4, 1, 7, 2, 1, 3, 1, 4, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 2, 1, 5, 1, 8, 4, 1, 6, 3, 9, 4, 2, 1, 6, 1, 9, 4, 2, 7, 3, 1, 5, 1, 8, 4, 2, 1, 5, 1, 8, 4, 2, 1, 9, 2, 1, 4, 2, 1, 5, 1, 8, 2, 1, 6, 3, 9, 4, 2, 1, 5, 2, 1, 4, 2, 6, 3, 1, 5, 2, 1, 6, 3, 9, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 1, 7, 3, 1, 9, 2, 1, 4, 2, 1, 3, 1, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 1, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 9, 4, 1, 7, 2, 1, 3, 1, 4, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 2, 1, 5, 1, 8, 4, 1, 6, 3, 9, 4, 2, 1, 6, 1, 9, 4, 2, 7, 3, 1, 5, 1, 8, 4, 2, 1, 5, 1, 8, 4, 2, 1, 9, 4, 2, 7, 3, 1, 9, 4, 1, 7, 2, 1, 3, 1, 4, 2, 1, 5, 1, 8, 4, 2, 6, 3, 1, 5, 2, 1, 6, 3, 9, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 1, 7, 3, 1, 9, 2, 1, 4, 2, 1, 3, 1, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 1, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 9, 4, 1, 7, 2, 1, 3, 1, 4, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 2, 1, 5, 1, 8, 4, 1, 6, 3, 9, 4, 2, 1, 6, 1, 9, 4, 2, 7, 3, 1, 5, 1, 8, 4, 2, 1, 5, 1, 8, 4, 2, 1, 9, 2, 1, 7, 3, 1, 9, 4, 1, 7, 2, 1, 3, 1, 4, 2, 1, 5, 1, 8, 4, 2, 6, 3, 1, 5, 2, 1, 6, 3, 9, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 1, 7, 3, 1, 9, 2, 1, 4, 2, 1, 3, 1, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 1, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 9, 4, 1, 7, 2, 1, 3, 1, 4, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 2, 1, 5, 1, 8, 4, 1, 6, 3, 9, 4, 2, 1, 6, 1, 9, 4, 2, 7, 3, 1, 5, 1, 8, 4, 2, 1, 5, 1, 8, 4, 2, 1, 9, 4, 1, 7, 2, 1, 5, 1, 8, 4, 2, 1, 5, 1, 7, 3, 1, 5, 1, 8, 4, 2, 6, 3, 1, 5, 2, 1, 6, 3, 9, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 9, 2, 1, 4, 2, 6, 3, 9, 4, 1, 7, 3, 1, 9, 2, 1, 4, 2, 1, 3, 1, 4, 2, 7, 3, 1, 5, 2, 7, 3, 1, 5, 1, 8, 4, 1, 6, 3, 1, 5, 2, 7, 3, 1, 5, 2, 8, 4, 1, 6, 3, 9, 4, 1, 7, 2, 1, 3, 1, 4, 2, 7, 3, 1, 9, 2, 1, 4]
fig, ax = plt.subplots()
ax = sns.histplot(data, discrete=True, kde=False, stat='frequency')
plt.show()
I used Pytorch DataLoader to create My "batch-data" loder,but I got some problem.
As the definition of the pytorch DataLoader Shuffer.
shuffle (bool, optional) – set to True to have the data reshuffled at every epoch (default: False)
the data will be reshuffled after every epoch.
But,though I set shuffle to False,I will probably also get the completely different batch every iteration in the same epoch which I expect .
testData = torchvision.datasets.FashionMNIST(
root="data",
train=False,
download=True,
transform=ToTensor()
)
CurrentFoldTestDataLoader = data.DataLoader(testData, batch_size=32, shuffle=False)
for i in range(1000):
test_features, test_labels = next(iter(CurrentFoldTestDataLoader))
print(i,test_labels)
Here I got the same batch in every iteration.
0 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
1 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
2 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
3 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
4 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
5 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
6 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
7 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
8 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
9 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
10 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
Why is this? Is my understanding of the definition of shuffle inaccurate?
The problem with your code is that you are re-instantiating the same iterator for each step in the for cycle. With shuffle=False the iterator generates the same first batch of images. Try to instantiate the loader outside the cycle instead:
loader = data.DataLoader(testData, batch_size=32, shuffle=False)
for i, data in enumerate(loader):
test_features, test_labels = data
print(i, test_labels)
Let's say I have defined a function to generate a list of 6 random integers ranging from 0 to 10
import random
def func():
randomlist = random.sample(range(11), 6)
return randomlist
Run:
func()
Output:
[3, 7, 10, 9, 4, 1]
Now I want to define another function which calls func() inside with a parameter n to generate a multidimensional matrix, where each element will be replaced by a newly created list generated by func(), and n is the times of the completed replacement (so that the total number of integers in the matrix would be 6^n) -- for example --
when n=1, expected result:
[3, 7, 10, 9, 4, 1]
when n=2, expected result:
[[6, 2, 9, 1, 4, 0],
[7, 8, 1, 9, 4, 1],
[1, 0, 4, 6, 3, 1],
[9, 4, 3, 8, 6, 7],
[2, 4, 3, 9, 5, 6],
[4, 7, 2, 0, 1, 8]]
when n=3, expected result:
[[[4, 7, 3, 0, 2, 8],[1, 5, 6, 5, 4, 8],[8, 9, 6, 5, 10, 4],[7, 8, 6, 6, 4, 10],[7, 8, 1, 0, 2, 3],[4, 5, 8, 5, 4, 6]],
[[1, 7, 2, 0, 2, 8],[4, 5, 8, 8, 4, 5],[9, 5, 6, 2, 1, 3],[5, 4, 1, 2, 6, 10],[7, 5, 4, 1, 1, 4],[9, 6, 5, 2, 2, 1]],
[[8, 2, 7, 10, 2, 7],[8, 9, 5, 4, 5, 5],[5, 8, 7, 7, 4, 6],[9, 5, 9, 10, 5, 4],[1, 4, 5, 6, 5, 7],[9, 8, 7, 6, 5, 1]],
[[0, 7, 4, 0, 1, 9],[4, 7, 3, 0, 2, 8],[8, 9, 6, 5, 10, 4],[8, 2, 7, 10, 2, 7],[[7, 5, 4, 1, 1, 4],[7, 8, 1, 9, 4, 1]],
[[9, 5, 3, 9, 2, 8],[8, 9, 6, 5, 10, 4],[9, 4, 3, 8, 6, 7],[3, 7, 10, 9, 4, 1],[4, 7, 3, 0, 2, 8],[9, 4, 3, 8, 6, 7]],
[[5, 3, 4, 5, 2, 10],[[7, 5, 4, 1, 1, 4],[4, 7, 3, 0, 2, 8],[4, 5, 8, 8, 4, 5],[7, 8, 1, 9, 4, 1],[8, 2, 7, 10, 2, 7]]]
I think I'd need a recursive function, but I'm totally out of a clue. Any insights? Thanks a lot.
Well base case of 1 is just to call your func, otherwise build a list of the recursion call on n-1 6^(n-1) times:
def recur_6(n):
if n <= 1:
return func()
return [recur_6(n-1) for _ in range(6**(n-1))]
Live example
I'm hoping to calculate the distances between two points in a (Nx1) numpy array, i.e.:
a = [2, 5, 5, 12, 5, 3, 10, 8, 1, 3, 1]
I'm hoping to get a square matrix with the (normed) distances between each point:
sq = [[0, |2-5|, |2-5|, |2-12|, |2-5|, ...],
[|5-2|, 0, ...], ...]
So far, what I have doesn't work, giving wrong values for the square distance matrix. Is there a way to (I'm not sure if it is the correct term?) vectorise my method too, but am unfamiliar with the advanced indexing.
What I currently have is the following:
sq = np.zero((len(a), len(a))
for i in a:
for j in len(a+1):
sq[i,j] = np.abs(a[:,0] - a[:,0])
Would appreciate any help!
I think that by exploiting numpy broadcasting, this is the faster solution:
a = [2, 5, 5, 12, 5, 3, 10, 8, 1, 3, 1]
a = np.array(a).reshape(-1,1)
sq = np.abs(a.T-a)
sq
array([[ 0, 3, 3, 10, 3, 1, 8, 6, 1, 1, 1],
[ 3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4],
[ 3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4],
[10, 7, 7, 0, 7, 9, 2, 4, 11, 9, 11],
[ 3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4],
[ 1, 2, 2, 9, 2, 0, 7, 5, 2, 0, 2],
[ 8, 5, 5, 2, 5, 7, 0, 2, 9, 7, 9],
[ 6, 3, 3, 4, 3, 5, 2, 0, 7, 5, 7],
[ 1, 4, 4, 11, 4, 2, 9, 7, 0, 2, 0],
[ 1, 2, 2, 9, 2, 0, 7, 5, 2, 0, 2],
[ 1, 4, 4, 11, 4, 2, 9, 7, 0, 2, 0]])
With numpy the following line might be the shortest to your result:
import numpy as np
a = np.array([2, 5, 5, 12, 5, 3, 10, 8, 1, 3, 1])
sq = np.array([np.array([(np.abs(i - j)) for j in a]) for i in a])
print(sq)
The following would give you the desired result without numpy.
a = [2, 5, 5, 12, 5, 3, 10, 8, 1, 3, 1]
sq = []
for i in a:
distances = []
for j in a:
distances.append(abs(i-j))
sq.append(distances)
print(sq)
With both, the result comes as:
[[0, 3, 3, 10, 3, 1, 8, 6, 1, 1, 1], [3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4], [3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4], [10, 7, 7, 0, 7, 9, 2, 4, 11, 9, 11], [3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4], [1, 2, 2, 9, 2, 0, 7, 5, 2, 0, 2], [8, 5, 5, 2, 5, 7, 0, 2, 9, 7, 9], [6, 3, 3, 4, 3, 5, 2, 0, 7, 5, 7], [1, 4, 4, 11, 4, 2, 9, 7, 0, 2, 0], [1, 2, 2, 9, 2, 0, 7, 5, 2, 0, 2], [1, 4, 4, 11, 4, 2, 9, 7, 0, 2, 0]]
There may be more than one way to do this but one way is to only use numpy operations instead of loops because internally python does lots of optimizations for numpy arrays.
One way to do only using array operations is to create an NxN matrix by repeating the original matrix (a) N times.
This will create a matrix N times.
E.g:
a = [1, 2, 3]
b = [[1 , 2, 3], [1 , 2, 3], [1 , 2, 3]]
Then you can do a matrix, array operation of
ans = abs(b - a)
Assuming a is numpy array, you can do:
b = np.repeat(a,a.shape).reshape((a.shape[0],a.shape[0]))
ans = np.abs(b - a)
I'm using the scipy.stats.randint to get random numbers.
Here is my source code and result.
Input:
from scipy.stats import randint
randint.rvs(0.00001, 10, size=100)
Output:
array([6, 4, 6, 7, 9, 7, 3, 0, 2, 5, 1, 1, 0, 3, 6, 7, 3, 6, 4, 8, 6, 5,
0, 0, 5, 1, 3, 2, 3, 1, 0, 6, 5, 2, 0, 0, 9, 1, 5, 2, 3, 6, 1, 4,
3, 1, 4, 4, 9, 5, 6, 3, 4, 3, 7, 7, 2, 4, 0, 2, 0, 6, 8, 1, 5, 6,
4, 6, 5, 0, 8, 8, 5, 9, 3, 2, 8, 7, 1, 4, 6, 0, 7, 3, 9, 1, 2, 7,
7, 6, 4, 3, 3, 3, 4, 7, 7, 4, 1, 1])
My question is, I've set the low to 0.000001, but How the '0's came out from output.
Thanks for your help.
Scipy's randint invokes mtrand.randint, that is a part of Numpy package.
As you can see from its source code, lower bound is truncated using (int)(low).
So, to get random numbers from closed interval [1, 10], do the following:
randint.rvs(1, 11, size=100)
Note, you need to increase high bound by 1, as it seen from the form of probability distribution (pmf) for randint.