genfromtxt return numpy array not separated by comma - python-3.x

I have a *.csv file that store two columns of float data.
I am using this function to import it but it generates the data not separated with comma.
data=np.genfromtxt("data.csv", delimiter=',', dtype=float)
output:
[[ 403.14915 150.560364 ]
[ 403.7822265 135.13165 ]
[ 404.5017 163.4669 ]
[ 434.02465 168.023224 ]
[ 373.7655 177.904114 ]
[ 450.608429 208.4187315]
[ 454.39475 239.9666595]
[ 453.8055 248.4082 ]
[ 457.5625305 247.70315 ]
[ 451.729431 258.19335 ]
[ 366.74405 225.169922 ]
[ 377.0055235 258.110077 ]
[ 380.3581 261.760071 ]
[ 383.98615 262.33805 ]
[ 388.2516785 272.715332 ]
[ 408.378174 200.9713135]]
How to format it to get a numpy array like
[[ 403.14915, 150.560364 ]
[ 403.7822265, 135.13165 ],....]
?

NumPy doesn't display commas when you print arrays. If you really want to see them, you can use
print(repr(data))
The repr function forces a str representation not ment for "nice" printing, but for the literal representation you would use yourself to type the data in your code.

Related

Convert an array of array to array of JSON

I have an array of arrays such as this:
pl = [
["name1", "address1"],
["name2", ["address2"],
["name3", "address3"]
....
]
but I need to convert it into an array of objects:
pl = [
{"name1": "address1"},
{"name2": ["address2"},
{"name3": "address3"}
....
]
I'm struggling, with no luck.
Docs: https://docs.python.org/3/library/json.html
Example from docs:
import json
json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])

Changing the values of matrix is changing the weights of the model

I am working with neural network weights and I am seeing a weird thing. I have written this code:
x = list(mnist_classifier.named_parameters())
weight = x[0][1].detach().cpu().numpy().squeeze()
print(weight)
So I get the following values:
[[[-0.2435195 0.05255396 -0.32765684]
[ 0.06372751 0.03564635 -0.31417745]
[ 0.14694464 -0.03277654 -0.10328879]]
[[-0.13716389 0.0128522 0.24107361]
[ 0.45231998 0.15497956 0.11112727]
[ 0.18206735 -0.22820294 -0.29146808]]
[[ 1.1747813 0.9206593 0.49848938]
[ 1.1558323 1.0859997 0.7743778 ]
[ 1.0287125 0.52122927 0.4096022 ]]
[[-0.2980809 -0.04358199 -0.26461622]
[-0.1165191 -0.2267315 0.37054354]
[ 0.4429275 0.44967037 0.06866694]]
[[ 0.39549246 0.10898255 0.32859102]
[-0.07753246 0.1628792 0.03021396]
[ 0.323148 0.5103844 0.16282919]]
....
Now, when I change the value of the first matrix weight[0] to 0.1, it changes the values of the original weights:
x = list(mnist_classifier.named_parameters())
weight = x[0][1].detach().cpu().numpy().squeeze()
weight[0] = weight[0] * 0 + 0.1
print(list(mnist_classifier.named_parameters()))
[('conv1.weight', Parameter containing:
tensor([[[[ 0.1000, 0.1000, 0.1000],
[ 0.1000, 0.1000, 0.1000],
[ 0.1000, 0.1000, 0.1000]]],
[[[-0.1372, 0.0129, 0.2411],
[ 0.4523, 0.1550, 0.1111],
[ 0.1821, -0.2282, -0.2915]]],
[[[ 1.1748, 0.9207, 0.4985],
[ 1.1558, 1.0860, 0.7744],
[ 1.0287, 0.5212, 0.4096]]],
...
What is going on here? How is weight[0] connected to the neural network?
I found the answer. Apparently, when copying np arrays, you are supposed to use copy() otherwise it's a pass-by reference. So using copy() helped.

Reconstructing Map in Groovy

May I ask is it possible in groovy to transform the map recursively by removing all the key and only keep the value for maps with certain key prefix (e.g. pk-).
Example from this:
[
applications:[
pk-name-AppA:[name:AppA, keyA1:valueA1Prod, keyA2:valueA2],
pk-name-AppB:[name:AppB, keyB1:valueB1Prod, keyB2:valueB2],
otherAppC:[name:AppC, keyA1:valueA1Prod, keyA2:valueA2]
]
]
to
[
applications:[
[name:AppA, keyA1:valueA1Prod, keyA2:valueA2],
[name:AppB, keyB1:valueB1Prod, keyB2:valueB2],
otherAppC:[name:AppC, keyA1:valueA1Prod, keyA2:valueA2]
]
]

logstash parse complex message from Telegram

I'm processing through Telegram history (txt file) and I need to extract & process quite complex (nested) multiline pattern.
Here's the whole pattern
Free_Trade_Calls__AltSignals:IOC/ BTC (bittrex)
BUY : 0.00164
SELL :
TARGET 1 : 0.00180
TARGET 2 : 0.00205
TARGET 3 : 0.00240
STOP LOS : 0.000120
2018-04-19 15:46:57 Free_Trade_Calls__AltSignals:TARGET
basically I am looking for a pattern starting with
Free_Trade_Calls__AltSignals: ^%(
and ending with a timestamp.
Inside that pattern (telegram message)
- exchange - in brackets in the 1st line
- extract value after BUY
- SELL values in a array of 3 SELL[3] : target 1-3
- STOP loss value (it can be either STOP, STOP LOSS, STOP LOS)....
I've found this Logstash grok multiline message but I am very new to logstash firend advised it to me. I was trying to parse this text in NodeJS but it really is pain in the ass and mad about it.
Thanks Rob :)
Since you need to grab values from each line, you don't need to use multi-line modifier. You can skip empty line with %{SPACE} character.
For your given log, this pattern can be used,
Free_Trade_Calls__AltSignals:.*\(%{WORD:exchange}\)\s*BUY\s*:\s*%{NUMBER:BUY}\s*SELL :\s*TARGET 1\s*:\s*%{NUMBER:TARGET_1}\s*TARGET 2\s*:\s*%{NUMBER:TARGET_2}\s*TARGET 3\s*:\s*%{NUMBER:TARGET_3}\s*.*:\s*%{NUMBER:StopLoss}
please note that \s* equals to %{SPACE}
It will output,
{
"exchange": [
[
"bittrex"
]
],
"BUY": [
[
"0.00164"
]
],
"BASE10NUM": [
[
"0.00164",
"0.00180",
"0.00205",
"0.00240",
"0.000120"
]
],
"TARGET_1": [
[
"0.00180"
]
],
"TARGET_2": [
[
"0.00205"
]
],
"TARGET_3": [
[
"0.00240"
]
],
"StopLoss": [
[
"0.000120"
]
]
}

Hierarchical Clustering with cosine similarity metric in fcluster package

I use scipy.cluster.hierarchy to do a hierarchical clustering on a set of points using "cosine" similarity metric. As an example, I have:
import scipy.cluster.hierarchy as hac
import matplotlib.pyplot as plt
Points =
np.array([[ 0. , 0.23508573],
[ 0.00754775 , 0.26717266],
[ 0.00595464 , 0.27775905],
[ 0.01220563 , 0.23622067],
[ 0.00542628 , 0.14185873],
[ 0.03078922 , 0.11273108],
[ 0.06707743 ,-0.1061131 ],
[ 0.04411757 ,-0.10775407],
[ 0.01349434 , 0.00112159],
[ 0.04066034 , 0.11639591],
[ 0. , 0.29046682],
[ 0.07338036 , 0.00609912],
[ 0.01864988 , 0.0316196 ],
[ 0. , 0.07270636],
[ 0. , 0. ]])
z = hac.linkage(Points, metric='cosine', method='complete')
labels = hac.fcluster(z, 0.1, criterion="distance")
plt.scatter(Points[:, 0], Points[:, 1], c=labels.astype(np.float))
plt.show()
Since I use cosine metric, in some cases the dot product of two vectors can be negative or norm of some vectors can be zero. It means z output will have some negative or infinite elements which is not valid for fcluster (as below):
z =
[[ 0.00000000e+00 1.00000000e+01 0.00000000e+00 2.00000000e+00]
[ 1.30000000e+01 1.50000000e+01 0.00000000e+00 3.00000000e+00]
[ 8.00000000e+00 1.10000000e+01 4.26658708e-13 2.00000000e+00]
[ 1.00000000e+00 2.00000000e+00 2.31748880e-05 2.00000000e+00]
[ 3.00000000e+00 4.00000000e+00 8.96700489e-05 2.00000000e+00]
[ 1.60000000e+01 1.80000000e+01 3.98805492e-04 5.00000000e+00]
[ 1.90000000e+01 2.00000000e+01 1.33225099e-03 7.00000000e+00]
[ 5.00000000e+00 9.00000000e+00 2.41120340e-03 2.00000000e+00]
[ 6.00000000e+00 7.00000000e+00 1.52914684e-02 2.00000000e+00]
[ 1.20000000e+01 2.20000000e+01 3.52441432e-02 3.00000000e+00]
[ 2.10000000e+01 2.40000000e+01 1.38662986e-01 1.00000000e+01]
[ 1.70000000e+01 2.30000000e+01 6.99056531e-01 4.00000000e+00]
[ 2.50000000e+01 2.60000000e+01 1.92543748e+00 1.40000000e+01]
[ -1.00000000e+00 2.70000000e+01 inf 1.50000000e+01]]
To solve this problem, I checked linkage() function and inside it I needed to check _hierarchy.linkage() method. I use pycharm text editor and when I asked for "linkage" source code, it opened up a python file namely "_hierarchy.py" inside the directory like the following:
.PyCharm40/system/python_stubs/-1247972723/scipy/cluster/_hierarchy.py
This python file doesn't have any definition for all included functions.
I am wondering what is the correct source of this function to revise it or is there another way to solve this problem.
I would be appreciated for your helps and hints.
You have a zero vector 0 0 in your data set. For such data, cosine distance is undefined, so you are using an inappropriate distance function!
This is a definition gap that cannot be trivially closed. inf is as incorrect as 0. The distance to 0 0 with cosine cannot be defined without contraditions. You must not use cosine on such data.
Back to your actual question: _hierarchy is a Cython module. It is not pure python, but it is compiled to native code. You can easily see the source code on Github:
https://github.com/scipy/scipy/blob/master/scipy/cluster/_hierarchy.pyx

Resources