Data Standarisation of a list of list in Python - python-3.x

I want a standardized list of lists. I have the main list
Data=[[5, 3, 2, 8, 5, 10, 8, 1, 2],
[5, 1, 1, 1, 2, 1, 1, 1, 1],
[1, 1, 1, 1, 4, 3, 1, 1, 1],
[1, 1, 1, 1, 2, 2, 1, 1, 1],
[2, 1, 1, 1, 2, 1, 3, 1, 1],
[1, 1, 1, 2, 1, 1, 1, 1, 1],
[8, 10, 3, 2, 6, 4, 3, 10, 1]]
I have calculated the list of column averages and a list of Standard Deviation for 9 columns.
col_avg=[4.47,3.14,3.10,2.67, 3.25,3.83,3.16,2.99, 1.60]
col_std=[2.86,2.98,2.77,2.76,2.13,3.77,2.17, 3.16,1.67]
Now I want a list of lists where the elements are standardized.
The formula: (x-col_avg)/col_std
I don't want to use any python packages like Pandas and Numpy. It should be in general python.

At the risk of reiterating what #ScootCork said, I had the same solution, although using an outer for loop. The function enumerate gets the index for every element, which is used to look up the column in your pre-computed lists.
out = []
for i, col in enumerate(Data):
standardized_col = [(x - col_avg[i]) / col_std[i] for x in col]
out.append(standardized_col)
print(out)

You could do so with nested list comprehension, in combination with enumerate.
[[(e - col_avg[i]) / col_std[i] for i, e in enumerate(line)] for line in Data]
Basically you loop over Data, for each line you loop over the elements together with the index (from enumerate). With this index you can then 'lookup' the relevant avg and std values.

Related

How to reaarange the set of points (X and Y coordinates) of 2D geometry such that two consecutive node numbers placed next to each other in Python?

Problem statement: I am working on a task where I need to arrange nodes into aligned order. I have set of points (X and Y coordinates) of 2D geometry as shown in the picture below. For this picture, coordinates of the nodes are as,
x_origional = [2, 4, 2, 1, 0, 3, 1, 0, 2, 1, 1, 2, 3, 1, 4, 2]
y_origional = [2, 3, 4, 2, 4, 3, 4, 3, 1, 3, 1, 3, 4, 0, 4, 0]
Expected Output: I have to rearrange the node numbers in an aligned order. I want my output like in the picture below.
x_reformed = [0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4]
y_reformed = [3, 4, 4, 3, 2, 1, 0, 0, 1, 2, 3, 4, 4, 3, 3, 4]
Basically, I want my code to align the nodes in such an order that every consecutive nodes (node 3 & node 4...or node 6 & node 7) are placed next to each other like the picture shows below.
What I have done and what Output I got: To achieve above mentioned (In the picture) alignment of nodes, I have written the code below,
merged_list1 = sorted(zip(x_origional, y_origional))
xandy1 = list(map(list, zip(*merged_list1)))
x1_reformed, y1_reformed = xandy1
print('Reformed X coordinate for first geometry is', x1_reformed)
print('Reformed Y coordinate for first geometry is', y1_reformed)
The output I got using above code is,
x_reformed = [0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4]
y_reformed = [3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 3, 4, 3, 4]
My output shows like this which I am not expecting.
Kindly let me know where I am doing wrong and give your suggestion of codes.

Sorting of list depending on frequency in descending order

I have a list of numbers list:
[1, 2, 3, 1, 1, 2, 3, 1, 2, 3].
I want to sort it by the frequency of elements to get this:
[1, 1, 1, 1, 3, 3, 3, 2, 2, 2].
If several elements have the same frequency, sort them by descending value. Can you find any way to do this. I'm using mentioned method but getting my output as:[1, 1, 1, 1, 2, 2, 2, 3, 3, 3].
from collections import Counter
list = [1, 2, 3, 1, 1, 2, 3, 1, 2, 3]
c = Counter(list)
x = sorted(c.most_common(), key=lambda x: (-x[1], x[0]))
y = [([v] * n) for (v, n) in x]
z = sum(y, [])
print(z)
Looks like you need to use reverse=True
Ex:
from collections import Counter
data = [1, 2, 3, 1, 1, 2, 3, 1, 2, 3]
c = Counter(data)
data.sort(key=lambda x: (c[x], x), reverse=True)
print(data)
Output:
[1, 1, 1, 1, 3, 3, 3, 2, 2, 2]
if your list is long, you may want to have to sort only the items that occur the same amount of times:
from collections import Counter
from itertools import groupby
lst = [1, 2, 3, 1, 1, 2, 3, 1, 2, 3]
c = Counter(lst)
ret = []
for key, group in groupby(c.most_common(), key=lambda x: x[1]):
items = sorted((item for item, _ in group), reverse=True)
for item in items:
ret.extend(key * [item])
print(ret)
# [1, 1, 1, 1, 3, 3, 3, 2, 2, 2]

python 3 add many list in one

I have several lists with the same number of elements, example:
[[1, 2, 3, 1], [3, 2, 1, 2], [3, 3, 1, 1], [...], .....etc...
I would like to get a single list containing all the items in the listed lists, example:
[1, 2, 3, 1, 3, 2, 1, 2, 3, 3, 1, 1, ..........]
How can I get this result in the simplest way possible?
I tried this way but I can not find a solution.
a = [[1, 2, 3, 1], [3, 2, 1, 2], [3, 3, 1, 1]]
b = len(a)
for n in range(b):
a[0] += a[n]
print(a[0])
result:
[1, 2, 3, 1, 1, 2, 3, 1]
[1, 2, 3, 1, 1, 2, 3, 1, 3, 2, 1, 2]
[1, 2, 3, 1, 1, 2, 3, 1, 3, 2, 1, 2, 3, 3, 1, 1]
I would only use the last list produced without the repetition of the first list, but I can not correct it and extrapolate it.
Thank you.
A straightforward solution would use extend method of a list.
a = [[1, 2, 3, 1], [3, 2, 1, 2], [3, 3, 1, 1]]
result = []
for aa in a:
result.extend(aa)
I hope you will find Python's itertools interesting for this as well.
from itertools import chain
a = [[1, 2, 3, 1], [3, 2, 1, 2], [3, 3, 1, 1]]
result = list(chain(*a))
*a is basically unpacking the list for you.
Python has a reduce() function that can be used like this:
a = [[1, 2, 3, 1], [3, 2, 1, 2], [3, 3, 1, 1]]
import functools
functools.reduce(lambda x, y: x+y, a)
Result:
[1, 2, 3, 1, 3, 2, 1, 2, 3, 3, 1, 1]
This may be less efficient than modifying an element in-place, as in the question. If you want to keep using that, the simple thing to do is to start with an empty list rather than re-using a[0]:
result = []
for e in a:
result += e
print(result)

How to update MultiIndex after dropping columns in a Dataframe?

I cannot find how to update a column MultiIndex with dead values after columns have been dropped. I have read documentation, but I do not find a method that fit my needs.
MWE:
MultiIndex(levels=[['41B001', '41B004', '41B011', '41MEU1', '41N043', '41R001', '41R002', '41R012', '41WOL1'], ['CO', 'NO', 'NO2', 'O3', 'PM-10.0', 'PM-2.5', 'SO2']],
labels=[[1, 1, 2, 2, 4, 4, 5, 5, 7, 7, 8, 8], [2, 3, 2, 3, 2, 3, 2, 3, 2, 3, 2, 3]],
names=['sitekey', 'measurandkey'])
Keys 41B001, 41MEU1, 41R001 and CO, NO, PM-10.0, PM-2.5, SO2 have been removed but there are still referenced in MultiIndex. I would like a new index without those labels.
Following command does the trick, but it do not find it clean:
data3.T.reset_index().set_index(['sitekey', 'measurandkey']).index
Returns what I expect:
MultiIndex(levels=[['41B004', '41B011', '41N043', '41R001', '41R012', '41WOL1'], ['NO2', 'O3']],
labels=[[0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5], [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]],
names=['sitekey', 'measurandkey'])
Is there a better way (efficient, cleaner, pythonic, pandas friendly) to achieve this working only on MultiIndex instead of transposing the DataFrame?

Removing duplicates with no sets, no for loops, maintain order and update the original list

Ok, so I have to remove duplicates from a list and maintain order at the same time. However, there are certain conditions such as I'm not allowed to use set or for loops. Also when the function mustn't return a new list but update the original list. I have the following code, but it only works partially and yes I know I'm only checking once, but I'm not sure how to proceed further.
def clean_list(values):
i = len(values)-1
while i > 0:
if values[i] == values[i-1]:
values.pop(i)
i -= 1
return values
values = [1, 2, 0, 1, 4, 1, 1, 2, 2, 5, 4, 3, 1, 3, 3, 4, 2, 4, 3, 1, 3, 0, 3, 0, 0]
new_values = clean_list(values)
print(new_values)
Gives me the result:
[1, 2, 0, 1, 4, 1, 2, 5, 4, 3, 1, 3, 4, 2, 4, 3, 1, 3, 0, 3, 0]
Thanks
Try the following.
Using two while loops, the first will get your unique item, the second will then search through the rest of the list for any other matching items and remove them, maintaining order.
def clean_list(lst):
i = 0
while i < len(lst):
item = lst[i] # item to check
j = i + 1 # start next item along
while j < len(lst):
if item == lst[j]:
lst.pop(j)
else:
j += 1
i += 1
values = [1, 2, 0, 1, 4, 1, 1, 2, 2, 5, 4, 3, 1, 3, 3, 4, 2, 4, 3, 1, 3, 0, 3, 0, 0]
clean_list(values)
print(values)
# Output
[1, 2, 0, 4, 5, 3]
Update: Improved function to be faster as worst case the first one was O(n2)
def clean_list(lst):
seen = set()
i = 0
while i < len(lst):
item = lst[i]
if item in seen:
lst.pop(i)
else:
seen.add(item)
i += 1
values = [1, 2, 0, 1, 4, 1, 1, 2, 2, 5, 4, 3, 1, 3, 3, 4, 2, 4, 3, 1, 3, 0, 3, 0, 0]
clean_list(values)
print(values)
# Output
[1, 2, 0, 4, 5, 3]

Resources