How is this a series in Pandas? - python-3.x

The below code produces a Pandas series:
import pandas as pd
df = pd.read_csv(path)
s = df.groupby(['Pregnancies', 'Glucose'])['BloodPressure'].sum()
print(s)
I know I can make it a dataframe by using reset_index(). But I am confused how the below is a series seeing as a series should be a 1D array?
Pregnancies Glucose
0 57 60
67 76
73 0
74 52
78 88
84 146
86 68
91 148
93 220
94 70
95 229

Related

Python-3 - get values from list by frequency and then by the values with equal counts in descending order

I have list of integers, from which I would first like to get unique numbers, first ordered by their occurrences and then the numbers with equal counts should be ordered in descending order.
example 1:
input1 = [1,2,2,1,6,2,1,7]
expected output = [2,1,7,6]
explanation: both 2 and 1 appear thrice while 6 and 7 appear once. so, the numbers occurring thrice will be placed first and in descending order; and same for the set that appears once.
another example case:
input_2 = list(map(int, '40 29 2 44 30 79 46 85 118 66 113 52 55 63 48 99 123 51 110 66 40 115 107 46 6 114 36 99 13 108 85 39 14 121 42 37 56 11 104 28 24 123 63 51 118 52 120 28 64 43 44 86 42 71 101 78 93 1 6 14 42 33 88 107 35 70 74 30 54 76 27 91 115 71 63 103 94 109 39 4 16 108 97 83 29 57 86 121 53 94 28 7 5 31 123 21 2 17 112 104 75 124 88 30 108 14 65 118 28 81 80 14 14 107 21 60 47 97 50 53 19 112 43 46'.split()))
output_2 = list(map(int, '14 28 123 118 108 107 63 46 42 30 121 115 112 104 99 97 94 88 86 85 71 66 53 52 51 44 43 40 39 29 21 6 2 124 120 114 113 110 109 103 101 93 91 83 81 80 79 78 76 75 74 70 65 64 60 57 56 55 54 50 48 47 37 36 35 33 31 27 24 19 17 16 13 11 7 5 4 1'.split()))
This was from a coding test I took. This must be solved without using functions from imports like collections, itertools etc,. and using functions already available in python's namespace like dict, sorted is allowed. How do I do this as efficiently as possible?
def sort_sort(input1):
a = {i:input1.count(i) for i in set(input1)}
b ={i:[] for i in set(a.values())}
for k,v in a.items():
b[v].append(k)
for v in b.values():
v.sort(reverse=True)
output=[]
quays =list(b.keys())
quays.sort(reverse=True)
for q in quays:
output +=b[q]
print(output)

excel formula if average is greater than or equal to 85 then count scores lower than 80

Given the data inside the table, how to count scores that are less than 80, only to those who have an average score of 85 and above
score
joy 75 82 77 76 75 75 77 82 82 85 75 80 75 AVERAGE is 77
jay 85 93 92 95 90 80 86 88 91 82 84 94 87 AVERAGE is 89
jan 75 77 76 75 78 75 75 75 75 78 80 80 75 AVERAGE is 76
jen 88 95 88 92 89 85 89 97 94 92 89 95 91 AVERAGE is 91
Try below formula
=IF(O2>=85,COUNTIF(B2:N2,"<80"),"")
You can also calculate count with calculating average in a separate column. Try-
=IF(AVERAGE(B2:N2)>=85,COUNTIF(B2:N2,"<80"),"")

Issue with loading text file using numpy loadtxt

I am trying to load a text file which contains a set of arrays which looks like:
[ 90 91 92 93 94 95 96 97 100 101 102 103 157 158 159 160]
[ 58 59 60 61 62 63 76 77 78 79 80 81 82 83 84 85 86 87
88 89 90 91 92 93 94 95 96 97 98 99 100 102 103 104 105 108
109 110 111 127 128 129 130 131 132 133 134 135 137 138 139 140 145 146
147 148 171 172 173]
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18 23 24 25 26 27 47 48 49 50 51 52 53 54 55 56 57 58
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76
77 78 79 80 81 82 83 84 85 86 87 88 89 165 166 167 168 169
170 171 172 173]
I have tried using test = np.loadtxt('filename.txt') to load the text file but I keep getting this issue: could not convert string to float: [. The file does not contain any headers. Any help would be much appreciated. Thank you.
The following code will return a list of lists containing integers. This assumes that the data you have is all integers. If not, you can change the map call to convert to float instead:
import os.path as path
# Path to file.
path_to_data = path.join(path.dirname(path.realpath(__file__)),
'data.txt')
print(path_to_data)
# Read file contents.
with open(path_to_data) as file:
data = file.read()
print(data)
# Now we need to parse it for square braces.
data = data.replace("\n", " ").replace(" ", " ").replace(" ", " ")
data = data.replace("[ ", "[").replace(" ]", "]")
print(data)
# Now we loop through the string looking for [ symbols.
data_list = []
start_index = 0
end_index = 0
while 0 <= data.find('[', start_index):
start_index = data.find('[', start_index)
if 0 <= start_index <= data.find(']', start_index):
end_index = data.find(']', start_index)
# Don't include the braces, actually: grab what's in-between.
temp_str = data[start_index+1:end_index]
# Now split into a list based on spaces.
data_list.append(list(map(int, list(temp_str.split(" ")))))
start_index = end_index
print(data_list)

Generate correlated data using numpy function

I have a numpy ndarray x = [67 21 80 36 53 90 82 36 95 56 41 20 49 93 79 37 95 42 76 90]. Is there any function in numpy to generate another ndarray y which has a specific correlation(for example 0.8) with x?
Thanks in advance.

Can I format n numbers in Python

How can I print all numbers in a given range to given number of columns, where every colums is of width 6 character and there is a space between colums? I tried to use format:
for i in range(0,nolines):
for j in range(0,nocolums):
print("{0:6}{1:6}".format(number1,number2))
but found that this approach won't work as I need more general code to format n, where n is given by user input, numbers instead of two. So can I print n numbers by using format?
For example, if input is
min = 20, max = 104, numbers on one line = 10
the program should print
20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
100 101 102 103 104
def print_range(start, stop, ncolumns, width=6):
for i in range(start, stop, ncolumns):
print(' '.join(['{:{}d}'.format(j, width)
for j in range(i, min(i + ncolumns, stop))]))
Example:
>>> print_range(20, 105, ncolumns=10)
20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
100 101 102 103 104
You could use the str.rjust method:
lines = [
[1, 2, 3],
[111, 222, 333],
]
for line in lines:
for n in line:
print(str(n).rjust(6), end='')
print()

Resources