Generate correlated data using numpy function

Generate correlated data using numpy function - python-3.x

I have a numpy ndarray x = [67 21 80 36 53 90 82 36 95 56 41 20 49 93 79 37 95 42 76 90]. Is there any function in numpy to generate another ndarray y which has a specific correlation(for example 0.8) with x?
Thanks in advance.

Related

amend a subarray in place in j?

I've got two ways to amend a subarray in J but I don't like either of them.
(Imagine selecting a rectangle in a paint program and applying some arbitrary
operation to that rectangle in place.)
t =. i. 10 10 NB. table to modify
xy=. 2 3 [ wh =. 3 2 NB. region i want want to modify
u =. -#|. NB. operation to perform on region
I can fetch the subarray and apply the
operation in one step with cut (;.0):
st =. ((,./xy),:(|,./wh)) u;.0 t
Putting it back is easy enough, but seems to require
building a large boxed array of indices:
(,st) (xy&+each,{;&:i./wh) } t
I also tried recursively splitting and glueing
the table into four "window panes" at a time:
split =: {. ; }. NB. split y into 2 subarrays at index x
panes =: {{ 2 2$ ; |:L:0 X split&|:&.> Y split y [ 'Y X'=.x }}
glue =: [: ,&>/ ,.&.>/"1 NB. reassamble
xy panes t
┌────────┬────────────────────┐
│ 0 1 2│ 3 4 5 6 7 8 9│
│10 11 12│13 14 15 16 17 18 19│
├────────┼────────────────────┤
│20 21 22│23 24 25 26 27 28 29│
│30 31 32│33 34 35 36 37 38 39│
│40 41 42│43 44 45 46 47 48 49│
│50 51 52│53 54 55 56 57 58 59│
│60 61 62│63 64 65 66 67 68 69│
│70 71 72│73 74 75 76 77 78 79│
│80 81 82│83 84 85 86 87 88 89│
│90 91 92│93 94 95 96 97 98 99│
└────────┴────────────────────┘
NB. then split the lower right pane again,
NB. extract *its* upper left pane...
s0 =. 1 1 {:: p0 =. xy panes t
s1 =. 0 0 {:: p1 =. wh panes s0
NB. apply the operation and reassemble:
p1a =. (<u s1) (<0 0) } p1
glue (<glue p1a) (<1 1) } p0
The first approach seems to be the quicker and
easier option, but it feels like there ought
to be a more primitive way to apply a verb at
a sub-array without extracting it, or to paste
in a subarray at some coordinates without manually
creating the array of indices for each element.
Have I missed a better option?

I would begin by creating the set of indices that I wanted to amend
[ ind =. < xy + each i. each wh
┌───────────┐
│┌─────┬───┐│
││2 3 4│3 4││
│└─────┴───┘│
└───────────┘
I can use those to select the atoms I want from t
ind { t
23 24
33 34
43 44
And if I can select with them then I can use the same indices to amend t
_ ind } t
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 _ _ 25 26 27 28 29
30 31 32 _ _ 35 36 37 38 39
40 41 42 _ _ 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
and finally I can use a hook with the left tine being ind}~ after preprocessing t with the right tine (ind u#{ ]) to get my result
(ind}~ ind u#{ ]) t
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 _43 _44 25 26 27 28 29
30 31 32 _33 _34 35 36 37 38 39
40 41 42 _23 _24 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
You actually gave me the solution when you asked how you can 'amend' your array in place.

How is this a series in Pandas?

The below code produces a Pandas series:
import pandas as pd
df = pd.read_csv(path)
s = df.groupby(['Pregnancies', 'Glucose'])['BloodPressure'].sum()
print(s)
I know I can make it a dataframe by using reset_index(). But I am confused how the below is a series seeing as a series should be a 1D array?
Pregnancies Glucose
0 57 60
67 76
73 0
74 52
78 88
84 146
86 68
91 148
93 220
94 70
95 229

How do you find the largest product of 4 consecutive numbers in a grid? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Here's the problem:
Find the largest product of a horizontal or vertical line of four consecutive numbers in this grid. Here, "consecutive" means four numbers that lie next to each other in the same row or column. For example, the top row starts with the four consecutive numbers 8, 2, 22, and 97. Consecutive numbers do not "wrap around" sides of the grid.
I know that I should convert the grid into a number of lists and kind of go through each quad (pairs except 4 instead of 2). But I'm not sure how to save it as a file and how to do anything with this to code to the answer.
My SPECIFIC question is what exactly is the problem asking me to do and what would be the pseudocode? Taking 4 two-digit numbers and multiplying them does not give an 8-digit number (unless I'm crazy).
I have no code as of now.
Here is the 20x20 grid:
It should output an 8-digit number.

import numpy as np
np.random.seed(10)
a = np.random.randint(1,10,(6,7))
func = lambda arr,n:(arr[:,np.arange(arr.shape[1]-n+1)[:,None] + np.arange(n)]).prod(2).max()
np.r_[func(a,4),func(a.T,4)].max()
Now Using the GRID:
import io
grid = np.loadtxt(io.StringIO(GRID))
np.r_[func(grid,4),func(grid.T,4)].max()
Out: 51267216.0
Without using numpy, you could do:
from functools import reduce
prod = lambda lst: reduce(lambda x,y:x*y,lst)
def slice_grid_max(grid,n):
n_cols = len(list(zip(*grid)))
f = lambda grid: max([prod(j[i:i+n]) for j in grid for i in range(n_cols-n+1)])
return max((f(grid),f(zip(*grid))))
slice_grid_max(grid,4)
Out: 51267216.0

If you would like to solve this problem using Python's standard library without having to resort to any third-party modules, you can use the following program as a demonstration of how you could go about doing that:
#! /usr/bin/env python3
import enum
# The grid was converted here: https://easypdf.com/ocr-online
# After some manual cleanup of the conversion, you have this:
GRID = '''\
08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
67 26 20 68 02 62 12 20 93 63 94 39 63 08 40 91 66 49 94 21
24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 33 69
04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 83 74 04 36 16
20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 34
01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48'''
CONSECUTIVE = 4
def main():
matrix = convert_grid_to_matrix()
products = {}
calculate_horizontal_products(matrix, products)
calculate_vertical_products(matrix, products)
largest_products = calculate_largest_products(products)
display_largest_products(matrix, largest_products)
def convert_grid_to_matrix():
return tuple(tuple(map(int, line.split())) for line in GRID.splitlines())
def calculate_horizontal_products(matrix, products):
calculate_all_products(matrix, products, Orientation.HORIZONTAL)
def calculate_all_products(matrix, products, orientation):
for row, column in generate_starting_coordinates(matrix, orientation):
products[(row, column, orientation)] = calculate_product(
get_numbers(matrix, row, column, orientation)
)
def calculate_product(numbers, start=1):
for value in numbers:
start *= value
return start
def generate_starting_coordinates(matrix, orientation):
if orientation is Orientation.HORIZONTAL:
for row in range(len(matrix)):
for column in range(len(matrix[row]) - CONSECUTIVE + 1):
yield row, column
elif orientation is Orientation.VERTICAL:
for row in range(len(matrix) - CONSECUTIVE + 1):
for column in range(len(matrix[row])):
yield row, column
else:
raise ValueError(f'{orientation!r} is not a valid orientation')
def get_numbers(matrix, row, column, orientation):
if orientation is Orientation.HORIZONTAL:
return matrix[row][column:column + CONSECUTIVE]
if orientation is Orientation.VERTICAL:
return tuple(
matrix[row + offset][column] for offset in range(CONSECUTIVE)
)
raise ValueError(f'{orientation!r} is not a valid orientation')
def calculate_vertical_products(matrix, products):
calculate_all_products(matrix, products, Orientation.VERTICAL)
def calculate_largest_products(products):
max_value = max(products.values())
for key, value in products.items():
if value == max_value:
yield key
def display_largest_products(matrix, largest_products):
print('The largest product(s) can be found here:')
for coordinate in largest_products:
print(' Row: {}, Column: {}, {}'.format(*coordinate))
display_calculation(matrix, *coordinate)
def display_calculation(matrix, *coordinate):
numbers = get_numbers(matrix, *coordinate)
operation = ' * '.join(map(str, numbers))
print(f' {operation} = {calculate_product(numbers):,}')
#enum.unique
class Orientation(enum.Enum):
HORIZONTAL = enum.auto()
VERTICAL = enum.auto()
if __name__ == '__main__':
main()

Why are the results of these two commands different?

The result of this command（ls -d [!0-99]） already contains this command（ls -d [!0-100]）.But in my mind, the results of these two commands should be the same.Who can help me explain the result of the second command?
jack#DESKTOP-KRIB7TB:~$ ls -d [!0-99]*
a b c d e f g h i j k
jack#DESKTOP-KRIB7TB:~$ ls -d [!0-100]*
2 25 30 36 41 47 52 58 63 69 74 8 85 90 96 c i
20 26 31 37 42 48 53 59 64 7 75 80 86 91 97 d j
21 27 32 38 43 49 54 6 65 70 76 81 87 92 98 e k
22 28 33 39 44 5 55 60 66 71 77 82 88 93 99 f
23 29 34 4 45 50 56 61 67 72 78 83 89 94 a g
24 3 35 40 46 51 57 62 68 73 79 84 9 95 b h

The [...] syntax in a glob is a per-character match, not a numeric range. [a-c] is the same as [abc], for instance. You can have multiple ranges or individual characters in a block, so [a-cfh-k] is the same as [abcfhijk].
So [!0-99] is the same as [!01234567899] (notice the redundant 9), whereas [!0-100] is the same as [!0100], thereby only matching 0s and 1s.
You can list all non-digit directories with ls -d [!0-9]*, but I don't know that there's a robust way (with globs and ls) to list directories with names that are numerals greater than 100.

Can I format n numbers in Python

How can I print all numbers in a given range to given number of columns, where every colums is of width 6 character and there is a space between colums? I tried to use format:
for i in range(0,nolines):
for j in range(0,nocolums):
print("{0:6}{1:6}".format(number1,number2))
but found that this approach won't work as I need more general code to format n, where n is given by user input, numbers instead of two. So can I print n numbers by using format?
For example, if input is
min = 20, max = 104, numbers on one line = 10
the program should print
20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
100 101 102 103 104

def print_range(start, stop, ncolumns, width=6):
for i in range(start, stop, ncolumns):
print(' '.join(['{:{}d}'.format(j, width)
for j in range(i, min(i + ncolumns, stop))]))
Example:
>>> print_range(20, 105, ncolumns=10)
20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
100 101 102 103 104

You could use the str.rjust method:
lines = [
[1, 2, 3],
[111, 222, 333],
]
for line in lines:
for n in line:
print(str(n).rjust(6), end='')
print()

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Generate correlated data using numpy function - python-3.x

I have a numpy ndarray x = [67 21 80 36 53 90 82 36 95 56 41 20 49 93 79 37 95 42 76 90]. Is there any function in numpy to generate another ndarray y which has a specific correlation(for example 0.8) with x? Thanks in advance.

Related

amend a subarray in place in j?

How is this a series in Pandas?

How do you find the largest product of 4 consecutive numbers in a grid? [closed]

Why are the results of these two commands different?

Can I format n numbers in Python

Categories

Resources