Find all combinations by columns - combinatorics

I have n-raws m-columns matrix and want to find all combinations. For example:
2 5 6 9
5 2 8 3
1 1 9 4
2 5 3 9
my program will print
2-5-6-9
2-5-6-3
2-5-6-4
2-5-6-9
2-5-8-9
2-5-8-3...
Can't define m x for loops. How to do that?

Use a recursion. It is enough to specify for each position which values can be there (columns), and make a recursion which has as parameters list of numbers for passed positions. In recursion iteration make iteration through possibilities of next position.
Python implementation:
def C(choose_numbers, possibilities):
if len(choose_numbers) >= len(possibilities):
print '-'.join(map(str, choose_numbers)) # format output
else:
for i in possibilities[len(choose_numbers)]:
C(choose_numbers+[i], possibilities)
c = [[2, 5, 1, 2], [5, 2, 1, 5], [6, 8, 9, 3], [9, 3, 4, 9]]
C([], c)

Related

Create a matrix from another matrix in Python 3.11

I need to create two new numpy.array matrix by using only the odd elements from another matrix for one, and the even elements for the other, and insert zeroes in the positions that aren't even or odd in the respective matrixes. How can I do that?
I tried accessing the indexes of the elements directly but this method doesn't seem to work with arrays.
Example input:
1 2 3
4 5 6
7 8 9
should yield two matrixes like:
0 2 0 1 0 3
4 0 6 and 0 5 0
0 8 0 7 0 9
You can use:
is_odd = a%2
odd = np.where(is_odd, a, 0)
even = np.where(1-is_odd, a, 0)
output:
# odd
array([[1, 0, 3],
[0, 5, 0],
[7, 0, 9]])
# even
array([[0, 2, 0],
[4, 0, 6],
[0, 8, 0]])

Assigning value based on if cell is inbetween external tuple values

I have a pandas series of integer values and a dictionary of keys and tuples (2 integers).
The tuples represent a high low value for each key. I'd like to map the key value to each cell of my series based on which tuple the series value falls into.
Example:
d = {'a': (1,5), 'b': (6,10), 'c': (11,15)} keys and tuples are ordered and never repeated
s = pd.Series([5, 6, 5, 8, 15, 5, 2, 5]): I can sort series and there can be multiple repeated or not present values
for a shorter list i can do this manually i believe with a for loop but I can potentially have big dictionary with many keys.
Let's try pd.Interval:
lookup = pd.Series(list(d.keys()),
index=[pd.Interval(x,y, closed='both') for x,y in d.values()])
lookup.loc[s]
Output:
[1, 5] a
[6, 10] b
[1, 5] a
[6, 10] b
[11, 15] c
[1, 5] a
[1, 5] a
[1, 5] a
dtype: object
reindex also works and safer in the case you have out-of-range data:
lookup.reindex(s)
Output:
5 a
6 b
5 a
8 b
15 c
5 a
2 a
5 a
dtype: object
Another idea using pd.IntervalIndex and Series.map:
m = pd.Series(list(d.keys()),
index=pd.IntervalIndex.from_tuples(d.values(), closed='both'))
s = s.map(m)
Result:
0 a
1 b
2 a
3 b
4 c
5 a
6 a
7 a
dtype: object

Pandas: How to aggregate by range inclusion?

I have a dataframe with a "range" column and some value columns:
In [1]: df = pd.DataFrame({
"range": [[1,2], [[1,2], [6,11]], [4,5], [[1,3], [5,7], [9, 11]], [9,10], [[5,6], [9,11]]],
"A": range(1, 7),
"B": range(6, 0, -1)
})
Out[1]:
range A B
0 [1, 2] 1 6
1 [[1, 2], [6, 11]] 2 5
2 [4, 5] 3 4
3 [[1, 3], [5, 7], [9, 11]] 4 3
4 [9, 10] 5 2
5 [[5, 6], [9, 11]] 6 1
For every row I need to check if the range is entirely included (with all of its parts) in the range of another row and then sum the other columns (A and B) up, keeping the longer range. The rows are arbitarily ordered.
The detailed steps for the example dataframe would look like: Row 0 is entirely included in row 1 and 3, row 1, 2 and 3 have no other rows where their ranges are entirely included and row 4 is included in row 1, 3 and 5, but because row 5 is also included in 3 row 4 should only be merged once.
Hence my output dataframe would be:
Out[2]:
range A B
0 [[1, 2], [6, 11]] 8 13
1 [4, 5] 3 4
2 [[1, 3], [5, 7], [9, 11]] 16 12
I thought about sorting the rows first in order to put the longest ranges at the top so it would be easier and more efficient to merge the ranges, but unfortunately I have no idea how to perform this in pandas...

How to create a separate df after applying groupby?

I have a df as follows:
Product Step
1 1
1 3
1 6
1 6
1 8
1 1
1 4
2 2
2 4
2 8
2 8
2 3
2 1
3 1
3 3
3 6
3 6
3 8
3 1
3 4
What I would like to do is to:
For each Product, every Step must be grabbed and the order must not be changed, that is, if we look at Product 1, after Step 8, there is a 1 coming and that 1 must be after 8 only. So, the expected output for product 1 and product 3 should be of the order: 1, 3, 6, 8, 1, 4; for the product 2 it must be: 2, 4, 8, 3, 1.
Update:
Here, I only want one value of 6 for product 1 and 3, since in the main df both the 6 next to each other, but both the values of 1 must be present since they are not next to each other.
Once the first step is done, the products with the same Steps must be grouped together into a new df (in the below example: Product 1 and 3 have same Steps, so they must be grouped together)
What I have done:
import pandas as pd
sid = pd.DataFrame(data.groupby('Product').apply(lambda x: x['Step'].unique())).reset_index()
But it is yielding a result like:
Product 0
0 1 [1 3 6 8 4]
1 2 [2 4 8 3 1]
2 3 [1 3 6 8 4]
which is not the result I want. I would like the value for the first and third product to be [1 3 6 8 1 4].
IIUC Create the Newkey by using cumsum and diff
df['Newkey']=df.groupby('Product').Step.apply(lambda x : x.diff().ne(0).cumsum())
df.drop_duplicates(['Product','Newkey'],inplace=True)
s=df.groupby('Product').Step.apply(tuple)
s.reset_index().groupby('Step').Product.apply(list)
Step
(1, 3, 6, 8, 1, 4) [1, 3]
(2, 4, 8, 3, 1) [2]
Name: Product, dtype: object
groupby preservers the order of rows within a group, so there isn't much need to worry about the rows shifting.
A straightforward, but not greatly performant, solution would be to apply(tuple), since they are hashable allowing you to group on them to see which Products are identical. form_seq will make it so that consecutive values only appear once in the list of steps before forming the tuple.
def form_seq(x):
x = x[x != x.shift()]
return tuple(x)
s = df.groupby('Product').Step.apply(form_seq)
s.groupby(s).groups
#{(1, 3, 6, 8, 1, 4): Int64Index([1, 3], dtype='int64', name='Product'),
# (2, 4, 8, 3, 1): Int64Index([2], dtype='int64', name='Product')}
Or if you'd like a DataFrame:
s.reset_index().groupby('Step').Product.apply(list)
#Step
#(1, 3, 6, 8, 1, 4) [1, 3]
#(2, 4, 8, 3, 1) [2]
#Name: Product, dtype: object
The values of that dictionary are the groupings of products that share the step sequence (given by the dictionary keys). Products 1 and 3 are grouped together by the step sequence 1, 3, 6, 8, 1, 4.
Another very similar way:
df_no_dups=df[df.shift()!=df].dropna(how='all').ffill()
df_no_dups_grouped=df_no_dups.groupby('Product')['Step'].apply(list)

How to calculate minimum number of swap to make the median of two sorted arrays equal?

1 2 3 3 5 6 7
4 6 8 8 9 9 9
We just need two swap operations.
First swap operation:
Take 1 from A and 9 from B and swap them.
Now the arrays look like this: A = [2, 3, 3, 5, 6, 7, 9] and B = [1, 4, 6, 8, 8, 9, 9].
Second swap operation:
Take 2 from A and 9 from B and swap them.
Now the arrays look like this: A = [3, 3, 5, 6, 7, 9, 9] and B = [1, 2, 4, 6, 8, 8, 9].
Now the median of both arrays is 6.

Resources