Rust, permutations between fixed size, and varying size - rust

My data is structured this way:
[
(
permutation_result0,
permutation_result1,
permutation_result2,
(varying_size_values0),
(varying_size_values1),
(varying_size_values2)
)
...
]
where permutation_result0 relates to varying_size_values0 and so on...
Example:
[
(banana, apple, orange, (50, 20, 10), (66), (10, 2, 3))
(apple, beef, orange, (49), (5), (10, 20))
(cabbage, beef, apple, (30), (4, 3), (2, 1, 444))
]
Desired output:
[
(banana, apple, orange, 50, 66, 10)
(banana, apple, orange, 50, 66, 2)
(banana, apple, orange, 50, 66, 3)
(banana, apple, orange, 20, 66, 10)
(banana, apple, orange, 20, 66, 2)
...
(apple, beef, orange, 49, 5, 10)
(apple, beef, orange, 49, 5, 20)
(cabbage, beef, apple, 30, 4, 2)
(cabbage, beef, apple, 30, 4, 1)
...
]
or even:
[
(50, 66, 10)
(50, 66, 2)
...
(49, 5, 10)
(49, 5, 20)
...
(30, 4, 2)
(30, 4, 1)
...
]
The first 3 entries and their values have to stay the same until all the values that can differ in their corresponding varying values, are exhausted.
The best idea I can come up with so far is using contains:
let mut results_vec = vec![];
for i in data_structure{
let mut operating_vec = vec![];
operating_vec.push(
i.0, i.1, i.2,
(i.3, i.3.len()),
(i.4, i.4.len()),
(i.5, i.5.len())
);
let mut inside_value_vec = vec![];
let mut permutations_vec = vec![];
for i in operating_vec{
// 1. if all len > 1
if i.3.1>1 && i.4.1>1 && i.5.1>1 {
for xxx in i.3.0{
inside_value_vec.push(xxx);
};
for xxxx in i.4.0{
inside_value_vec.push(xxxx);
};
for xxxxx in i.5.0{
inside_value_vec.push(xxxxx);
};
for x in inside_value_vec.iter().permutations(3){
if i.3.0.contains(x.0)&&
i.4.0.contains(x.1)&&
i.5.0.contains(x.2){
results_vec.push(
i.0, i.1, i.2,
x.0, x.1, x.2
);
}
};
// 2.check if & which values' len == 1
//problem is I can't do it 1 by 1, I have to check all 3 values for that condition
//then remember which positions were len = 1 and glue the derived permutations' values accordingly.
// 3. else continue ( for 0 len values )
};
};
};
As per the comments in my code I can't figure out any remotely clean way for the len == 1 conditionals. I hope this problem is at least somewhat interesting to the reader and that I can receive guidance. Thanks.

Related

Modifying overlapping time period to include 1 day difference

I am trying to modify the overlapping time period problem so that if there is 1 day difference between dates, it should still be counted as an overlap. As long as the difference in dates is less than 2 days it should be seen as an overlap.
This is the dataframe containing the dates
df_dates = pd.DataFrame({"id": [102, 102, 102, 102, 103, 103, 104, 104, 104, 102, 104, 104, 103, 106, 106, 106],
"start dates": [pd.Timestamp(2002, 1, 1), pd.Timestamp(2002, 3, 3), pd.Timestamp(2002,10,20), pd.Timestamp(2003, 4, 4), pd.Timestamp(2003, 8, 9), pd.Timestamp(2005, 2, 8), pd.Timestamp(1993, 1, 1), pd.Timestamp(2005, 2, 3), pd.Timestamp(2005, 2, 16), pd.Timestamp(2002, 11, 16), pd.Timestamp(2005, 2, 23), pd.Timestamp(2005, 10, 11), pd.Timestamp(2015, 2, 9), pd.Timestamp(2011, 11, 24), pd.Timestamp(2011, 11, 24), pd.Timestamp(2011, 12, 21)],
"end dates": [pd.Timestamp(2002, 1, 3), pd.Timestamp(2002, 12, 3),pd.Timestamp(2002,11,20), pd.Timestamp(2003, 4, 4), pd.Timestamp(2004, 11, 1), pd.Timestamp(2015, 2, 8), pd.Timestamp(2005, 2, 3), pd.Timestamp(2005, 2, 15) , pd.Timestamp(2005, 2, 21), pd.Timestamp(2003, 2, 16), pd.Timestamp(2005, 10, 8), pd.Timestamp(2005, 10, 21), pd.Timestamp(2015, 2, 17), pd.Timestamp(2011, 12, 31), pd.Timestamp(2011, 11, 25), pd.Timestamp(2011, 12, 22)]
})
This was helpful with answering the overlap question but I am not sure how to modify it (red circle) to include 1 day difference
This was my attempt at answering the question, which kind of did (red circle), but then the overlap calculation is not always right (yellow circle)
def Dates_Restructure(df, pers_id, start_dates, end_dates):
df.sort_values([pers_id, start_dates], inplace=True)
df['overlap'] = (df.groupby(pers_id)
.apply(lambda x: (x[end_dates].shift() - x[start_dates]) < timedelta(days=-1))
.reset_index(level=0, drop=True))
df['cumsum'] = df.groupby(pers_id)['overlap'].cumsum()
return df.groupby([pers_id, 'cumsum']).aggregate({start_dates: min, end_dates: max}).reset_index()
I will appreciate your help with this. Thanks
This was the answer I came up with and it worked. I combined the 2 solutions in my question to get this solution.
def Dates_Restructure(df_dates, pers_id, start_dates, end_dates):
df2 = df_dates.copy()
startdf2 = pd.DataFrame({pers_id: df2[pers_id], 'time': df2[start_dates], 'start_end': 1})
enddf2 = pd.DataFrame({pers_id: df2[pers_id], 'time': df2[end_dates], 'start_end': -1})
mergedf2 = pd.concat([startdf2, enddf2]).sort_values([pers_id, 'time'])
mergedf2['cumsum'] = mergedf2.groupby(pers_id)['start_end'].cumsum()
mergedf2['new_start'] = mergedf2['cumsum'].eq(1) & mergedf2['start_end'].eq(1)
mergedf2['group'] = mergedf2.groupby(pers_id)['new_start'].cumsum()
df2['group_id'] = mergedf2['group'].loc[mergedf2['start_end'].eq(1)]
df3 = df2.groupby([pers_id, 'group_id']).aggregate({start_dates: min, end_dates: max}).reset_index()
df3.sort_values([pers_id, start_dates], inplace=True)
df3['overlap'] = (df3.groupby(pers_id).apply(lambda x: (x[end_dates].shift() - x[start_dates]) < timedelta(days=-1))
.reset_index(level=0, drop=True))
df3['GROUP_ID'] = df3.groupby(pers_id)['overlap'].cumsum()
return df3.groupby([pers_id, 'GROUP_ID']).aggregate({start_dates: min, end_dates: max}).reset_index()

efficient SIMD dot product in rust

I'm trying to create efficient SIMD version of dot product to implement 2D convolution for i16 type for FIR filter.
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;
#[target_feature(enable = "avx2")]
unsafe fn dot_product(a: &[i16], b: &[i16]) {
let a = a.as_ptr() as *const [i16; 16];
let b = b.as_ptr() as *const [i16; 16];
let a = std::mem::transmute(*a);
let b = std::mem::transmute(*b);
let ms_256 = _mm256_mullo_epi16(a, b);
dbg!(std::mem::transmute::<_, [i16; 16]>(ms_256));
let hi_128 = _mm256_castsi256_si128(ms_256);
let lo_128 = _mm256_extracti128_si256(ms_256, 1);
dbg!(std::mem::transmute::<_, [i16; 8]>(hi_128));
dbg!(std::mem::transmute::<_, [i16; 8]>(lo_128));
let temp = _mm_add_epi16(hi_128, lo_128);
}
fn main() {
let a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15];
let b = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15];
unsafe {
dot_product(&a, &b);
}
}
I] ~/c/simd (master|…) $ env RUSTFLAGS="-C target-cpu=native" cargo run --release | wl-copy
warning: unused variable: `temp`
--> src/main.rs:16:9
|
16 | let temp = _mm_add_epi16(hi_128, lo_128);
| ^^^^ help: if this is intentional, prefix it with an underscore: `_temp`
|
= note: `#[warn(unused_variables)]` on by default
warning: 1 warning emitted
Finished release [optimized] target(s) in 0.00s
Running `target/release/simd`
[src/main.rs:11] std::mem::transmute::<_, [i16; 16]>(ms_256) = [
0,
1,
4,
9,
16,
25,
36,
49,
64,
81,
100,
121,
144,
169,
196,
225,
]
[src/main.rs:14] std::mem::transmute::<_, [i16; 8]>(hi_128) = [
0,
1,
4,
9,
16,
25,
36,
49,
]
[src/main.rs:15] std::mem::transmute::<_, [i16; 8]>(lo_128) = [
64,
81,
100,
121,
144,
169,
196,
225,
]
While I understand SIMD conceptually I'm not familiar with exact instructions and intrinsics.
I know what I need to multiply two vectors and then horizontally sum then by halving them and using instructions to vertically add two halved of lower size.
I've found madd instruction which supposedly should do one such summation after multiplications right away, but not sure what to do with the result.
If using mul instead of madd I'm not sure which instructions to use to reduce the result further.
Any help is welcome!
PS
I've tried packed_simd but it seems like it doesn't work on stable rust.

get multiple tuples from list of tuples using min function

I have a list that looks like this
mylist = [('Part1', 5, 5), ('Part2', 7, 7), ('Part3', 11, 9),
('Part4', 45, 45), ('part5', 5, 5)]
I am looking for all the tuples that has a number closest to my input
now i am using this code
result = min([x for x in mylist if x[1] >= 4 and x[2] >= 4])
The result i am getting is
('part5', 5, 5)
But i am looking for an result looking more like
[('Part1', 5, 5), ('part5', 5, 5)]
and if there are more tuples in it ( i have 2 in this example but it could be more) then i would like to get all the tuples back
the whole code
mylist = [('Part1', 5, 5), ('Part2', 7, 7), ('Part3', 11, 9), ('Part4', 45, 45), ('part5', 5, 5)]
result = min([x for x in mylist if x[1] >= 4 and x[2] >= 4])
print(result)
threshold = 4
mylist = [('Part1', 5, 5), ('Part2', 7, 7), ('Part3', 11, 9), ('Part4', 45, 45), ('part5', 5, 5)]
filtered = [x for x in mylist if x[1] >= threshold and x[2] >= threshold]
keyfunc = lambda x: x[1]
my_min = keyfunc(min(filtered, key=keyfunc))
result = [v for v in filtered if keyfunc(v)==my_min]
# [('Part1', 5, 5), ('part5', 5, 5)]

Range function in M?

Is it possible to create a numerical range in M? For example something like:
let
x = range(1,10) // {1,2,3,4,5,6,7,8,9,10}, from 1 to 10, increment by 1
x = range(1,10,2) // {1,3,5,7,9}, from 1 to 10, increment by 2
For simple scenarios, a..b might be appropriate. Some examples:
let
firstList = {1..10}, // {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
secondList = {1, 5, 12, 14..17, 18..20}, // {1, 5, 12, 14, 15, 16, 17, 18, 19, 20}
thirdList = {Character.ToNumber("a")..100, 110..112}, // {97, 98, 99, 100, 110, 111, 112}
fourthList = {95..(10 * 10)} // {95, 96, 97, 98, 99, 100}
in
fourthList
Otherwise, maybe try a custom function which internally uses List.Generate:
let
range = (inclusiveStart as number, inclusiveEnd as number, optional step as number) as list =>
let
interval = if step = null then 1 else step,
conditionFunc =
if (interval > 0) then each _ <= inclusiveEnd
else if (interval < 0) then each _ >= inclusiveEnd
else each false,
generated = List.Generate(() => inclusiveStart, conditionFunc, each _ + interval)
in generated,
firstList = range(1, 10), // {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
secondList = range(1, 10, 2), // {1, 3, 5, 7, 9}
thirdList = range(1, 10 , -1), // {} due to the combination of negative "step", but "inclusiveEnd" > "inclusiveStart"
fourthList = range(10, 1, 0), // {} current behaviour is to return an empty list if 0 is passed as "step" argument.
fifthList = range(10, 1, -1), // {10, 9, 8, 7, 6, 5, 4, 3, 2, 1}
sixthList = range(10, 1, 1) // {} due to the combination of positive "step", but "inclusiveEnd" < "inclusiveStart"
in
sixthList
The above range function should be able to generate both ascending and descending sequences (see examples in code).
Alternatively, you could create a custom function which uses List.Numbers internally by first calculating what the count argument needs to be. But I chose to go with List.Generate.

Inserting multiple elements in a numpy array

Is there a function in python that allows me to insert number 100's or consecutive non zeros in the array [1,2,3,4,5]?
Output should be [1, 100, 100, 100, 2, 100, 100, 100, 3 .....] or [ 1, 100, 101, 102, 2 , 100, 101, 102, 3...]
I have tried numpy.insert()
ar2=np.insert(ar1, slice(1,None), range(100,103))
Output: array([ 1, 100, 2, 101, 3, 102, 4, 100, 5, 101])
Numpy.Insert() method allows addition of only a single number between the input elements. Let me know your thoughts on this.
You can use numpy.kron
np.kron([1,2,3,4,5],[1,0,0,0]) + 100*np.kron(np.ones(5),[0,1,1,1])
for the second one
np.kron([1,2,3,4,5],[1,0,0,0]) + np.kron(np.ones(5),[0,101,102,103])

Resources