I have this code and I want every combination to be multiplied:
fn main() {
let min = 1;
let max = 9;
for i in (min..=max).rev() {
for j in (min..=max).rev() {
println!("{}", i * j);
}
}
}
Result is something like:
81
72
[...]
9
72
65
[...]
8
6
4
2
9
8
7
6
5
4
3
2
1
Is there a clever way to produce the results in descending order (without collecting and sorting) and without duplicates?
Note that this answer provides a solution for this specific problem (multiplication table) but the title asks a more general question (any two iterators).
The naive solution of storing all elements in a vector and then sorting it uses O(n^2 log n) time and O(n^2) space (where n is the size of the multiplication table).
You can use a priority queue to reduce the memory to O(n):
use std::collections::BinaryHeap;
fn main() {
let n = 9;
let mut heap = BinaryHeap::new();
for j in 1..=n {
heap.push((9 * j, j));
}
let mut last = n * n + 1;
while let Some((val, j)) = heap.pop() {
if val < last {
println!("{val}");
last = val;
}
if val > j {
heap.push((val - j, j));
}
}
}
playground.
The conceptual idea behind the algorithm is to consider 9 separate sequences
9*9, 9*8, 9*7, .., 9*1
8*9, 8*8, 8*7, .., 8*1
...
1*9, 1*8, 1*7, .., 1*1
Since they are all decreasing, at a given moment, we only need to consider one element of each sequence (the largest one we haven't reached yet).
These are inserted into the priority queue which allows us to efficiently find the maximum one.
Once we have printed a given element we move onto the next one in the sequence and insert that into the priority queue.
By keeping track of the last element printed we can avoid duplicates.
Related
I've been rewriting some performance sensitive parts of my code to aarch64 neon. For some things, like population count, i've managed to get a 12x speed. But for some algorithms i'm having trouble..
The high level problem is quickly adding a list of newline separated strings to a hashset. Assuming the hashset functionality is optimal (I am looking into it next), first i need to scan for the strings in the buffer.
I have tried various techniques - but my intuition tells me that I can create a list of pointers to each newline, and then insert them into the hashset afterwards now that i have the slices.
The fundamental problem is I can't work out an efficient way to load a vector, compare against the newline, and spit out a list of pointers to the newlines. eg. the output is a variable length, depending on how many newlines were found in the input vector.
Here is my approach;
fn read_file7(mut buffer: Vec<u8>, needle: u8) -> Result<HashSet<Vec<u8>>, Error>
{
let mut set = HashSet::new();
let mut chunk_offset: usize = 0;
let special_finder_big = [
0x80u8, 0x40u8, 0x20u8, 0x10u8, 0x08u8, 0x04u8, 0x02u8, 0x01u8, // high
0x80u8, 0x40u8, 0x20u8, 0x10u8, 0x08u8, 0x04u8, 0x02u8, 0x01u8, // low
];
let mut next_start: usize = 0;
let needle_vector = unsafe { vdupq_n_u8(needle) };
let special_finder_big = unsafe { vld1q_u8(special_finder_big.as_ptr()) };
let mut line_counter = 0;
// we process 16 chars at a time
for chunk in buffer.chunks(16) {
unsafe {
let src = vld1q_u8(chunk.as_ptr());
let out = vceqq_u8(src, needle_vector);
let anded = vandq_u8(out, special_finder_big);
// each of these is a bitset of each matching character
let vadded = vaddv_u8(vget_low_u8(anded));
let vadded2 = vaddv_u8(vget_high_u8(anded));
let list = [vadded2, vadded];
// combine bitsets into one big one!
let mut num = std::mem::transmute::<[u8; 2], u16>(list);
// while our bitset has bits left, find the set bits
while num > 0 {
let mut xor = 0x8000u16; // only set the highest bit
let clz = (num).leading_zeros() as usize;
set.get_or_insert_owned(&buffer[(next_start)..(chunk_offset + clz)]);
// println!("found '{}' at {} | clz is {} ", needle.escape_ascii(), start_offset + clz, clz);
// println!("string is '{}'", input[(next_start)..(start_offset + clz)].escape_ascii());
xor = xor >> clz;
num = num ^ xor;
next_start = chunk_offset + clz + 1;
//println!("new num {:032b}", num);
line_counter += 1;
}
}
chunk_offset += 16;
}
// get the remaining
set.get_or_insert_owned(&buffer[(next_start)..]);
println!(
"line_counter: {} unique elements {}",
line_counter,
set.len()
);
Ok(set)
}
if I unroll this to do 64 bytes at a time, on a big input it will be slightly faster than memchr. But not much.
Any tips would be appreciated.
I've shown this to a colleague who's come up with better intrinsics code than I would. Here's his suggestion, it's not been compiled, so there needs to be some finishing off of pseudo-code pieces etc, but something along the lines of below should be much faster & work:
let mut line_counter = 0;
for chunk in buffer.chunks(32) { // Read 32 bytes at a time
unsafe {
let src1 = vld1q_u8(chunk.as_ptr());
let src2 = vld1q_u8(chunk.as_ptr() + 16);
let out1 = vceqq_u8(src1, needle_vector);
let out2 = vceqq_u8(src2, needle_vector);
// We slot these next to each other in the same vector.
// In this case the bottom 64-bits of the vector will tell you
// if there are any needle values inside the first vector and
// the top 64-bits tell you if you have any needle values in the
// second vector.
let combined = vpmaxq_u8(out1, out2);
// Now we get another maxp which compresses this information into
// a single 64-bit value, where the bottom 32-bits tell us about
// src1 and the top 32-bit about src2.
let combined = vpmaxq_u8(combined, combined);
let remapped = vreinterpretq_u64_u8 (combined);
let val = vgetq_lane_u64 (remapped, 0);
if (val == 0) // most chunks won't have a new-line
... // If val is 0 that means no match was found in either vectors, adjust offset and continue.
if (val & 0xFFFF)
... // there must be a match in src1. use below code in a function
if (val & 0xFFFF0000)
... // there must be a match in src2. use below code in a function
...
}
}
Now that we now which vector to look in, we should find the index in the vector
As an example, let's assume matchvec is the vector we found above (so either out1 or out2).
To find the first index:
// We create a mark of repeating 0xf00f chunks. when we fill an entire vector
// with it we get a pattern where every byte is 0xf0 or 0x0f. We'll use this
// to find the index of the matches.
let mask = unsafe { vreinterpretq_u16_u8 (vdupq_n_u16 (0xf00f)); }
// We first clear the bits we don't want, which leaves for each adjacent 8-bit entries
// 4 bits of free space alternatingly.
let masked = vandq_u8 (matchvec, mask);
// Which means when we do a pairwise addition
// we are sure that no overflow will ever happen. The entries slot next to each other
// and a non-zero bit indicates the start of the first element.
// We've also compressed the values into the lower 64-bits again.
let compressed = vpaddq_u8 (masked, masked);
let val = vgetq_lane_u64 (compressed, 0);
// Post now contains the index of the first element, every 4 bit is a new entry
// This assumes Rust has kept val on the SIMD side. if it did not, then it's best to
// call vclz on the lower 64-bits of compressed and transfer the results.
let pos = (val).leading_zeros() as usize;
// So just shift pos right by 2 to get the actual index.
let pos = pos >> 2;
pos will now contain the index of the first needle value.
If you were processing out2, remember to add 16 to the result.
To find all the indices we can run through the bitmask without using clz, we avoid the repeated register file transfers this way.
// set masked and compressed as above
let masked = vandq_u8 (matchvec, mask);
let compressed = vpaddq_u8 (masked, masked);
int idx = current_offset;
while (val)
{
if (val & 0xf)
{
// entry found at idx.
}
idx++;
val = val >> 4;
}
In [3, 2, 1, 1, 1, 0], if the value we are searching for is 1, then the function should return 2.
I found binary search, but it seems to return the last occurrence.
I do not want a function that iterates over the entire vector and matches one by one.
binary_search assumes that the elements are sorted in less-to-greater order. Yours is reversed, so you can use binary_search_by:
let x = 1; //value to look for
let data = [3,2,1,1,1,0];
let idx = data.binary_search_by(|probe| probe.cmp(x).reverse());
Now, as you say, you do not get the first one. That is expected, for the binary search algorithm will select an arbitrary value equal to the one searched. From the docs:
If there are multiple matches, then any one of the matches could be returned.
That is easily solvable with a loop:
let mut idx = data.binary_search_by(|probe| probe.cmp(&x).reverse());
if let Ok(ref mut i) = idx {
while x > 0 {
if data[*i - 1] != x {
break;
}
*i -= 1;
}
}
But if you expect many duplicates that may negate the advantages of the binary search.
If that is a problem for you, you can try to be smarter. For example, you can take advantage of this comment in the docs of binary_search:
If the value is not found then Result::Err is returned, containing the index where a matching element could be inserted while maintaining sorted order.
So to get the index of the first value with a 1 you look for an imaginary value just between 2 and 1 (remember that your array is reversed), something like 1.5. That can be done hacking a bit the comparison function:
let mut idx = data.binary_search_by(|probe| {
//the 1s in the slice are greater than the 1 in x
probe.cmp(&x).reverse().then(std::cmp::Greater)
});
There is a handy function Ordering::then() that does exactly what we need (the Rust stdlib is amazingly complete).
Or you can use a simpler direct comparison:
let idx = data.binary_search_by(|probe| {
use std::cmp::Ordering::*;
if *probe > x { Less } else { Greater }
});
The only detail left is that this function will always return Err(i), being i either the position of the first 1 or the position where the 1 would be if there are none. An extra comparison is necessary so solve this ambiguity:
if let Err(i) = idx {
//beware! i may be 1 past the end of the slice
if data.get(i) == Some(&x) {
idx = Ok(i);
}
}
Since 1.52.0, [T] has the method partition_point to find the partition point with a predicate in O(log N) time.
In your case, it should be:
let xs = vec![3, 2, 1, 1, 1, 0];
let idx = xs.partition_point(|&a| a > 1);
if idx < xs.len() && xs[idx] == 1 {
println!("Found first 1 idx: {}", idx);
}
I am really new to Constraint Programming and I am trying to solve a problem where from a two dimensional array, consisting of numbers, I need to take the least amount of sub arrays (2D) as possible, covering as much of the original 2D array as possible, obeying the following rules:
Every sub array must be a rectangle part of the original
The sum of numbers in each sub array must not exceed a specific number
Every sub array must have at least 2 numbers in it
For example for the following matrix:
3 5 1 4
5 1 2 8
0 8 1 3
8 3 2 1
For a maximum sum of 10, a solution would be:
3 -not picked
{ 5 1 4 }
{ 5 1 }
{ 2 8 }
{ 0 8 }
{ 1 3
2 1 }
8 -not picked
Right now I am using the diffn() equivalent of or-tools (MakeNonOverlappingBoxesConstraint()) to create the rectangles that are gonna cover the original array.
My problem is how to get the rectangles created by diffn() and split the original matrix based on the position and size of each one, so I can apply the Sum constraint.
If there is another way of achieving the same constraints without using the diffn() then I would try it out, but I can't think any other way.
Thank you!
The way to get a value from an array based on an IntVar, inside the solver, is by using the MakeElement() function and in this case the 2d version of it.
That way you can get a specific value from the matrix but not a range based on two IntVars (for example x - dx of rectangles). To accomplish the range part you can use a loop and a ConditionalExpression() to figure out if the specified value is in range.
For example in a 1d array, in order to get elements from data, positions x to x + dx would be as follows
int[] data = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
IntVar x = solver.MakeIntVar(0, data.Length - 1);
IntVar dx = solver.MakeIntVar(1, data.Length);
solver.Add(x + dx <= data.Length);
IntVarVector range = new IntVarVector();
for (int i = 0; i < dx.Max(); i++)
{
range.Add(solver.MakeConditionalExpression((x + i < x + dx).Var() , solver.MakeElement(data, (x + i).Var()), 0).Var());
}
solver.Add(range.ToArray().Sum() <= 10);
In Case of the 2d array (as in the question) then you just iterate through both dimensions. The only difference is that the 2d version of MakeElement() accepts an IndexEvaluator2 item (LongLongToLong in C#) so you have to make your own class that inherits LongLongToLong and override the Run() function.
class DataValues: LongLongToLong
{
private int[,] _data;
private int _rows;
private int _cols;
public DataValues(int[,] data, int rows, int cols)
{
_rows = rows;
_cols = cols;
_data = data;
}
public override long Run(long arg0, long arg1)
{
if (arg0 >= _rows || arg1 >= _cols)
return 0;
return _data[arg0, arg1];
}
}
The only problem with this class is that it can ask for a value off the array, so we must handle it ourselves with if (arg0 >= _rows || arg1 >= _cols).
P.S. I dont know if this is the best method of accomplishing it, but it was the best I could think of, since I couldn't find anything similar online.
We need to find the maximum element in an array which is also equal to product of two elements in the same array. For example [2,3,6,8] , here 6=2*3 so answer is 6.
My approach was to sort the array and followed by a two pointer method which checked whether the product exist for each element. This is o(nlog(n)) + O(n^2) = O(n^2) approach. Is there a faster way to this ?
There is a slight better solution with O(n * sqrt(n)) if you are allowed to use O(M) memory M = max number in A[i]
Use an array of size M to mark every number while you traverse them from smaller to bigger number.
For each number try all its factors and see if those were already present in the array map.
Here is a pseudo code for that:
#define M 1000000
int array_map[M+2];
int ans = -1;
sort(A,A+n);
for(i=0;i<n;i++) {
for(j=1;j<=sqrt(A[i]);j++) {
int num1 = j;
if(A[i]%num1==0) {
int num2 = A[i]/num1;
if(array_map[num1] && array_map[num2]) {
if(num1==num2) {
if(array_map[num1]>=2) ans = A[i];
} else {
ans = A[i];
}
}
}
}
array_map[A[i]]++;
}
There is an ever better approach if you know how to find all possible factors in log(M) this just becomes O(n*logM). You have to use sieve and backtracking for that
#JerryGoyal 's solution is correct. However, I think it can be optimized even further if instead of using B pointer, we use binary search to find the other factor of product if arr[c] is divisible by arr[a]. Here's the modification for his code:
for(c=n-1;(c>1)&& (max==-1);c--){ // loop through C
for(a=0;(a<c-1)&&(max==-1);a++){ // loop through A
if(arr[c]%arr[a]==0) // If arr[c] is divisible by arr[a]
{
if(binary_search(a+1, c-1, (arr[c]/arr[a]))) //#include<algorithm>
{
max = arr[c]; // if the other factor x of arr[c] is also in the array such that arr[c] = arr[a] * x
break;
}
}
}
}
I would have commented this on his solution, unfortunately I lack the reputation to do so.
Try this.
Written in c++
#include <vector>
#include <algorithm>
using namespace std;
int MaxElement(vector< int > Input)
{
sort(Input.begin(), Input.end());
int LargestElementOfInput = 0;
int i = 0;
while (i < Input.size() - 1)
{
if (LargestElementOfInput == Input[Input.size() - (i + 1)])
{
i++;
continue;
}
else
{
if (Input[i] != 0)
{
LargestElementOfInput = Input[Input.size() - (i + 1)];
int AllowedValue = LargestElementOfInput / Input[i];
int j = 0;
while (j < Input.size())
{
if (Input[j] > AllowedValue)
break;
else if (j == i)
{
j++;
continue;
}
else
{
int Product = Input[i] * Input[j++];
if (Product == LargestElementOfInput)
return Product;
}
}
}
i++;
}
}
return -1;
}
Once you have sorted the array, then you can use it to your advantage as below.
One improvement I can see - since you want to find the max element that meets the criteria,
Start from the right most element of the array. (8)
Divide that with the first element of the array. (8/2 = 4).
Now continue with the double pointer approach, till the element at second pointer is less than the value from the step 2 above or the match is found. (i.e., till second pointer value is < 4 or match is found).
If the match is found, then you got the max element.
Else, continue the loop with next highest element from the array. (6).
Efficient solution:
2 3 8 6
Sort the array
keep 3 pointers C, B and A.
Keeping C at the last and A at 0 index and B at 1st index.
traverse the array using pointers A and B till C and check if A*B=C exists or not.
If it exists then C is your answer.
Else, Move C a position back and traverse again keeping A at 0 and B at 1st index.
Keep repeating this till you get the sum or C reaches at 1st index.
Here's the complete solution:
int arr[] = new int[]{2, 3, 8, 6};
Arrays.sort(arr);
int n=arr.length;
int a,b,c,prod,max=-1;
for(c=n-1;(c>1)&& (max==-1);c--){ // loop through C
for(a=0;(a<c-1)&&(max==-1);a++){ // loop through A
for(b=a+1;b<c;b++){ // loop through B
prod=arr[a]*arr[b];
if(prod==arr[c]){
System.out.println("A: "+arr[a]+" B: "+arr[b]);
max=arr[c];
break;
}
if(prod>arr[c]){ // no need to go further
break;
}
}
}
}
System.out.println(max);
I came up with below solution where i am using one array list, and following one formula:
divisor(a or b) X quotient(b or a) = dividend(c)
Sort the array.
Put array into Collection Col.(ex. which has faster lookup, and maintains insertion order)
Have 2 pointer a,c.
keep c at last, and a at 0.
try to follow (divisor(a or b) X quotient(b or a) = dividend(c)).
Check if a is divisor of c, if yes then check for b in col.(a
If a is divisor and list has b, then c is the answer.
else increase a by 1, follow step 5, 6 till c-1.
if max not found then decrease c index, and follow the steps 4 and 5.
Check this C# solution:
-Loop through each element,
-loop and multiply each element with other elements,
-verify if the product exists in the array and is the max
private static int GetGreatest(int[] input)
{
int max = 0;
int p = 0; //product of pairs
//loop through the input array
for (int i = 0; i < input.Length; i++)
{
for (int j = i + 1; j < input.Length; j++)
{
p = input[i] * input[j];
if (p > max && Array.IndexOf(input, p) != -1)
{
max = p;
}
}
}
return max;
}
Time complexity O(n^2)
This Question was asked to me at the Google interview. I could do it O(n*n) ... Can I do it in better time.
A string can be formed only by 1 and 0.
Definition:
X & Y are strings formed by 0 or 1
D(X,Y) = Remove the things common at the start from both X & Y. Then add the remaining lengths from both the strings.
For e.g.
D(1111, 1000) = Only First alphabet is common. So the remaining string is 111 & 000. Therefore the result length("111") & length("000") = 3 + 3 = 6
D(101, 1100) = Only First two alphabets are common. So the remaining string is 01 & 100. Therefore the result length("01") & length("100") = 2 + 3 = 5
It is pretty that obvious that do find out such a crazy distance is going to be linear. O(m).
Now the question is
given n input, say like
1111
1000
101
1100
Find out the maximum crazy distance possible.
n is the number of input strings.
m is the max length of any input string.
The solution of O(n2 * m) is pretty simple. Can it be done in a better way?
Let's assume that m is fixed. Can we do this in better than O(n^2) ?
Put the strings into a tree, where 0 means go left and 1 means go right. So for example
1111
1000
101
1100
would result in a tree like
Root
1
0 1
0 1* 0 1
0* 0* 1*
where the * means that an element ends there. Constructing this tree clearly takes O(n m).
Now we have to find the diameter of the tree (the longest path between two nodes, which is the same thing as the "crazy distance"). The optimized algorithm presented there hits each node in the tree once. There are at most min(n m, 2^m) such nodes.
So if n m < 2^m, then the the algorithm is O(n m).
If n m > 2^m (and we necessarily have repeated inputs), then the algorithm is still O(n m) from the first step.
This also works for strings with a general alphabet; for an alphabet with k letters build a k-ary tree, in which case the runtime is still O(n m) by the same reasoning, though it takes k times as much memory.
I think this is possible in O(nm) time by creating a binary tree where each bit in a string encodes the path (0 left, 1 right). Then finding the maximum distance between nodes of the tree which can be done in O(n) time.
This is my solution, I think it works:
Create a binary tree from all strings. The tree will be constructed in this way:
at every round, select a string and add it to the tree. so for your example, the tree will be:
<root>
<1> <empty>
<1> <0>
<1> <0> <1> <0>
<1> <0> <0>
So each path from root to a leaf will represent a string.
Now the distance between each two leaves is the distance between two strings. To find the crazy distance, you must find the diameter of this graph, that you can do it easily by dfs or bfs.
The total complexity of this algorithm is:
O(n*m) + O(n*m) = O(n*m).
I think this problem is something like "find prefix for two strings", you can use trie(http://en.wikipedia.org/wiki/Trie) to accerlate searching
I have a google phone interview 3 days before, but maybe I failed...
Best luck to you
To get an answer in O(nm) just iterate across the characters of all string (this is an O(n) operation). We will compare at most m characters, so this will be done O(m). This gives a total of O(nm). Here's a C++ solution:
int max_distance(char** strings, int numstrings, int &distance) {
distance = 0;
// loop O(n) for initialization
for (int i=0; i<numstrings; i++)
distance += strlen(strings[i]);
int max_prefix = 0;
bool done = false;
// loop max O(m)
while (!done) {
int c = -1;
// loop O(n)
for (int i=0; i<numstrings; i++) {
if (strings[i][max_prefix] == 0) {
done = true; // it is enough to reach the end of one string to be done
break;
}
int new_element = strings[i][max_prefix] - '0';
if (-1 == c)
c = new_element;
else {
if (c != new_element) {
done = true; // mismatch
break;
}
}
}
if (!done) {
max_prefix++;
distance -= numstrings;
}
}
return max_prefix;
}
void test_misc() {
char* strings[] = {
"10100",
"10101110",
"101011",
"101"
};
std::cout << std::endl;
int distance = 0;
std::cout << "max_prefix = " << max_distance(strings, sizeof(strings)/sizeof(strings[0]), distance) << std::endl;
}
Not sure why use trees when iteration gives you the same big O computational complexity without the code complexity. anyway here is my version of it in javascript O(mn)
var len = process.argv.length -2; // in node first 2 arguments are node and program file
var input = process.argv.splice(2);
var current;
var currentCount = 0;
var currentCharLoc = 0;
var totalCount = 0;
var totalComplete = 0;
var same = true;
while ( totalComplete < len ) {
current = null;
currentCount = 0;
for ( var loc = 0 ; loc < len ; loc++) {
if ( input[loc].length === currentCharLoc) {
totalComplete++;
same = false;
} else if (input[loc].length > currentCharLoc) {
currentCount++;
if (same) {
if ( current === null ) {
current = input[loc][currentCharLoc];
} else {
if (current !== input[loc][currentCharLoc]) {
same = false;
}
}
}
}
}
if (!same) {
totalCount += currentCount;
}
currentCharLoc++;
}
console.log(totalCount);