Problem in implementing Persistent Segment Tree - persistent

I am trying to implement Persistent Segment Tree. The queries are of 2 types: 1 and 2.
1 ind val : update the value at ind to val in the array
2 k l r : find the sum of elements from index l to r after the kth update operation.
I have implemented the update and query functions properly and they are working fine on an array. But the problem arises when I am forming different versions. Basically this is my part of code
while (q--) {
cin >> type;
if (type == 1) {
cin >> ind >> val;
node *t = new node;
*t = *ver[size - 1];
update(t, ind, val);
ver.pb(t);
size++;
}
}
cout << query(ver[0], 0, 1) << ' ' << query(ver[1], 0, 1) << query(ver[2], 0, 1);
Now the problem is it is also changing the parameters for the all the node is the array. That means after 3 updates all the versions are storing the latest tree. This is probably because I am not properly allocating the new pointer. The changes made to the new pointer are getting reflected in all the pointers in the array
For example if I give this input
5
1 2 3 4 5
2
1 1 10
1 0 5
where 5 is the number of elements in the array and following is the array. Then there is q, number of queries and then all the queries. After carrying out the update the value of query function called for (l, r) = (0, 1) for all the 3 versions are 15. But it should be 3, 11, 15. What am I doing wrong

So let's say we have some simple segment tree like this:
For Persistant segment tree, during update we generate new nodes for all changed nodes and replace pointers to new nodes where needed, so let's say we update node 4, then we get a persistent segment tree like this (new nodes marked with *):
And all you're doing is replacing the root and copying all data so you get something like this:

Related

Multiply numbers from two iterators in order and without duplicates

I have this code and I want every combination to be multiplied:
fn main() {
let min = 1;
let max = 9;
for i in (min..=max).rev() {
for j in (min..=max).rev() {
println!("{}", i * j);
}
}
}
Result is something like:
81
72
[...]
9
72
65
[...]
8
6
4
2
9
8
7
6
5
4
3
2
1
Is there a clever way to produce the results in descending order (without collecting and sorting) and without duplicates?
Note that this answer provides a solution for this specific problem (multiplication table) but the title asks a more general question (any two iterators).
The naive solution of storing all elements in a vector and then sorting it uses O(n^2 log n) time and O(n^2) space (where n is the size of the multiplication table).
You can use a priority queue to reduce the memory to O(n):
use std::collections::BinaryHeap;
fn main() {
let n = 9;
let mut heap = BinaryHeap::new();
for j in 1..=n {
heap.push((9 * j, j));
}
let mut last = n * n + 1;
while let Some((val, j)) = heap.pop() {
if val < last {
println!("{val}");
last = val;
}
if val > j {
heap.push((val - j, j));
}
}
}
playground.
The conceptual idea behind the algorithm is to consider 9 separate sequences
9*9, 9*8, 9*7, .., 9*1
8*9, 8*8, 8*7, .., 8*1
...
1*9, 1*8, 1*7, .., 1*1
Since they are all decreasing, at a given moment, we only need to consider one element of each sequence (the largest one we haven't reached yet).
These are inserted into the priority queue which allows us to efficiently find the maximum one.
Once we have printed a given element we move onto the next one in the sequence and insert that into the priority queue.
By keeping track of the last element printed we can avoid duplicates.

What is the worst case for binary search

Where should an element be located in the array so that the run time of the Binary search algorithm is O(log n)?
The first or last element will give the worst case complexity in binary search as you'll have to do maximum no of comparisons.
Example:
1 2 3 4 5 6 7 8 9
Here searching for 1 will give you the worst case, with the result coming in 4th pass.
1 2 3 4 5 6 7 8
In this case, searching for 8 will give the worst case, with the result coming in 4 passes.
Note that in the second case searching for 1 (the first element) can be done in just 3 passes. (compare 1 & 4, compare 1 & 2 and finally 1)
So, if no. of elements are even, the last element gives the worst case.
This is assuming all arrays are 0 indexed. This happens due to considering the mid as float of (start + end) /2.
// Java implementation of iterative Binary Search
class BinarySearch
{
// Returns index of x if it is present in arr[],
// else return -1
int binarySearch(int arr[], int x)
{
int l = 0, r = arr.length - 1;
while (l <= r)
{
int m = l + (r-l)/2;
// Check if x is present at mid
if (arr[m] == x)
return m;
// If x greater, ignore left half
if (arr[m] < x)
l = m + 1;
// If x is smaller, ignore right half
else
r = m - 1;
}
// if we reach here, then element was
// not present
return -1;
}
// Driver method to test above
public static void main(String args[])
{
BinarySearch ob = new BinarySearch();
int arr[] = {2, 3, 4, 10, 40};
int n = arr.length;
int x = 10;
int result = ob.binarySearch(arr, x);
if (result == -1)
System.out.println("Element not present");
else
System.out.println("Element found at " +
"index " + result);
}
}
Time Complexity:
The time complexity of Binary Search can be written as
T(n) = T(n/2) + c
The above recurrence can be solved either using Recurrence T ree method or Master method. It falls in case II of Master Method and solution of the recurrence is Theta(Logn).
Auxiliary Space: O(1) in case of iterative implementation. In case of recursive implementation, O(Logn) recursion call stack space.

Get dynamic submatrix and apply constraints

I am really new to Constraint Programming and I am trying to solve a problem where from a two dimensional array, consisting of numbers, I need to take the least amount of sub arrays (2D) as possible, covering as much of the original 2D array as possible, obeying the following rules:
Every sub array must be a rectangle part of the original
The sum of numbers in each sub array must not exceed a specific number
Every sub array must have at least 2 numbers in it
For example for the following matrix:
3 5 1 4
5 1 2 8
0 8 1 3
8 3 2 1
For a maximum sum of 10, a solution would be:
3 -not picked
{ 5 1 4 }
{ 5 1 }
{ 2 8 }
{ 0 8 }
{ 1 3
2 1 }
8 -not picked
Right now I am using the diffn() equivalent of or-tools (MakeNonOverlappingBoxesConstraint()) to create the rectangles that are gonna cover the original array.
My problem is how to get the rectangles created by diffn() and split the original matrix based on the position and size of each one, so I can apply the Sum constraint.
If there is another way of achieving the same constraints without using the diffn() then I would try it out, but I can't think any other way.
Thank you!
The way to get a value from an array based on an IntVar, inside the solver, is by using the MakeElement() function and in this case the 2d version of it.
That way you can get a specific value from the matrix but not a range based on two IntVars (for example x - dx of rectangles). To accomplish the range part you can use a loop and a ConditionalExpression() to figure out if the specified value is in range.
For example in a 1d array, in order to get elements from data, positions x to x + dx would be as follows
int[] data = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
IntVar x = solver.MakeIntVar(0, data.Length - 1);
IntVar dx = solver.MakeIntVar(1, data.Length);
solver.Add(x + dx <= data.Length);
IntVarVector range = new IntVarVector();
for (int i = 0; i < dx.Max(); i++)
{
range.Add(solver.MakeConditionalExpression((x + i < x + dx).Var() , solver.MakeElement(data, (x + i).Var()), 0).Var());
}
solver.Add(range.ToArray().Sum() <= 10);
In Case of the 2d array (as in the question) then you just iterate through both dimensions. The only difference is that the 2d version of MakeElement() accepts an IndexEvaluator2 item (LongLongToLong in C#) so you have to make your own class that inherits LongLongToLong and override the Run() function.
class DataValues: LongLongToLong
{
private int[,] _data;
private int _rows;
private int _cols;
public DataValues(int[,] data, int rows, int cols)
{
_rows = rows;
_cols = cols;
_data = data;
}
public override long Run(long arg0, long arg1)
{
if (arg0 >= _rows || arg1 >= _cols)
return 0;
return _data[arg0, arg1];
}
}
The only problem with this class is that it can ask for a value off the array, so we must handle it ourselves with if (arg0 >= _rows || arg1 >= _cols).
P.S. I dont know if this is the best method of accomplishing it, but it was the best I could think of, since I couldn't find anything similar online.

Generate string permutations recursively; each character appears n times

I'm trying to write an algorithm that will generate all strings of length nm, with exactly n of each number 1, 2, ... m,
For instance all strings of length 6, with exactly two 1's, two 2's and two 3's e.g. 112233, 121233,
I managed to do this with just 1's and 2's using a recursive method, but can't seem to get something that works when I introduce 3's.
When m = 2, the algorithm I have is:
generateAllStrings(int len, int K, String str)
{
if(len == 0)
{
output(str);
}
if(K > 0)
{
generateAllStrings(len - 1, K - 1, str + '2');
}
if(len > K)
{
generateAllStrings(len - 1, K, str + '1');
}
}
I've tried inserting similar conditions for the third number but the algorithm doesn't give a correct output. After that I wouldn't even know how to generalise for 4 numbers and above.
Is recursion the right thing to do? Any help would be appreciated.
One option would be to list off all distinct permutations of the string 111...1222...2...nnn....n. There are nice algorithms for enumerating all distinct permutations of a string in time proportional to the length of the string, and they'd probably be a good way to go about solving this problem.
To use a simple recursive algorithm, give each recursion the permutation so far (variable perm), and the number of occurances of each digit that is still available (array count).
Run the code snippet to generate all unique permutations for n=2 and m=4 (set: 11223344).
function permutations(n, m) {
var perm = "", count = []; // start with empty permutation
for (var i = 0; i < m; i++) count[i] = n; // set available number for each digit = n
permute(perm, count); // start recursion with "" and [n,n,n...]
function permute(perm, count) {
var done = true;
for (var i = 0; i < count.length; i++) { // iterate over all digits
if (count[i] > 0) { // more instances of digit i available
var c = count.slice(); // create hard copy of count array
--c[i]; // decrement count of digit i
permute(perm + (i + 1), c); // add digit to permutation and recurse
done = false; // digits left over: not the last step
}
}
if (done) document.write(perm + "<BR>"); // no digits left: complete permutation
}
}
permutations(2, 4);
You can easily do this using DFS (or BFS alternatively). We can define an graph such that each node contains one string and a node is connected to any node that holds a string with a pair of int swaped in comparison to the original string. This graph is connected, thus we can easily generate a set of all nodes; which will contain all strings that are searched:
set generated_strings
list nodes
nodes.add(generateInitialString(N , M))
generated_strings.add(generateInitialString(N , M))
while(!nodes.empty())
string tmp = nodes.remove(0)
for (int i in [0 , N * M))
for (int j in distinct([0 , N * M) , i))
string new = swap(tmp , i , j)
if (!generated_strings.contains(new))
nodes.add(new)
generated_strings.add(new)
//generated_strings now contains all strings that can possibly be generated.

Traverse a graph in parallel

I'm revising for an exam (still) and have come across a question (posted below) that has me stumped. I think, in summary, the question is asking "Think of any_old_process that has to traverse a graph and do some work on the objects it finds, including adding more work.". My question is, what data structure can be parallelised to achieve the goals set out in the question?
The role of a garbage collector (GC) is to reclaim unused memory.
Tracing collectors must identify all live objects by traversing graphs
of objects induced by aggregation relationships. In brief, the GC has
some work-list of tasks to perform. It repeatedly (a) acquires a task
(e.g. an object to inspect), (b) performs the task (e.g. marks the
object unless it is already marked), and (c) generates further tasks
(e.g. adds the children of an unmarked task to the work-list). It is
desirable to parallelise this operation.
In a single-threaded
environment, the work-list is usually a single LIFO stack. What would
you have to do to make this safe for a parallel GC? Would this be a
sensible design for a parallel GC? Discuss designs of data structure
to support a parallel GC that would scale better. Explain why you
would expect them to scale better.
The natural data structure for a graph is, well, a graph, i.e. a set of graph elements (nodes) which can refer other elements. Though, for the better cache reuse, the elements can be placed/allocated in an array or arrays (generally, vectors) in order to put neighbor elements as close in memory as possible. Generally, each element or a group of elements should have a mutex (spin_mutex) to protect access to it, the contention means that some other thread is busy working on it, so no need to wait. Though, if possible, an atomic operation over the flag/state fields is preferable to mark the element as visited without a lock. For example, the simplest data structure can be the following:
struct object {
vector<object*> references;
atomic<bool> is_visited; // for simplicity, or epoch counter
// if nothing resets it to false
void inspect(); // processing method
};
vector<object> objects; // also for simplicity, if it can be for real
// things like `parallel_for` would be perfect here
Given this data structure and the way how GC work is described, it perfectly fits for a recursive parallelism like divide-and-conquer pattern:
void object::inspect() {
if( ! is_visited.exchange(true) ) {
for( object* o : objects ) // alternatively it can be `parallel_for` in some variants
cilk_spawn o->inspect(); // for Cilk or `task_group::run` for TBB or PPL
// further processing of the object
}
}
If the data structure in the question is how the tasks are organized. I'd recommend a work-stealing scheduler (like tbb or cilk. There are tons of papers on this subject. To put it simple, each worker thread has its own but shared deque of tasks, and when the deque is empty, a thread steals tasks from others deques.
The scalability comes from the property that each task can add some other tasks which can work in prarallel..
Your questions:
Think of any_old_process that has to traverse a graph and do some work on the objects it finds, including adding more work.
... what data structure can be parallelised to achieve the goals set out in the question?
Quoted questions:
Some stuff about garbage collection.
Since you are specifically interested in parallelizing graph algorithms, I'll give an example of one kind of graph traversal that can be parallelized well.
Executive Summary
Finding local minima ("basins") or maxima ("peaks") are useful operations in digital image processing. A concrete example is geological watershed analysis. One approach to the problem treats each pixel or small group of pixels in the image as a node and finds non-overlapping minimum spanning trees (MST) with the local minima as the tree roots.
Gory details
Below is a simplistic example. It's a web interview question from Palantir Technologies brought to Programming Puzzles & Code Golf by AnkitSablok. It's simplified by two assumptions (bolded below):
That a pixel/cell only has 4 neighbors instead of the usual eight.
That a cell has all uphill neighbors (it's the local minima) or has a unique downhill neighbor. I.e., plains aren't allowed.
Below that is some JavaScript that solves this problem. It violates every reasonable coding standard against use of side-effects, but illustrates where some of the opportunities for parallelization exist.
In the "Create list of sinks (i.e. roots)" loop, note that each cell can be evaluated completely independently for elevation with respect to it's neighbors as long as the elevation data is static. In a sequential program, one thread of execution examines each cell. In a parallel program, the cells are divvied up so that one, and only one, thread reads and writes the local minima state information (sink[] in the program below). If generating the list of minima/roots in parallel, the queuing operations for the stack would have to be synchronized. For a discussion how to do that for stacks and other queues, see "Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms", Michael & Scott, 1996. For modern updates, follow the citation tree on Google Scholar (no mutex required :).
In the "Each root explores it's basin" loop, note that each basin could explored/enumerated/flooded in parallel.
If you want dive deeper into parallelizing MSTs, see "Scalable Parallel Minimum Spanning Forest Computation", Nobari, Cao, arras, Bressan, 2012. The first two pages contain a clear and concise survey of the field.
Simplified example
A group of farmers has some elevation data, and we’re going to help them understand how rainfall flows over their farmland. We’ll represent the land as a two-dimensional array of altitudes and use the following model, based on the idea that water flows downhill:
If a cell’s four neighboring cells all have higher altitudes, we call this cell a sink; water collects in sinks. Otherwise, water will flow to the neighboring cell with the lowest altitude. If a cell is not a sink, you may assume it has a unique lowest neighbor and that this neighbor will be lower than the cell.
Cells that drain into the same sink – directly or indirectly – are said to be part of the same basin.
Your challenge is to partition the map into basins. In particular, given a map of elevations, your code should partition the map into basins and output the sizes of the basins, in descending order.
Assume the elevation maps are square. Input will begin with a line with one integer, S, the height (and width) of the map. The next S lines will each contain a row of the map, each with S integers – the elevations of the S cells in the row. Some farmers have small land plots such as the examples below, while some have larger plots. However, in no case will a farmer have a plot of land larger than S = 5000.
Your code should output a space-separated list of the basin sizes, in descending order. (Trailing spaces are ignored.)
Here's an example:
Input:
5
1 0 2 5 8
2 3 4 7 9
3 5 7 8 9
1 2 5 4 2
3 3 5 2 1
Output: 11 7 7
The basins, labeled with A’s, B’s, and C’s, are:
A A A A A
A A A A A
B B A C C
B B B C C
B B C C C
// lm.js - find the local minima
// Globalization of variables.
/*
The map is a 2 dimensional array. Indices for the elements map as:
[0,0] ... [0,n]
...
[n,0] ... [n,n]
Each element of the array is a structure. The structure for each element is:
Item Purpose Range Comment
---- ------- ----- -------
h Height of cell integers
s Is it a sink? boolean
x X of downhill cell (0..maxIndex) if s is true, x&y point to self
y Y of downhill cell (0..maxIndex)
b Basin name ('A'..'A'+# of basins)
Use a separate array-of-arrays for each structure item. The index range is
0..maxIndex.
*/
var height = [];
var sink = [];
var downhillX = [];
var downhillY = [];
var basin = [];
var maxIndex;
// A list of sinks in the map. Each element is an array of [ x, y ], where
// both x & y are in the range 0..maxIndex.
var basinList = [];
// An unordered list of basin sizes.
var basinSize = [];
// Functions.
function isSink(x,y) {
var myHeight = height[x][y];
var imaSink = true;
var bestDownhillHeight = myHeight;
var bestDownhillX = x;
var bestDownhillY = y;
/*
Visit the neighbors. If this cell is the lowest, then it's the
sink. If not, find the steepest downhill direction.
*/
function visit(deltaX,deltaY) {
var neighborX = x+deltaX;
var neighborY = y+deltaY;
if (myHeight > height[neighborX][neighborY]) {
imaSink = false;
if (bestDownhillHeight > height[neighborX][neighborY]) {
bestDownhillHeight = height[neighborX][neighborY];
bestDownhillX = neighborX;
bestDownhillY = neighborY;
}
}
}
if (x !== 0) {
// upwards neighbor exists
visit(-1,0);
}
if (x !== maxIndex) {
// downwards neighbor exists
visit(1,0);
}
if (y !== 0) {
// left-hand neighbor exists
visit(0,-1);
}
if (y !== maxIndex) {
// right-hand neighbor exists
visit(0,1);
}
downhillX[x][y] = bestDownhillX;
downhillY[x][y] = bestDownhillY;
return imaSink;
}
function exploreBasin(x,y,currentSize,basinName) {
// This cell is in the basin.
basin[x][y] = basinName;
currentSize++;
/*
Visit all neighbors that have this cell as the best downhill
path and add them to the basin.
*/
function visit(x,deltaX,y,deltaY) {
if ((downhillX[x+deltaX][y+deltaY] === x) && (downhillY[x+deltaX][y+deltaY] === y)) {
currentSize = exploreBasin(x+deltaX,y+deltaY,currentSize,basinName);
}
return 0;
}
if (x !== 0) {
// upwards neighbor exists
visit(x,-1,y,0);
}
if (x !== maxIndex) {
// downwards neighbor exists
visit(x,1,y,0);
}
if (y !== 0) {
// left-hand neighbor exists
visit(x,0,y,-1);
}
if (y !== maxIndex) {
// right-hand neighbor exists
visit(x,0,y,1);
}
return currentSize;
}
// Read map from file (1st argument).
var lines = $EXEC('cat "' + $ARG[0] + '"').split('\n');
maxIndex = lines.shift() - 1;
for (var i = 0; i<=maxIndex; i++) {
height[i] = lines.shift().split(' ');
// Create all other 2D arrays.
sink[i] = [];
downhillX[i] = [];
downhillY[i] = [];
basin[i] = [];
}
for (var i = 0; i<=maxIndex; i++) { print(height[i]); }
// Everyone decides if they are a sink. Create list of sinks (i.e. roots).
for (var x=0; x<=maxIndex; x++) {
for (var y=0; y<=maxIndex; y++) a
if (sink[x][y] = isSink(x,y)) {
// This node is a root (AKA sink).
basinList.push([x,y]);
}
}
}
//for (var i = 0; i<=maxIndex; i++) { print(sink[i]); }
// Each root explores it's basin.
var basinName = 'A';
for (var i=basinList.length-1; i>=0; --i) { // i-- makes Closure Compiler sad
var x = basinList[i][0];
var y = basinList[i][5];
basinSize.push(exploreBasin(x,y,0,basinName));
basinName = String.fromCharCode(basinName.charCodeAt() + 1);
}
for (var i = 0; i<=maxIndex; i++) { print(basin[i]); }
// Done.
print(basinSize.sort(function(a, b){return b-a}).join(' '));

Resources