Shortest hamiltonian path with dynamic programming and bitmasking

Shortest hamiltonian path with dynamic programming and bitmasking - dynamic-programming

I've just read an article about how to find the shortest hamiltonian path using dynamic programming here http://codeforces.com/blog/entry/337.
While the pseudocode works, I do not understand why I have to take to use the xor operator on the set and 2^i.
Why wouldn't you just substract the current visisted city from the bitmask? What does the xor with the set in order to make the algorithm do it's magic?
To clarify here is the piece of pseudocode written in java:
public int calculate(int set, int i){
if(count(set) == 1 && (set & 1<<i) != 0){
return 0;
}
if ( dp[set][i] != infinity){
return dp[set][i];
}
for (int city=0;city<n;city++){
if((set & 1<<city) == 0) continue;
dp[set][i] = Math.min(dp[set][i], calculate(set ^ 1<<i, city) + dist[i][city]);
}
return dp[set][i];
}

Found the solution to my problem, the ^ is a bitflip. Thus if you have a bitmask and use the xor operator on the mask, you flip the bit on that place. E.g. 1010 ^ (1<<1) results in 1000.
Same goes for 1000 ^ (1<<1) = 1010.
The substraction also works, but with the xor operator you know for certain that you only touch the bit at that place, and none else. Image 1000 - (1<1), thus would result in something entirely different. Thus substraction works and can be used if you are 100% sure that at an 1 is at place i, but xor is safer.

Related

find the number of ways you can form a string on size N, given an unlimited number of 0s and 1s

The below question was asked in the atlassian company online test ,I don't have test cases , this is the below question I took from this link
find the number of ways you can form a string on size N, given an unlimited number of 0s and 1s. But
you cannot have D number of consecutive 0s and T number of consecutive 1s. N, D, T were given as inputs,
Please help me on this problem,any approach how to proceed with it
My approach for the above question is simply I applied recursion and tried for all possiblity and then I memoized it using hash map
But it seems to me there must be some combinatoric approach that can do this question in less time and space? for debugging purposes I am also printing the strings generated during recursion, if there is flaw in my approach please do tell me
#include <bits/stdc++.h>
using namespace std;
unordered_map<string,int>dp;
int recurse(int d,int t,int n,int oldd,int oldt,string s)
{
if(d<=0)
return 0;
if(t<=0)
return 0;
cout<<s<<"\n";
if(n==0&&d>0&&t>0)
return 1;
string h=to_string(d)+" "+to_string(t)+" "+to_string(n);
if(dp.find(h)!=dp.end())
return dp[h];
int ans=0;
ans+=recurse(d-1,oldt,n-1,oldd,oldt,s+'0')+recurse(oldd,t-1,n-1,oldd,oldt,s+'1');
return dp[h]=ans;
}
int main()
{
int n,d,t;
cin>>n>>d>>t;
dp.clear();
cout<<recurse(d,t,n,d,t,"")<<"\n";
return 0;
}

You are right, instead of generating strings, it is worth to consider combinatoric approach using dynamic programming (a kind of).
"Good" sequence of length K might end with 1..D-1 zeros or 1..T-1 of ones.
To make a good sequence of length K+1, you can add zero to all sequences except for D-1, and get 2..D-1 zeros for the first kind of precursors and 1 zero for the second kind
Similarly you can add one to all sequences of the first kind, and to all sequences of the second kind except for T-1, and get 1 one for the first kind of precursors and 2..T-1 ones for the second kind
Make two tables
Zeros[N][D] and Ones[N][T]
Fill the first row with zero counts, except for Zeros[1][1] = 1, Ones[1][1] = 1
Fill row by row using the rules above.
Zeros[K][1] = Sum(Ones[K-1][C=1..T-1])
for C in 2..D-1:
Zeros[K][C] = Zeros[K-1][C-1]
Ones[K][1] = Sum(Zeros[K-1][C=1..T-1])
for C in 2..T-1:
Ones[K][C] = Ones[K-1][C-1]
Result is sum of the last row in both tables.
Also note that you really need only two active rows of the table, so you can optimize size to Zeros[2][D] after debugging.

This can be solved using dynamic programming. I'll give a recursive solution to the same. It'll be similar to generating a binary string.
States will be:
i: The ith character that we need to insert to the string.
cnt: The number of consecutive characters before i
bit: The character which was repeated cnt times before i. Value of bit will be either 0 or 1.
Base case will: Return 1, when we reach n since we are starting from 0 and ending at n-1.
Define the size of dp array accordingly. The time complexity will be 2 x N x max(D,T)
#include<bits/stdc++.h>
using namespace std;
int dp[1000][1000][2];
int n, d, t;
int count(int i, int cnt, int bit) {
if (i == n) {
return 1;
}
int &ans = dp[i][cnt][bit];
if (ans != -1) return ans;
ans = 0;
if (bit == 0) {
ans += count(i+1, 1, 1);
if (cnt != d - 1) {
ans += count(i+1, cnt + 1, 0);
}
} else {
// bit == 1
ans += count(i+1, 1, 0);
if (cnt != t-1) {
ans += count(i+1, cnt + 1, 1);
}
}
return ans;
}
signed main() {
ios_base::sync_with_stdio(false), cin.tie(nullptr);
cin >> n >> d >> t;
memset(dp, -1, sizeof dp);
cout << count(0, 0, 0);
return 0;
}

Counter for two binary strings C++

I am trying to count two binary numbers from string. The maximum number of counting digits have to be 253. Short numbers works, but when I add there some longer numbers, the output is wrong. The example of bad result is "10100101010000111111" with "000011010110000101100010010011101010001101011100000000111000000000001000100101101111101000111001000101011010010111000110".
#include <iostream>
#include <stdlib.h>
using namespace std;
bool isBinary(string b1,string b2);
int main()
{
string b1,b2;
long binary1,binary2;
int i = 0, remainder = 0, sum[254];
cout<<"Get two binary numbers:"<<endl;
cin>>b1>>b2;
binary1=atol(b1.c_str());
binary2=atol(b2.c_str());
if(isBinary(b1,b2)==true){
while (binary1 != 0 || binary2 != 0){
sum[i++] =(binary1 % 10 + binary2 % 10 + remainder) % 2;
remainder =(binary1 % 10 + binary2 % 10 + remainder) / 2;
binary1 = binary1 / 10;
binary2 = binary2 / 10;
}
if (remainder != 0){
sum[i++] = remainder;
}
--i;
cout<<"Result: ";
while (i >= 0){
cout<<sum[i--];
}
cout<<endl;
}else cout<<"Wrong input"<<endl;
return 0;
}
bool isBinary(string b1,string b2){
bool rozhodnuti1,rozhodnuti2;
for (int i = 0; i < b1.length();i++) {
if (b1[i]!='0' && b1[i]!='1') {
rozhodnuti1=false;
break;
}else rozhodnuti1=true;
}
for (int k = 0; k < b2.length();k++) {
if (b2[k]!='0' && b2[k]!='1') {
rozhodnuti2=false;
break;
}else rozhodnuti2=true;
}
if(rozhodnuti1==false || rozhodnuti2==false){ return false;}
else{ return true;}
}

One of the problems might be here: sum[i++]
This expression, as it is, first returns the value of i and then increases it by one.
Did you do it on purporse?
Change it to ++i.
It'd help if you could also post the "bad" output, so that we can try to move backward through the code starting from it.
EDIT 2015-11-7_17:10
Just to be sure everything was correct, I've added a cout to check what binary1 and binary2 contain after you assing them the result of the atol function: they contain the integer numbers 547284487 and 18333230, which obviously dont represent the correct binary-to-integer transposition of the two 01 strings you presented in your post.
Probably they somehow exceed the capacity of atol.
Also, the result of your "math" operations bring to an even stranger result, which is 6011111101, which obviously doesnt make any sense.
What do you mean, exactly, when you say you want to count these two numbers? Maybe you want to make a sum? I guess that's it.
But then, again, what you got there is two signed integer numbers and not two binaries, which means those %10 and %2 operations are (probably) misused.
EDIT 2015-11-07_17:20
I've tried to use your program with small binary strings and it actually works; with small binary strings.
It's a fact(?), at this point, that atol cant handle numerical strings that long.
My suggestion: use char arrays instead of strings and replace 0 and 1 characters with numerical values (if (bin1[i]){bin1[i]=1;}else{bin1[i]=0}) with which you'll be able to perform all the math operations you want (you've already written a working sum function, after all).
Once done with the math, you can just convert the char array back to actual characters for 0 and 1 and cout it on the screen.
EDIT 2015-11-07_17:30
Tested atol on my own: it correctly converts only strings that are up to 10 characters long.
Anything beyond the 10th character makes the function go crazy.

Given length and number of digits,we have to find minimum and maximum number that can be made?

As the question states,we are given a positive integer M and a non-negative integer S. We have to find the smallest and the largest of the numbers that have length M and sum of digits S.
Constraints:
(S>=0 and S<=900)
(M>=1 and M<=100)
I thought about it and came to conclusion that it must be Dynamic Programming.However I failed to build DP state.
This is what I thought:-
dp[i][j]=First 'i' digits having sum 'j'
And tried to make program.This is how it looks like
/*
*** PATIENCE ABOVE PERFECTION ***
"When in doubt, use brute force. :D"
-Founder of alloj.wordpress.com
*/
#include<bits/stdc++.h>
using namespace std;
#define pb push_back
#define mp make_pair
#define nline cout<<"\n"
#define fast ios_base::sync_with_stdio(false),cin.tie(0)
#define ull unsigned long long int
#define ll long long int
#define pii pair<int,int>
#define MAXX 100009
#define fr(a,b,i) for(int i=a;i<b;i++)
vector<int>G[MAXX];
int main()
{
int m,s;
cin>>m>>s;
int dp[m+1][s+1];
fr(1,m+1,i)
fr(1,s+1,j)
fr(0,10,k)
dp[i][j]=min(dp[i-1][j-k]+k,dp[i][j]); //Tried for Minimum
cout<<dp[m][s]<<endl;
return 0;
}
Please guide me about this DP state and what will be the time complexity of the program.This is my first try of DP.

dp solution goes here :-
#include<iostream>
using namespace std;
int dp[102][902][2] ;
void print_ans(int m , int s , int flag){
if(m==0)
return ;
cout<<dp[m][s][flag];
if(dp[m][s][flag]!=-1)
print_ans(m-1 , s-dp[m][s][flag] , flag );
return ;
}
int main(){
//freopen("problem.in","r",stdin);
//freopen("out.txt","w",stdout);
//int t;
//cin>>t;
//while(t--){
int m , s ;
cin>>m>>s;
if(s==0){
cout<<(m==1?"0 0":"-1 -1");
return 0;
}
for(int i = 0 ; i <=m ; i++){
for(int j=0 ; j<=s ;j++){
dp[i][j][0]=-1;
dp[i][j][1]=-1;
}
}
for(int i = 0 ; i < 10 ; i++){
dp[1][i][0]=i;
dp[1][i][1]=i;
}
for(int i = 2 ; i<=m ; i++){
for(int j = 0 ; j<=s ; j++){
int flag = -1;
int f = -1;
for(int k = 0 ; k <= 9 ; k++){
if(i==m&&k==0)
continue;
if( j>=k && flag==-1 && dp[i-1][j-k][0]!=-1)
flag = k;
}
for(int k = 9 ; k >=0 ;k--){
if(i==m&&k==0)
continue;
if( j>=k && f==-1 && dp[i-1][j-k][1]!=-1)
f = k;
}
dp[i][j][0]=flag;
dp[i][j][1]=f;
}
}
if(m!=0){
print_ans(m , s , 0);
cout<<" ";
print_ans(m,s,1);
}
else
cout<<"-1 -1";
cout<<endl;
// }
}

The DP state is (i,j). It can be thought of as the parameters of a mathematical function defined in terms of recurrences(Smaller problems ,Hence sub problems!)
More deeply,
State is generally the number of parameters to identify the problem uniquely , so that we always know on what we are computing on!!
Let us take the example of your question only
Just to define your problem we will need Number of Digits in the state + Sums that can be formed with these Digits (Note: You are kind of collectively keeping the sum while traversing through digits!)
I think that is enough for the state part.
Now,
Running time of Dynamic Programming is very simple.
First Let us see how many sub problems exist in a problem :
You need to fill up each and every state i.e. You have to cover all the unique sub problems smaller than or equal to the whole problem !!
Which problem is smaller than the other is known by the recurrent relation !!
For example:
Fibonacci Sequence
F(n)=F(n-1)+F(n-2)
Note the base case , is always the smallest sub problem .!!
Note Here for F(n) We have to calculate F(n-1) and F(n-2) , And it will reach a stage where n=1 , where you need to return the base case!!
Hence the total number of sub problems can be said as all the problems between the base case and the current problem!
Now,
In bottom up , we need to process each and every state in terms of size between this base case and problem!
Now, This tells us that the Running time should be
O(Number of Subproblems * Time per each subproblem).
So how many subproblems exist in your solution DP[0][0] to DP[M][S]
and for every problem you are running a loop of 10
O( M*S (Subproblems ) * 10 )
Chop that constant of!
But it is not necessarily a constant always!!
Here is some code which you might want to look! Feel free to ask anything !
#include<bits/stdc++.h>
using namespace std;
bool DP[9][101];
int Number[9][101];
int main()
{
DP[0][0]=true; // It is possible to form 0 using NULL digits!!
int N=9,S=100,i,j,k;
for(i=1;i<=9;++i)
for(j=0;j<=100;++j)
{
if(DP[i-1][j])
{
for(k=0;k<=9;++k)
if(j+k<=100)
{
DP[i][j+k]=true;
Number[i][j+k]=Number[i-1][j]*10+k;
}
}
}
cout<<Number[9][81]<<"\n";
return 0;
}
You can rather use backtracking rather than storing the numbers directly just because your constraints are high!
DP[i][j] represents if it is possible to form sum of digits using i digits only!!
Number[i][j]
is my laziness to avoid typing a backtrack way(Sleepy, its already 3A.M.)
I am trying to add all the possible digits to extend the state.
It is essentially kind of forward DP style!! You can read more about it at Topcoder

How to find the longest continuous sub-string in a string?

For example, there is a given string which is consisted of 1s and 0s:
s = "00000000001111111111100001111111110000";
What is the efficient way to get the count of longest 1s substring in s? (11)
What is the efficient way to get the count of longest 0s substring in s? (10)
I appreciate the question would be answered from an algorithmic perspective.

I think the most straight-forward way is to walk through the bit-string while recording the max lengths for all 0 and all 1 sub-strings. This is of O (n) complexity as suggested by others.
If you can afford some sort of a data-parallel computation, you might want to look at parallel patterns as explained here. Specifically, take a look at parallel reduction. I think this problem can be implemented in O (log n) time if you can afford one of those methods.
I'm trying to think of a parallel reduction for this problem:
On the first level of the reduction, each thread will process chunks of 8 bit strings (depending on the number of threads you have and the length of the string) and produce a summary of the bit string like: 0 -> x, 1 -> y, 0 -> z, ....
On the next level each thread will merge two of these summaries into one, any possible joins will be performed at this phase (basically, if the previous summary ended with a 0 (1) and the next summary begins with a 0 (1), then the last entry and the first entry of the two summaries can be collapsed into one).
On the top level there will be just one structure with the overall summary of the bit string, which you'll have to step through to figure out the largest sequences (but this time they are all in summary form, so it should be faster). Or, you can make each summary structure keep track of the larges 0 and 1 sub-strings, this will make it unnecessary to walk through the final structure.
I guess this approach only makes sense in a very limited scope, but since you seem to be very keen on getting better than O (n)...

OK, here is one solution I come up with, I'm not sure whether this is bug-free. Correct me if you discover a bug or suggest a better way to do it. Vote it if you agree with this solution. Thanks!
#include <iostream>
using namespace std;
int main(){
int s[] = {0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0};
int length = sizeof(s) / sizeof(s[0]);
int one_start = 0;
int one_n = 0;
int max_one_n = 0;
int zero_start = 0;
int zero_n = 0;
int max_zero_n = 0;
for(int i=0; i<length; i++){
// Calculate 1s
if(one_start==0 && s[i]==1){
one_start = 1;
one_n++;
}
else if(one_start==1 && s[i]==1){
one_n++;
}
else if(one_start==1 && s[i]==0){
one_start = 0;
if(one_n > max_one_n){
max_one_n = one_n;
}
one_n = 0; // Reset
}
// Calculate 0s
if(zero_start==0 && s[i]==0){
zero_start = 1;
zero_n++;
}
else if(zero_start==1 && s[i]==0){
zero_n++;
}
else if(one_start==1 && s[i]==1){
zero_start = 0;
if(zero_n > max_zero_n){
max_zero_n = zero_n;
}
zero_n = 0; // Reset
}
}
if(one_n > max_one_n){
max_one_n = one_n;
}
if(zero_n > max_zero_n){
max_zero_n = zero_n;
}
cout << "max_one_n: " << max_one_n << endl;
cout << "max_zero_n: " << max_zero_n << endl;
return 0;
}

Worst case is always O(n), you can always find input which forces the algorithm to check every bit.
But you can probably get average slightly better than that (more simply if you scan just for 0 or 1, not both), because you can skip the length of currently found longest sequence and scan backwards. At the very least this will reduce the constant factor of O(n), but at least with random input, more items also means longer sequences, and thus longer and longer skips. But the difference to O(n) will not be much...

Efficient string sorting algorithm

Sorting strings by comparisons (e.g. standard QuickSort + strcmp-like function) may be a bit slow, especially for long strings sharing a common prefix (the comparison function takes O(s) time, where s is the length of string), thus a standard solution has the complexity of O(s * nlog n). Are there any known faster algorithms?

If you know that the string consist only of certain characters (which is almost always the case), you can use a variant of BucketSort or RadixSort.

You could build a trie, which should be O(s*n), I believe.

Please search for "Sedgewick Multikey quick sort" (Sedgewick wrote famous algorithms textbooks in C and Java). His algorithm is relatively easy to implement and quite fast. It avoids the problem you are talking above. There is the burst sort algorithm which claims to be faster, but I don't know of any implementation.
There is an article Fast String Sort in C# and F# that describes the algorithm and has a reference to Sedgewick's code as well as to C# code. (disclosure: it's an article and code that I wrote based on Sedgewick's paper).

Summary
I found the string_sorting
repo by Tommi Rantala comprehensive, it includes many known efficient (string) sorting algorithms, e.g. MSD radix sort, burstsort and multi-key-quicksort. In addition, most of them are also cache efficient.
My Experience
It appears to me three-way radix/string quicksort is one of the fastest string sorting algorithms. Also, MSD radix sort is a good one. They are introduced in Sedgewick's excellent Algorithms book.
Here are some results to sort leipzig1M.txt taken from here:
$ wc leipzig1M.txt
# lines words characters
1'000'000 21'191'455 129'644'797 leipzig1M.txt
Method
Time
Hoare
7.8792s
Quick3Way
7.5074s
Fast3Way
5.78015s
RadixSort
4.86149s
Quick3String
4.3685s
Heapsort
32.8318s
MergeSort
16.94s
std::sort/introsort
6.10666s
MSD+Q3S
3.74214s
The charming thing about three-way radix/string quicksort is it is really simple to implement, effectively only about ten source lines of code.
template<typename RandomIt>
void insertion_sort(RandomIt first, RandomIt last, size_t d)
{
const int len = last - first;
for (int i = 1; i < len; ++i) {
// insert a[i] into the sorted sequence a[0..i-1]
for (int j = i; j > 0 && std::strcmp(&(*(first+j))[d], &(*(first+j-1))[d]) < 0; --j)
iter_swap(first + j, first + j - 1);
}
}
template<typename RandomIt>
void quick3string(RandomIt first, RandomIt last, size_t d)
{
if (last - first < 2) return;
#if 0 // seems not to help much
if (last - first <= 8) { // change the threshold as you like
insertion_sort(first, last, d);
return;
}
#endif
typedef typename std::iterator_traits<RandomIt>::value_type String;
typedef typename string_traits<String>::value_type CharT;
typedef std::make_unsigned_t<CharT> UCharT;
RandomIt lt = first, i = first + 1, gt = last - 1;
/* make lo = median of {lo, mid, hi} */
RandomIt mid = lt + ((gt - lt) >> 1);
if ((*mid)[d] < (*lt)[d]) iter_swap(lt, mid);
if ((*mid)[d] < (*gt)[d]) iter_swap(gt, mid);
// now mid is the largest of the three, then make lo the median
if ((*lt)[d] < (*gt)[d]) iter_swap(lt, gt);
UCharT pivot = (*first)[d];
while (i <= gt) {
int diff = (UCharT) (*i)[d] - pivot;
if (diff < 0) iter_swap(lt++, i++);
else if (diff > 0) iter_swap(i, gt--);
else ++i;
}
// Now a[lo..lt-1] < pivot = a[lt..gt] < a[gt+1..hi].
quick3string(first, lt, d); // sort a[lo..lt-1]
if (pivot != '\0')
quick3string(lt, gt+1, d+1); // sort a[lt..gt] on following character
quick3string(gt+1, last, d); // sort a[gt+1..hi]
}
/*
* Three-way string quicksort.
* Similar to MSD radix sort, we first sort the array on the leading character
* (using quicksort), then apply this method recursively on the subarrays. On
* first sorting, a pivot v is chosen, then partition it in 3 parts, strings
* whose first character are less than v, equal to v, and greater than v. Just
* like the partitioning in classic quicksort but with comparing only the 1st
* character instead of the whole string. After partitioning, only the middle
* (equal-to-v) part can sort on the following character (index of d+1). The
* other two recursively sort on the same depth (index of d) because these two
* haven't been sorted on the dth character (just partitioned them: <v or >v).
*
* Time complexity: O(N~N*lgN), space complexity: O(lgN).
* Explaination: N * string length (for partitioning, find equal-to-v part) +
* O(N*lgN) (to do the quicksort thing)
* character comparisons (instead of string comparisons in normal quicksort).
*/
template<typename RandomIt>
void str_qsort(RandomIt first, RandomIt last)
{
quick3string(first, last, 0);
}
NOTE: But if you like me searching Google for "fastest string sorting algorithm", chances are it's burstsort, a cache-aware MSD radix sort variant (paper). I also found this paper by Bentley and Sedgewick helpful, which used a Multikey Quicksort.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string