why word ladder problem works different on GFG and leetcode? - string

In this question we have to print the all minimum size string to reach the target word. When i solve this question on GFG, it runs fine but not on LeetCode.
Here is my code:
class Solution {
public:
vector<vector<string>> findSequences(string beginWord, string endWord, vector<string>& wordList) {
unordered_set<string> st(wordList.begin(), wordList.end());
queue<vector<string>> p;
p.push({beginWord});
vector<string> usedOnLevel;
usedOnLevel.push_back(beginWord);
int level = 0;
vector<vector<string>> ans;
while (!p.empty()) {
vector<string> vec = p.front();
p.pop();
if (vec.size() > level) {
level++;
for (auto it : usedOnLevel) {
st.erase(it);
}
}
string word = vec.back();
if (word == endWord) {
if (ans.size() == 0) {
ans.push_back(vec);
} else if (ans.size() > 0 && ans[0].size() == vec.size()) {
ans.push_back(vec);
}
}
for (int i = 0; i < word.length(); i++) {
char original = word[i];
for (char ch = 'a'; ch <= 'z'; ch++) {
word[i] = ch;
if (st.find(word) != st.end()) {
usedOnLevel.push_back(word);
vec.push_back(word);
p.push(vec);
vec.pop_back();
}
}
word[i] = original;
}
}
return ans;
}
};

The difference is that leetcode throws bigger problems at you, and so correct code with poor performance is going to break. And your code has poor performance.
Why? Well, for a start, for each word you find a path to, for each possible substitution, you're looking through all words to find yours. So suppose I start with all of the 5 letter words in the official Scrabble dictionary. There are about 9000 of those. For each word you find you're going to come up with 26*5 = 130 possible new words, then search the entire 9000 word list for that for 1_170_000_000 word comparisons, mostly to find nothing. Your algorithm wanted to do more than just that, but it has already timed out.
How could you make that faster? Here is one idea. Create a data structure to answer the following question:
by position of the deleted letter:
by the resulting string:
list of words that matched
For the entire Scrabble dictionary this data structure only has around 45_000 entries. And makes it easy to find all words next to a given word in the word ladder.
OK, great! Is that enough? Well...probably not. You're starting from startWord and finding all chains of words you can find from there. Most of which are going nowhere near endWord and represent wasted work. If the minimum length chain is fairly long, this can easily be an exponential amount of wasted effort. How can we avoid it?
The answer is to do a breadth-first search from endWord to find out how far away each word is from endWord. In this search we can also record for each word which words moved you closer. Again, even for all of the Scrabble dictionary, this data structure will be of manageable size. And you can break it off as soon as you've found how to get to startWord.
But now with this pre-processing, it is easy to start with startWord and recursively find all solutions. Because all of the work you'll be doing is enumerating paths that you already know will work.

Related

How do I simple remove duplicates in my vector?

I am new to coding and struggling with a section in my code. I am at the part where i want to remove duplicate int values from my vector.
my duplicated vector contains: 1 1 2 1 4
my goal is to get a deduplicated vector: 1, 2, 4.
This is what I have so far, It also needs to be a rather simple solution. No pointers and fancy stuff as I still need to study those in the future.
for(int i = 0; i < duplicatedVector.size(); i++) {
int temp = duplicatedVector.at(i);
int counter = 0;
if(temp == duplicatedVector.at(i)) {
counter++;
if(counter > 1) {
deduplicatedVector.push_back(temp);
}
}
}
Could anyone tell me what I do wrong ? I genuinly am trying to iterate through the vector and delete duplicated int, in the given order.
Your algorithm is not well-enough thought out.
Break it up:
for each element of the original vector:
is it in the result vector?
yes: do nothing
no: add it to the result vector
You have your (1) loop, but the (2) part is confused. The result vector is not the same as the original vector, and is not to be indexed the same.
To determine whether an element is in a vector, you need a loop. Loop through your result vector to see if the element is in it. If you find it, it is, so break the inner loop. If you do not, you don't.
You can tell whether or not you found a duplicate by the final value of your inner loop index (the index into the result vector). If it equals result.size() then no duplicate was found.
Clearer variable naming might help as well. You are calling your original/source vector duplicatedVector, and your result vector deduplicatedVector. Even hasDuplicates and noDuplicates would be easier to mentally parse.
You could use a set since it eliminates duplicates:
#include <bits/stdc++.h>
using namespace std;
int main () {
vector<int> vec = vector<int>();
vector<int> dedupl = vector<int>();
vec.push_back(2);
vec.push_back(4);
vec.push_back(2);
vec.push_back(7);
vec.push_back(34);
vec.push_back(34);
set<int> mySet = set<int>();
for (int i = 0; i < vec.size(); i++) {
mySet.insert(vec[i]);
}
for (int elem : mySet) {
dedupl.push_back(elem);
}
for (int elem : dedupl) {
cout << elem << " ";
}
}

O(N) Simple Diffing Algorithm Implementation -- is this right?

I just posted this on HN but it doesn't seem to be getting much uptake, I had a question about diffing -- I wanted to know if an implementation I'm using is alright: it seems a little too simple, and the literature on diffing is dense.
Background: I've been building a rendering engine for a code editor the past couple of days. Rendering huge chunks of highlighted syntax can get laggy. It's not worth switching to React at this stage, so I wanted to just write a quick diff algorithm that would selectively update only changed lines.
I found this article:
https://blog.jcoglan.com/2017/02/12/the-myers-diff-algorithm-part-1/
With a link to this paper, the initial Git diff implementation:
http://www.xmailserver.org/diff2.pdf
I couldn't find the PDF to start with, but read "edit graph" and immediately thought — why don't I just use a hashtable to store lines from LEFT_TEXT and references to where they are, then iterate over RIGHT_TEXT and return matches one by one, also making sure that I keep track of the last match to prevent jumbling?
The algorithm I produced is only a few lines and seems accurate. It's O(N) time complexity, whereas the paper above gives a best case of O(ND) where D is minimum edit distance.
function lineDiff (left, right) {
left = left.split('\n');
right = right.split('\n');
let lookup = {};
// Store line numbers from LEFT in a lookup table
left.forEach(function (line, i) {
lookup[line] = lookup[line] || [];
lookup[line].push(i);
});
// Last line we matched
var minLine = -1;
return right.map(function (line) {
lookup[line] = lookup[line] || [];
var lineNumber = -1;
if (lookup[line].length) {
lineNumber = lookup[line].shift();
// Make sure we're looking ahead
if (lineNumber > minLine) {
minLine = lineNumber;
} else {
lineNumber = -1
}
}
return {
value: line,
from: lineNumber
};
});
}
RunKit link: https://runkit.com/keithwhor/line-diff
What am I missing? I can't find other references to doing diffing like this. Everything just links back to that one paper.

Leetcode--3 find the longest substring without repeating character

The target is simple--find the longest substring without repeating characters,here is the code:
class Solution {
public:
int lengthOfLongestSubstring(string s) {
int ans = 0;
int dic[256];
memset(dic, -1, sizeof(dic));
int len = s.size();
int idx = -1;
for (int i = 0;i < len;i++) {
char c = s[i];
if (dic[c] > idx)
idx = dic[c];
ans = max(ans, i - idx);
dic[c] = i;
}
return ans;
}
};
From its concise expression,I think this is a high-performance method,and we can get that its Time Complexity is just O(n).But I'm confused about this method,though I came up with some examples to understand,can anyone give some tips or idea to me?
What it is doing is recording the position where each character was last seen.
As you step through, it takes each new encountered character and the length of non-repeat goes back at least as far as that last-seen, but for future indices can't go back further, as we have now seen a duplicate.
So we are maintaining in idx, the start index of the latest-seen highest-starting duplicate, which is the candidate for the start of the longest non-duplicating sequence.
I'm certain that the ans = max() code code be optimised slightly, as after encountering a new duplicate, you have to go forward at least ans chars from the start of that duplicate before ans can be improved again. You still need to do the rest of the work maintaining dic and idx, but you could avoid that particular test for ans for a few iterations. You would have to do a lot of unrolling to benefit, though.

Iterative deepening search : Is it recursive?

I've searched the internet about the IDS algorithm and I keep finding some example but they are all with recursion, and as I understood iterative is not recursive..
So can you please give me some examples for IDS algorithm ?(implementation would be great and without recursion)
Thanks in advance! you will be a life saver!
The iterative part is not recursive: at the top it is more or less:
int limit = 0;
Solution sol;
do {
limit++;
sol = search(problem,limit);
} while(sol == null);
//do something with the solution.
This said, in most cases searching for a solution is indeed implemented recursively:
Solution search(Problem problem, int limit) {
return search(problem,0,limit);
}
Solution search (Problem problem, int price, int limit) {
if(problem.solved) {
return problem.getSolution();
}
for(int value = 0; value < valuerange; value++) {
problem.assignVariable(value);
int newprice = price + problem.price();
if(price < limit) {
Solution solution = search(problem,newprice,limit);
if(s != null) {
return solution;
}
}
problem.backtrackVariable();
}
return null;
}
But there exists an automatic procedure to turn any recursive program into a non-recursive one.
If you are thinking in algorithm terms (not just implementation), this would mean applying iteration at all nodes of the search tree, instead of just at the root node.
In the case of chess programs, this does have some bennefits. It improves move ordering, even in the case where a branch that was previously pruned by alpha-beta is later included. The cost of the extra search is kept low by using a transposition table.
https://www.chessprogramming.org/Internal_Iterative_Deepening

Most efficient way to find the common prefix of many strings

What is the most efficient way to find the common prefix of many strings.
For example:
For this set of strings
/home/texai/www/app/application/cron/logCron.log
/home/texai/www/app/application/jobs/logCron.log
/home/texai/www/app/var/log/application.log
/home/texai/www/app/public/imagick.log
/home/texai/www/app/public/status.log
I wanna get /home/texai/www/app/
I want to avoid char by char comparatives.
You cannot avoid going through at least the common parts to find common prefix.
I don't think this needs any fancy algorithm. Just keep track of the current common prefix, then shorten the prefix by comparing the current prefix with the next string.
Since this is common prefix of all strings, you may end up with empty string (no common prefix).
I'm not sure what you mean by avoid char by char comparative, but you at least need to read the common prefix from each of the strings, so the following algorithm is the best you can achieve (just iterate over the strings until they deviate or until the current longest prefix count is reached):
List<string> list = new List<string>()
{
"/home/texai/www/app/application/cron/logCron.log",
"/home/texai/www/app/application/jobs/logCron.log",
"/home/texai/www/app/var/log/application.log",
"/home/texai/www/app/public/imagick.log",
"/home/texai/www/app/public/status.log"
};
int maxPrefix = list[0].Length;
for(int i = 1; i < list.Count; i++)
{
int pos = 0;
for(; pos < maxPrefix && pos < list[i].Length && list[0][pos] == list[i][pos]; pos++);
maxPrefix = pos;
}
//this is the common prefix
string prefix = list[0].Substring(0, maxPrefix);

Resources