Implement phone directory using two tries - string

I have encountered an interview question
“Implement a phone directory using Data Structures”
I want to solve it using tries.By solving it with tries,I tried using two tries,one for name and another for phone number,
but I faced a difficulty .
Suppose ,I have to add three entries( AB “112” BC ”124” CD ”225”)
Then if I query the name for number “225”,how do I return CD.
that is,how these two tries will be linked .
One approach I was thinking was taking two pointers in both the tries.
These pointers will point to the first and last word in the other trie.
For example,if the structures are as follows:
Struct nametrie
{
Struct nametrie *child[26];
struct phonetrie*head,*tail;
struct phonetrie*root;
-----------
}
Struct phonetrie
{
struct phonetrie*child[9];
struct nametrie*head,*tail;
struct nametrie*root;
-----------
}
Then for AB “112”,
Name trie willstore head(1) and tail (2).
But I think this approach will not work for duplicate entries(one name and multiple numbers.)
Can someone please explain a good approach.I am not looking for code but good understanding of approach,may be via diagram or algorithm.

I dont know C so I cant comment in your code.
The idea of using tries is valid.
you seems to be missing what data the nodes can hold in tries
the node in trees has 2 main components
the data it has which can be anytype
list of childen (or left , right childeren) or any combination of children
what we will do here is that we will add another field to each node and call it the value "theValue"
So the trie node will look like this
Class TrieNode{
public char theChar;
public String theValue;
public List<TrieNode> children;
}
So for forward lookup (name to phone) you construct one Trie and on the node that match entry in the directory you will set theValue to that entrie.
you will need to create 2nd trie to do the same for reverse lookup (phone to name)
So to give you example how it will look like for this data it will be
( AB “112” AC ”124” ACD ”225”)
//create nodes
TrieNode root = new TrieNode();
TrieNode A = new TrieNode();
A.theChar = 'A';
TrieNode B = new TrieNode();
A.theChar = 'B';
TrieNode C = new TrieNode();
A.theChar = 'C';
TrieNode C2 = new TrieNode();
A.theChar = 'C';
TrieNode D = new TrieNode();
A.theChar = 'D';
//link nodes together
root.children = new ArrayList<>();
root.children.add(A);
A.children = new ArrayList<>();
A.children.add(B);
A.children.add(C);
B.children = new ArrayList<>();
B.children.add(C2);
//fill the data
B.theValue = "112";
C.theValue = "124";
C2.theValue = "225";
now you can easy traverse this Trie and when you reach a node and whant to check the value just read theValue
i hope it is clear

Related

AS3 "Advanced" string manipulation

I'm making an air dictionary and I have a(nother) problem. The main app is ready to go and works perfectly but when I tested it I noticed that it could be better. A bit of context: the language (ancient egyptian) I'm translating from does not use punctuation so a phrase canlooklikethis. Add to that the sheer complexity of the glyph system (6000+ glyphs).
Right know my app works like this :
user choose the glyphs composing his/r word.
app transforms those glyphs to alphanumerical values (A1 - D36 - X1A, etc).
the code compares the code (say : A5AD36) to a list of xml values.
if the word is found (A5AD36 = priestess of Bast), the user gets the translation. if not, s/he gets all the possible words corresponding to the two glyphs (A5A & D36).
If the user knows the string is a word, no problem. But if s/he enters a few words, s/he'll have a few more choices than hoped (exemple : query = A1A5AD36 gets A1 - A5A - D36 - A5AD36).
What I would like to do is this:
query = A1A5AD36 //word/phrase to be translated;
varArray = [A1, A5A, D36] //variables containing the value of the glyphs.
Corresponding possible words from the xml : A1, A5A, D36, A5AD36.
Possible phrases: A1 A5A D36 / A1 A5AD36 / A1A5A D36 / A1A5AD36.
Possible phrases with only legal words: A1 A5A D36 / A1 A5AD36.
I'm not I really clear but to things simple, I'd like to get all the possible phrases containing only legal words and filter out the other ones.
(example with english : TOBREAKFAST. Legal = to break fast / to breakfast. Illegal = tobreak fast.
I've managed to get all the possible words, but not the rest. Right now, when I run my app, I have an array containing A1 - A5A - D36 - A5AD36. But I'm stuck going forward.
Does anyone have an idea ? Thank you :)
function fnSearch(e: Event): void {
var val: int = sp.length; //sp is an array filled with variables containing the code for each used glyph.
for (var i: int = 0; i < val; i++) { //repeat for every glyph use.
var X: String = ""; //variable created to compare with xml dictionary
for (var i2: int = 0; i2 < val; i2++) { // if it's the first time, use the first glyph-code, else the one after last used.
if (X == "") {
X = sp[i];
} else {
X = X + sp[i2 + i];
}
xmlresult = myXML.mot.cd; //xmlresult = alphanumerical codes corresponding to words from XMLList already imported
trad = myXML.mot.td; //same with traductions.
for (var i3: int = 0; i3 < xmlresult.length(); i3++) { //check if element X is in dictionary
var codeElement: XML = xmlresult[i3]; //variable to compare with X
var tradElement: XML = trad[i3]; //variable corresponding to codeElement
if (X == codeElement.toString()) { //if codeElement[i3] is legal, add it to array of legal words.
checkArray.push(codeElement); //checkArray is an array filled with legal words.
}
}
}
}
var iT2: int = 500 //iT2 set to unreachable value for next lines.
for (var iT: int = 0; iT < checkArray.length; iT++) { //check if the word searched by user is in the results.
if (checkArray[iT] == query) {
iT2 = iT
}
}
if (iT2 != 500) { //if complete query is found, put it on top of the array so it appears on top of the results.
var oldFirst: String = checkArray[0];
checkArray[0] = checkArray[iT2];
checkArray[iT2] = oldFirst;
}
results.visible = true; //make result list visible
loadingResults.visible = false; //loading screen
fnPossibleResults(null); //update result list.
}
I end up with an array of variables containing the glyph-codes (sp) and another with all the possible legal words (checkArray). What I don't know how to do is mix those two to make legal phrases that way :
If there was only three glyphs, I could probably find a way, but user can enter 60 glyphs max.

Threadsafe mutable collection with fast elements removal and random get

I need a thread safe data structure with three operations: remove, getRandom, reset.
I have only two ideas by now.
First: Seq in syncronized var.
val all: Array[String] = ... //all possible.
var current: Array[String] = Array.empty[String]
def getRandom(): = {
val currentAvailable = current
currentAvailable(Random.nextInt(currentAvailable.length))
}
def remove(s: String) = {
this.syncronized {
current = current diff Seq(s)
}
}
def reset(s: String) = {
this.syncronized {
current = all
}
}
Second:
Maintain some Map[String,Boolean], there bool is true when element currently is present. The main problem is to make a fast getRandom method (not something like O(n) in worst case).
Is there a better way(s) to implement this?
Scala's Trie is a lock free data structure that supports snapshots (aka your currentAvailable) and fast removals
Since I'm not a Scala expert so this answer is general as an example I used Java coding.
in short the answer is YES.
if you use a map such as :
Map<Integer,String> map=new HashMap<Integer,String>(); //is used to get random in constant time
Map<String,Integer> map1=new HashMap<String,Integer>(); //is used to remove in constant time
to store date,
the main idea is to keep the key( in this case the integer) synchronized to be {1 ... size of map}
for example to fill this structure, you need something like this:
int counter=0; //this is a global variable
for(/* all your string (s) in all */ ){
map.put(counter++, s);
}
//then , if you want the removal to be in constant time you need to fill the second map
for(Entry e : map.EntrySet(){
map1.put(e.getValue(),e.getKey());
}
The above code is the initialization. everytime you want to set things you need to do that
then you can achieve a random value with O(1) complexity
String getRandom(){
int i; /*random number between 0 to counter*/
return map.get(i);
}
Now to remove things you use map1 to achive it in constant time O(1);
void remove(String s){
if(!map1.containsKey(s))
return; //s doesn't exists
String val=map.get(counter); //value of the last
map.remove(counter) //removing the last element
int thisCounter= map1.get(s); //pointer to this
map1.remove(s); // remove from map1
map.remove(counter); //remove from map
map1.put(thisCounter,val); //the val of the last element with the current pointer
counter--; //reducing the counter by one
}
obviously the main issue here is to keep the synchronization ensured. but by carefully analyzing the code you should be able to do that.

Linq to split/analyse substrings

I have got a List of strings like:
String1
String1.String2
String1.String2.String3
Other1
Other1.Other2
Test1
Stuff1.Stuff1
Text1.Text2.Text3
Folder1.Folder2.FolderA
Folder1.Folder2.FolderB
Folder1.Folder2.FolderB.FolderC
Now I would like to group this into:
String1.String2.String3
Other1.Other2
Test1
Stuff1.Stuff1
Text1.Text2.Text3
Folder1.Folder2.FolderA
Folder1.Folder2.FolderB.FolderC
If
"String1" is in the next item "String1.String2" I will ignore the first one
and if the second item is in the third I will only take the third "String1.String2.String3"
and so on (n items). The string is structured like a node/path and could be split by a dot.
As you can see for the Folder example Folder2 has got two different Subfolder items so I would need both strings.
Do you know how to handle this with Linq? I would prefer VB.Net but C# is also ok.
Regards Athu
Dim r = input.Where(Function(e, i) i = input.Count - 1 OrElse Not input(i + 1).StartsWith(e + ".")).ToList()
Condition within Where method checks if element is last from input or is not followed by element, that contains current one.
That solution uses the fact, that input is List(Of String), so Count and input(i+1) are available on O(1) time.
LINQ isn't really the correct approach here, because you need to access more than one item at a time.
I would go with something like this:
public static IEnumerable<string> Filter(this IEnumerable<string> source)
{
string previous = null;
foreach(var current in source)
{
if(previous != null && !current.Contains(previous))
yield return previous;
previous = current;
}
yield return previous;
}
Usage:
var result = strings.Filter();
Pretty simple one. Try this:
var lst = new List<string> { /*...*/ };
var sorted =
from item in lst
where lst.Last() == item || !lst[lst.IndexOf(item) + 1].Contains(item)
select item;
the following simple line can do the trick, I'm not sure about the performance cost through
List<string> someStuff = new List<string>();
//Code to the strings here, code not added for brewity
IEnumerable<string> result = someStuff.Where(s => someStuff.Count(x => x.StartsWith(s)) == 1);

Count of nodes in BST

I am trying to count the number of nodes in a Binary Search Tree and was wondering what the most efficient means was. These are the options that I have found:
store int count in the BST Class
store int children in each node of the tree which stores the number of children under it
write a method that counts the number of Nodes in the BST
if using option 3, I've written:
int InOrder {
Node *cur = root;
int count = 0;
Stack *s = null;
bool done = false;
while(!done) {
if(cur != NULL) {
s.push(cur);
cur = cur->left;
}
else {
if(!s.IsEmpty()) {
cur = s.pop();
count++;
cur = cur->right;
}
else {
done = true;
}
}
}
return count;
}
but from looking at it, it seems like it would get stuck in an infinite loop between cur = cur->left; and cur = cur->right;
So which option is the most efficient and if it is option 3, then will this method work?
I think the first option is the quickest and it only requires O(1) space to achieve this. However whenever you insert/delete an item, you need to keep updating this value.
It will take O(1) time to get the number of all the nodes.
The second option would make this program way too complicated since deleting/inserting a node somewhere would have to update all of its ancestors. Either you add a parent pointer so you can adequately update each one of the ancestors, or you need to go through all the nodes in the tree and update the numbers again. Anyway I think this would be the worst option of all three.
The third option is good if you don't call this many times since the first option is a lot quicker, O(1), than this option. This will take O(n) since you need to go through every single node to check the count.
In terms of your code, I think it's easier to write in a recursive way like below:
int getCount(Node* n)
{
if (!n)
return 0;
return 1 + getCount(n->left) + getCount(n->right);
}
Hope this helps!

Hadoop... Text.toString() conversion problems

I'm writing a simple program for enumerating triangles in directed graphs for my project. First, for each input arc (e.g. a b, b c, c a, note: a tab symbol serves as a delimiter) I want my map function output the following pairs ([a, to_b], [b, from_a], [a_b, -1]):
public void map(LongWritable key, Text value,
OutputCollector<Text, Text> output,
Reporter reporter) throws IOException {
String line = value.toString();
String [] tokens = line.split(" ");
output.collect(new Text(tokens[0]), new Text("to_"+tokens[1]));
output.collect(new Text(tokens[1]), new Text("from_"+tokens[0]));
output.collect(new Text(tokens[0]+"_"+tokens[1]), new Text("-1"));
}
Now my reduce function is supposed to cross join all pairs that have both to_'s and from_'s
and to simply emit any other pairs whose keys contain "_".
public void reduce(Text key, Iterator<Text> values,
OutputCollector<Text, Text> output,
Reporter reporter) throws IOException {
String key_s = key.toString();
if (key_s.indexOf("_")>0)
output.collect(key, new Text("completed"));
else {
HashMap <String, ArrayList<String>> lists = new HashMap <String, ArrayList<String>> ();
while (values.hasNext()) {
String line = values.next().toString();
String[] tokens = line.split("_");
if (!lists.containsKey(tokens[0])) {
lists.put(tokens[0], new ArrayList<String>());
}
lists.get(tokens[0]).add(tokens[1]);
}
for (String t : lists.get("to"))
for (String f : lists.get("from"))
output.collect(new Text(t+"_"+f), key);
}
}
And this is where the most exciting stuff happens. tokens[1] yields an ArrayOutOfBounds exception. If you scroll up, you can see that by this point the iterator should give values like "to_a", "from_b", "to_b", etc... when I just output these values, everything looks ok and I have "to_a", "from_b". But split() don't work at all, moreover line.length() is always 1 and indexOf("") returns -1! The very same indexOf WORKS PERFECTLY for keys... where we have pairs whose keys contain "" and look like "a_b", "b_c"
I'm really puzzled with all this. MapReduce is supposed to save lives making everything simple. Instead I spent several hours to just localize this.
NOt sure if that's the problem by try changing this:
String [] tokens = line.split(" ");
to this:
String [] tokens = line.split("\t");

Resources