String unique characters [duplicate] - string

This question already has answers here:
Determining a string has all unique characters without using additional data structures and without the lowercase characters assumption
(7 answers)
Closed 8 years ago.
Implement an algorithm to determine if a string has all unique characters. What if you can not use additional data structures?

If you can use a little auxiliary memory, then a small array of bits (indexed by the numerical code of the character) is all you need (if your characters are 4-byte Unicode ones you'll probably want a hashmap instead;-). Start with all bits at 0: scan the string from the start -- each time, you've found a duplicate if the bit corresponding to the current character is already 1 -- otherwise, no duplicates yet, set that bit to 1. This is O(N).
If you can't allocate any extra memory, but can alter the string, sorting the string then doing a pass to check for adjacent duplicates is the best you can do, O(N log N).
If you can't allocate extra memory and cannot alter the string, you need an O(N squared) check where each character is checked vs all the following ones.

we can do it by assigning a prime number to every character.. and multiply it for every character found. then on every character check that if the value is divisible by the number assigned to that character or not..

answer in c program
int is_uniq(char *str)
{
int i = 0, flag = 0, value = 0;
for(i = 0; i < strlen(str); i++) {
value = str[i] - 'a';
if(flag & (1 << value)) {
return 0;
}
flag |= (1 << value);
}
return 1;
}

for each character in the string
if any subsequent character matches it
fail
succeed

One possible solution - You could extract the string into an array of characters, sort the array, and then loop over it, checking to see if the next character is equal to the current one.

the prime method described by partik (based on this theorem). It's O(N).
# one prime per letter
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101]
starting_byte = ?a.ord
primes_product = 1
ARGV[0].each_byte do |byte|
current_prime = primes[byte - starting_byte]
if primes_product % current_prime == 0
puts "Not unique"
exit
else
primes_product = primes_product * current_prime
end
end
puts "Unique"

I came on this thread for the similar question and ended up with the following solution in C#:
var data = "ABCDEFGADFGHETFASAJUTE";
var hash = new Dictionary<char, int>();
foreach (char c in data)
{
if (hash.ContainsKey(c))
{
hash[c] += 1;
}
else
{
hash.Add(c, 1);
}
}
var Characters = hash.Keys.ToArray();
var Frequencies = hash.Values.ToArray();

import java.io.*;
public class uniqueChar
{
boolean checkUniqueChar(String strin)
{
int m;
char []str=strin.toCharArray();
java.util.Arrays.sort(str);
for(int i=0;i<str.length-1;i++)
{
if(str[i]==str[i+1])
return false;
}
return true;
}
public static void main(String argv[]) throws IOException
{
String str;
System.out.println("enter the string\n");
InputStreamReader in=new InputStreamReader(System.in);
BufferedReader bin=new BufferedReader(in);
str=bin.readLine();
System.out.println(new uniqueChar().checkUniqueChar(str));
}
}

public static boolean isUniqueChars(String str) {
int checker = 0;
for (int i = 0; i < str.length(); ++i) {
int val = str.charAt(i) - 'a';
if ((checker & (1 << val)) > 0)
return false;
checker |= (1 << val);
}
return true;
}

Related

Dynamic character generator; Generate all possible strings from a character set

I want to make a dynamic string generator that will generate all possible unique strings from a character set with a dynamic length.
I can make this very easily using for loops but then its static and not dynamic length.
// Prints all possible strings with the length of 3
for a in allowedCharacters {
for b in allowedCharacters {
for c in allowedCharacters {
println(a+b+c)
}
}
}
But when I want to make this dynamic of length so I can just call generate(length: 5) I get confused.
I found this Stackoverflow question But the accepted answer generates strings 1-maxLength length and I want maxLength on ever string.
As noted above, use recursion. Here is how it can be done with C#:
static IEnumerable<string> Generate(int length, char[] allowed_chars)
{
if (length == 1)
{
foreach (char c in allowed_chars)
yield return c.ToString();
}
else
{
var sub_strings = Generate(length - 1, allowed_chars);
foreach (char c in allowed_chars)
{
foreach (string sub in sub_strings)
{
yield return c + sub;
}
}
}
}
private static void Main(string[] args)
{
string chars = "abc";
List<string> result = Generate(3, chars.ToCharArray()).ToList();
}
Please note that the run time of this algorithm and the amount of data it returns is exponential as the length increases which means that if you have large lengths, you should expect the code to take a long time and to return a huge amount of data.
Translation of #YacoubMassad's C# code to Swift:
func generate(length: Int, allowedChars: [String]) -> [String] {
if length == 1 {
return allowedChars
}
else {
let subStrings = generate(length - 1, allowedChars: allowedChars)
var arr = [String]()
for c in allowedChars {
for sub in subStrings {
arr.append(c + sub)
}
}
return arr
}
}
println(generate(3, allowedChars: ["a", "b", "c"]))
Prints:
aaa, aab, aac, aba, abb, abc, aca, acb, acc, baa, bab, bac, bba, bbb, bbc, bca, bcb, bcc, caa, cab, cac, cba, cbb, cbc, cca, ccb, ccc
While you can (obviously enough) use recursion to solve this problem, it quite an inefficient way to do the job.
What you're really doing is just counting. In your example, with "a", "b" and "c" as the allowed characters, you're counting in base 3, and since you're allowing three character strings, they're three digit numbers.
An N-digit number in base M can represent NM different possible values, going from 0 through NM-1. So, for your case, that's limit=pow(3, 3)-1;. To generate all those values, you just count from 0 through the limit, and convert each number to base M, using the specified characters as the "digits". For example, in C++ the code can look like this:
#include <string>
#include <iostream>
int main() {
std::string letters = "abc";
std::size_t base = letters.length();
std::size_t digits = 3;
int limit = pow(base, digits);
for (int i = 0; i < limit; i++) {
int in = i;
for (int j = 0; j < digits; j++) {
std::cout << letters[in%base];
in /= base;
}
std::cout << "\t";
}
}
One minor note: as I've written it here, this produces the output in basically a little-endian format. That is, the "digit" that varies the fastest is on the left, and the one that changes the slowest is on the right.

interview riddle (string manipulation) - explanation needed

i am studying for an interview and encountered a question + solution.
i am having a problem with one line in the solution and was hoping maybe someone here can explain it.
the question:
Write a method to replace all spaces in a string with ‘%20’.
the solution:
public static void ReplaceFun(char[] str, int length) {
int spaceCount = 0, newLength, i = 0;
for (i = 0; i < length; i++) {
if (str[i] == ‘ ‘) {
spaceCount++;
}
}
newLength = length + spaceCount * 2;
str[newLength] = ‘\0’;
for (i = length - 1; i >= 0; i--) {
if (str[i] == ‘ ‘) {
str[newLength - 1] = ‘0’;
str[newLength - 2] = ‘2’;
str[newLength - 3] = ‘%’;
newLength = newLength - 3;
} else {
str[newLength - 1] = str[i];
newLength = newLength - 1;
}
}
}
my problem is with line number 9. how can he just set str[newLength] to '\0'? or in other words, how can he take over the needed amount of memory without allocating it first or something like that?
isn't he running over a memory?!
Assuming this is actually meant to be in C (private static is not valid C or C++), they can't, as it's written. They're never allocating a new str which will be long enough to hold the old string plus the %20 expansion.
I suspect there's an additional part to the question, which is that str is already long enough to hold the expanded %20 data, and that length is the length of the string in str, not counting the zero terminator.
This is valid code, but it's not good code. You are completely correct in your assessment that we are overwriting the bounds of the initial str[]. This could cause some rather unwanted side-effects depending on what was being overwritten.

What does following code means int val = str.charAt(i) - 'a';?

The code is taken from career cup book
public static boolean isUniqueChars(String str) {
if (str.length() > 256) {
return false;`
}
int checker = 0;
for (int i = 0; i < str.length(); i++) {
int val = str.charAt(i) - 'a';
if ((checker & (1 << val)) > 0) return false;
checker |= (1 << val);
}
return true;
}
Thank you for explanation and I am not sure what do I get. Lets look at the following code-
public class ConvertAscii {
public static void main(String args[]){
String str ="Hello How are you";
int i =0;
for(i=0;i<str.length();i++){
System.out.println(str.charAt(i)-'a');
}
}
}
It gives me following output-
-24
12
32
34
etc
Also as in the above example we have
For example if str is "fbhsdsbfid" and i is 4 then val is equal to 3. What does subtracting ascii value of character 'a' from another character results in? Please explain more
It takes the character which is at index i in str and substracts the ASCII value of the character 'a'.
For example if str is "fbhsdsbfid" and i is 4 then val is equal to 3.
To answer your question for index i = 4, the character at index 4 is 'd' and it's corresponding ASCII value is 64.
The ASCII value of 'a' is 61.Therefore, str.charAt(i) - 'a' gives 64 - 61 = 3.

Sorting a string using another sorting order string [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I saw this in an interview question ,
Given a sorting order string, you are asked to sort the input string based on the given sorting order string.
for example if the sorting order string is dfbcae
and the Input string is abcdeeabc
the output should be dbbccaaee.
any ideas on how to do this , in an efficient way ?
The Counting Sort option is pretty cool, and fast when the string to be sorted is long compared to the sort order string.
create an array where each index corresponds to a letter in the alphabet, this is the count array
for each letter in the sort target, increment the index in the count array which corresponds to that letter
for each letter in the sort order string
add that letter to the end of the output string a number of times equal to it's count in the count array
Algorithmic complexity is O(n) where n is the length of the string to be sorted. As the Wikipedia article explains we're able to beat the lower bound on standard comparison based sorting because this isn't a comparison based sort.
Here's some pseudocode.
char[26] countArray;
foreach(char c in sortTarget)
{
countArray[c - 'a']++;
}
int head = 0;
foreach(char c in sortOrder)
{
while(countArray[c - 'a'] > 0)
{
sortTarget[head] = c;
head++;
countArray[c - 'a']--;
}
}
Note: this implementation requires that both strings contain only lowercase characters.
Here's a nice easy to understand algorithm that has decent algorithmic complexity.
For each character in the sort order string
scan string to be sorted, starting at first non-ordered character (you can keep track of this character with an index or pointer)
when you find an occurrence of the specified character, swap it with the first non-ordered character
increment the index for the first non-ordered character
This is O(n*m), where n is the length of the string to be sorted and m is the length of the sort order string. We're able to beat the lower bound on comparison based sorting because this algorithm doesn't really use comparisons. Like Counting Sort it relies on the fact that you have a predefined finite external ordering set.
Here's some psuedocode:
int head = 0;
foreach(char c in sortOrder)
{
for(int i = head; i < sortTarget.length; i++)
{
if(sortTarget[i] == c)
{
// swap i with head
char temp = sortTarget[head];
sortTarget[head] = sortTarget[i];
sortTarget[i] = temp;
head++;
}
}
}
In Python, you can just create an index and use that in a comparison expression:
order = 'dfbcae'
input = 'abcdeeabc'
index = dict([ (y,x) for (x,y) in enumerate(order) ])
output = sorted(input, cmp=lambda x,y: index[x] - index[y])
print 'input=',''.join(input)
print 'output=',''.join(output)
gives this output:
input= abcdeeabc
output= dbbccaaee
Use binary search to find all the "split points" between different letters, then use the length of each segment directly. This will be asymptotically faster then naive counting sort, but will be harder to implement:
Use an array of size 26*2 to store the begin and end of each letter;
Inspect the middle element, see if it is different from the element left to it. If so, then this is the begin for the middle element and end for the element before it;
Throw away the segment with identical begin and end (if there are any), recursively apply this algorithm.
Since there are at most 25 "split"s, you won't have to do the search for more than 25 segemnts, and for each segment it is O(logn). Since this is constant * O(logn), the algorithm is O(nlogn).
And of course, just use counting sort will be easier to implement:
Use an array of size 26 to record the number of different letters;
Scan the input string;
Output the string in the given sorting order.
This is O(n), n being the length of the string.
Interview questions are generally about thought process and don't usually care too much about language features, but I couldn't resist posting a VB.Net 4.0 version anyway.
"Efficient" can mean two different things. The first is "what's the fastest way to make a computer execute a task" and the second is "what's the fastest that we can get a task done". They might sound the same but the first can mean micro-optimizations like int vs short, running timers to compare execution times and spending a week tweaking every millisecond out of an algorithm. The second definition is about how much human time would it take to create the code that does the task (hopefully in a reasonable amount of time). If code A runs 20 times faster than code B but code B took 1/20th of the time to write, depending on the granularity of the timer (1ms vs 20ms, 1 week vs 20 weeks), each version could be considered "efficient".
Dim input = "abcdeeabc"
Dim sort = "dfbcae"
Dim SortChars = sort.ToList()
Dim output = New String((From c In input.ToList() Select c Order By SortChars.IndexOf(c)).ToArray())
Trace.WriteLine(output)
Here is my solution to the question
import java.util.*;
import java.io.*;
class SortString
{
public static void main(String arg[])throws IOException
{
BufferedReader br=new BufferedReader(new InputStreamReader(System.in));
// System.out.println("Enter 1st String :");
// System.out.println("Enter 1st String :");
// String s1=br.readLine();
// System.out.println("Enter 2nd String :");
// String s2=br.readLine();
String s1="tracctor";
String s2="car";
String com="";
String uncom="";
for(int i=0;i<s2.length();i++)
{
if(s1.contains(""+s2.charAt(i)))
{
com=com+s2.charAt(i);
}
}
System.out.println("Com :"+com);
for(int i=0;i<s1.length();i++)
if(!com.contains(""+s1.charAt(i)))
uncom=uncom+s1.charAt(i);
System.out.println("Uncom "+uncom);
System.out.println("Combined "+(com+uncom));
HashMap<String,Integer> h1=new HashMap<String,Integer>();
for(int i=0;i<s1.length();i++)
{
String m=""+s1.charAt(i);
if(h1.containsKey(m))
{
int val=(int)h1.get(m);
val=val+1;
h1.put(m,val);
}
else
{
h1.put(m,new Integer(1));
}
}
StringBuilder x=new StringBuilder();
for(int i=0;i<com.length();i++)
{
if(h1.containsKey(""+com.charAt(i)))
{
int count=(int)h1.get(""+com.charAt(i));
while(count!=0)
{x.append(""+com.charAt(i));count--;}
}
}
x.append(uncom);
System.out.println("Sort "+x);
}
}
Here is my version which is O(n) in time. Instead of unordered_map, I could have just used a char array of constant size. i.,e. char char_count[256] (and done ++char_count[ch - 'a'] ) assuming the input strings has all ASCII small characters.
string SortOrder(const string& input, const string& sort_order) {
unordered_map<char, int> char_count;
for (auto ch : input) {
++char_count[ch];
}
string res = "";
for (auto ch : sort_order) {
unordered_map<char, int>::iterator it = char_count.find(ch);
if (it != char_count.end()) {
string s(it->second, it->first);
res += s;
}
}
return res;
}
private static String sort(String target, String reference) {
final Map<Character, Integer> referencesMap = new HashMap<Character, Integer>();
for (int i = 0; i < reference.length(); i++) {
char key = reference.charAt(i);
if (!referencesMap.containsKey(key)) {
referencesMap.put(key, i);
}
}
List<Character> chars = new ArrayList<Character>(target.length());
for (int i = 0; i < target.length(); i++) {
chars.add(target.charAt(i));
}
Collections.sort(chars, new Comparator<Character>() {
#Override
public int compare(Character o1, Character o2) {
return referencesMap.get(o1).compareTo(referencesMap.get(o2));
}
});
StringBuilder sb = new StringBuilder();
for (Character c : chars) {
sb.append(c);
}
return sb.toString();
}
In C# I would just use the IComparer Interface and leave it to Array.Sort
void Main()
{
// we defin the IComparer class to define Sort Order
var sortOrder = new SortOrder("dfbcae");
var testOrder = "abcdeeabc".ToCharArray();
// sort the array using Array.Sort
Array.Sort(testOrder, sortOrder);
Console.WriteLine(testOrder.ToString());
}
public class SortOrder : IComparer
{
string sortOrder;
public SortOrder(string sortOrder)
{
this.sortOrder = sortOrder;
}
public int Compare(object obj1, object obj2)
{
var obj1Index = sortOrder.IndexOf((char)obj1);
var obj2Index = sortOrder.IndexOf((char)obj2);
if(obj1Index == -1 || obj2Index == -1)
{
throw new Exception("character not found");
}
if(obj1Index > obj2Index)
{
return 1;
}
else if (obj1Index == obj2Index)
{
return 0;
}
else
{
return -1;
}
}
}

How do I convert a string version of a number in an arbitrary base to an integer?

how to convert string to integer??
for ex:
"5328764",to int base 10
"AB3F3A", to int base 16
any code will be helpfull
Assuming arbitrary base (not 16, 10, 8, 2):
In C (C++), use strtol
return strtol("AB3F3A", NULL, 16);
In Javascript, use parseInt.
return parseInt("AB3F3A", 16);
In Python, use int(string, base).
return int("AB3F3A", 16)
In Java, use Integer.parseInt (thanks Michael.)
return Integer.parseInt("AB3F3A", 16);
In PHP, use base_convert.
return intval(base_convert('AB3F3A', 16, 10));
In Ruby, use to_i
"AB3F3A".to_i(16)
In C#, write one yourself.
in C#, i think it is: Convert.ToInt64(value, base)
and the base must be 2, 8, 10, or 16
9999 is really 9000 + 900 + 90 + 9
So, start at the right hand side of the string, and pick off the numbers one at a time.
Each character number has an ASCII code, which can be translated to the number, and multiplied by the appropriate amount.
Two functions in java, in both directions: "code" parameter represent the numerical system: "01" for base 2, "0123456789" for base 10, "0123456789abcdef" for hexdecimal and so on...
public String convert(long num, String code) {
final int base = code.length();
String text = "";
while (num > 0) {
text = code.charAt((int) (num%base)) + text;
num /= base;
}
return text;
}
public long toLong(String text, String code) {
final long base = code.length();
long num = 0;
long pow = 1;
int len = text.length();
for(int i = 0; i < len; i++) {
num += code.indexOf(text.charAt(len - i - 1)) * pow;
pow *= base;
}
return num;
}
println(convert(9223372036854775807L,"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"));
println(convert(9223372036854775807L,"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ#=-+*/^%$#&()!?.,:;[]"));
println(toLong("Ns8T$87=uh","0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ#=-+*/^%$#&()!?.,:;[]"));```
in your example:
toLong("5328764", "0123456789") = 5328764
toLong("AB3F3A", "0123456789ABCDEF") = 11222842

Resources