Algorithm to delete duplicate characters from a String

Algorithm to delete duplicate characters from a String - string

Let us a have a string "abbashbhqa". We have to remove the duplicate characters in such a manner that the output should be "abshq". One possible solution is to check each character with the others present in the string and then manipulate. But this requires O(n^2) time complexity. Is there any optimised approach to do so ?

O(n):
Define an array L[26] of booleans. Set all to FALSE.
Construct a new empty string
Walk over the string and for each letter check if L [x] is FALSE. If so, append x to the new string and set L [x] to 1.
Copy new string to the old one.

as soon as you iterate string you create a set (or hash set). in case the alphabet is limited (English letters as in your example) you just can create a 256 boolean array and use ASCII code as a key to it. Make all booleans to be false at starting point. Each iteration you check if array[] is false or true. In case it's false, the symbol is not a duplicate, so you mark it into array[] = true, do not remove from the string and go on. in case it's true - the symbol is a duplicate

Probably this will be the implementation of the above problem
import java.util.*;
import java.io.*;
public class String_Duplicate_Removal
{
public static String duplicate_removal(String s)
{
if(s.length()<2)
return s;
else if(s.length()==2)
{
if(s.charAt(0)==s.charAt(1))
s = Character.toString(s.charAt(0));
return s;
}
boolean [] arr = new boolean[26];
for(int i=0;i<s.length();i++)
{
if(arr[s.charAt(i)-'a']==false)
arr[s.charAt(i)-'a']=true;
else
{
s= ((new StringBuilder(s)).deleteCharAt(i)).toString();
i--;
}
}
return s;
}
public static void main(String [] args)
{
String s = "abbashbhqa";
System.out.println(duplicate_removal(s));
}
}

I am solving using Python and it works in O(n) time and O(n) space --
I am using set() as set does not allow duplicates ---
In this case the order of elements gets changed --
If u want the order to remain same then u can use OrderedDict() and it also works in O(n) time --
def remove_duplicates(s , ans_set):
for i in s: # O(n)
ans_set.add(i) # O(1)
ans = ''
for char in ans_set:
ans += char
print ans
s = raw_input()
ans_set = set()
remove_duplicates(s , ans_set)
from collections import OrderedDict
def remove_duplicates_maintain_order(a):
ans_dict = OrderedDict()
for i in a: # O(n)
ans_dict[i] = ans_dict.get(i , 0) + 1 # O(1)
ans = ''
for char in ans_dict:
ans += char
print ans
s = raw_input()
remove_duplicates_maintain_order(s)

Related

Find the even number using given number

I have to find the greatest even number possible using the digits of given number
Input : 7876541
Desired output : 8776514
Can anyone help me with the logic?

How about this?
convert it into string
sort the numbers in reverse order
join them and convert it as number
def n = 7876541
def newN = (n.toString().split('').findAll{it}.sort().reverse().join()) as Integer
println newN
You can quickly try it on-line demo
EDIT: Based on the OP comments, updating the answer.
Here is what you can do -
- find the permutations of the number
- find the even number
- filter it by maximum number.
There is already found a thread for finding the permutations, so re-using it with little changes. Credits to JavaHopper.
Of course, it can be simplified by groovified.
class Permutations {
static def list = []
public static void printPermutation(char[] a, int startIndex, int endIndex) {
if (startIndex == endIndex)
list << ((new String(a)) as Integer)
else {
for (int x = startIndex; x < endIndex; x++) {
swap(a, startIndex, x)
printPermutation(a, startIndex + 1, endIndex)
swap(a, startIndex, x)
}
}
}
private static void swap(char[] a, int i, int x) {
char t = a[i]
a[i] = a[x]
a[x] = t
}
}
def n = 7876541
def cArray = n.toString().toCharArray()
Permutations.printPermutation(cArray, 0, cArray.size())
println Permutations.list.findAll { it.mod(2) == 0}?.max()
Quickly try online demo

There is no need to create permutations.
Try this solution:
convert the source number into a string.
split the string into an array,
sort the numbers, for the time being, in ascending order,
find the index of the first even digit,
remove this number from the array (storing it in a variable),
reverse the array and add the removed number,
join the digits from the array and convert them into integer.
So the whole script looks like below:
def inp = 7876541
def chars1 = inp.toString().split('')
// findAll{it} drops an empty starting element from the split result
def chars2 = chars1.findAll{it}.sort()
// Find index of the 1st even digit
def n = chars2.findIndexOf{it.toInteger() % 2 == 0}
def dig = chars2[n] // Store this digit
chars2.remove(n) // Remove from the array
def chars3 = chars2.reverse() // Descending order
chars3.add(dig) // Add the temporarily deleted number
def out = (chars3.join()) as Integer // result
println out

Grails convert String to Map with comma in string values

I want convert string to Map in grails. I already have a function of string to map conversion. Heres the code,
static def StringToMap(String reportValues){
Map result=[:]
result=reportValues.replace('[','').replace(']','').replace(' ','').split(',').inject([:]){map,token ->
List tokenizeStr=token.split(':');
tokenizeStr.size()>1?tokenizeStr?.with {map[it[0]?.toString()?.trim()]=it[1]?.toString()?.trim()}:tokenizeStr?.with {map[it[0]?.toString()?.trim()]=''}
map
}
return result
}
But, I have String with comma in the values, so the above function doesn't work for me. Heres my String
[program_type:, subsidiary_code:, groupName:, termination_date:, effective_date:, subsidiary_name:ABC, INC]
my function returns ABC only. not ABC, INC. I googled about it but couldnt find any concrete help.

Generally speaking, if I have to convert a Stringified Map to a Map object I try to make use of Eval.me. Your example String though isn't quite right to do so, if you had the following it would "just work":
// Note I have added '' around the values.
String a = "[program_type:'', subsidiary_code:'', groupName:'', termination_date:'', effective_date:'', subsidiary_name:'ABC']"
Map b = Eval.me(a)
// returns b = [program_type:, subsidiary_code:, groupName:, termination_date:, effective_date:, subsidiary_name:ABC]
If you have control of the String then if you can create it following this kind of pattern, it would be the easiest solution I suspect.

In case it is not possible to change the input parameter, this might be a not so clean and not so short option. It relies on the colon instead of comma values.
String reportValues = "[program_type:, subsidiary_code:, groupName:, termination_date:, effective_date:, subsidiary_name:ABC, INC]"
reportValues = reportValues[1..-2]
def m = reportValues.split(":")
def map = [:]
def length = m.size()
m.eachWithIndex { v, i ->
if(i != 0) {
List l = m[i].split(",")
if (i == length-1) {
map.put(m[i-1].split(",")[-1], l.join(","))
} else {
map.put(m[i-1].split(",")[-1], l[0..-2].join(","))
}
}
}
map.each {key, value -> println "key: " + key + " value: " + value}
BTW: Only use eval on trusted input, AFAIK it executes everything.

You could try messing around with this bit of code:
String tempString = "[program_type:11, 'aa':'bb', subsidiary_code:, groupName:, termination_date:, effective_date:, subsidiary_name:ABC, INC]"
List StringasList = tempString.tokenize('[],')
def finalMap=[:]
StringasList?.each { e->
def f = e?.split(':')
finalMap."${f[0]}"= f.size()>1 ? f[1] : null
}
println """-- tempString: ${tempString.getClass()} StringasList: ${StringasList.getClass()}
finalMap: ${finalMap.getClass()} \n Results\n finalMap ${finalMap}
"""
Above produces:
-- tempString: class java.lang.String StringasList: class java.util.ArrayList
finalMap: class java.util.LinkedHashMap
Results
finalMap [program_type:11, 'aa':'bb', subsidiary_code:null, groupName:null, termination_date:null, effective_date:null, subsidiary_name:ABC, INC:null]
It tokenizes the String then converts ArrayList by iterating through the list and passing each one again split against : into a map. It also has to check to ensure the size is greater than 1 otherwise it will break on f[1]

D: how to remove last char in string?

I need to remove last char in string in my case it's comma (","):
foreach(line; fcontent.splitLines)
{
string row = line.split.map!(a=>format("'%s', ", a)).join;
writeln(row.chop.chop);
}
I have found only one way - to call chop two times. First remove \r\n and second remove last char.
Is there any better ways?

import std.array;
if (!row.empty)
row.popBack();

As it usually happens with string processing, it depends on how much Unicode do you care about.
If you only work with ASCII it is very simple:
import std.encoding;
// no "nice" ASCII literals, D really encourages Unicode
auto str1 = cast(AsciiString) "abcde";
str1 = str1[0 .. $-1]; // get slice of everything but last byte
auto str2 = cast(AsciiString) "abcde\n\r";
str2 = str2[0 .. $-3]; // same principle
In "last char" actually means unicode code point (http://unicode.org/glossary/#code_point) it gets a bit more complicated. Easy way is to just rely on D automatic decoding and algorithms:
import std.range, std.stdio;
auto range = "кириллица".retro.drop(1).retro();
writeln(range);
Here retro (http://dlang.org/phobos/std_range.html#.retro) is a lazy reverse iteration function. It takes any range (unicode string is a valid range) and returns wrapper that is capable of iterating it backwards.
drop (http://dlang.org/phobos/std_range.html#.drop) simply pops a single range element and ignores it. Calling retro again will reverse the iteration order back to normal, but now with the last element dropped.
Reason why it is different from ASCII version is because of nature of Unicode (specifically UTF-8 which D defaults to) - it does not allow random access to any code point. You actually need to decode them all one by one to get to any desired index. Fortunately, D takes care of all decoding for you hiding it behind convenient range interface.
For those who want even more Unicode correctness, it should be possible to operate on graphemes (http://unicode.org/glossary/#grapheme):
import std.range, std.uni, std.stdio;
auto range = "abcde".byGrapheme.retro.drop(1).retro();
writeln(range);
Sadly, looks like this specific pattern is not curently supported because of bug in Phobos. I have created an issue about it : https://issues.dlang.org/show_bug.cgi?id=14394

NOTE: Updated my answer to be a bit cleaner and removed the lambda function in 'map!' as it was a little ugly.
import std.algorithm, std.stdio;
import std.string;
void main(){
string fcontent = "I am a test\nFile\nwith some,\nCommas here and\nthere,\n";
auto data = fcontent
.splitLines
.map!(a => a.replaceLast(","))
.join("\n");
writefln("%s", data);
}
auto replaceLast(string line, string toReplace){
auto o = line.lastIndexOf(toReplace);
return o >= 0 ? line[0..o] : line;
}

module main;
import std.stdio : writeln;
import std.string : lineSplitter, join;
import std.algorithm : map, splitter, each;
enum fcontent = "some text\r\nnext line\r\n";
void main()
{
fcontent.lineSplitter.map!(a=>a.splitter(' ')
.map!(b=>"'" ~ b ~ "'")
.join(", "))
.each!writeln;
}

Take a look, I use this extension method to replace any last character or sub-string, for example:
string testStr = "Happy holiday!";<br>
Console.Write(testStr.ReplaceVeryLast("holiday!", "Easter!"));
public static class StringExtensions
{
public static string ReplaceVeryLast(this string sStr, string sSearch, string sReplace = "")
{
int pos = 0;
sStr = sStr.Trim();
do
{
pos = sStr.LastIndexOf(sSearch, StringComparison.CurrentCultureIgnoreCase);
if (pos >= 0 && pos + sSearch.Length == sStr.Length)
sStr = sStr.Substring(0, pos) + sReplace;
} while (pos == (sStr.Length - sSearch.Length + 1));
return sStr;
}
}

Remove all the occurences of substrings from a string

Given a string S and a set of n substrings. Remove every instance of those n substrings from S so that S is of the minimum length and output this minimum length.
Example 1
S = ccdaabcdbb
n = 2
substrings = ab, cd
Output
2
Explanation:
ccdaabcdbb -> ccdacdbb -> cabb -> cb (length=2)
Example 2
S = abcd
n = 2
substrings = ab,bcd
Output
1
How do I solve this problem ?

A simple Brute-force search algorithm is:
For each substring, try all possible ways to remove it from the string, then recurse.
In Pseudocode:
def min_final_length (input, substrings):
best = len(input)
for substr in substrings:
beg = 0
// find all occurrences of substr in input and recurse
while (found = find_substring(input, substr, from=beg)):
input_without_substr = input[0:found]+input[found+len(substr):len(input)]
best = min(best, min_final_length(input_without_substr,substrings))
beg = found+1
return best
Let complexity be F(S,n,l) where S is the length of the input string, n is the cardinality of the set substrings and l is the "characteristic length" of substrings. Then
F(S,n,l) ~ n * ( S * l + F(S-l,n,l) )
Looks like it is at most O(S^2*n*l).

The following solution would have an complexity of O(m * n) where m = len(S) and n is the number of substring
def foo(S, sub):
i = 0
while i < len(S):
for e in sub:
if S[i:].startswith(e):
S = S[:i] + S[i+len(e):]
i -= 1
break
else: i += 1
return S, i

If you are for raw performance and your string is very large, you can do better than brute force. Use a suffix trie (E.g, Ukkonnen trie) to store your string. Then find each substring (which us done in O(m) time, m being substring length), and store the offsets to the substrings and length in an array.
Then use the offsets and length info to actually remove the substrings by filling these areas with \0 (in C) or another placeholder character. By counting all non-Null characters you will get the minimal length of the string.
This will als handle overlapping substring, e.g. say your string is "abcd", and you have two substrings "ab" and "abcd".

I solved it using trie+dp.
First insert your substrings in a trie. Then define the state of the dp is some string, walk through that string and consider each i (for i =0 .. s.length()) as the start of some substring. let j=i and increment j as long as you have a suffix in the trie (which will definitely land you to at least one substring and may be more if you have common suffix between some substring, for example "abce" and "abdd"), whenever you encounter an end of some substring, go solve the new sub-problem and find the minimum between all substring reductions.
Here is my code for it. Don't worry about the length of the code. Just read the solve function and forget about the path, I included it to print the string formed.
struct node{
node* c[26];
bool str_end;
node(){
for(int i= 0;i<26;i++){
c[i]=NULL;
}
str_end= false;
}
};
class Trie{
public:
node* root;
Trie(){
root = new node();
}
~Trie(){
delete root;
}
};
class Solution{
public:
typedef pair<int,int>ii;
string get_str(string& s,map<string,ii>&path){
if(!path.count(s)){
return s;
}
int i= path[s].first;
int j= path[s].second;
string new_str =(s.substr(0,i)+s.substr(j+1));
return get_str(new_str,path);
}
int solve(string& s,Trie* &t, map<string,int>&dp,map<string,ii>&path){
if(dp.count(s)){
return dp[s];
}
int mn= (int)s.length();
for(int i =0;i<s.length();i++){
string left = s.substr(0,i);
node* cur = t->root->c[s[i]-97];
int j=i;
while(j<s.length()&&cur!=NULL){
if(cur->str_end){
string new_str =left+s.substr(j+1);
int ret= solve(new_str,t,dp,path);
if(ret<mn){
path[s]={i,j};
}
}
cur = cur->c[s[++j]-97];
}
}
return dp[s]=mn;
}
string removeSubstrings(vector<string>& substrs, string s){
map<string,ii>path;
map<string,int>dp;
Trie*t = new Trie();
for(int i =0;i<substrs.size();i++){
node* cur = t->root;
for(int j=0;j<substrs[i].length();j++){
if(cur->c[substrs[i][j]-97]==NULL){
cur->c[substrs[i][j]-97]= new node();
}
cur = cur->c[substrs[i][j]-97];
if(j==substrs[i].length()-1){
cur->str_end= true;
}
}
}
solve(s,t,dp,path);
return get_str(s, path);
}
};
int main(){
vector<string>substrs;
substrs.push_back("ab");
substrs.push_back("cd");
Solution s;
cout << s.removeSubstrings(substrs,"ccdaabcdbb")<<endl;
return 0;
}

Remove single character occurrence from String

I want an algorithm to remove all occurrences of a given character from a string in O(n) complexity or lower? (It should be INPLACE editing original string only)
eg.
String="aadecabaaab";
removeCharacter='a'
Output:"decbb"

Enjoy algo:
j = 0
for i in length(a):
if a[i] != symbol:
a[j] = a[i]
j = j + 1
finalize:
length(a) = j

You can't do it in place with a String because it's immutable, but here's an O(n) algorithm to do it in place with a char[]:
char[] chars = "aadecabaaab".toCharArray();
char removeCharacter = 'a';
int next = 0;
for (int cur = 0; cur < chars.length; ++cur) {
if (chars[cur] != removeCharacter) {
chars[next++] = chars[cur];
}
}
// chars[0] through chars[4] will have {d, e, c, b, b} and next will be 5
System.out.println(new String(chars, 0, next));

Strictly speaking, you can't remove anything from a String because the String class is immutable. But you can construct another String that has all characters from the original String except for the "character to remove".
Create a StringBuilder. Loop through all characters in the original String. If the current character is not the character to remove, then append it to the StringBuilder. After the loop ends, convert the StringBuilder to a String.

Yep. In a linear time, iterate over String, check using .charAt() if this is a removeCharacter, don't copy it to new String. If no, copy. That's it.

This probably shouldn't have the "java" tag since in Java, a String is immutable and you can't edit it in place. For a more general case, if you have an array of characters (in any programming language) and you want to modify the array "in place" without creating another array, it's easy enough to do with two indexes. One goes through every character in the array, and the other starts at the beginning and is incremented only when you see a character that isn't removeCharacter. Since I assume this is a homework assignment, I'll leave it at that and let you figure out the details.

import java.util.*;
import java.io.*;
public class removeA{
public static void main(String[] args){
String text = "This is a test string! Wow abcdefg.";
System.out.println(text.replaceAll("a",""));
}
}

Use a hash table to hold the data you want to remove. log N complexity.
std::string toRemove = "ad";
std::map<char, int> table;
size_t maxR = toRemove.size();
for (size_t n = 0; n < maxR; ++n)
{
table[toRemove[n]] = 0;
}
Then parse the whole string and remove when you get a hit (thestring is an array):
size_t counter = 0;
while(thestring[counter] != 0)
{
std::map<char,int>::iterator iter = table.find(thestring[counter]);
if (iter == table.end()) // we found a valid character!
{
++counter;
}
else
{
// move the data - dont increment counter
memcpy(&thestring[counter], &thestring[counter+1], max-counter);
// dont increment counter
}
}
EDIT: I hope this is not a technical test or something like that. =S

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Algorithm to delete duplicate characters from a String - string

O(n): Define an array L[26] of booleans. Set all to FALSE. Construct a new empty string Walk over the string and for each letter check if L [x] is FALSE. If so, append x to the new string and set L [x] to 1. Copy new string to the old one.

Related

Find the even number using given number

Grails convert String to Map with comma in string values

D: how to remove last char in string?

Remove all the occurences of substrings from a string

Remove single character occurrence from String

Categories

Resources