Map value is not getting increased when it finds duplicate key. Please advise - hashmap

I am new to Java.. Please help me to get required output for the below code.
Map value has to be counted when month value is repeated
Issue occurs due to 2 sublist which is executing on a different threads. Please advise if there is any modification needs to be done in this code in order to rectify map value to get counted.
int size = inputList.size();
int listSize = size/numberOfThreads;
List<String> tmplist1 = inputList.subList(0, listSize);
int count = listSize;
for (int i=0;i<numberOfThreads;i++)
{
if(listSize <= size) {
t1 = new TotalOrderThread();
t1.setInput(tmplist1);
Thread thread = new Thread(t1);
thread.start();
thread.join();
if(listSize < size)
tmplist1 = inputList.subList(listSize, listSize+count);
listSize=listSize+count;
}
*
Input:
123,03/04/2005
234,04/05/2005
567,03/04/2005
789,01/01/2005
Output:(month 4 is repeated twice but the value is not getting counted as 2). Please help to find out the mistake. Also, Is there anyway to print the month value as "MMM" format while iterating map?
4 1
5 1
1 1
4 1
*
Map<Integer,Integer> orderMap = new HashMap<>();
for(int i=0;i<this.input.size();i++)
{
String details = input.get(i);
String[] detailsarr = details.split(",");
DateTimeFormatter f = DateTimeFormatter.ofPattern( "dd/MM/uuuu" ) ;
LocalDate id = LocalDate.parse(detailsarr[1], f);
int month = id.getMonthValue();
if(orderMap.containsKey(month))
{
int count = orderMap.get(month);
orderMap.put(month, count+1);
}
else
{
orderMap.put(month, 1);
}
}
for(Map.Entry<Integer,Integer> entry : orderMap.entrySet())
{
int month3 = entry.getKey();
int value = entry.getValue();
System.out.println(month3+ " " +value);
}
}

I tried your code like below. This one works.
Map<Integer,Integer> orderMap = new HashMap<>();
String[] input = {"123,03/04/2005", "234,04/05/2005", "567,03/04/2005", "789,01/01/2005"};
for(int i=0;i<input.length;i++)
{
String details = input[i];
String[] detailsarr = details.split(",");
DateTimeFormatter f = DateTimeFormatter.ofPattern( "dd/MM/uuuu" ) ;
LocalDate id = LocalDate.parse(detailsarr[1], f);
int month = id.getMonthValue();
if(orderMap.containsKey(month))
{
int count = orderMap.get(month);
orderMap.put(month, count+1);
}
else
{
orderMap.put(month, 1);
}
}
for(Map.Entry<Integer,Integer> entry : orderMap.entrySet())
{
int month3 = entry.getKey();
int value = entry.getValue();
System.out.println(month3+ " " +value);
}

Related

Find maximum deviation of all substrings

Given a string, find the maximum deviation among all substrings. The maximum deviation is defined as the difference between the maximum frequency of a character and the minimum frequency of a character.
For example, in abcaba, a has a frequency of 3; b has a frequency of 2; c has a frequency of 1. so a has the maximum frequency, which is 3, whereas c has a minimum frequency of 1. Therefore the deviation of this string is 3 - 1 = 2. And we also need to find all other deviations for each of the substrings for abacaba, the maximum among them is the answer.
I couldn't think of a better way rather than the obvious brute force approach. Thanks in advance!
For finding all substrings you have to consider O(n2). See this post for more details. You can just optimize it by stop point where substring lengths be smaller than current maximum deviation.
maxDeviation = 0;
n = strlen(str);
for i = 0 to n
{
if(n-i < maxDeviation) break; //this is new stop point to improve
sub1 = substring(str,i,n);
sub2 = substring(str,0,n-i); // you can use if(i!=0) to avoid duplication of first sentence
a = findMaxDeviation(sub1); // This is O(n)
b = findMaxDeviation(sub2); // This is O(n)
maxDeviation = max(a,b);
}
print maxDeviation
Pay attention to this line if(n-i < maxDeviation) break; because you cannot find a deviation more than maxDeviation in a string with length of smaller than maxDeviation.
public static int getDev(Map<String, Integer> devEntries){
List<Integer> entries = devEntries.entrySet().stream()
.map(x->x.getValue())
.collect(Collectors.toList());
Comparator<Integer> collect = Comparator.naturalOrder();
Collections.sort(entries,collect.reversed());
return entries.get(0) - entries.get( entries.size()-1);
}
public static int getMaxFreqDeviation(String s, Set<Integer> deviations ) {
for (int x=0;x<s.length();x++) {
for (int g=x;g<s.length()+1;g++){
String su =s.substring(x,g);
Map<String, Integer> map = Arrays.asList(su.split(""))
.stream()
.collect(Collectors.groupingBy(v->v,Collectors.summingInt(v->1)));
if (map.entrySet().size()==1){
deviations.add(abs(0));
}else {
int devcount = getDev(map);
deviations.add(abs(devcount));
}
}
}
return deviations.stream().collect(Collectors.toList()).get(deviations.size()-1);
}
public static void main(String[] args){
String se = "abcaba";
Set<Integer> deviations = new TreeSet<>();
int ans = getMaxFreqDeviation(se,deviations);
System.out.println(ans);
}
}
I faced a similar question in a test and I used c#, although I failed during the challenge but picked it up to solve the next day. I came about something like the below.
var holdDict = new Dictionary<char, int>();
var sArray = s.ToCharArray();
var currentCharCount = 1;
//Add the first element
holdDict.Add(sArray[0],1);
for (int i = 1; i < s.Length-1; i++)
{
if (sArray[i] == sArray[i - 1])
{
currentCharCount += 1;
}
else
{
currentCharCount = 1;
}
holdDict.TryGetValue(sArray[i], out var keyValue);
if (keyValue < currentCharCount) holdDict[sArray[i]] = currentCharCount;
}
var myQueue = new PriorityQueue<string, int>();
foreach (var rec in holdDict)
{
myQueue.Enqueue($"{rec.Key}#{rec.Value}", rec.Value);
}
int highest = 0, lowest = 0, queueCount=myQueue.Count;
while (myQueue.Count > 0)
{
int currentValue = int.Parse(myQueue.Peek().Split('#')[1]);
if (myQueue.Count == queueCount) lowest = currentValue;
highest = currentValue;
myQueue.Dequeue();
}
return highest - lowest;
O(n) algo (26*26*N)
import string
def maxSubarray(s, ch1, ch2):
"""Find the largest sum of any contiguous subarray."""
"""From https://en.wikipedia.org/wiki/Maximum_subarray_problem"""
best_sum = 0
current_sum = 0
for x in s:
if x == ch1:
x = 1
elif x == ch2:
x = -1
else:
x = 0
current_sum = max(0, current_sum + x)
best_sum = max(best_sum, current_sum)
return best_sum
def findMaxDiv(s):
'''Algo from https://discuss.codechef.com/t/help-coding-strings/99427/4'''
maxDiv = 0
for ch1 in string.ascii_lowercase:
for ch2 in string.ascii_lowercase:
if ch1 == ch2:
continue
curDiv = maxSubarray(s, ch1, ch2)
if curDiv > maxDiv:
maxDiv = curDiv
return maxDiv

Palindrome operations on a string

You are given a string S initially and some Q queries. For each query you will have 2 integers L and R. For each query, you have to perform the following operations:
Arrange the letters from L to R inclusive to make a Palindrome. If you can form many such palindromes, then take the one that is lexicographically minimum. Ignore the query if no palindrome is possible on rearranging the letters.
You have to find the final string after all the queries.
Constraints:
1 <= length(S) <= 10^5
1 <= Q <= 10^5
1<= L <= R <= length(S)
Sample Input :
4
mmcs 1
1 3
Sample Output:
mcms
Explanation:
The initial string is mmcs, there is 1 query which asks to make a palindrome from 1 3, so the palindrome will be mcm. Therefore the string will mcms.
If each query takes O(N) time, the overall time complexity would be O(NQ) which will give TLE. So each query should take around O(logn) time. But I am not able to think of anything which will solve this question. I think since we only need to find the final string rather than what every query result into, I guess there must be some other way to approach this question. Can anybody help me?
We can solve this problem using Lazy Segment Tree with range updates.
We will make Segment Tree for each character , so there will be a total of 26 segment trees.
In each node of segment tree we will store the frequency of that character over the range of that node and also keep a track of whether to update that range or not.
So for each query do the following ->
We are given a range L to R
So first we will find frequency of each character over L to R (this will take O(26*log(n)) time )
Now from above frequencies count number of characters who have odd frequency.
If count > 1 , we cannot form palindrome, otherwise we can form palindrome
If we can form palindrome then,first we will assign 0 over L to R for each character in Segment Tree and then we will start from smallest character and assign it over (L,L+count/2-1) and (R-count/2+1,R) and then update L += count/2 and R -= count/2
So the time complexity of each query is O(26log(n)) and for building Segment Tree time complexity is O(nlog(n)) so overall time complexity is O(nlogn + q26logn).
For a better understanding please see my code,
#include <bits/stdc++.h>
using namespace std;
#define enl '\n'
#define int long long
#define sz(s) (int)s.size()
#define all(v) (v).begin(),(v).end()
#define input(vec) for (auto &el : vec) cin >> el;
#define print(vec) for (auto &el : vec) cout << el << " "; cout << "\n";
const int mod = 1e9+7;
const int inf = 1e18;
struct SegTree {
vector<pair<bool,int>>lazy;
vector<int>cnt;
SegTree () {}
SegTree(int n) {
lazy.assign(4*n,{false,0});
cnt.assign(4*n,0);
}
int query(int l,int r,int st,int en,int node) {
int mid = (st+en)/2;
if(st!=en and lazy[node].first) {
if(lazy[node].second) {
cnt[2*node] = mid - st + 1;
cnt[2*node+1] = en - mid;
}
else {
cnt[2*node] = cnt[2*node+1] = 0;
}
lazy[2*node] = lazy[2*node+1] = lazy[node];
lazy[node] = {false,0};
}
if(st>r or en<l) return 0;
if(st>=l and en<=r) return cnt[node];
return query(l,r,st,mid,2*node) + query(l,r,mid+1,en,2*node+1);
}
void update(int l,int r,int val,int st,int en,int node) {
int mid = (st+en)/2;
if(st!=en and lazy[node].first) {
if(lazy[node].second) {
cnt[2*node] = mid - st + 1;
cnt[2*node+1] = en - mid;
}
else {
cnt[2*node] = cnt[2*node+1] = 0;
}
lazy[2*node] = lazy[2*node+1] = lazy[node];
lazy[node] = {false,0};
}
if(st>r or en<l) return;
if(st>=l and en<=r) {
cnt[node] = (en - st + 1)*val;
lazy[node] = {true,val};
return;
}
update(l,r,val,st,mid,2*node);
update(l,r,val,mid+1,en,2*node+1);
cnt[node] = cnt[2*node] + cnt[2*node+1];
}
};
void solve() {
int n;
cin>>n;
string s;
cin>>s;
vector<SegTree>tr(26,SegTree(n));
for(int i=0;i<n;i++) {
tr[s[i]-'a'].update(i,i,1,0,n-1,1);
}
int q;
cin>>q;
while(q--) {
int l,r;
cin>>l>>r;
vector<int>cnt(26);
for(int i=0;i<26;i++) {
cnt[i] = tr[i].query(l,r,0,n-1,1);
}
int odd = 0;
for(auto u:cnt) odd += u%2;
if(odd>1) continue;
for(int i=0;i<26;i++) {
tr[i].update(l,r,0,0,n-1,1);
}
int x = l,y = r;
for(int i=0;i<26;i++) {
if(cnt[i]/2) {
tr[i].update(x,x+cnt[i]/2-1,1,0,n-1,1);
tr[i].update(y-cnt[i]/2+1,y,1,0,n-1,1);
x += cnt[i]/2;
y -= cnt[i]/2;
cnt[i]%=2;
}
}
for(int i=0;i<26;i++) {
if(cnt[i]) {
tr[i].update(x,x,1,0,n-1,1);
}
}
}
string ans(n,'a');
for(int i=0;i<26;i++) {
for(int j=0;j<n;j++) {
if(tr[i].query(j,j,0,n-1,1)) {
ans[j] = (char)('a'+i);
}
}
}
cout<<ans<<enl;
}
signed main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);cout.tie(nullptr);
int testcases = 1;
cin>>testcases;
while(testcases--) solve();
return 0;
}

JXL Get Cell Address

I am using the Java jxl api v2.6.16 to generate Excel Spread Sheet. Like the above title puts it, how do get an address of a Cell or More specifically Writable cell that I am writing to if all I have is the cell's column and row? Or do I have to write an algorithm which can generate that?
Thanks in advance.
You can use this code. Hope this help. You can use it in this way:
cellAddress(cell.getRow() + 1, cell.getColumn()) if cell is defined like Cell cell = someCell;
private String cellAddress(Integer rowNumber, Integer colNumber){
return "$"+columnName(colNumber)+"$"+rowNumber;
}
private String columName(Integer colNumber) {
Base columns = new Base(colNumber,26);
columns.transform();
return columns.getResult();
}
class Base {
String[] colNames = "A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z".split(",");
String equalTo;
int position;
int number;
int base;
int[] digits;
int[] auxiliar;
public Base(int n, int b) {
position = 0;
equalTo = "";
base = b;
number = n;
digits = new int[1];
}
public void transform() {
if (number < base) {
digits[position] = number;
size();
} else {
digits[position] = number % base;
size();
position++;
number = number / base;
transform();
}
}
public String getResult() {
for (int j = digits.length - 2; j >= 0; j--) {
equalTo += colNames[j>0?digits[j]-1:digits[j]];
}
return equalTo;
}
private void size() {
auxiliar = digits;
digits = new int[auxiliar.length + 1];
System.arraycopy(auxiliar, 0, digits, 0, auxiliar.length);
}
}
the Address is made up of the Column and Row.
In A1 Notation it is written like: Range("A1") where the column 1 is rtepresented by the letter "A" and the row is 1
In R1C1 Notation it would be written like this: R1C1 where the column is 1 and the row is 1
They would be used like this:
Cells(1,1).font.bold=true ' row 1, column 1
range("A1").font.bold=true
to get the address from a Reference, retrieve the Address property of the Cells or Range object as below:
sAddress=cells(1,1).address
which would return A$1$
and in JXL, String rangeAddress = range.getAddress();
You can get the column by it's alpha index by using one of the methods below. he simple method just casts an integer to a ASCII character. The second method allows the use of a custom alphabet.
Simple Method
public class LookupUtil {
public static String cellAddress(int row, int col) {
return String.format("$%s$%d", columnName(col), row);
}
public static String columnName(int index) {
int div = index + 1;
StringBuffer result = new StringBuffer();
int mod = 0;
while (div > 0) {
mod = (div - 1) % 26;
result.insert(0, (char) (65 + mod));
div = (int) ((div - mod) / 26);
}
return result.toString();
}
}
Advanced Method
public class LookupUtil {
private static final char[] ALPHA = "ABCDEFGHIJKLMNOPQRSTUVWXYZ".toCharArray();
public static String cellAddress(int row, int col) {
return String.format("$%s$%d", columnName(col, ALPHA), row);
}
public static String columnName(int index, char[] alphabet) {
int div = index + 1;
StringBuffer result = new StringBuffer();
int mod = 0;
while (div > 0) {
mod = (div - 1) % alphabet.length;
result.insert(0, alphabet[mod]);
div = (int) ((div - mod) / alphabet.length);
}
return result.toString();
}
}

How to format a Date variable in LWUIT?

I have an object whose class has a getter method , and this getter method returns a Date value. I want to show this value in a Label in the format DD/MM/YYYY.
How to achieve that with LWUIT ?
Thank you very much indeed
You can use this code to convert date to string format and pass the this string value to label.
public static String dateToString (long date)
{
Calendar c = Calendar.getInstance();
c.setTime(new Date(date));
int y = c.get(Calendar.YEAR);
int m = c.get(Calendar.MONTH) + 1;
int d = c.get(Calendar.DATE);
String t = (d<10? "0": "")+d+"/"+(m<10? "0": "")+m+"/"+(y<10? "0": "")+y;
return t;
}
To thank you here is a code of mine which formats a number :
public static String formatNombre(int trivialNombre, String separateur)
{
String pNombre, sNombreLeads, sNombre, argNombre, resultat;
int leadingBits;
int nbBit;
pNombre = String.valueOf(trivialNombre);
if (pNombre.length() > 3)
{
leadingBits = (pNombre.length())%3;
if (leadingBits != 0)
sNombreLeads = pNombre.substring(0, leadingBits).concat(separateur);
else
sNombreLeads = "";
nbBit = 0;
sNombre = "";
argNombre = pNombre.substring(leadingBits);
for (int i=0;i<argNombre.length();i++)
{
sNombre = sNombre.concat(String.valueOf(argNombre.charAt(i)));
nbBit++;
if (nbBit%3 == 0)
sNombre = sNombre.concat(separateur);
}
sNombre = sNombre.substring(0, sNombre.length() - 1);
resultat = sNombreLeads.concat(sNombre);
return resultat;
}
else
return pNombre;
}

How to find smallest substring which contains all characters from a given string?

I have recently come across an interesting question on strings. Suppose you are given following:
Input string1: "this is a test string"
Input string2: "tist"
Output string: "t stri"
So, given above, how can I approach towards finding smallest substring of string1 that contains all the characters from string 2?
To see more details including working code, check my blog post at:
http://www.leetcode.com/2010/11/finding-minimum-window-in-s-which.html
To help illustrate this approach, I use an example: string1 = "acbbaca" and string2 = "aba". Here, we also use the term "window", which means a contiguous block of characters from string1 (could be interchanged with the term substring).
i) string1 = "acbbaca" and string2 = "aba".
ii) The first minimum window is found.
Notice that we cannot advance begin
pointer as hasFound['a'] ==
needToFind['a'] == 2. Advancing would
mean breaking the constraint.
iii) The second window is found. begin
pointer still points to the first
element 'a'. hasFound['a'] (3) is
greater than needToFind['a'] (2). We
decrement hasFound['a'] by one and
advance begin pointer to the right.
iv) We skip 'c' since it is not found
in string2. Begin pointer now points to 'b'.
hasFound['b'] (2) is greater than
needToFind['b'] (1). We decrement
hasFound['b'] by one and advance begin
pointer to the right.
v) Begin pointer now points to the
next 'b'. hasFound['b'] (1) is equal
to needToFind['b'] (1). We stop
immediately and this is our newly
found minimum window.
The idea is mainly based on the help of two pointers (begin and end position of the window) and two tables (needToFind and hasFound) while traversing string1. needToFind stores the total count of a character in string2 and hasFound stores the total count of a character met so far. We also use a count variable to store the total characters in string2 that's met so far (not counting characters where hasFound[x] exceeds needToFind[x]). When count equals string2's length, we know a valid window is found.
Each time we advance the end pointer (pointing to an element x), we increment hasFound[x] by one. We also increment count by one if hasFound[x] is less than or equal to needToFind[x]. Why? When the constraint is met (that is, count equals to string2's size), we immediately advance begin pointer as far right as possible while maintaining the constraint.
How do we check if it is maintaining the constraint? Assume that begin points to an element x, we check if hasFound[x] is greater than needToFind[x]. If it is, we can decrement hasFound[x] by one and advancing begin pointer without breaking the constraint. On the other hand, if it is not, we stop immediately as advancing begin pointer breaks the window constraint.
Finally, we check if the minimum window length is less than the current minimum. Update the current minimum if a new minimum is found.
Essentially, the algorithm finds the first window that satisfies the constraint, then continue maintaining the constraint throughout.
You can do a histogram sweep in O(N+M) time and O(1) space where N is the number of characters in the first string and M is the number of characters in the second.
It works like this:
Make a histogram of the second string's characters (key operation is hist2[ s2[i] ]++).
Make a cumulative histogram of the first string's characters until that histogram contains every character that the second string's histogram contains (which I will call "the histogram condition").
Then move forwards on the first string, subtracting from the histogram, until it fails to meet the histogram condition. Mark that bit of the first string (before the final move) as your tentative substring.
Move the front of the substring forwards again until you meet the histogram condition again. Move the end forwards until it fails again. If this is a shorter substring than the first, mark that as your tentative substring.
Repeat until you've passed through the entire first string.
The marked substring is your answer.
Note that by varying the check you use on the histogram condition, you can choose either to have the same set of characters as the second string, or at least as many characters of each type. (Its just the difference between a[i]>0 && b[i]>0 and a[i]>=b[i].)
You can speed up the histogram checks if you keep a track of which condition is not satisfied when you're trying to satisfy it, and checking only the thing that you decrement when you're trying to break it. (On the initial buildup, you count how many items you've satisfied, and increment that count every time you add a new character that takes the condition from false to true.)
Here's an O(n) solution. The basic idea is simple: for each starting index, find the least ending index such that the substring contains all of the necessary letters. The trick is that the least ending index increases over the course of the function, so with a little data structure support, we consider each character at most twice.
In Python:
from collections import defaultdict
def smallest(s1, s2):
assert s2 != ''
d = defaultdict(int)
nneg = [0] # number of negative entries in d
def incr(c):
d[c] += 1
if d[c] == 0:
nneg[0] -= 1
def decr(c):
if d[c] == 0:
nneg[0] += 1
d[c] -= 1
for c in s2:
decr(c)
minlen = len(s1) + 1
j = 0
for i in xrange(len(s1)):
while nneg[0] > 0:
if j >= len(s1):
return minlen
incr(s1[j])
j += 1
minlen = min(minlen, j - i)
decr(s1[i])
return minlen
I received the same interview question. I am a C++ candidate but I was in a position to code relatively fast in JAVA.
Java [Courtesy : Sumod Mathilakath]
import java.io.*;
import java.util.*;
class UserMainCode
{
public String GetSubString(String input1,String input2){
// Write code here...
return find(input1, input2);
}
private static boolean containsPatternChar(int[] sCount, int[] pCount) {
for(int i=0;i<256;i++) {
if(pCount[i]>sCount[i])
return false;
}
return true;
}
public static String find(String s, String p) {
if (p.length() > s.length())
return null;
int[] pCount = new int[256];
int[] sCount = new int[256];
// Time: O(p.lenght)
for(int i=0;i<p.length();i++) {
pCount[(int)(p.charAt(i))]++;
sCount[(int)(s.charAt(i))]++;
}
int i = 0, j = p.length(), min = Integer.MAX_VALUE;
String res = null;
// Time: O(s.lenght)
while (j < s.length()) {
if (containsPatternChar(sCount, pCount)) {
if ((j - i) < min) {
min = j - i;
res = s.substring(i, j);
// This is the smallest possible substring.
if(min==p.length())
break;
// Reduce the window size.
sCount[(int)(s.charAt(i))]--;
i++;
}
} else {
sCount[(int)(s.charAt(j))]++;
// Increase the window size.
j++;
}
}
System.out.println(res);
return res;
}
}
C++ [Courtesy : sundeepblue]
#include <iostream>
#include <vector>
#include <string>
#include <climits>
using namespace std;
string find_minimum_window(string s, string t) {
if(s.empty() || t.empty()) return;
int ns = s.size(), nt = t.size();
vector<int> total(256, 0);
vector<int> sofar(256, 0);
for(int i=0; i<nt; i++)
total[t[i]]++;
int L = 0, R;
int minL = 0; //gist2
int count = 0;
int min_win_len = INT_MAX;
for(R=0; R<ns; R++) { // gist0, a big for loop
if(total[s[R]] == 0) continue;
else sofar[s[R]]++;
if(sofar[s[R]] <= total[s[R]]) // gist1, <= not <
count++;
if(count == nt) { // POS1
while(true) {
char c = s[L];
if(total[c] == 0) { L++; }
else if(sofar[c] > total[c]) {
sofar[c]--;
L++;
}
else break;
}
if(R - L + 1 < min_win_len) { // this judge should be inside POS1
min_win_len = R - L + 1;
minL = L;
}
}
}
string res;
if(count == nt) // gist3, cannot forget this.
res = s.substr(minL, min_win_len); // gist4, start from "minL" not "L"
return res;
}
int main() {
string s = "abdccdedca";
cout << find_minimum_window(s, "acd");
}
Erlang [Courtesy : wardbekker]
-module(leetcode).
-export([min_window/0]).
%% Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n).
%% For example,
%% S = "ADOBECODEBANC"
%% T = "ABC"
%% Minimum window is "BANC".
%% Note:
%% If there is no such window in S that covers all characters in T, return the emtpy string "".
%% If there are multiple such windows, you are guaranteed that there will always be only one unique minimum window in S.
min_window() ->
"eca" = min_window("cabeca", "cae"),
"eca" = min_window("cfabeca", "cae"),
"aec" = min_window("cabefgecdaecf", "cae"),
"cwae" = min_window("cabwefgewcwaefcf", "cae"),
"BANC" = min_window("ADOBECODEBANC", "ABC"),
ok.
min_window(T, S) ->
min_window(T, S, []).
min_window([], _T, MinWindow) ->
MinWindow;
min_window([H | Rest], T, MinWindow) ->
NewMinWindow = case lists:member(H, T) of
true ->
MinWindowFound = fullfill_window(Rest, lists:delete(H, T), [H]),
case length(MinWindow) == 0 orelse (length(MinWindow) > length(MinWindowFound)
andalso length(MinWindowFound) > 0) of
true ->
MinWindowFound;
false ->
MinWindow
end;
false ->
MinWindow
end,
min_window(Rest, T, NewMinWindow).
fullfill_window(_, [], Acc) ->
%% window completed
Acc;
fullfill_window([], _T, _Acc) ->
%% no window found
"";
fullfill_window([H | Rest], T, Acc) ->
%% completing window
case lists:member(H, T) of
true ->
fullfill_window(Rest, lists:delete(H, T), Acc ++ [H]);
false ->
fullfill_window(Rest, T, Acc ++ [H])
end.
REF:
http://articles.leetcode.com/finding-minimum-window-in-s-which/#comment-511216
http://www.mif.vu.lt/~valdas/ALGORITMAI/LITERATURA/Cormen/Cormen.pdf
Please have a look at this as well:
//-----------------------------------------------------------------------
bool IsInSet(char ch, char* cSet)
{
char* cSetptr = cSet;
int index = 0;
while (*(cSet+ index) != '\0')
{
if(ch == *(cSet+ index))
{
return true;
}
++index;
}
return false;
}
void removeChar(char ch, char* cSet)
{
bool bShift = false;
int index = 0;
while (*(cSet + index) != '\0')
{
if( (ch == *(cSet + index)) || bShift)
{
*(cSet + index) = *(cSet + index + 1);
bShift = true;
}
++index;
}
}
typedef struct subStr
{
short iStart;
short iEnd;
short szStr;
}ss;
char* subStringSmallest(char* testStr, char* cSet)
{
char* subString = NULL;
int iSzSet = strlen(cSet) + 1;
int iSzString = strlen(testStr)+ 1;
char* cSetBackUp = new char[iSzSet];
memcpy((void*)cSetBackUp, (void*)cSet, iSzSet);
int iStartIndx = -1;
int iEndIndx = -1;
int iIndexStartNext = -1;
std::vector<ss> subStrVec;
int index = 0;
while( *(testStr+index) != '\0' )
{
if (IsInSet(*(testStr+index), cSetBackUp))
{
removeChar(*(testStr+index), cSetBackUp);
if(iStartIndx < 0)
{
iStartIndx = index;
}
else if( iIndexStartNext < 0)
iIndexStartNext = index;
else
;
if (strlen(cSetBackUp) == 0 )
{
iEndIndx = index;
if( iIndexStartNext == -1)
break;
else
{
index = iIndexStartNext;
ss stemp = {iStartIndx, iEndIndx, (iEndIndx-iStartIndx + 1)};
subStrVec.push_back(stemp);
iStartIndx = iEndIndx = iIndexStartNext = -1;
memcpy((void*)cSetBackUp, (void*)cSet, iSzSet);
continue;
}
}
}
else
{
if (IsInSet(*(testStr+index), cSet))
{
if(iIndexStartNext < 0)
iIndexStartNext = index;
}
}
++index;
}
int indexSmallest = 0;
for(int indexVec = 0; indexVec < subStrVec.size(); ++indexVec)
{
if(subStrVec[indexSmallest].szStr > subStrVec[indexVec].szStr)
indexSmallest = indexVec;
}
subString = new char[(subStrVec[indexSmallest].szStr) + 1];
memcpy((void*)subString, (void*)(testStr+ subStrVec[indexSmallest].iStart), subStrVec[indexSmallest].szStr);
memset((void*)(subString + subStrVec[indexSmallest].szStr), 0, 1);
delete[] cSetBackUp;
return subString;
}
//--------------------------------------------------------------------
Edit: apparently there's an O(n) algorithm (cf. algorithmist's answer). Obviously this have this will beat the [naive] baseline described below!
Too bad I gotta go... I'm a bit suspicious that we can get O(n). I'll check in tomorrow to see the winner ;-) Have fun!
Tentative algorithm:
The general idea is to sequentially try and use a character from str2 found in str1 as the start of a search (in either/both directions) of all the other letters of str2. By keeping a "length of best match so far" value, we can abort searches when they exceed this. Other heuristics can probably be used to further abort suboptimal (so far) solutions. The choice of the order of the starting letters in str1 matters much; it is suggested to start with the letter(s) of str1 which have the lowest count and to try with the other letters, of an increasing count, in subsequent attempts.
[loose pseudo-code]
- get count for each letter/character in str1 (number of As, Bs etc.)
- get count for each letter in str2
- minLen = length(str1) + 1 (the +1 indicates you're not sure all chars of
str2 are in str1)
- Starting with the letter from string2 which is found the least in string1,
look for other letters of Str2, in either direction of str1, until you've
found them all (or not, at which case response = impossible => done!).
set x = length(corresponding substring of str1).
- if (x < minLen),
set minlen = x,
also memorize the start/len of the str1 substring.
- continue trying with other letters of str1 (going the up the frequency
list in str1), but abort search as soon as length(substring of strl)
reaches or exceed minLen.
We can find a few other heuristics that would allow aborting a
particular search, based on [pre-calculated ?] distance between a given
letter in str1 and some (all?) of the letters in str2.
- the overall search terminates when minLen = length(str2) or when
we've used all letters of str1 (which match one letter of str2)
as a starting point for the search
Here is Java implementation
public static String shortestSubstrContainingAllChars(String input, String target) {
int needToFind[] = new int[256];
int hasFound[] = new int[256];
int totalCharCount = 0;
String result = null;
char[] targetCharArray = target.toCharArray();
for (int i = 0; i < targetCharArray.length; i++) {
needToFind[targetCharArray[i]]++;
}
char[] inputCharArray = input.toCharArray();
for (int begin = 0, end = 0; end < inputCharArray.length; end++) {
if (needToFind[inputCharArray[end]] == 0) {
continue;
}
hasFound[inputCharArray[end]]++;
if (hasFound[inputCharArray[end]] <= needToFind[inputCharArray[end]]) {
totalCharCount ++;
}
if (totalCharCount == target.length()) {
while (needToFind[inputCharArray[begin]] == 0
|| hasFound[inputCharArray[begin]] > needToFind[inputCharArray[begin]]) {
if (hasFound[inputCharArray[begin]] > needToFind[inputCharArray[begin]]) {
hasFound[inputCharArray[begin]]--;
}
begin++;
}
String substring = input.substring(begin, end + 1);
if (result == null || result.length() > substring.length()) {
result = substring;
}
}
}
return result;
}
Here is the Junit Test
#Test
public void shortestSubstringContainingAllCharsTest() {
String result = StringUtil.shortestSubstrContainingAllChars("acbbaca", "aba");
assertThat(result, equalTo("baca"));
result = StringUtil.shortestSubstrContainingAllChars("acbbADOBECODEBANCaca", "ABC");
assertThat(result, equalTo("BANC"));
result = StringUtil.shortestSubstrContainingAllChars("this is a test string", "tist");
assertThat(result, equalTo("t stri"));
}
//[ShortestSubstring.java][1]
public class ShortestSubstring {
public static void main(String[] args) {
String input1 = "My name is Fran";
String input2 = "rim";
System.out.println(getShortestSubstring(input1, input2));
}
private static String getShortestSubstring(String mainString, String toBeSearched) {
int mainStringLength = mainString.length();
int toBeSearchedLength = toBeSearched.length();
if (toBeSearchedLength > mainStringLength) {
throw new IllegalArgumentException("search string cannot be larger than main string");
}
for (int j = 0; j < mainStringLength; j++) {
for (int i = 0; i <= mainStringLength - toBeSearchedLength; i++) {
String substring = mainString.substring(i, i + toBeSearchedLength);
if (checkIfMatchFound(substring, toBeSearched)) {
return substring;
}
}
toBeSearchedLength++;
}
return null;
}
private static boolean checkIfMatchFound(String substring, String toBeSearched) {
char[] charArraySubstring = substring.toCharArray();
char[] charArrayToBeSearched = toBeSearched.toCharArray();
int count = 0;
for (int i = 0; i < charArraySubstring.length; i++) {
for (int j = 0; j < charArrayToBeSearched.length; j++) {
if (String.valueOf(charArraySubstring[i]).equalsIgnoreCase(String.valueOf(charArrayToBeSearched[j]))) {
count++;
}
}
}
return count == charArrayToBeSearched.length;
}
}
This is an approach using prime numbers to avoid one loop, and replace it with multiplications. Several other minor optimizations can be made.
Assign a unique prime number to any of the characters that you want to find, and 1 to the uninteresting characters.
Find the product of a matching string by multiplying the prime number with the number of occurrences it should have. Now this product can only be found if the same prime factors are used.
Search the string from the beginning, multiplying the respective prime number as you move into a running product.
If the number is greater than the correct sum, remove the first character and divide its prime number out of your running product.
If the number is less than the correct sum, include the next character and multiply it into your running product.
If the number is the same as the correct sum you have found a match, slide beginning and end to next character and continue searching for other matches.
Decide which of the matches is the shortest.
Gist
charcount = { 'a': 3, 'b' : 1 };
str = "kjhdfsbabasdadaaaaasdkaaajbajerhhayeom"
def find (c, s):
Ns = len (s)
C = list (c.keys ())
D = list (c.values ())
# prime numbers assigned to the first 25 chars
prmsi = [ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89 , 97]
# primes used in the key, all other set to 1
prms = []
Cord = [ord(c) - ord('a') for c in C]
for e,p in enumerate(prmsi):
if e in Cord:
prms.append (p)
else:
prms.append (1)
# Product of match
T = 1
for c,d in zip(C,D):
p = prms[ord (c) - ord('a')]
T *= p**d
print ("T=", T)
t = 1 # product of current string
f = 0
i = 0
matches = []
mi = 0
mn = Ns
mm = 0
while i < Ns:
k = prms[ord(s[i]) - ord ('a')]
t *= k
print ("testing:", s[f:i+1])
if (t > T):
# included too many chars: move start
t /= prms[ord(s[f]) - ord('a')] # remove first char, usually division by 1
f += 1 # increment start position
t /= k # will be retested, could be replaced with bool
elif t == T:
# found match
print ("FOUND match:", s[f:i+1])
matches.append (s[f:i+1])
if (i - f) < mn:
mm = mi
mn = i - f
mi += 1
t /= prms[ord(s[f]) - ord('a')] # remove first matching char
# look for next match
i += 1
f += 1
else:
# no match yet, keep searching
i += 1
return (mm, matches)
print (find (charcount, str))
(note: this answer was originally posted to a duplicate question, the original answer is now deleted.)
C# Implementation:
public static Tuple<int, int> FindMinSubstringWindow(string input, string pattern)
{
Tuple<int, int> windowCoords = new Tuple<int, int>(0, input.Length - 1);
int[] patternHist = new int[256];
for (int i = 0; i < pattern.Length; i++)
{
patternHist[pattern[i]]++;
}
int[] inputHist = new int[256];
int minWindowLength = int.MaxValue;
int count = 0;
for (int begin = 0, end = 0; end < input.Length; end++)
{
// Skip what's not in pattern.
if (patternHist[input[end]] == 0)
{
continue;
}
inputHist[input[end]]++;
// Count letters that are in pattern.
if (inputHist[input[end]] <= patternHist[input[end]])
{
count++;
}
// Window found.
if (count == pattern.Length)
{
// Remove extra instances of letters from pattern
// or just letters that aren't part of the pattern
// from the beginning.
while (patternHist[input[begin]] == 0 ||
inputHist[input[begin]] > patternHist[input[begin]])
{
if (inputHist[input[begin]] > patternHist[input[begin]])
{
inputHist[input[begin]]--;
}
begin++;
}
// Current window found.
int windowLength = end - begin + 1;
if (windowLength < minWindowLength)
{
windowCoords = new Tuple<int, int>(begin, end);
minWindowLength = windowLength;
}
}
}
if (count == pattern.Length)
{
return windowCoords;
}
return null;
}
I've implemented it using Python3 at O(N) efficiency:
def get(s, alphabet="abc"):
seen = {}
for c in alphabet:
seen[c] = 0
seen[s[0]] = 1
start = 0
end = 0
shortest_s = 0
shortest_e = 99999
while end + 1 < len(s):
while seen[s[start]] > 1:
seen[s[start]] -= 1
start += 1
# Constant time check:
if sum(seen.values()) == len(alphabet) and all(v == 1 for v in seen.values()) and \
shortest_e - shortest_s > end - start:
shortest_s = start
shortest_e = end
end += 1
seen[s[end]] += 1
return s[shortest_s: shortest_e + 1]
print(get("abbcac")) # Expected to return "bca"
String s = "xyyzyzyx";
String s1 = "xyz";
String finalString ="";
Map<Character,Integer> hm = new HashMap<>();
if(s1!=null && s!=null && s.length()>s1.length()){
for(int i =0;i<s1.length();i++){
if(hm.get(s1.charAt(i))!=null){
int k = hm.get(s1.charAt(i))+1;
hm.put(s1.charAt(i), k);
}else
hm.put(s1.charAt(i), 1);
}
Map<Character,Integer> t = new HashMap<>();
int start =-1;
for(int j=0;j<s.length();j++){
if(hm.get(s.charAt(j))!=null){
if(t.get(s.charAt(j))!=null){
if(t.get(s.charAt(j))!=hm.get(s.charAt(j))){
int k = t.get(s.charAt(j))+1;
t.put(s.charAt(j), k);
}
}else{
t.put(s.charAt(j), 1);
if(start==-1){
if(j+s1.length()>s.length()){
break;
}
start = j;
}
}
if(hm.equals(t)){
t = new HashMap<>();
if(finalString.length()<s.substring(start,j+1).length());
{
finalString=s.substring(start,j+1);
}
j=start;
start=-1;
}
}
}
JavaScript solution in bruteforce way:
function shortestSubStringOfUniqueChars(s){
var uniqueArr = [];
for(let i=0; i<s.length; i++){
if(uniqueArr.indexOf(s.charAt(i)) <0){
uniqueArr.push(s.charAt(i));
}
}
let windoww = uniqueArr.length;
while(windoww < s.length){
for(let i=0; i<s.length - windoww; i++){
let match = true;
let tempArr = [];
for(let j=0; j<uniqueArr.length; j++){
if(uniqueArr.indexOf(s.charAt(i+j))<0){
match = false;
break;
}
}
let checkStr
if(match){
checkStr = s.substr(i, windoww);
for(let j=0; j<uniqueArr.length; j++){
if(uniqueArr.indexOf(checkStr.charAt(j))<0){
match = false;
break;
}
}
}
if(match){
return checkStr;
}
}
windoww = windoww + 1;
}
}
console.log(shortestSubStringOfUniqueChars("ABA"));
# Python implementation
s = input('Enter the string : ')
s1 = input('Enter the substring to search : ')
l = [] # List to record all the matching combinations
check = all([char in s for char in s1])
if check == True:
for i in range(len(s1),len(s)+1) :
for j in range(0,i+len(s1)+2):
if (i+j) < len(s)+1:
cnt = 0
b = all([char in s[j:i+j] for char in s1])
if (b == True) :
l.append(s[j:i+j])
print('The smallest substring containing',s1,'is',l[0])
else:
print('Please enter a valid substring')
Java code for the approach discussed above:
private static Map<Character, Integer> frequency;
private static Set<Character> charsCovered;
private static Map<Character, Integer> encountered;
/**
* To set the first match index as an intial start point
*/
private static boolean hasStarted = false;
private static int currentStartIndex = 0;
private static int finalStartIndex = 0;
private static int finalEndIndex = 0;
private static int minLen = Integer.MAX_VALUE;
private static int currentLen = 0;
/**
* Whether we have already found the match and now looking for other
* alternatives.
*/
private static boolean isFound = false;
private static char currentChar;
public static String findSmallestSubStringWithAllChars(String big, String small) {
if (null == big || null == small || big.isEmpty() || small.isEmpty()) {
return null;
}
frequency = new HashMap<Character, Integer>();
instantiateFrequencyMap(small);
charsCovered = new HashSet<Character>();
int charsToBeCovered = frequency.size();
encountered = new HashMap<Character, Integer>();
for (int i = 0; i < big.length(); i++) {
currentChar = big.charAt(i);
if (frequency.containsKey(currentChar) && !isFound) {
if (!hasStarted && !isFound) {
hasStarted = true;
currentStartIndex = i;
}
updateEncounteredMapAndCharsCoveredSet(currentChar);
if (charsCovered.size() == charsToBeCovered) {
currentLen = i - currentStartIndex;
isFound = true;
updateMinLength(i);
}
} else if (frequency.containsKey(currentChar) && isFound) {
updateEncounteredMapAndCharsCoveredSet(currentChar);
if (currentChar == big.charAt(currentStartIndex)) {
encountered.put(currentChar, encountered.get(currentChar) - 1);
currentStartIndex++;
while (currentStartIndex < i) {
if (encountered.containsKey(big.charAt(currentStartIndex))
&& encountered.get(big.charAt(currentStartIndex)) > frequency.get(big
.charAt(currentStartIndex))) {
encountered.put(big.charAt(currentStartIndex),
encountered.get(big.charAt(currentStartIndex)) - 1);
} else if (encountered.containsKey(big.charAt(currentStartIndex))) {
break;
}
currentStartIndex++;
}
}
currentLen = i - currentStartIndex;
updateMinLength(i);
}
}
System.out.println("start: " + finalStartIndex + " finalEnd : " + finalEndIndex);
return big.substring(finalStartIndex, finalEndIndex + 1);
}
private static void updateMinLength(int index) {
if (minLen > currentLen) {
minLen = currentLen;
finalStartIndex = currentStartIndex;
finalEndIndex = index;
}
}
private static void updateEncounteredMapAndCharsCoveredSet(Character currentChar) {
if (encountered.containsKey(currentChar)) {
encountered.put(currentChar, encountered.get(currentChar) + 1);
} else {
encountered.put(currentChar, 1);
}
if (encountered.get(currentChar) >= frequency.get(currentChar)) {
charsCovered.add(currentChar);
}
}
private static void instantiateFrequencyMap(String str) {
for (char c : str.toCharArray()) {
if (frequency.containsKey(c)) {
frequency.put(c, frequency.get(c) + 1);
} else {
frequency.put(c, 1);
}
}
}
public static void main(String[] args) {
String big = "this is a test string";
String small = "tist";
System.out.println("len: " + big.length());
System.out.println(findSmallestSubStringWithAllChars(big, small));
}
def minimum_window(s, t, min_length = 100000):
d = {}
for x in t:
if x in d:
d[x]+= 1
else:
d[x] = 1
tot = sum([y for x,y in d.iteritems()])
l = []
ind = 0
for i,x in enumerate(s):
if ind == 1:
l = l + [x]
if x in d:
tot-=1
if not l:
ind = 1
l = [x]
if tot == 0:
if len(l)<min_length:
min_length = len(l)
min_length = minimum_window(s[i+1:], t, min_length)
return min_length
l_s = "ADOBECODEBANC"
t_s = "ABC"
min_length = minimum_window(l_s, t_s)
if min_length == 100000:
print "Not found"
else:
print min_length

Resources