Compare a converted TreeSet from a String with another TreeSet Java 7 - string

Using Java 7 I am looking for a way to compare a string that I converted to a TreeSet with another defined TreeSet.
Here is an example:
public static <T> Set<T> intersection(Set<T> setA, Set<T> setB) {
Set<T> tmp = new TreeSet<T>();
for (T x : setA)
if (setB.contains(x))
tmp.add(x);
return tmp;
}
public static void main(String args[]) {
String str ="A, B, C, D";
Set<String> set3 = new TreeSet(Arrays.asList(str));
TreeSet<String> set1 = new TreeSet<String>();
TreeSet<String> set2 = new TreeSet<String>();
set1.add("A");
set1.add("B");
set1.add("C");
set1.add("D");
set2.add("A");
set2.add("B");
set2.add("C");
set2.add("D");
System.out.println("Intersection: Set 1 & 2: " + intersection(set1, set2));
System.out.println("Intersection: Set 1 & 3: " + intersection(set1, set3));
}
Comparing a set to another set that I used set.add to add its elments works as expected:
Intersection: Set 1 & 2: [A, B, C, D]
The issue that I can't get my head around is why I don't have a match when I compare a converted TreeSet from a String with another TreeSet.
Intersection: Set 1 & 3: []
But when I print set1 and set3 I find them to be exactly the same??
System.out.println("set1: " + set1);
System.out.println("set3: " + set3);
set1: [A, B, C, D]
set3: [A, B, C, D]
Then I tried looping through the string and added each element to the set but that didn't work either.
System.out.println("Intersection: (set1 - set3)" + intersection(set1, convertString2Set(str)));
private static Set convertString2Set(String line) {
List<String> linesList = new ArrayList<String>(Arrays.asList(line));
Set<String> set = new TreeSet<String>();
for (String elme : linesList) {
set.add(elme);
}
return set;
}

set3 in your code contains only one element: a string which is equal to "A, B, C, D". This string is not equal to either of four strings in set1, which are "A", "B', "C" and "D". The toString() representation of the two sets happens to be the same, but it does not mean that they have the same content.
Arrays.asList() on a String does not split anything; it will always create a single-element list. If you want to split a string into multiple substrings, use the String.split() method.

Related

How to find missing character in the second string when we compare two strings? - Coding Question

If String a = "abbc" and String b="abc", we have to print that character 'b' is missing in the second string.
I want to do it by using Java. I am able to do it when String 2 has a character not present in String 1 when s1=abc and s2=abk but not when characters are same in both strings like the one I have mentioned in the question.
public class Program
{
public static void main(String[] args) {
String str1 = "abbc";
String str2 = "abc";
char first[] = str1.toCharArray();
char second[] = str2.toCharArray();
HashMap <Character, Integer> map1 = new HashMap<Character,Integer>();
for(char a: first){
if(!map1.containsKey(a)){
map1.put(a,1);
}else{
map1.put(a,map1.get(a)+1);
}
}
System.out.println(map1);
HashMap <Character, Integer> map2 = new HashMap<Character,Integer>();
for(char b: second){
if(!map2.containsKey(b)){
map2.put(b,1);
}else{
map2.put(b,map2.get(b)+1);
}
}
System.out.println(map2);
}
}
I have two hashmaps here one for the longer string and one for the shorter string, map1 {a=1,b=2,c=1} and map2 {a=1,b=1,c=1}. What should I do after this?
Let assume that we have two strings a and b.
(optional) Compare lengths to find longer one.
Iterate over them char by char and compare letters at same index.
If both letters are the same, ignore it. If different, add letter from longer string to result and increment index of the longer string by 1.
What's left in longer string is your result.
Pseudocode:
const a = "aabbccc"
const b = "aabcc"
let res = ""
for (let i = 0, j = 0; i <= a.length; i++, j++) {
if (a[i] !== b[j]) {
res += a[i]
i++
}
}
console.log(res)
More modern and elegant way using high order functions:
const a = "aabbccc"
const b = "aabcc"
const res = [...a].reduce((r, e, i) => e === b[i - r.length] ? r : r + e, "")
console.log(res)

Shortest string containing the most occurrence of strings in a set

Define the degree of a string M be the number of times it appears in another string S. For example M = "aba" and S="ababa", the degree of M is 2. Given a set of strings and an integer N, find the string of the minimum length so that the sum of degrees of all strings in the set is at least N.
For example a set {"ab", "bd", "abd" "babd", "abc"}, N = 4, the answer will be "babd". It contains "ab", "abd", "babd" and "bd" one time.
N <= 100, M <= 100, length of every string in the set <= 100. Strings in the set only consist of uppercase and lowercase letters.
How to solve this problem? This looks similar to the shortest superstring problems which has a dynamic programming solution that has exponential complexity. However, the constraint in this problem is much larger and the same idea also won't work here. Is there some string data structure that can be applied here?
I have a polynomial time algorithm, which I'm too lazy to code. But I'll describe it for you.
First, make each string in the set plus the empty string be the nodes of a graph. The empty string is connected to each other string, and vice versa. If the end of one string overlaps with the start of another, they also connect. If two can overlap by different amounts, they get multiple edges. (So it is not exactly a graph...)
Each edge gets a cost and a value. The cost is how many characters you have to extend the string you are building by to move from the old end to the new end. (In other words the length of the second string minus the length of the overlap.) to having this one. The value is how many new strings you completed that cross the barrier between the former and the latter string.
Your example was {"ab", "bd", "abd" "babd", "abc"}. Here are the (cost, value) pairs for each transition.
from -> to : (value, cost)
"" -> "ab": ( 1, 2)
"" -> "bd": ( 1, 2)
"" -> "abd": ( 3, 3) # we added "ab", "bd" and "abd"
"" -> "babd": ( 4, 4) # we get "ab", "bd", "abd" and "babd"
"" -> "abc": ( 2, 3) # we get "ab" and "abc"
"ab" -> "": ( 0, 0)
"ab" -> "bd": ( 2, 1) # we added "abd" and "bd" for 1 character
"ab" -> "abd": ( 2, 1) # ditto
"ab" -> "abc": ( 1, 1) # we only added "abc"
"bd" -> "": ( 0, 0) # only empty, nothing else starts "bd"
"abd" -> "": ( 0, 0)
"babd" -> "": ( 0, 0)
"babd" -> "abd": ( 0, 0) # overlapped, but added nothing.
"abc" -> "": ( 0, 0)
OK, all of that is setup. Why did we want this graph?
Well note that if we start at "" with a cost of 0 and a value of 0, then take a path through the graph, that constructs a string. It correctly states the cost, and provides a lower bound on the value. The value can be higher. For example if your set were {"ab", "bc", "cd", "abcd"} then the path "" -> "ab" -> "bc" -> "cd" would lead to the string "abcd" with a cost of 4 and a predicted value of 3. But that value estimate missed the fact that we matched "abcd".
However for any given string made up only of substrings from the set, there is a path through the graph that has the correct cost and the correct value. (At each choice you want to pick the earliest starting matching string that you have not yet counted, and of those pick the longest of them. Then you never miss any matches.)
So we've turned our problem from constructing strings to constructing paths through a graph. What we want to do is build up the following data structure:
for each (value, node) combination:
(best cost, previous node, previous value)
Filling in that data structure is a dynamic programming problem. Once filled in we can just trace back through it to find what path in the graph got us to that value with that cost. Given that path, we can figure out the string that did it.
How fast is it? If our set has K strings then we only need to fill in K * N values, each of which we can give a maximum of K candidates for new values. Which makes the path finding a O(K^2 * N) problem.
So here is my approach. At first iteration we construct a pool out of the initial strings.
After that:
We select out of the pool a string having minimal length and sum of degrees=N. If we found such a string we just return it.
We filter out of the pool all strings with degree less than maximal. We work only with the best possible string combinations.
We construct all variants out of the current pool and the initial strings. Here we need to take into consideration that strings can overlap. Say a string "aba" and "ab"(from initial strings) could produce: ababa, abab, abaab (we do not include "aba" because we already had it in our pool and we need to move further).
We filter out duplicates and this is our next pool.
Repeat everything from the point 1.
The FindTarget() method accepts the target sum as a parameter. FindTarget(4) will solve the sample task.
public class Solution
{
/// <summary>
/// The initial strings.
/// </summary>
string[] stringsSet;
Tuple<string, string>[][] splits;
public Solution(string[] strings)
{
stringsSet = strings;
splits = stringsSet.Select(s => ProduceItemSplits(s)).ToArray();
}
/// <summary>
/// Find the optimal string.
/// </summary>
/// <param name="N">Target degree.</param>
/// <returns></returns>
public string FindTarget(int N)
{
var pool = stringsSet;
while (true)
{
var poolWithDegree = pool.Select(s => new { str = s, degree = GetN(s) })
.ToArray();
var maxDegree = poolWithDegree.Max(m => m.degree);
var optimalString = poolWithDegree
.Where(w => w.degree >= N)
.OrderBy(od => od.str.Length)
.FirstOrDefault();
if (optimalString != null) return optimalString.str; // We found it
var nextPool = poolWithDegree.Where(w => w.degree == maxDegree)
.SelectMany(sm => ExpandString(sm.str))
.Distinct()
.ToArray();
pool = nextPool;
}
}
/// <summary>
/// Get degree.
/// </summary>
/// <param name="candidate"></param>
/// <returns></returns>
public int GetN(string candidate)
{
var N = stringsSet.Select(s =>
{
var c = Regex.Matches(candidate, s).Count();
return c;
}).Sum();
return N;
}
public Tuple<string, string>[] ProduceItemSplits(string item)
{
var substings = Enumerable.Range(0, item.Length + 1)
.Select((i) => new Tuple<string, string>(item.Substring(0, i), item.Substring(i, item.Length - i))).ToArray();
return substings;
}
private IEnumerable<string> ExpandStringWithOneItem(string str, int index)
{
var item = stringsSet[index];
var itemSplits = splits[index];
var startAttachments = itemSplits.Where(w => str.StartsWith(w.Item2) && w.Item1.Length > 0)
.Select(s => s.Item1 + str);
var endAttachments = itemSplits.Where(w => str.EndsWith(w.Item1) && w.Item2.Length > 0)
.Select(s => str + s.Item2);
return startAttachments.Union(endAttachments);
}
public IEnumerable<string> ExpandString(string str)
{
var r = Enumerable.Range(0, splits.Length - 1)
.Select(s => ExpandStringWithOneItem(str, s))
.SelectMany(s => s);
return r;
}
}
static void Main(string[] args)
{
var solution = new Solution(new string[] { "ab", "bd", "abd", "babd", "abc" });
var s = solution.FindTarget(150);
Console.WriteLine(s);
}

Valid binary tree from array of string

I am having an array of strings.Each character in a string can be r or l only.
I have to check if it is valid or not as
1. {rlr,l,r,lr, rl}
*
/ \
l r
\ /
r l
\
r
A valid tree as all nodes are present.
2. {ll, r, rl, rr}
*
/ \
- r
/ /\
l l r
Invalid tree as there is no l node.
From a give input I have to determine if it is creating a valid tree or not.
I have come up with two solutions.
1.Using trie to store input and marking each node as valid or not while insertion.
2.Sort the input array according to the length.
So for the first case it will be { l, r, lr, rl, rlr}
And I will create a set of strings to put all input.
If a string is having length more then 1(for rlr :: r, rl) I will consider all its prefix from index 0 and check in set.if any of the prefix in not present in set then I will return false.
I am wondering if there is a more optimal solution or any modification in the above methods.
A recursive approach with test cases,
public static void main(String[] args) {
System.out.println(new Main().isValid(new String[]{"LRL", "LRR", "LL", "LR"}));
System.out.println(new Main().isValid(new String[]{"LRL", "LRR", "LL", "LR", "L"}));
System.out.println(new Main().isValid(new String[]{"LR", "L"}));
System.out.println(new Main().isValid(new String[]{"L", "R", "LL", "LR"}));
}
public boolean isValid(String[] strs) {
Set<String> set = new HashSet<>();
int maxLength = 0;
for (String str : strs) {
set.add(str);
maxLength = Math.max(str.length(), maxLength);
}
helper(set, "L", 1, maxLength);
helper(set, "R", 1, maxLength);
return set.isEmpty();
}
private void helper(Set<String> set, String current, int len, int maxLength) {
if (!set.contains(current) || current.length() > maxLength) {
return;
}
if (set.contains(current))
set.remove(current);
helper(set, current + "L", len + 1, maxLength);
helper(set, current + "R", len + 1, maxLength);
}
Another possible solution is actually building the tree (or trie) and maintain a set of nodes that are incomplete yet.
If you finish iterating over the list and you still have incomplete nodes then the tree isn't valid.
If the set is empty then the tree is valid.
For example, in the second tree you gave, for node ll you will create also node l but you will add it to the incomplete set. If one of the later nodes is l then you will erase it from the set. If not, you will end the iteration with a non empty set that contains you missing nodes.

Chance for this hole in Groovy staic typing to be fixed

When I run the following Groovy snippet, it prints ",a,b,c" as expected:
#CompileStatic
public static void main(String[] args) {
def inList = ["a", "b", "c"]
def outList = inList.inject("", { a, b -> a + "," + b })
println(outList)
}
Now I change the first parameter in inject from an empty string to the number 0:
#CompileStatic
public static void main(String[] args) {
def inList = ["a", "b", "c"]
def outList = inList.inject(0, { a, b -> a + "," + b })
println(outList)
}
This won't work an produces an exception "Cannot cast object '0,a' with class 'java.lang.String' to class 'java.lang.Number'". Problem is that the compiler did not complain. I tried this in Scala and Kotlin (where inject is called fold) an the respective compiler complains about the mismatch as expected. Also the counterpart in Java8 does not compile (it says found int, required: java.lang.String):
List<String> list = Arrays.asList("a", "b", "c");
Object obj = list.stream().reduce(0, (x, y) -> x + y);
System.out.println(obj);
Question is now whether this can be fixed in Groovy or whether this is a a general problem because of static typing being introduced later into the language.
I think (I'm not 100% sure) that it is a bug in Groovy, very probably somewhere in type inference, and it can be fixed. Try to fill an issue in the bug tracker.
If you want to see compilation error, you can give types to closure parameters
inList.inject(0, { String a, String b -> a + "," + b })
Gives an error:
Expected parameter of type java.lang.Integer but got java.lang.String
# line 7, column 38.
def outList = inList.inject(0, { String a, String b -> a + "," + b})
^

Convert string representation of an array of int to a groovy list

If I am given an string which is the repsentation of an array of int like below
String d = "[2,3,4,5]"
How I convert to an array of string?
String[] f = convert d to array of String
Also how I convert to an array of int?
int[] f = convert d to array of int
I was looking for other solutions, since my values contained strings including dots in them and found this.
This code should work for you:
"[1,2,3]".tokenize(',[]')*.toInteger()
You can use Eval.me like so:
String[] f = Eval.me( d )*.toString()
Or
int[] i = Eval.me( d )
Be careful though, as if this String is entered by a third party, it could do nasty things and is a huge security risk... To get round that, you'd need to parse it yourself with something like:
def simplisticParse( String input, Class requiredType ) {
input.dropWhile { it != '[' }
.drop( 1 )
.takeWhile { it != ']' }
.split( ',' )*.asType( requiredType )
}
String[] s = simplisticParse( d, String )
int[] i = simplisticParse( d, Integer )

Resources