how to convert map keys to uppercase in using java 8 streams? - hashmap

I have a method as below
private Map<String,List<String>> createTableColumnListMap(List<Map<String,String>> nqColumnMapList){
Map<String, List<String>> targetTableColumnListMap =
nqColumnMapList.stream()
.flatMap(m -> m.entrySet().stream())
.collect(groupingBy(Map.Entry::getKey, mapping(Map.Entry::getValue, toList())));
return targetTableColumnListMap;
}
I want to uppercase the map keys but couldn't find a way to do it. is there a java 8 way to achieve this?

This doesn't require any fancy manipulation of Collectors. Lets say you have this map
Map<String, Integer> imap = new HashMap<>();
imap.put("One", 1);
imap.put("Two", 2);
Just get a stream for the keySet() and collect into a new map where the keys you insert are uppercased:
Map<String, Integer> newMap = imap.keySet().stream()
.collect(Collectors.toMap(key -> key.toUpperCase(), key -> imap.get(key)));
// ONE - 1
// TWO - 2
Edit:
#Holger's comment is correct, it would be better (and cleaner) to just use an entry set, so here is the updated solution
Map<String, Integer> newMap = imap.entrySet().stream()
.collect(Collectors.toMap(entry -> entry.getKey().toUpperCase(), entry -> entry.getValue()));

Answer for your question [which you can copy and paste] :
Map<String, List<String>> targetTableColumnListMap = nqColumnMapList.stream().flatMap(m -> m.entrySet().stream())
.collect(Collectors.groupingBy(e -> e.getKey().toUpperCase(), Collectors.mapping(Map.Entry::getValue, Collectors.toList())));

Related

Optimization (in terms of speed )

is there any other way to optimize this code. Anyone can come up with better way because this is taking lot of time in main code. Thanks alot;)
HashMap<String, Integer> hmap = new HashMap<String, Integer>();
List<String> dup = new ArrayList<String>();
List<String> nondup = new ArrayList<String>();
for (String num : nums) {
String x= num;
String result = x.toLowerCase();
if (hmap.containsKey(result)) {
hmap.put(result, hmap.get(result) + 1);
}
else {
hmap.put(result,1);
}
}
for(String num:nums){
int count= hmap.get(num.toLowerCase());
if (count == 1){
nondup.add(num);
}
else{
dup.add(num);
}
}
output:
[A/tea, C/SEA.java, C/clock, aep, aeP, C/SEA.java]
Dups: [C/SEA.java, aep, aeP, C/SEA.java]
nondups: [A/tea, C/clock]
How much time is "a lot of time"? Is your input bigger than what you've actually shown us?
You could parallelize this with something like Arrays.parallelStream(nums).collect(Collectors.groupingByConcurrent(k -> k, Collectors.counting()), which would get you a Map<String, Long>, but that would only speed up your code if you have a lot of input, which it doesn't look like you have right now.
You could parallelize the next step, if you liked, like so:
Map<String, Long> counts = Arrays.parallelStream(nums)
.collect(Collectors.groupingByConcurrent(k -> k, Collectors.counting());
Map<Boolean, List<String>> hasDup =
counts.entrySet().parallelStream()
.collect(Collectors.partitioningBy(
entry -> entry.getValue() > 1,
Collectors.mapping(Entry::getKey, Collectors.toList())));
List<String> dup = hasDup.get(true);
List<String> nodup = hasDup.get(false);
The algorithms in the other answers can speed up execution using multiple threads.
This can theoretically reduce the processing time with a factor of M, where M is the maximum number of threads that your system can run concurrently. However, as M is a constant number, this does not change the order of complexity, which therefore remains O(N).
At a glance, I do not see a way to solve your problem in less than O(N), I am afraid.

Java 11 Array of strings to Map of array position, string [duplicate]

This question already has answers here:
Converting string arrays into Map
(5 answers)
Closed 3 years ago.
I have an Array of strings that I want to convert in a Map. Without using AtomicInteger or third party API.
sample:
final String[] strings = new String[]{"Arsenal", "Chelsea", "Liverpool"};
final Map<Integer, String> map = new HashMap<>();
for (int i = 0; i < strings.length; i++) {
map.put(i, strings[i]);
}
System.out.println(map);
which is the best and concise way to achieve so by using streams API?
After looking into it I found 2 possible solutions:
final Map<Integer, String> map = IntStream.rangeClosed(0, strings.length - 1)
.mapToObj(i -> new SimpleEntry<>(i + 1, strings[i]))
.collect(Collectors.toMap(SimpleEntry::getKey, SimpleEntry::getValue));
And:
final Map<Integer, String> map = IntStream.rangeClosed(0, strings.length - 1)
.boxed()
.collect(toMap(i -> i + 1, i -> strings[i]));
Where I don't need to instantiate AbstractMap.SimpleEntry
Any better solution? Is there any advice on which one is the best way?

Implement phone directory using two tries

I have encountered an interview question
“Implement a phone directory using Data Structures”
I want to solve it using tries.By solving it with tries,I tried using two tries,one for name and another for phone number,
but I faced a difficulty .
Suppose ,I have to add three entries( AB “112” BC ”124” CD ”225”)
Then if I query the name for number “225”,how do I return CD.
that is,how these two tries will be linked .
One approach I was thinking was taking two pointers in both the tries.
These pointers will point to the first and last word in the other trie.
For example,if the structures are as follows:
Struct nametrie
{
Struct nametrie *child[26];
struct phonetrie*head,*tail;
struct phonetrie*root;
-----------
}
Struct phonetrie
{
struct phonetrie*child[9];
struct nametrie*head,*tail;
struct nametrie*root;
-----------
}
Then for AB “112”,
Name trie willstore head(1) and tail (2).
But I think this approach will not work for duplicate entries(one name and multiple numbers.)
Can someone please explain a good approach.I am not looking for code but good understanding of approach,may be via diagram or algorithm.
I dont know C so I cant comment in your code.
The idea of using tries is valid.
you seems to be missing what data the nodes can hold in tries
the node in trees has 2 main components
the data it has which can be anytype
list of childen (or left , right childeren) or any combination of children
what we will do here is that we will add another field to each node and call it the value "theValue"
So the trie node will look like this
Class TrieNode{
public char theChar;
public String theValue;
public List<TrieNode> children;
}
So for forward lookup (name to phone) you construct one Trie and on the node that match entry in the directory you will set theValue to that entrie.
you will need to create 2nd trie to do the same for reverse lookup (phone to name)
So to give you example how it will look like for this data it will be
( AB “112” AC ”124” ACD ”225”)
//create nodes
TrieNode root = new TrieNode();
TrieNode A = new TrieNode();
A.theChar = 'A';
TrieNode B = new TrieNode();
A.theChar = 'B';
TrieNode C = new TrieNode();
A.theChar = 'C';
TrieNode C2 = new TrieNode();
A.theChar = 'C';
TrieNode D = new TrieNode();
A.theChar = 'D';
//link nodes together
root.children = new ArrayList<>();
root.children.add(A);
A.children = new ArrayList<>();
A.children.add(B);
A.children.add(C);
B.children = new ArrayList<>();
B.children.add(C2);
//fill the data
B.theValue = "112";
C.theValue = "124";
C2.theValue = "225";
now you can easy traverse this Trie and when you reach a node and whant to check the value just read theValue
i hope it is clear

Reduce a string using grammar-like rules

I'm trying to find a suitable DP algorithm for simplifying a string. For example I have a string a b a b and a list of rules
a b -> b
a b -> c
b a -> a
c c -> b
The purpose is to get all single chars that can be received from the given string using these rules. For this example it will be b, c. The length of the given string can be up to 200 symbols. Could you please prompt an effective algorithm?
Rules always are 2 -> 1. I've got an idea of creating a tree, root is given string and each child is a string after one transform, but I'm not sure if it's the best way.
If you read those rules from right to left, they look exactly like the rules of a context free grammar, and have basically the same meaning. You could apply a bottom-up parsing algorithm like the Earley algorithm to your data, along with a suitable starting rule; something like
start <- start a
| start b
| start c
and then just examine the parse forest for the shortest chain of starts. The worst case remains O(n^3) of course, but Earley is fairly effective, these days.
You can also produce parse forests when parsing with derivatives. You might be able to efficiently check them for short chains of starts.
For a DP problem, you always need to understand how you can construct the answer for a big problem in terms of smaller sub-problems. Assume you have your function simplify which is called with an input of length n. There are n-1 ways to split the input in a first and a last part. For each of these splits, you should recursively call your simplify function on both the first part and the last part. The final answer for the input of length n is the set of all possible combinations of answers for the first and for the last part, which are allowed by the rules.
In Python, this can be implemented like so:
rules = {'ab': set('bc'), 'ba': set('a'), 'cc': set('b')}
all_chars = set(c for cc in rules.values() for c in cc)
# memoize
def simplify(s):
if len(s) == 1: # base case to end recursion
return set(s)
possible_chars = set()
# iterate over all the possible splits of s
for i in range(1, len(s)):
head = s[:i]
tail = s[i:]
# check all possible combinations of answers of sub-problems
for c1 in simplify(head):
for c2 in simplify(tail):
possible_chars.update(rules.get(c1+c2, set()))
# speed hack
if possible_chars == all_chars: # won't get any bigger
return all_chars
return possible_chars
Quick check:
In [53]: simplify('abab')
Out[53]: {'b', 'c'}
To make this fast enough for large strings (to avoiding exponential behavior), you should use a memoize decorator. This is a critical step in solving DP problems, otherwise you are just doing a brute-force calculation. A further tiny speedup can be obtained by returning from the function as soon as possible_chars == set('abc'), since at that point, you are already sure that you can generate all possible outcomes.
Analysis of running time: for an input of length n, there are 2 substrings of length n-1, 3 substrings of length n-2, ... n substrings of length 1, for a total of O(n^2) subproblems. Due to the memoization, the function is called at most once for every subproblem. Maximum running time for a single sub-problem is O(n) due to the for i in range(len(s)), so the overall running time is at most O(n^3).
Let N - length of given string and R - number of rules.
Expanding a tree in a top down manner yields computational complexity O(NR^N) in the worst case (input string of type aaa... and rules aa -> a).
Proof:
Root of the tree has (N-1)R children, which have (N-1)R^2 children, ..., which have (N-1)R^N children (leafs). So, the total complexity is O((N-1)R + (N-1)R^2 + ... (N-1)R^N) = O(N(1 + R^2 + ... + R^N)) = (using binomial theorem) = O(N(R+1)^N) = O(NR^N).
Recursive Java implementation of this naive approach:
public static void main(String[] args) {
Map<String, Character[]> rules = new HashMap<String, Character[]>() {{
put("ab", new Character[]{'b', 'c'});
put("ba", new Character[]{'a'});
put("cc", new Character[]{'b'});
}};
System.out.println(simplify("abab", rules));
}
public static Set<String> simplify(String in, Map<String, Character[]> rules) {
Set<String> result = new HashSet<String>();
simplify(in, rules, result);
return result;
}
private static void simplify(String in, Map<String, Character[]> rules, Set<String> result) {
if (in.length() == 1) {
result.add(in);
}
for (int i = 0; i < in.length() - 1; i++) {
String two = in.substring(i, i + 2);
Character[] rep = rules.get(two);
if (rep != null) {
for (Character c : rep) {
simplify(in.substring(0, i) + c + in.substring(i + 2, in.length()), rules, result);
}
}
}
}
Bas Swinckels's O(RN^3) Java implementation (with HashMap as a memoization cache):
public static Set<String> simplify2(final String in, Map<String, Character[]> rules) {
Map<String, Set<String>> cache = new HashMap<String, Set<String>>();
return simplify2(in, rules, cache);
}
private static Set<String> simplify2(final String in, Map<String, Character[]> rules, Map<String, Set<String>> cache) {
final Set<String> cached = cache.get(in);
if (cached != null) {
return cached;
}
Set<String> ret = new HashSet<String>();
if (in.length() == 1) {
ret.add(in);
return ret;
}
for (int i = 1; i < in.length(); i++) {
String head = in.substring(0, i);
String tail = in.substring(i, in.length());
for (String c1 : simplify2(head, rules)) {
for (String c2 : simplify2(tail, rules, cache)) {
Character[] rep = rules.get(c1 + c2);
if (rep != null) {
for (Character c : rep) {
ret.add(c.toString());
}
}
}
}
}
cache.put(in, ret);
return ret;
}
Output in both approaches:
[b, c]

Hadoop... Text.toString() conversion problems

I'm writing a simple program for enumerating triangles in directed graphs for my project. First, for each input arc (e.g. a b, b c, c a, note: a tab symbol serves as a delimiter) I want my map function output the following pairs ([a, to_b], [b, from_a], [a_b, -1]):
public void map(LongWritable key, Text value,
OutputCollector<Text, Text> output,
Reporter reporter) throws IOException {
String line = value.toString();
String [] tokens = line.split(" ");
output.collect(new Text(tokens[0]), new Text("to_"+tokens[1]));
output.collect(new Text(tokens[1]), new Text("from_"+tokens[0]));
output.collect(new Text(tokens[0]+"_"+tokens[1]), new Text("-1"));
}
Now my reduce function is supposed to cross join all pairs that have both to_'s and from_'s
and to simply emit any other pairs whose keys contain "_".
public void reduce(Text key, Iterator<Text> values,
OutputCollector<Text, Text> output,
Reporter reporter) throws IOException {
String key_s = key.toString();
if (key_s.indexOf("_")>0)
output.collect(key, new Text("completed"));
else {
HashMap <String, ArrayList<String>> lists = new HashMap <String, ArrayList<String>> ();
while (values.hasNext()) {
String line = values.next().toString();
String[] tokens = line.split("_");
if (!lists.containsKey(tokens[0])) {
lists.put(tokens[0], new ArrayList<String>());
}
lists.get(tokens[0]).add(tokens[1]);
}
for (String t : lists.get("to"))
for (String f : lists.get("from"))
output.collect(new Text(t+"_"+f), key);
}
}
And this is where the most exciting stuff happens. tokens[1] yields an ArrayOutOfBounds exception. If you scroll up, you can see that by this point the iterator should give values like "to_a", "from_b", "to_b", etc... when I just output these values, everything looks ok and I have "to_a", "from_b". But split() don't work at all, moreover line.length() is always 1 and indexOf("") returns -1! The very same indexOf WORKS PERFECTLY for keys... where we have pairs whose keys contain "" and look like "a_b", "b_c"
I'm really puzzled with all this. MapReduce is supposed to save lives making everything simple. Instead I spent several hours to just localize this.
NOt sure if that's the problem by try changing this:
String [] tokens = line.split(" ");
to this:
String [] tokens = line.split("\t");

Resources