Approach to find duplicate - Kafka and queue - hashmap

Question asked in interview ---Suppose there are two kafaka topic or lets say queue - Q1 & Q2
Both are having some messages suppose 10 messages each.
The condition here is if both the queue are having same messages exctaly same in both queue it's fine but if there is even one odd or non. matching message we need to error out or notify.
the approach i suggested for this problem.
1- Using hashset We can find.. we will add first queue message in add set and while adding other add method will notify us if message is not already there.
2- we can use the Hashmap and store it as key value form..while adding it i will check if the key -message is already there.
but he was not satisfied he did not share the right answer or problem. with above approach.
Let me know if better solution exist and the problem with this approach

He may have been aiming to get into a discussion about the difficulties of balancing in a real-time streaming situation. Let's say there is a continuous stream of messages going through the two topics how do you know if things are balanced?
There is no single answer, it depends on the situation but generally has to consider some kind of time window.
My guess is that the interviewer's dissatisfaction (if any) would be because he was looking to talk about options rather than going to one specific solution to a specific situation.
We can't know what he was thinking without asking (which I would always recommend) but when I'm interviewing I always look for candidates who can consider and discuss problems and trade-offs, not necessarily ones who have the 'right' solution.

Related

Moving to next question when failing to receive a response

We had a developer to come up with a prototype for a bot for bookkeeping questions and we understand that the bot is not perfect. Our biggest challenge was to ensure that after 2-3 failed attempts for the bot to receive an appropriate response, the bot moves on to the next question and that's it. Our previous developer claimed that it's not possible, is that true or not? Currently the bot just gives up after a couple of attempt and that's it.
I am not a tech person and I would really appreciate some help on this.
Hypothetical example of the ideal scenario:
Q: What accounting software do you use?
A: askdnjsajld
Q: Didn't get that. What accounting software do you use?
A: asdkjnajksdn
Q: I am sorry, didn't get that. Let's move on to the next question... When would you like to receive your financials?
A: month-end
Thank you for your help!
Yes, this is possible, although the exact details of how you do this depend on how you're handling responses from the user in the first place.
The most important thing to keep in mind to handle this, however, is to remember that Intents represent what the user says and not how you handle it or how you reply.
That last bit gives the simplest answer to your question - you can, of course, reply however you want the bot to reply after each round. It sounds like you were able to implement logic that says if it got a result it didn't want - you can extend that to add a counter that just uses the next question as its reply after a number of tries.
The more difficult part is to make sure that you know what question the user is replying to, and to make sure you capture the right value in case they try to go back and answer a previous question.
The general solution to this problem is twofold:
Have one Intent that accepts the direct answer to the question you're currently on, but have it triggerable only with certain Input Contexts being set. Then set that Context to be valid when you ask the question, and remove the Context when the question is answered.
Have other Intents that specifically respond to how the user might phrase the question if they were going back to answer it.
For example, if you have asked "What software do you use?" you might set the Context "ask-software-used". You would then have two Intents:
One with an Input Context of "ask-software-used" that accepts just valid software names.
Another with no Input Context, but which might have training phrases such as
"I'm using XXXX software"
"We are working with the XXXX package"
once the user does answer the question, clear the "ask-software-used" Context, ask the next question, and set its Context.
You can also use this to keep track of how many times you've had to repeat the question, waiting for an answer. If that counter hits the limit, do the same thing: clear the Context, ask the next question, and set its Context.

Akka/ZeroMQ Messaging Patterns by Example

I'm interested in trying to see how I might leverage the Akka/ZeroMQ module in my project.
In that document, 4 so-called "messaging patterns" are identified but only 1 (Pub-Sub) are explained in detail. They are:
Pub-Sub
Router-Dealer
Push-Pull
Rep-Req
To me (a messaging greenhorn), I don't understand how there could be anything more than Pub-Sub: you have a message, you publish it to a broker, and another process (subscriber) consumes it from the broker.
So my specific question is: what are some concrete use cases for each message ZeroMQ pattern, and why would I ever want to utilize each pattern if Akka already has a mechanism for communicating between threads?
I ask this because the documentation linked above simply states "More documentation and examples will follow soon." for all patterns except Pub-Sub.
Before going into more details right for your question, kindly check another Answer almost identical to your one >>> https://stackoverflow.com/a/25742744/3666197
Q: What are some concrete use cases for each message ZeroMQ pattern
A: Best proceed with the book, you will find many indispensable comments and remarks there
Q: .. don't understand how there could be anything more than Pub-Sub
A: Oh yes, there is a complete new Universe behind that. ZeroMQ is broker-less, zero-copy, incredibly fast to touch just a few ( read below )
Q: why would I ever want to utilize each pattern if Akka already has a mechanism for communicating between threads?
A: Well, it depends. If you are happy with message passing performance for just a few localhost threads ( not much above a few tens ), no need to invest your time into ZeroMQ. If going for high perormance, distributed, (almost) linear scaleability and heterogenous portability, well, then there might be the right time to start reading into ZMQ.
Several links to a few must-read-s
worth for shaping one's mind before moving into details from the ZeroMQ evangelists Pieter Hintjens & Martin Sústrik
An initial view on PUB/SUB from http://250bpm.com/blog:39 ( check and do not miss Martin's cool notes on unit-testing & other gems in his collection )
A very indepth must-have & must-read is a book ( available as pdf ) "Code Connected, Volume 1" If going seriously in for messaging, this is a basis to work with.
A collection of good whitepapers is on http://zeromq.org/area:whitepapers

fast parallel picking/checking of some ID (think non-game application of loot lag)

So I am verifying new operational management systems, and one of these OS's sends pick lists to a scale-able number of handheld devices. It sends these using messages, and their pick lists may contain overlapping jobs. So in my virtual world, I need to make sure that two simulated humans don't pick the same job - whenever someone picks a job, all the job lists get refreshed, so that the picked job doesn't appear on anyone else's handheld anymore, but for me the message is still in the queue being handled, so I have to make sure to discard that option.
Basically I have this giant list with a mutex, and the more "people" hitting it faster, the slower I can handle messages, to the point where I'm no longer at real-time, which is bad, because I can't actually validate the system because I can't keep up with the messages. (two guys on the same isle will recognize that one is going to pick one object and the next guy should pick the 2nd item, but I need to check every single job i'm about to pick and see if it has been claimed by someone else already)
I've considered localized binning of the lists, but it actually doesn't solve the problem in the stupid case that breaks it anyway, tons of people working on the same row. Now granted this would probably be confusing for the real people as well, as in real life they need to do the same resolution, but I'm curious what the currently accepted "best" solution to this problem is.
PS - I already am implementing this in c++ and it's fast, fast enough that in any practical test I don't "need" this question answered, it's more because I'm curious that I'm asking.
Thanks in advance!
I see a problem in the design "giant list with (one) mutex". You simply can't provide the whole list in synchronized fashion, if the list size and/or access rate is unlimited. Basic math works against you. So what i would do is a mutexed flag on each job. You can't prevent a job from being displayed on someone's screen, but you can assure that he gets a graceful "no more available" error and THEN the updated list. If you ever wanted to reserve a seat on highly popular gig, you may have witnessed the solution.

Java : Use Disruptor or Not . .

Hy,
Currently I am developing a program that takes 2 values from an amq queue and performs a series of mathematical calculations on them. A topic has been created on the amq server to which my program subscribes and receive messages via callbacks (listeners).
Now whenever a message arrives the two values are taken out of and added to the SynchronizedDescriptiveStatistics object. After each addition to the list of values the whole sequence of calculations is performed all over again (this is part of the requirement actually).
The problem I am facing right now is that since I am using listeners, sometimes a single or more messages are received in the middle of calculations. Although SynchronizedDescriptiveStatistics takes care of all the thread related issues it self but it adds all the waiting values in its list of numbers at once when it comes out of lock or something. While my problem was to add one value then perform calcls on it then second value and on and on.
The solution I came up with is to use job queues in my program (not amq queues). In this way whenever calcs are over the program would look for further jobs in the queue and goes on accordingly.
Since I am also looking for efficiency and speed I thought the Disruptor framework might be good for this problem and it is optimized for threaded situations. But I am not sure if its worth the trouble of implementing Disruptor in to my application because regular standard queue might be enough for what I am trying to do.
Let me also tell you that the data on which the calcs need to be performed is a lot and it will keep on coming and the whole calcs will need to be performed all over again for each addition of a single value in a continuous fashion. So keeping in mind the efficiency and the huge volume of data what do you think will be useful in the long run.
Waiting for a reply. . .
Regards.
I'll give our typical answer to this question: test first, and make your decision based on your results.
Although you talk about efficiency, you don't specifically say that performance is a fundamental requirement. If you have an idea of your performance requirements, you could mock up a simple prototype using queues versus a basic implementation of the Disruptor, and take measurements of the performance of both.
If one comes off substantially better than the other, that's your answer. If, however, one is much more effort to implement, especially if it's also not giving you the efficiency you require, or you don't have any hard performance requirements, then that suggests that solution is not the right one.
Measure first, and decide based on your results.

DDD - Identifying entities, roots and services

I want to apologize for my bad english first. I am trying to implement DDD in my project, but I have some problems I will try to describe. I am developing a chat application. Front-end is quite simple, no rooms, just one window showing last n messages.
Here comes the first problem. I don't see any entity here (ChatRoom would be ok, but I have only one room). Message seems like value object to me. So I don't know how to save state of chat (I think that having repositories for value objects is a bad practice).
This chat should have one specific feature, it should group and store related messages together (create something like ChatSegment). These segments will be divided by blocks of m off-topic messages (off-topic messages mean messages unrelated to messages in current segment).
I cannot imagine the way to make this work, without using stateful service. This behavior does not fit into any Entity (not even into the hypothetical ChatRoom entity). Segmenter entity does not seem right either. How would you solve this problem?
Maybe my thoughts are totally incorrect, but I would need to make them clear.
Thank you,
Brano.
In an IM application I'd (OTTOMH) treat Meassage as an Entity with a unique Id.
Combining DDD with DNC could be very powerful here if you treat your Message type as a «moment» archetype, and perhaps also use «role» archetypes (e.g. Chatter) in order for the users of the system to take on more than one «role» of needed.
Furthermore; the concept of "next/prior moment (i.e. message)" seems to be a perfect match.

Resources