How to unit test method using Stream.generate?

How to unit test method using Stream.generate? - multithreading

I've been spending the last couple of hours trying to figure out what's the best way to test this piece of code.
void consume() {
executorService.execute(() -> Stream.generate(this::takeFromQueue)
.filter(Optional::isPresent)
.map(Optional::get)
.forEach(messageSender::send));
}
private Optional<Message> takeFromQueue() {
try {
return Optional.ofNullable(queue.take());
} catch (InterruptedException e) {
log.error("Queue consumer interrupted.");
return Optional.empty();
}
}
The idea of the code is to provide a consumer of a BlockingQueue that runs in a separate thread until the application ends.
In this situation I cannot mock the executorService because the test would hang waiting for the thread to finish an infinite stream. If I leave the execution run in a separate thread, the test won't be deterministic and I would need to rely on a Thread.sleep to give it time to at least consume a message from the queue.
Any ideas?
Thanks.

I would suggest to change method consume() to consume(Stream)
and call it in the test with a Stream of fixed size.
In your real code than call it with consume(Stream.generate(this::takeFromQueue))

Related

Kotlin: Why isn't job.invokeOnCompletion() block running on main thread?

In my Android application I have code that should run periodically in its own coroutine and should be cancelable.
for this I have the following functions:
startJob(): Initializes the job, sets up invokeOnCompletion() and starts the work loop in the respective scope
private fun startJob() {
if (::myJob.isInitialized && myJob.isActive) {
return
}
myJob= Job()
myJob.invokeOnCompletion {
it?.message.let {
var msg = it
if (msg.isNullOrBlank()) {
msg = "Job stopped. Reason unknown"
}
myJobCompleted(msg)
}
}
CoroutineScope(Dispatchers.IO + myJob).launch {
workloop()
}
}
workloop(): The main work loop. Do some work in a loop with a set delay in each iteration:
private suspend fun workloop() {
while (true) {
// doing some stuff here
delay(setDelayInMilliseconds)
}
}
myJobCompleted: do some finalizing. For now simply log a message for testing.
private fun myJobCompleted(msg: String) {
try {
mainActivityReference.logToGUI(msg)
}
catch (e:Exception){
println("debug: " + e.message)
}
}
Running this and calling myJob.Cancel() will throw the following exception in myJobCompleted():
debug: Only the original thread that created a view hierarchy can touch its views.
I'm curious as to why this code isn't running on the main thread, since startJob() IS called from the main thread?
Furthermore: is there a option similar to using a CancellationTokenSource in c#, where the job is not immediately cancelled, but a cancellation request can be checked each iteration of the while loop?
Immediately breaking off the job, regardless of what it is doing (although it will pretty much always be waiting for the delay on cancellation) doesn't seem like a good idea to me.

It is not the contract of Job.invokeOnCompletion to run on the same thread where Job is created. Moreover, such a contract would be impossible to implement.
You can't expect an arbitrary piece of code to run on an arbitrary thread, just because there was some earlier method invocation on that thread. The ability of the Android main GUI thread to execute code submitted from the outside is special, and involves the existence a top-level event loop.
In the world of coroutines, what controls thread assignment is the coroutine context, while clearly you are outside of any context when creating the job. So the way to fix it is to explicitly launch(Dispatchers.Main) a coroutine from within invokeOnCompletion.
About you question on cancellation, you can use withContext(NonCancellable) to surround the part of code you want to protect from cancellation.

Interrupt parallel Stream execution

Consider this code :
Thread thread = new Thread(() -> tasks.parallelStream().forEach(Runnable::run));
tasks are a list of Runnables that should be executed in parallel.
When we start this thread, and it begins its execution, then depending on some calculations we need to interrupt (cancel) all those tasks.
Interrupting the Thread will only stop one of exections. How do we handle others? or maybe Streams should not be used that way? or you know a better solution?

You can use a ForkJoinPool to interrupt the threads:
#Test
public void testInterruptParallelStream() throws Exception {
final AtomicReference<InterruptedException> exc = new AtomicReference<>();
final ForkJoinPool forkJoinPool = new ForkJoinPool(4);
// use the pool with a parallel stream to execute some tasks
forkJoinPool.submit(() -> {
Stream.generate(Object::new).parallel().forEach(obj -> {
synchronized (obj) {
try {
// task that is blocking
obj.wait();
} catch (final InterruptedException e) {
exc.set(e);
}
}
});
});
// wait until the stream got started
Threads.sleep(500);
// now we want to interrupt the task execution
forkJoinPool.shutdownNow();
// wait for the interrupt to occur
Threads.sleep(500);
// check that we really got an interruption in the parallel stream threads
assertTrue(exc.get() instanceof InterruptedException);
}
The worker threads do really get interrupted, terminating a blocking operation. You can also call shutdown() within the Consumer.
Note that those sleeps might not be tweaked for a proper unit test, you might have better ideas to just wait as necessary. But it is enough to show that it is working.

You aren't actually running the Runnables on the Thread you are creating. You are running a thread which will submit to a pool, so:
Thread thread = new Thread(() -> tasks.parallelStream().forEach(Runnable::run));
In this example you are in lesser terms doing
List<Runnable> tasks = ...;
Thread thread = new Thread(new Runnable(){
public void run(){
for(Runnable r : tasks){
ForkJoinPool.commonPool().submit(r);
}
}
});
This is because you are using a parallelStream that delegates to a common pool when handling parallel executions.
As far as I know, you cannot get a handle of the Threads that are executing your tasks with a parallelStream so may be out of luck. You can always do tricky stuff to get the thread but probably isn't the best idea to do so.

Something like the following should work for you:
AtomicBoolean shouldCancel = new AtomicBoolean();
...
tasks.parallelStream().allMatch(task->{
task.run();
return !shouldCancel.get();
});
The documentation for the method allMatch specifically says that it "may not evaluate the predicate on all elements if not necessary for determining the result." So if the predicate doesn't match when you want to cancel, then it doesn't need to evaluate any more. Additionally, you can check the return result to see if the loop was cancelled or not.

Kill Spring scheduled thread

Is there anyway to timeout a scheduled task (kill thread) in Spring if the task takes to long or even hangs because of remote resource unavailability
In my case, tasks can take too long or even hang because they're based on HtmlUnitDriver (Selenium) sequence of steps, but from time to time it hangs and I would like to be able to set a time limit for the thread to execute. Something like 1 minute at most.
I setup a fixed rate execution of 5 minutes with an initial delay of 1 minute.
Thanks in advance

I did the same some time ago following this example: example
The basic idea is to put your code in a class implementing Callable or Runnable, then create a FutureTask wherever you are going to invoque your thread with the Callable or Runnable class as parameter. Define an executor , submit your futureTask to the executor, and now you are able to execute the thread for x time inside a try catch block, if your thread ends with an timeoutException you will know that it took too long.
Here is my code:
CallableServiceExecutor callableServiceExecutor = new CallableServiceExecutor();
FutureTask<> task = new FutureTask<>(callableServiceExecutor);
ExecutorService executor = Executors.newSingleThreadExecutor();
executor.submit(task);
Boolean exito = true;
try {
result = task.get(getTimeoutValidacion() , TimeUnit.SECONDS);
} catch (InterruptedException e) {
exito = false;
} catch (ExecutionException e) {
exito = false;
} catch (TimeoutException e) {
exito = false;
}
task.cancel(true);
executor.shutdown();

See: How to timeout a thread
The short answer is that there is not easy or reliable way to kill a thread due to the limitations of Java's thread implementation. The ExecutorService#shutdown() is sort of a hack and heavy. Its best to deal with this in the task itself e.g. like at the network request level if your making a REST request to timeout on the socket.
Or better if you do some sort of message passing ala Actor model (see Akka) you can send a message from "supervisor" for the Actor to die. Also avoiding blocking by using something like Netty will help.

Async Logger. Can I lose/delay log entries?

I'm implementing my own logging framework. Following is my BaseLogger which receives the log entries and push it to the actual Logger which implements the abstract Log method.
I use the C# TPL for logging in an Async manner. I use Threads instead of TPL. (TPL task doesn't hold a real thread. So if all threads of the application end, tasks will stop as well, which will cause all 'waiting' log entries to be lost.)
public abstract class BaseLogger
{
// ... Omitted properties constructor .etc. ... //
public virtual void AddLogEntry(LogEntry entry)
{
if (!AsyncSupported)
{
// the underlying logger doesn't support Async.
// Simply call the log method and return.
Log(entry);
return;
}
// Logger supports Async.
LogAsync(entry);
}
private void LogAsync(LogEntry entry)
{
lock (LogQueueSyncRoot) // Make sure we ave a lock before accessing the queue.
{
LogQueue.Enqueue(entry);
}
if (LogThread == null || LogThread.ThreadState == ThreadState.Stopped)
{ // either the thread is completed, or this is the first time we're logging to this logger.
LogTask = new new Thread(new ThreadStart(() =>
{
while (true)
{
LogEntry logEntry;
lock (LogQueueSyncRoot)
{
if (LogQueue.Count > 0)
{
logEntry = LogQueue.Dequeue();
}
else
{
break;
// is it possible for a message to be added,
// right after the break and I leanve the lock {} but
// before I exit the loop and task gets 'completed' ??
}
}
Log(logEntry);
}
}));
LogThread.Start();
}
}
// Actual logger implimentations will impliment this method.
protected abstract void Log(LogEntry entry);
}
Note that AddLogEntry can be called from multiple threads at the same time.
My question is, is it possible for this implementation to lose log entries ?
I'm worried that, is it possible to add a log entry to the queue, right after my thread exists the loop with the break statement and exits the lock block, and which is in the else clause, and the thread is still in the 'Running' state.
I do realize that, because I'm using a queue, even if I miss an entry, the next request to log, will push the missed entry as well. But this is not acceptable, specially if this happens for the last log entry of the application.
Also, please let me know whether and how I can implement the same, but using the new C# 5.0 async and await keywords with a cleaner code. I don't mind requiring .NET 4.5.
Thanks in Advance.

While you could likely get this to work, in my experience, I'd recommend, if possible, use an existing logging framework :) For instance, there are various options for async logging/appenders with log4net, such as this async appender wrapper thingy.
Otherwise, IMHO since you're going to be blocking a threadpool thread during your logging operation anyway, I would instead just start a dedicated thread for your logging. You seem to be kind-of going for that approach already, just via Task so that you'd not hold a threadpool thread when nothing is logging. However, the simplification in implementation I think benefits just having the dedicated thread.
Once you have a dedicated logging thread, you then only need have an intermediate ConcurrentQueue. At that point, your log method just adds to the queue and your dedicated logging thread just does that while loop you already have. You can wrap with BlockingCollection if you need blocking/bounded behavior.
By having the dedicated thread as the only thing that writes, it eliminates any possibility of having multiple threads/tasks pulling off queue entries and trying to write log entries at the same time (painful race condition). Since the log method is now just adding to a collection, it doesn't need to be async and you don't need to deal with the TPL at all, making it simpler and easier to reason about (and hopefully in the category of 'obviously correct' or thereabouts :)
This 'dedicated logging thread' approach is what I believe the log4net appender I linked to does as well, FWIW, in case that helps serve as an example.

I see two race conditions off the top of my head:
You can spin up more than one Thread if multiple threads call AddLogEntry. This won't cause lost events but is inefficient.
Yes, an event can be queued while the Thread is exiting, and in that case it would be "lost".
Also, there's a serious performance issue here: unless you're logging constantly (thousands of times a second), you're going to be spinning up a new Thread for each log entry. That will get expensive quickly.
Like James, I agree that you should use an established logging library. Logging is not as trivial as it seems, and there are already many solutions.
That said, if you want a nice .NET 4.5-based approach, it's pretty easy:
public abstract class BaseLogger
{
private readonly ActionBlock<LogEntry> block;
protected BaseLogger(int maxDegreeOfParallelism = 1)
{
block = new ActionBlock<LogEntry>(
entry =>
{
Log(entry);
},
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = maxDegreeOfParallelism,
});
}
public virtual void AddLogEntry(LogEntry entry)
{
block.Post(entry);
}
protected abstract void Log(LogEntry entry);
}

Regarding the loosing waiting messages on app crush because of unhandled exception, I've bound a handler to the event AppDomain.CurrentDomain.DomainUnload. Goes like this:
protected ManualResetEvent flushing = new ManualResetEvent(true);
protected AsyncLogger() // ctor of logger
{
AppDomain.CurrentDomain.DomainUnload += CurrentDomain_DomainUnload;
}
protected void CurrentDomain_DomainUnload(object sender, EventArgs e)
{
if (!IsEmpty)
{
flushing.WaitOne();
}
}
Maybe not too clean, but works.

Question about thread synchronisation

i have a question about thread situation.
Suppose i have 3 threads :producer,helper and consumer.
the producer thread is in running state(and other two are in waiting state)and when its done it calls invoke,but the problem it has to invoke only helper thread not consumer,then how it can make sure that after it releases resources are to be fetched by helper thread only and then by consumer thread.
thanks in advance

Or have you considered, sometimes having separate threads is more of a problem than a solution?
If you really want the operations in one thread to be strictly serialized with the operations in another thread, perhaps the simpler solution is to discard the second thread and structure the code so the first thread does the operations in the order desired.
This may not always be possible, but it's something to bear in mind.

You could have, for instance, two mutexes (or whatever you are using): one for producer and helper, and other for producer and consumer
Producer:
//lock helper
while true
{
//lock consumer
//do stuff
//release and invoke helper
//wait for helper to release
//lock helper again
//unlock consumer
//wait consumer
}
The others just lock and unlock normally.
Another possible approach (maybe better) is using a mutex for producer / helper, and other helper / consumer; or maybe distribute this helper thread tasks between the other two threads. Could you give more details?

The helper thread is really just a consumer/producer thread itself. Write some code for the helper like you would for any other consumer to take the result of the producer. Once that's complete write some code for the helper like you would for any other producer and hook it up to your consumer thread.

You might be able to use queues to help you with this with locks around them.
Producer works on something, produces it, and puts it on the helper queue.
Helper takes it, does something with it, and then puts it on the consumer queue.
Consumer take its, consumes it, and goes on.
Something like this:
Queue<MyDataType> helperQ, consumerQ;
object hqLock = new object();
object cqLock = new object();
// producer thread
private void ProducerThreadFunc()
{
while(true)
{
MyDataType data = ProduceNewData();
lock(hqLock)
{
helperQ.Enqueue(data);
}
}
}
// helper thread
private void HelperThreadFunc()
{
while(true)
{
MyDataType data;
lock(hqLock)
{
data = helperQ.Dequeue();
}
data = HelpData(data);
lock(cqLock)
{
consumerQ.Enqueue(data);
}
}
}
// consumer thread
private void ConsumerThreadFunc()
{
while(true)
{
MyDataType data;
lock(cqLock)
{
data = consumerQ.Dequeue();
}
Consume(data);
}
}
NOTE: You will need to add more logic to this example to make sure usable. Don't expect it to work as-is. Mainly, use signals for one thread to let the other know that data is available in its queue (or as a worst case poll the size of the queue to make sure it is greater than 0 , if it is 0, then sleep -- but the signals are cleaner and more efficient).
This approach would let you process data at different rates (which can lead to memory issues).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string