I wanted to learn Rust and so I decided to use it for a real-world project.
The idea is to have a server that
from the main thread A spawns a new thread B that performs some async task that produces a stream of values through time
receives client websocket connections [c, d, e, ..] asynchronously and handles them concurrently spawning new threads [C, D, E, ...]
sends the values produced in thread B to threads [C, D, E, ...]
each thread in [C, D, E, ...] publishes the value to their respective client in [c, d, e, ..]
I am using
tokio to spawn new threads and tokio::sync::mpsc::unbounded_channel to send the values computed in B to the other threads
tokio_tungstenite to manage websocket connections and send values to the clients
I managed to get a working example where thread B produces integers and fixed time intervals. When the server starts, B starts producing a stream of values [0,1,2,3, ..].
When a new websocket connection is opened, the client will receive the stream of data, starting from the value produced after the connection is opened (so that if the connection starts after the value 3 is produced by B, then the client will receive values from 4 onward).
Here is the catch.
The only way I found to for the receiving part of the channel in C to receive values asynchronously (and therefore prevent it from buffering the values and sending them to c just when B is completely done) is to use a loop that I believe consumes 100% of CPU.
I noted that because of this, every websocket connection will consume 100% of CPU (so if there are two connections open CPU usage will be 200% and so on).
Here is the loop:
loop {
while let Ok(v) = rx.try_recv() {
println!("PRINTER ID [{}] | RECEIVED: {:#?}", addr, v);
println!("PRINTER ID [{}] | SENDING TO WS: {:#?}", addr, v);
let mess = Message::Text(v.to_string());ws_sender.send(mess).await?;
}
If I use recv() (instead of try_recv()) the values will be buffered and released to the websocket just when B is done.
I tried to use futures_channel::unbounded instead of the tokio channel but I have the same buffer problem.
QUESTION: how to rewrite the above loop to avoid using 100% and stream values to websocket without blocking?
You can see the tokio server here: https://github.com/ceikit/async_data/blob/master/src/bin/tokio_server.rs
you can test it by spinning a websocket connection in another terminal window running client
needed to change thread::sleep to use futures-timer and sync::Mutex to futures::lock::Mutex, then a while-let with recv() works perfectly
Related
I'm having trouble understanding the point of a blocking Observable, specifically blockingForEach()
What is the point in applying a function to an Observable that we will never see?? Below, I'm attempting to have my console output in the following order
this is the integer multiplied by two:2
this is the integer multiplied by two:4
this is the integer multiplied by two:6
Statement comes after multiplication
My current method prints the statement before the multiplication
fun rxTest(){
val observer1 = Observable.just(1,2,3).observeOn(AndroidSchedulers.mainThread())
val observer2 = observer1.map { response -> response * 2 }
observer2
.observeOn(AndroidSchedulers.mainThread())
.subscribeOn(AndroidSchedulers.mainThread())
.subscribe{ it -> System.out.println("this is the integer multiplie by two:" + it) }
System.out.println("Statement comes after multiplication ")
}
Now I have my changed my method to include blockingForEach()
fun rxTest(){
val observer1 = Observable.just(1,2,3).observeOn(AndroidSchedulers.mainThread())
val observer2 = observer1.map { response -> response * 2 }
observer2
.observeOn(AndroidSchedulers.mainThread())
.subscribeOn(AndroidSchedulers.mainThread())
.blockingForEach { it -> System.out.println("this is the integer multiplie by two:" + it) }
System.out.println("Statement comes after multiplication ")
}
1.)What happens to the transformed observables once no longer blocking? Wasnt that just unnecessary work since we never see those Observables??
2.)Why is my System.out("Statement...) appear before my observables when I'm subscribing?? Its like observable2 skips its blocking method, makes the System.out call and then resumes its subscription
It's not clear what you mean by your statement that you will "never see" values emitted by an observer chain. Each value that is emitted in the observer chain is seen by observers downstream from the point where they are emitted. At the point where you subscribe to the observer chain is the usual place where you perform a side effect, such as printing a value or storing it into a variable. Thus, the values are always seen.
In your examples, you are getting confused by how the schedulers work. When you use the observeOn() or subscribeOn() operators, you are telling the observer chain to emit values after the value is move on to a different thread. When you move data between threads, the destination thread has to be able to process the data. If your main code is running on the same thread, you can lock yourself out or you will re-order operations.
Normally, the use of blocking operations is strongly discouraged. Blocking operations can often be used when testing, because you have full control of the consequences. There are a couple of other situations where blocking may make sense. An example would be an application that requires access to a database or other resource; the application has no purpose without that resource, so it blocks until it becomes available or a timeout occurs, kicking it out.
I have a very long list data, lets assume that it looks like this:
[(a, a, 1),
(b, b, 1),
(c, c, 1),
(d, d, 1),
(e, e, 1),
(f, f, 1),
(g, g, 1),
(h, h, 1),
(i, i, 1),]
I am trying to use multithreading as follows:
from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(4)
pool.starmap(help_func, data)
Help_func is as follows:
def help_func(in_vala, in_valb, in_valc):
print("asking for " + str(in_vala) + " asking for " + str(in_valb))
receiver(in_vala)
and receiver is a simple test function as so:
def receiver(group):
print(group)
When I run my program, I can see that the output from help_func is correct, i.e., it enumerates the values of data.
However, when I look at the values generated at the receiver(), I notice some weird prints which look like:
a
b
c
de
e
f
gh
i
I am struggling to see why this might be the case. There is something that goes wrong when calling receiver, perhaps due to receiver bring non-blocking may be?
How should I go around this issue.
Also, when I use ThreadPool(1), I do not see this issue. My actual problem has a much larger function that is called from help_func, so I would like to run it under multiple threads ideally.
You are encountering classical concurrency problem: everything you think is atomic is not. Actually print function prints two things: the data you pass to it and the end argument, which by default is "\n".
So that concatenation is the result of one thread writing data, then other writing data, then both writing new lines.
It all much better explained in this Raymond Hettinger talk.
P.S.: I hope that you’re aware of python GIL. In short: only one python instruction can execute across all python threads at the same time. If you want to speed up execution of your function - use multiprocessing, multithreading is useful when your thread is blocking most of the time (for example, networking is mostly waiting for packets to arrive, so threads are ok for that)
Application works simple. When it receives request it spawn's thread_listener and loops it 10 times and passes the index (i).
ndc_thread takes this data (i) and returns to core.
Core is looping and waiting for message from thread, when it receives, it sends chunk containing message returned by thread.
problem is that after that 1..10 loop function is executed it sends "End" chunk and closes connection.
So output from curl http://localhost:4000 is: "StartEnd"
Desired result is: "Start12345678910End"
Is there any way to keep connection open and wait until custom timeout or wait while processes executed?
defmodule AlivePlug do
import Plug.Conn
def init(opts) do
opts
end
def call(conn, _opts) do
conn = send_chunked(conn, 200)
chunk(conn, "Start")
core_pid = spawn_link(fn -> core_listener(conn) end)
thread = spawn_link(fn -> thread_listener end)
1..10 |> Enum.each(fn i -> send thread, {core_pid, i} end)
chunk(conn, "End")
conn
end
defp core_listener(conn) do
receive do
{status, i} ->
_conn = chunk(conn, Integer.to_string(i))
core_listener(conn)
end
end
defp thread_listener do
receive do
{core_pid, i} ->
send core_pid, {:ok, i}
thread_listener
_ ->
thread_listener
end
end
end
this is working application
just run and use with postman or curl http://localhost:4000
https://github.com/programisti/keep-alive-elixir
Well, the problem is that you spawn the thread process, send it some data, but never wait for it to get a chance to work and then the your function that's holding the connection ends.
Let's start with a very naive solution:
1..10 |> Enum.each(fn i -> send thread, {core_pid, i} end)
Process.sleep 5000
chunk(conn, "End")
If we sleep for 5 seconds, that should be enough time for thread to get all its work done. This is a bad solution because if you loop to 10000 then 5 seconds might be too short and for 1..10 5 seconds is probably way too long. Moreover, you shouldn't trust blind timing. In theory the system might wait 30 seconds one time and 10 ms another time to schedule thread on the CPU.
Now let's do a real solution, based on the fact that process mailboxes are a FIFO.
First, we'll add another type of message that thread_listener understands:
defp thread_listener do
receive do
{core_pid, i} ->
send core_pid, {:ok, i}
thread_listener
{:please_ack, pid} ->
send pid :ack
thread_listener
_ ->
thread_listener
end
end
Next, let's replace that Process.sleep with something smarter
1..10 |> Enum.each(fn i -> send thread, {core_pid, i} end)
send thread, {:please_ack, self}
receive do
:ack ->
ok
end
chunk(conn, "End")
Now we're putting an 11th message into thread's mailbox, and then waiting for an ack. Once all 11 of our messages have been processed, we'll get a message back and be able to continue. It doesn't matter if thread handles each message the moment it's sent or if there's a 10 second pause before it even starts. (These numbers are hyperbolic, but I'm trying to emphasize that in multithreaded programming you can't rely on timing.)
You may now notice that what we've implemented does not actually use processes in a useful way. You spawn two threads off your main thread and they both sit idle until your main thread sends them data whereupon your main thread sits idle until the spawned threads are done. It would have been clearer to just do the work in the main thread. This example is too trivial to actually need threads and I'm not able to spot how you plan to extend it, but I hope this has given you some general help on programming with threads.
We're creating a chain of actors for every (small) incoming group of messages to guarantee their sequential processing and piping (groups are differentiating by common id). The problem is that our chain has forks, like A1 -> (A2 -> A3 | A4 -> A5) and we should guarantee no races between messages going through A2 -> A3 and A4 -> A5. The currrent legacy solution is to block A1 actor til current message is fully processed (in one of sub-chains):
def receive { //pseudocode
case x => ...
val f = A2orA4 ? msg
Await.complete(f, timeout)
}
As a result, count of threads in application is in direct ratio to the count of messages, that are in processing, no matter these messages are active or just asynchronously waiting for some response from outer service. It works about two years with fork-join (or any other dynamic) pool but of course can't work with fixed-pool and extremely decrease performance in case of high-load. More than that, it affects GC as every blocked fork-actor holds redundant previous message's state inside.
Even with backpressure it creates N times more threads than messages received (as there is N sequential forks in the flow), which is still bad as proceesing of one message takes a long time but not much CPU. So we should process as more messages as we have enough memory for. First solution I came up with - to linearize the chain like A1 -> A2 -> A3 -> A4 -> A5. Is there any better?
The simpler solution is to store a future for last received message into the actor's state and chain it with previous future:
def receive = process(Future{new Ack}) //completed future
def process(prevAck: Future[Ack]): Receive = { //pseudocode
case x => ...
context become process(prevAck.flatMap(_ => A2orA4 ? msg))
}
So it will create chain of futures without any blocking. The chain will be erased after futures completion (except the last one).
In my j2me app I have an array of double data type containing 5 coordinates value. This array is inside the thread to continuously check whether the same values is given by GPS.
Once it get correct match, I want to pause the thread then remove match found value from thread and resume the thread. I want this should be happen till array contains coordinates values. Once array got empty I want to pause the thread till it get new value, Once again when array gets values it should start again.
How should I implement this logic in code?
If it was me, I wouldn't bother putting the Thread into pause. I'd just have it running all the time.
while (true) {
for (coordinate in arrayOfCoordinates) {
if (checkLocation(coordinate)) removeFromArray(coordinate);
}
try { Thread.sleep(5000); } catch (Exception e) {}
}
I don't see any reason putting the thread into pause, when it's only making such a small check you describe.