Do a synchronous http client fetch within an actix thread - rust

I have an actix endpoint, and I need to do a synchronous http client fetch to get some results, and return some data. My endpoints cannot use async, so I can't use any .await methods.
I've tried using reqwests blocking client in my endpoint like so:
{ ...
let res = reqwest::blocking::get(&fetch_url)?
.json::<MyResp>()?;
...
But it gives me the error:
thread 'main' panicked at 'Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.', /.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.9/src/runtime/enter.rs:19:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

you should try creating a new thread for that:
std::thread::spawn(move || {
reqwest::blocking::get(&url).unwrap().json().unwrap()
}).join().unwrap()

I couldn't figure out how to get it working with reqwest (it must have some weird conflicts with actix), but for some reason it worked fine with chttp.
chttp::get(&fetch_url)?.text()?;

You cannot use blocking functions inside async functions.
Instead of reqwest::blocking::get() use reqwest::get().await.

Related

In a Rust Tonic gRPC server, how to close the network connection after receiving a malicious request?

Rust Tonic generates the following interface for a simple "hello-world" application:
pub trait HelloworldService: Send + Sync + 'static {
async fn sayhello(
&self,
request: tonic::Request<super::UserInput>,
) -> Result<tonic::Response<super::UserInputResponse>, tonic::Status>;
}
After implementing function sayhello and starting a tonic server, everything works as expected.
My question is:
If I check the input UserInput object and decide that the current user input is malicious (say, contain an empty security token), I'd like to immediately close the network connection without feeding any response (not even some error msg/code) to the client-side, how to do that?
Tonic doesn't seem to have an API to access the underlying network. Had to dig deeper all the way down to hyper h2 server (hyper/src/proto/h2/server.rs). In the implementation of struct H2Stream (impl<F, B, E> H2Stream<F, B> where ...), there is a function poll2(), which is called to poll the state of H2Stream. That function contains the following piece of code that apparently inserts the current time into the response header before sending it over to the client.
// set Date header if it isn't already set...
res.headers_mut()
.entry(::http::header::DATE)
.or_insert_with(date::update_and_header_value);
This leads to the following idea: add a flag in the response returned from HelloworldService::sayhello() when detecting a malicious request, then check that flag in H2Stream::poll2(), if the flag is presented, immediately return from H2Stream::poll2() with return Poll::Ready(Ok(())), this will drop the stream and hence the connection.
Http response Extensions are used to carry that flag. It is set by sayHello() in the Extensions field of tonic::Response, and that field is carried over to the Extensions field of http::response::Response, which is checked by H2Stream::poll2().
This apparent hack seems to work. But is there a better idea?

Tokio thread is not starting / spawning

I'm trying to start a new task to read from a socket client. I'm using the following same method on both the websocket server and client to receive from the connection.
The problem is, on the server side, the thread is started (2 log lines printed), but on the client side the thread is not starting (only the first line printed).
If I await on the spawn(), I can receive from the client. But then the parent task cannot proceed.
Any pointers for solving this problem?
pub async fn receive_message_from_peer(
mut receiver: PeerReceiver,
sender: Sender<IoEvent>,
peer_index: u64,
) {
debug!("starting new task for reading from peer : {:?}", peer_index);
tokio::task::spawn(async move {
debug!("new thread started for peer receiving");
// ....
}); // not awaiting or join!()
Do you use tokio::TcpListener::from_std(...) to create a listener object this way?
I had the same problem as you, my std_listener object was created based on net2. So there is a scheduling incompatibility problem.
From the description in the newer official documentation https://docs.rs/tokio/latest/tokio/net/struct.TcpListener.html#method.from_std, it seems that tokio currently has better support for socket2.
So I think the issue was I was using std::thread::sleep() in async code in some places. after using tokio::time::sleep() didn't need to yield the thread.

Sequentially execute webhooks received in node application

I have a node application using koa. It receiving webhooks from external application on specific resources.
To illustrate let say the webhook send me with POST request an object of this type :
{
'resource_id':'<SomeID>',
'resource_origin':'<SomeResourceOrigin>',
'value' : '<SomeValue>'
}
I would like to execute sequentially any resources coming from the same origin to avoid desynchronization of resources related to my execution.
I was thinking to use database as lock and use cron to sequentially executing my process for each resources of same origin.
But I'm not sure it's the most efficient method.
So my question is here :
Do you know some method/package/service allowing me to use global queues that I could implement for each origin insuring resources from same origin will be executed synchronously without making all webhooks processed sequentially ? If it do not use database it's better.
If I were you I would start by serializing the handling of all your webhooks. In other words, I suggest you handle them one at a time no matter their origin. Use a simple queue inside your nodejs application.
(Once you've convinced yourself that works correctly, you can then serialize them based on origin.)
First, structure your function (let's call it handleOneWebhook()) for handling incoming webhooks as a Promise or an async function. Then you could invoke them using code with this outline.
let busy= false
async function handleManyWebhooks (queue) {
if (busy) return
busy = true
while (queue.length > 0) {
const item = queue.shift()
await handleOneWebhook (item)
}
busy = false
}
The queue you pass to handleManyWebhooks is a simple array, where each element is the object from a POST request. You use it as a queue: push() each object to put it into the queue, and shift() to remove it.
Then, whenever you receive a webhook POST object you use code with this outline.
const queue = []
...
function handlePostObject (postObject) {
queue.push(postObject)
handleManyWebooks (queue)
}
Even though you call handleManyWebhooks once for each incoming object, the busy flag makes sure it handles only one at a time.
Notice this is a very simple solution. Once you have it working correctly, two possible refinements suggest themselves.
Use something more efficient for your queue than a simple array. shift() is not very fast.
Create a separate queue object with its own busy flag for each separate origin. Then you will be able to parallelize the handling of webhooks from different origins while still serializing the stream of webhooks from each origin.
Solution I decide to use
Small brief of the post discussion
As Ivan Rubinson let me know my problem is just a producer-consumer problem.
So I finally chose to use RabbitMQ because I have a huge amount of webhook to process. For peoples having a small amount of request to process and do not want use external tools O. Jones answer is a real good way to solve the problem.
Solution design
I finally install and configure a RabbitMQ server, then I created for each origin of my web-hooks one queue.
Producer
On the producer side when I receive the web-hook data I send a message to the queue corresponding to the origin of my web-hook with serialized information needed to process in fact id of the row in the Database to make messages as light as possible.
Consumer
On the consumer side I create a consumer function for each origin queue and set the fetch policy to one to process message one by one in each queue finally I set the channel policy to wait an acknowledgement message before to send the next message . Wit this configuration consumers proceed message by message and solve the initial problem.
Implementation
Producer
async function create(){
await amqp.connect(RBMQ_CONNECTION_STRING).then(async (conn)=>{
await conn.createChannel().then(async (ch)=>{
global.channel_publisher=ch;
});
});
}
async function sendtask(queue,task){
if(!global.channel_publisher){
await create();
}
global.channel_publisher.assertQueue(queue).then((ok)=>{
global.channel_publisher.sendToQueue(queue, Buffer.from(task));
});
}
I use the sendtask(queue,task) function at the place I received my web-hook
Consumer
async function create(){
await amqp.connect(RBMQ_CONNECTION_STRING).then(async (conn)=>{
await conn.createChannel().then(async (ch)=>{
ch.prefetch(1);
global.channel_consumer=ch;
});
});
}
async function consumeTask(queue){
if(!global.channel_consumer){
await create();
}
global.channel_consumer.assertQueue(queue).then((ok)=>{
global.channel_consumer.consume(queue,(message)=>{
const args=message.content.toString().split(';');
await processWebhooks(args);
global.channel_consumer.ack(message);
});
});
}
I use the consumeTask(queue) when I had to process a new origin of web-hooks. Also I use it for initialize my application with all known origins in the database.

get the output of the remote execution of the SSH command

I work with expressJs and to execute a remote SSH command I use the 'simple-ssh', this code allows to execute the command except that I could not get the result of the display outside this block.
ssh.exec('ls Documents/versions', {
out: function(stdout) {
arrayOfVersion = stdout.split("\n");}}).start();
How to get the content of arrayOfVersion and manipulate it after
Your function which creates arrayofVersion async, you won't be able access it outside of this scope without some sort of waiting process which waits until the variable has a value.
You can do this in a few ways, to begin with I would recommend researching how nodejs handles async functions as this is a big part of nodejs. Generally you would use one of the following: callbacks, promises, or async/await.
With any of those techniques, you should be able to run your SSH code and then continue on with the result of the stdout.

Is there any risk to read/write the same file content from different 'sessions' in Node JS?

I'm new in Node JS and i wonder if under mentioned snippets of code has multisession problem.
Consider I have Node JS server (express) and I listen on some POST request:
app.post('/sync/:method', onPostRequest);
var onPostRequest = function(req,res){
// parse request and fetch email list
var emails = [....]; // pseudocode
doJob(emails);
res.status(200).end('OK');
}
function doJob(_emails){
try {
emailsFromFile = fs.readFileSync(FILE_PATH, "utf8") || {};
if(_.isString(oldEmails)){
emailsFromFile = JSON.parse(emailsFromFile);
}
_emails.forEach(function(_email){
if( !emailsFromFile[_email] ){
emailsFromFile[_email] = 0;
}
else{
emailsFromFile[_email] += 1;
}
});
// write object back
fs.writeFileSync(FILE_PATH, JSON.stringify(emailsFromFile));
} catch (e) {
console.error(e);
};
}
So doJob method receives _emails list and I update (counter +1) these emails from object emailsFromFile loaded from file.
Consider I got 2 requests at the same time and it triggers doJob twice. I afraid that when one request loaded emailsFromFile from file, the second request might change file content.
Can anybody spread the light on this issue?
Because the code in the doJob() function is all synchronous, there is no risk of multiple requests causing a concurrency problem.
If you were using async IO in that function, then there would be possible concurrency issues.
To explain, Javascript in node.js is single threaded. So, there is only one thread of Javascript execution running at a time and that thread of execution runs until it returns back to the event loop. So, any sequence of entirely synchronous code like you have in doJob() will run to completion without interruption.
If, on the other hand, you use any asynchronous operations such as fs.readFile() instead of fs.readFileSync(), then that thread of execution will return back to the event loop at the point you call fs.readFileSync() and another request can be run while it is reading the file. If that were the case, then you could end up with two requests conflicting over the same file. In that case, you would have to implement some form of concurrency protection (some sort of flag or queue). This is the type of thing that databases offer lots of features for.
I have a node.js app running on a Raspberry Pi that uses lots of async file I/O and I can have conflicts with that code from multiple requests. I solved it by setting a flag anytime I'm writing to a specific file and any other requests that want to write to that file first check that flag and if it is set, those requests going into my own queue are then served when the prior request finishes its write operation. There are many other ways to solve that too. If this happens in a lot of places, then it's probably worth just getting a database that offers features for this type of write contention.

Resources