How to speed up writing points to InfluxDB?

How to speed up writing points to InfluxDB? - multithreading

I'm writing points (60k per batch) to InfluxDB.
Point contains time (i64) and value (f64).
It looks like.
use influx_db_client::{
Client, Point, Points, Value, Precision
};
use tokio;
fn main() {
let client = Client::new(Url::parse("http://localhost:8086").expect("Cannot parse url"), "test");
let data: Vec<Point> = get_data();
tokio::runtime::Runtime::new().unwrap().block_on(async move {
for chunk in data.chunks(60000) {
let points = Points::create_new(chunk.to_vec());
client.write_points(points, Some(Precision::Milliseconds), None).await.expect("Cannot write points");
}
});
}
It works and it takes 6 seconds but I think it may be faster.
I tried to use threadpool but it doesn't work because I can't create a lot of clients.
use influx_db_client::{
Client, Point, Points, Value, Precision
};
use tokio;
use threadpool::ThreadPool;
fn main() {
let n_workers = 4;
let data: Vec<Point> = get_data();
let pool = ThreadPool::new(n_workers);
for chunk in data.chunks(60000) {
let points = Points::create_new(chunk.to_vec());
pool.execute(move || {
tokio::runtime::Runtime::new().unwrap().block_on(async move {
let client = Client::new(Url::parse("http://localhost:8086").expect("Cannot parse url"), "test");
client.write_points(points, Some(Precision::Milliseconds), None).await.expect("Cannot write points");
});
});
}
}
So I don't know how to execute it faster.
Can I use asynchronous?
Probably you know some optimization tricks?
data contains 1'400'000 points (they are quickly created, 1.3 s, get_data() - 0.627 s, chunk loop (without calling client.write_points) - 0.678 s)

Related

How to select a file as bytes or text in Rust WASM?

I am trying to get the Vec<u8> or String (or more ideally a Blob ObjectURL) of a file uploaded as triggered by a button click.
I am guessing this will require an invisible <input> somewhere in the DOM but I can't figure out how to leverage web_sys and/or gloo to either get the contents nor a Blob ObjectURL.

A js-triggered input probably won't do the trick, as many browsers won't let you trigger a file input from JS, for good reasons. You can use labels to hid the input if you think it is ugly. Other than that, you need to wiggle yourself through the files api of HtmlInputElement. Pretty painful, that:
use js_sys::{Object, Reflect, Uint8Array};
use wasm_bindgen::{prelude::*, JsCast};
use wasm_bindgen_futures::JsFuture;
use web_sys::*;
#[wasm_bindgen(start)]
pub fn init() {
// Just some setup for the example
std::panic::set_hook(Box::new(console_error_panic_hook::hook));
let window = window().unwrap();
let document = window.document().unwrap();
let body = document.body().unwrap();
while let Some(child) = body.first_child() {
body.remove_child(&child).unwrap();
}
// Create the actual input element
let input = document
.create_element("input")
.expect_throw("Create input")
.dyn_into::<HtmlInputElement>()
.unwrap();
input
.set_attribute("type", "file")
.expect_throw("Set input type file");
let recv_file = {
let input = input.clone();
Closure::<dyn FnMut()>::wrap(Box::new(move || {
let input = input.clone();
wasm_bindgen_futures::spawn_local(async move {
file_callback(input.files()).await;
})
}))
};
input
.add_event_listener_with_callback("change", recv_file.as_ref().dyn_ref().unwrap())
.expect_throw("Listen for file upload");
recv_file.forget(); // TODO: this leaks. I forgot how to get around that.
body.append_child(&input).unwrap();
}
async fn file_callback(files: Option<FileList>) {
let files = match files {
Some(files) => files,
None => return,
};
for i in 0..files.length() {
let file = match files.item(i) {
Some(file) => file,
None => continue,
};
console::log_2(&"File:".into(), &file.name().into());
let reader = file
.stream()
.get_reader()
.dyn_into::<ReadableStreamDefaultReader>()
.expect_throw("Reader is reader");
let mut data = Vec::new();
loop {
let chunk = JsFuture::from(reader.read())
.await
.expect_throw("Read")
.dyn_into::<Object>()
.unwrap();
// ReadableStreamReadResult is somehow wrong. So go by hand. Might be a web-sys bug.
let done = Reflect::get(&chunk, &"done".into()).expect_throw("Get done");
if done.is_truthy() {
break;
}
let chunk = Reflect::get(&chunk, &"value".into())
.expect_throw("Get chunk")
.dyn_into::<Uint8Array>()
.expect_throw("bytes are bytes");
let data_len = data.len();
data.resize(data_len + chunk.length() as usize, 255);
chunk.copy_to(&mut data[data_len..]);
}
console::log_2(
&"Got data".into(),
&String::from_utf8_lossy(&data).into_owned().into(),
);
}
}
(If you've got questions about the code, ask. But it's too much to explain it in detail.)
And extra, the features you need on web-sys for this to work:
[dependencies.web-sys]
version = "0.3.60"
features = ["Window", "Navigator", "console", "Document", "HtmlInputElement", "Event", "EventTarget", "FileList", "File", "Blob", "ReadableStream", "ReadableStreamDefaultReader", "ReadableStreamReadResult"]

Thanks to Caesar I ended up with this code for use with dominator as the Dom crate.
pub fn upload_file_input(mimes: &str, mutable: Mutable<Vec<u8>>) -> Dom {
input(|i| {
i.class("file-input")
.prop("type", "file")
.prop("accept", mimes)
.apply(|el| {
let element: HtmlInputElement = el.__internal_element();
let recv_file = {
let input = element.clone();
Closure::<dyn FnMut()>::wrap(Box::new(move || {
let input = input.clone();
let mutable = mutable.clone();
wasm_bindgen_futures::spawn_local(async move {
file_callback(input.files(), mutable.clone()).await;
})
}))
};
element
.add_event_listener_with_callback(
"change",
recv_file.as_ref().dyn_ref().unwrap(),
)
.expect("Listen for file upload");
recv_file.forget();
el
})
})
}
async fn file_callback(files: Option<FileList>, mutable: Mutable<Vec<u8>>) {
let files = match files {
Some(files) => files,
None => return,
};
for i in 0..files.length() {
let file = match files.item(i) {
Some(file) => file,
None => continue,
};
// gloo::console::console_dbg!("File:", &file.name());
let reader = file
.stream()
.get_reader()
.dyn_into::<ReadableStreamDefaultReader>()
.expect("Reader is reader");
let mut data = Vec::new();
loop {
let chunk = JsFuture::from(reader.read())
.await
.expect("Read")
.dyn_into::<Object>()
.unwrap();
// ReadableStreamReadResult is somehow wrong. So go by hand. Might be a web-sys bug.
let done = Reflect::get(&chunk, &"done".into()).expect("Get done");
if done.is_truthy() {
break;
}
let chunk = Reflect::get(&chunk, &"value".into())
.expect("Get chunk")
.dyn_into::<Uint8Array>()
.expect("bytes are bytes");
let data_len = data.len();
data.resize(data_len + chunk.length() as usize, 255);
chunk.copy_to(&mut data[data_len..]);
}
mutable.set(data);
// gloo::console::console_dbg!(
// "Got data",
// &String::from_utf8_lossy(&data).into_owned(),
// );
}
}

Turning tokio_postgres client into a variable for reuse

I am trying to figure out a way to make my tokio_postgres client a variable that I can reuse in different parts of my app. Ideally, I'm trying to achieve something similar to the Prisma ORM in the Node world:
const prisma = new PrismaClient()
...
const user = await prisma.user.create({
data: {
name: 'Alice',
email: 'alice#prisma.io',
},
The code I have so far is:
async fn connect() -> Result<P::Client, PgError> {
// Connect to the database.
let (client, connection) =
tokio_postgres::connect("host=localhost user=postgres", NoTls).await?;
// The connection object performs the actual communication with the database,
// so spawn it off to run on its own.
tokio::spawn(async move {
if let Err(e) = connection.await {
eprintln!("connection error: {}", e);
}
});
// Now we can execute a simple statement that just returns its parameter.
let rows = client
.query("SELECT $1::TEXT", &[&"hello world"])
.await?;
// And then check that we got back the same string we sent over.
let value: &str = rows[0].get(0);
assert_eq!(value, "hello world");
return client;
}
However, I am getting the error:
expected type Result<tokio_postgres::Client, tokio_postgres::Error>
found struct tokio_postgres::Client
Any idea what I could be doing wrong here? I'm new to Rust and maybe I'm just bringing baggage from Node, but I haven't found any documentation on this and figured it would be good to have.

How to translate JS Promises to Rust

at the moment I'm writing a pure Rust MQTT5 library (I know there are existing ones out there, but I'm more trying to learn Rust) and I stumpled upon this problem.
I'm using latest stable rust with tokio 1.0.1.
When I send out a packet over the wire, I often expect a response from the server (example below PingReq/PingAck, Ping/Pong).
Leaving out a lot if logic regarding timeouts and packet clashes I wrote a simplified version of the logic in JavaScript (since I know that fairly well).
How would this logic translate to Rust and its futures?
Or to be more clear: Can I somehow recreate the resolve() callback function behavior of awaitPackage + onIncomingPacket?
class Client {
awaitedPacketTypes = {};
/**
* a ping consist of a send ping and a receive pong
*/
async ping(){
await this.sendPacket("Ping");
return await this.awaitPackage("Pong");
}
async sendPacket(packetType) { /*...*/ }
/**
* This expects a specific packet type to be received in the future
* #param {*} packetType
*/
awaitPackage(packetType) {
return new Promise((resolve, reject) => {
this.awaitedPacketTypes[packetType] = {
resolve,
reject
};
});
}
/**
* This gets called for every packet from the network side and calls the correct resolver if something waits for this packet type
* #param {*} packet
*/
onIncomingPacket(packet) {
if(this.awaitedPacketTypes[packet.type]) {
this.awaitedPacketTypes[packet.type].resolve(packet);
this.awaitedPacketTypes[packet.type] = undefined;
} else {
/*...*/
}
}
}

Or to be more clear: Can I somehow recreate the resolve() callback function behavior of awaitPackage + onIncomingPacket?
Kinda? A rust Future is only "something which can be polled for readiness", it's a much lower-level concept than a JS promise.
There are libraries which claim to provide JS-style promises, but most every async library probably provides a similar object named differently e.g. in Tokio, you'd probably want a oneshot channel, that is a channel on which a single value can be sent, resulting in something along the lines of:
struct Packet { r#type: &'static str }
struct Client {
awaited: Mutex<HashMap<&'static str, Sender<Packet>>>
}
impl Client {
async fn ping(&self) -> Packet {
self.send_packet("Pong").await;
self.await_package("Pong").await.unwrap()
}
async fn send_packet(&self, _: &'static str) {}
fn await_package(&self, packet_type: &'static str) -> Receiver<Packet> {
let (tx, rx) = channel();
self.awaited.lock().unwrap().insert(packet_type, tx);
rx
}
fn on_incoming_packet(&self, packet: Packet) {
if let Some(tx) = self.awaited.lock().unwrap().remove(packet.r#type) {
tx.send(packet);
}
}
}

from a stream "FramedRead" how to "do something" in every chunk

I would like to display the upload progress of a file using the crate indicatif, I am uploading the file asynchronous using reqwest with something like this:
use tokio::fs::File;
use tokio_util::codec::{BytesCodec, FramedRead};
let file = File::open(file_path).await?;
let stream = FramedRead::new(file, BytesCodec::new());
let body = Body::wrap_stream(stream);
client.put(url).body(body)
The progress bar is implemented like this:
use indicatif::ProgressBar;
let bar = ProgressBar::new(1000);
for _ in 0..1000 {
bar.inc(1);
// ...
}
bar.finish();
How from the stream:
let stream = FramedRead::new(file, BytesCodec::new());
// how on every chunk do X ?
let body = Body::wrap_stream(stream);
could I call bar.inc(1) on every interaction?
From the docs I see that there is a read_buffer but how to iterate over it in a way that I could use it for calling a custom function or also count the bytes send in the cased I could display "bytes sent" so far for example.

You can use TryStreamExt::inspect_ok, for instance, which will call a closure with a reference to every Ok(item) in the stream when that item is consumed.
use futures::stream::TryStreamExt;
use tokio_util::codec::{BytesCodec, FramedRead};
let stream = FramedRead::new(file, BytesCodec::new())
.inspect_ok(|chunk| {
// do X with chunk...
});
let body = Body::wrap_stream(stream);

Fastest way to send many groups of HTTP requests using new async/await syntax and control the amount of workers

Most recent threads I have read are saying async is the better way to perform lots of I/O bound work such as sending HTTP requests and the like. I have tried to pick up async recently but am struggling with understanding how to send many groups of requests in parallel, for example:
let client = reqwest::Client::new();
let mut requests = 0;
let get = client.get("https://somesite.com").send().await?;
let response = get.text().await?;
if response.contains("some stuff") {
let get = client.get("https://somesite.com/something").send().await?;
let response = get.text().await?;
if response.contains("some new stuff") {
requests += 1;
println!("Got response {}", requests)
This does what I want, but how can I run it in parallel and control the amount of "worker threads" or whatever the equivalent is to a thread pool in async?
I understand it is similar to this question, but mine is strictly talking about the nightly Rust async/await syntax and a more specific use case where groups of requests/tasks need to be done. I also find using combinators for these situations a bit confusing, was hoping the newer style would help make it a bit more readable.

Not sure if this is the fastest way, as I am just experimenting myself, but here is my solution:
let client = reqwest::Client::new();
let links = vec![ // A vec of strings representing links
"example.net/a".to_owned(),
"example.net/b".to_owned(),
"example.net/c".to_owned(),
"example.net/d".to_owned(),
];
let ref_client = &client; // Need this to prevent client from being moved into the first map
futures::stream::iter(links)
.map(async move |link: String| {
let res = ref_client.get(&link).send().await;
// res.map(|res| res.text().await.unwrap().to_vec())
match res { // This is where I would usually use `map`, but not sure how to await for a future inside a result
Ok(res) => Ok(res.text().await.unwrap()),
Err(err) => Err(err),
}
})
.buffer_unordered(10) // Number of connection at the same time
.filter_map(|c| future::ready(c.ok())) // Throw errors out, do your own error handling here
.filter_map(|item| {
if item.contains("abc") {
future::ready(Some(item))
} else {
future::ready(None)
}
})
.map(async move |sec_link| {
let res = ref_client.get(&sec_link).send().await;
match res {
Ok(res) => Ok(res.text().await.unwrap()),
Err(err) => Err(err),
}
})
.buffer_unordered(10) // Number of connections for the secondary requests (so max 20 connections concurrently)
.filter_map(|c| future::ready(c.ok()))
.for_each(|item| {
println!("File received: {}", item);
future::ready(())
})
.await;
This requires the #![feature(async_closure)] feature.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to speed up writing points to InfluxDB? - multithreading

Related

How to select a file as bytes or text in Rust WASM?

Turning tokio_postgres client into a variable for reuse

How to translate JS Promises to Rust

from a stream "FramedRead" how to "do something" in every chunk

Fastest way to send many groups of HTTP requests using new async/await syntax and control the amount of workers

Categories

Resources