Diameter: Let us consider one server sent one diameter request to the host1.com(for eg: AA over Rx) Can i receive a response from another host say host2.com. Is this a valid scenario ?
Unfortunately you can see this happening a lot in the diameter world.
You have request with dest-host: X and receives response from Orig-Host: Y.
It happens sometimes because there is a pool of servers, sometimes because other peer “stole” the request and there are probably more reasons.
But I started with Unfortunately because sometime diameter peers take this Y and use it in the next request as dest-host AVP (instead of using X as they should)
This is why it is not recommended to do so
Related
I need to make POST http request at exact timestamp in future, as accurate as possible, down to milliseconds. But there is network latency as well. How can I achieve such a goal?
setTimeout is not enough here, because it always takes some time resulting in latecomer request due vary network latency. And firing this request before target timestamp may result in early coming request.
My goal is to make request guaranteed came to server after target timestamp, but as soon as possible after it. Could you suggest any solutions with Nodejs?
The best you can do in nodejs (which is not a real-time system) is to do the following:
Premeasure the expected latency so you know about how much to presend the request.
Use setTimeout() to schedule the send at precisely the one-way latency time before your target time. There is no other mechanism in nodejs that would be more precise.
If your request involves a DNS lookup, you can prefetch the TCP address for your hostname and take the DNS lookup time out of your request cycle or at least prime the local DNS cache.
Create a dedicated nodejs program that does nothing else - so its event loop will not be doing anything else at the time the setTimeout() needs to run. You could run this as a child_process from your larger program if desired.
Run a number of tests to see how the timing works and, if you are consistently off by some margin, then adjust your latency offset.
You can develop a regular latency test to determine if the latency changes with time.
As others have said, there is no way to predict what the natural response time will be of the target server (how long it takes to start processing your request from the moment your network packets arrive there). If lots of incoming requests are all racing for the same time slot, then your request will get interleaved in among all the others and served in some order that you do not control.
Other things you can consider. If the target server supports the latest http specifications, then you can have a pre-established http connection with the host (perhaps targeting some other endpoint) that will be kept alive for you to send your precise timing request on. This would take some experimentation to figure out what the target host supports and if this would work.
I'm wondering, how modern DNS servers dealing with millions queries per second, due to the fact that txnid field is uint16 type?
Let me explain. There is intermediate server, from one side clients sending to it DNS requests, and from other side server itself sending requests to upper DNS server (8.8.8.8 for example). So the thing is, that according to DNS protocol there is field txnid in the DNS header, which should be unchanged during request and response. Obviously, that intermediate DNS server with multiple clients replace this value with it's own txnid value (which is a counter), then sends request to external DNS server and after resolving replace this value back to client's one. And all of this will work fine for 65535 simultaneous requests due to uint16 field type. But what if we have hundreds of millions of them like Google DNS servers?
Going from your Google DNS server example:
In mid-2018 their servers were handling 1.2 trillion queries-per-day, extrapolating that growth says their service is currently handling ~20 million queries-per-second
They say that successful resolution of a cache-miss takes ~130ms, but taking timeouts into account pushes the average time up to ~400ms
I can't find any numbers on what their cache-hit rates are like, but I'd assume it's more than 90%. And presumably it increases with the popularity of their service
Putting the above together (2e7 * 0.4 * (1-0.9)) we get ~1M transactions active at any one time. So you have to find at least 20 bits of state somewhere. 16 bits comes for free because of the txnid field. As Steffen points out you can also use port numbers, which might give you another ~15 bits of state. Just these two sources give you more than enough state to run something orders of magnitude bigger than Google's DNS system.
That said, you could also just relegate transaction IDs to preventing any cache-poisoning attacks, i.e. reject any answers where the txnid doesn't match the inflight query for that question. If this check passes, then add the answer to the cache and resume any waiting clients.
Let us say, a gRPC client makes two requests R1 and R2 to gRPC server, one after the other (assume without any significant time gap, i.e R2 is made when R1 is still not served). Also, assume that R1 takes much more time than R2.
In this case, should I expect R2's response first as it takes less time or should I expect R1's response first as this request is made prior to R2? What will happen and why?
As far as what I have observed, I think requests are served in FCFS fashion, so, R1's response will be received by the client first and then R2's, but I am not sure.
Theoretically nothing discourages server and client process gRPC requests in parallel. GRPC connection is made over HTTP/2 one that can handle multiple requests at once. So yes - if server doesn't use some specific synchronization or limitation mechanisms then requests would be processes with overlapping. If server resources or policy doesn't allow it then they should be processed one by one. Also I can add than request can have a Timeout after which it would be cancelled. So long wait can lead to cancellation and non-processing at all.
All requests should be processed in parallel. The gRPC architecture for the Java implementation for example, it is divided into 2 "parts":
The event loop runs in a thread work group - It is similar to what we have to reactive implementations. One thread per core to handle the incoming requests.
The request processing is done in a dedicated thread which will be created using the CachedThreadPool system by default.
For single-thread languages like Javascript, I am not sure how they are doing it, but I would guess it is done in the same thread and therefore it would end up queuing the requests.
The getstream.io documentation says that one should expect retrieving a feed in approximately 60ms. When I retrieve my feeds they contain a field named 'duration' which I take is the calculated server side processing time. This value is steadily around 10-40ms, with an average around 15ms.
The problem is, I seldomly get my feeds in less than 150ms and the average time is rather around 200-250ms and sometimes up to 300-400ms. This is the time for the getting the feed alone, no enrichment etc., and I have verified with tcpdump that the network roundtrip is low (around 25ms), and that the time is actually spent waiting for the server to respond.
I've tried to move around my application (eu-west and eu-central) but that doesn't seem to affect things much (again, network roundtrip is steadily around 25ms).
My question is - should I really expect 60ms and continue investigating, or is 200-400ms normal? On the getstream.io site it is explained that developer accounts receive "Low Priority Processing" - what does this mean in practise? How much difference could I expect with another plan?
I'm using the node js low level API.
Stream APIs use SSL to encrypt traffic. Unfortunately SSL introduces additional network I/O. Usually you need to pay for the increased latency only once because Stream HTTP APIs supports HTTP persistent connection (aka keep-alive).
Here's a Wireshark screenshot of the TCP traffic of 2 sequential API requests with keep alive disabled client side:
The 4 lines in red highlight that the TCP connection is getting closed each time. Another interesting thing is that the handshaking takes almost 100ms and it's done twice (the first bunch of lines).
After some investigation, it turns out that the library used to make API requests to Stream's APIs (request) does not have keep-alive enabled by default. Such change will be part of the library soon and is available on a development branch.
Here's a screenshot of the same two requests with keep-alive enabled (using the code from that branch):
This time there is not connection reset anymore and the second HTTP request does not do SSL handshaking.
I wonder how can I "abort" a message after it has not been sent for sometime.
The scenario is simple:
1) Client connects to server
2) The server goes down
3) client send a message, there's no issue here as Zmq queues the message locally (so the "send" operation is successful)
4) Assume I've set RCVTIMEO I get the timeout
5) After I got the timeout I no longer wish to send the message, but once the server goes up again Zmq will transmit the message. How can I prevent it?
The reason I want to prevent this is because once I got the timeout I responded back to my customer with failure message (e.g "the request could not be processed due to timeout"), and it would be a real issue if eventually his request would get transmitted and processed...
Hope my question is clear... Thx!
Step 1 ) set aClientSOCKET instance ZMQ_LINGER parameter not to spend any time to re-send any en-queued messages during the forthcoming socket dismantling operations ( which is fairly in-line with the modern high-performance / low-latency distributed messaging systems design -- things go wrong, quite often go wrong, so work with this as a matter of fact rather than lose time fighting against un-avoidable ... )
and
Step 2 ) force the .close() to discard the socket
and
Step 3 ) if needed, re-instate another socket / communication means or use another a-priori-prepared means ( alike the binary-star fault-resilient shading ) to resolve the intended application processing after the situation the { "primary" | "observed" }-connection-peer handshaking time-out-ed.