Hazelcast High Response Times - hazelcast

We have a Java 1.6 application that uses Hazelcast 3.7.4 version,
with a topology of two nodes. The application operates mainly with 3 maps.
In normal application working, response times when consulting the maps are
generally in values around some milliseconds tens.
I have observed that in some circumstances such as for example with network
cuts, the response time increases to huge values such as for example, 20 or 30 seconds!!
And this is impacting the application performance.
I would like to know if this kind of situation with network micro-cuts can lead
to increase searches response time in this manner. I do not know if some concrete configuration can be done to minimize this, and also which other elements can provoke so high times.
I provide some examples of some executed consults
Example 1:
String sqlPredicate = "acui='"+acui+"'";
Collection<Agent> agents =
(Collection<Agent>) data.getMapAgents().values(new SqlPredicate(sqlPredicate));
Example 2:
boolean exist = data.getMapAgents().containsKey(agent);
Thanks so much for your help.
Best Regards,
Jorge

The Map operations are all TCP Socket based and thus are subject to your Operating Systems TCP Driver implementation.
See TCP_NODELAY

Related

How to pass efficiently a huge sized parameter to a function?

thank you reading. Please forgive my English, I'm french. Is it a justification ?
The aim of my function is to scan many IPv4:PORT via multiprocessing/threading (do not, I forsee you, do not !) in Python3, send a SYN and register any result. Multisomething is called here because of the timeout needed to wait to any potential reply.
So.
This function is called a producer. It stores potential TCP connection into a database. Consumers
then read the database and do some pentesting indepently. So.
I prepared an IPv4 list. It's a random valid IP list of 10's K elements. Remember we have 65K ports to test per IPv4.
My method is then to suffle the list port with a new seed for each producer launched. I have many. Each have a valid IPv4 list, but, if we consider 65K ports with 100K IP's, we have a 6.5G elements to pass.
Why ?
IP[] are random.shuffle()-like, by construction. Ports[] are too.
If I read p in Ports and for each p, join IPv4:Ports and append into params[], I can launch the parallelized job via scan(u) for u in params[]
I launch it via run(rounds) like this :
def run(rounds):
for r in rounds:
scan(r)
But, the problem is size(rounds) = size(params) ~ 6.5G elements
I cannot find a efficient (memory talking) way to pass such a huge parameter (big list) to a parallelized job. I'm running out of memory.
I'm not asking how to manage it on a mind-blowing-capable workstation, while having designed this function on a paper, it doesn't fit into the raspi2 (1GB mem).
I do not need Python3 at all for myself, it a teaching product. Thus, i'm stuck.
My question is : could you help me to find a efficient way to attack a huge list by a parallelized function that pop's the list without sending it via a parameter ?
I googled, followed forums and threads I'm aware of, but, as I refactor the programm, the constant problem stays, laughing at me, at the same place in the main().
What I dont want to reproduce :
MAX_IPV4 = i.IPv4Address._ALL_ONES
MAX_PORT = 65535
def r_ipv4():
while True:
a=i.IPv4Address._string_from_ip_int(r.randint(0, MAX_IPV4))
b=i.IPv4Address(a)
if b.is_multicast or b.is_private or b.is_loopback or b.is_reserved or b.is_link_local :
continue
else :
return a
break
def generate_target():
return r_ipv4()
def generate_r_targets(n):
ip=[]
for i in range(0,n):
ip.append(generate_target())
ports=[]
for p in range(1,65535):
ports.append(p)
r.shuffle(ports)
targets=[]
for p in ports:
for z in ip:
targets.append((str(z)+","+str(p)))
It's not my optimal method, but the way I can explain and show the problem at best.
I dont want to lost the (shuffled ports) iterable, it is my principal vector and it is at the heart.
I'm testing a dynamic configuration of knockd.
Say. Computer_one register a sequence in knockd.
Computer_one is the only one to know the seed.
The seed is the key of the sequence of keys of knockd, that's done.
We passed it via a plain file :)

KRPC Protocol act weird in BEP-05

According to BEP-05 , when you start a find_node or get_peers request, you will receive the query message or K (8) good nodes closest to the target/infohash.
However, in my case ,with the bootstrap node router.utorrent.com:6881, the remote returned the 8 nodes which closest to self's nodeId. And if it is a get_peers request, it always returned 8 nodes closest to self and 7 invalid peers. But if access to some special node which redirect to near the infohash, the protocol acts normal.
weird wireshark dump
success wireshark dump
Any help would be appreciated!
You shouldn't pay too much attention to what the bootstrap nodes do as long as they allow you to populate your routing table, since that is their primary purpose.
They receive a disproportionate amount of traffic and to avoid directing undue amounts of traffic to any particular node they may deviate from the specification in a few ways that are harmless as long as only a vanishingly small fraction of the network behaves that way. There is only a single-digit number of bootstrap nodes among millions, so their behavior is negligible and should not be taken as a reference point.
It does not make sense to contact a bootstrap node via get peers either. find node queries would be the correct choice to populate your routing table. And it is only necessary to contact them in the relatively rare case where other mechanisms were not successful.

What are the most recent bittorrent DHT implementation recommendations?

I'm working on implementing yet another bittorrent client and at this time struggling with DHT. It is implemented accordingly to this specification http://www.bittorrent.org/beps/bep_0005.html but starting debugging it I noticed that other nodes' responses on the network vary.
For example, find_node is supposed to return either target node info or 8 closest nodes. Most of the nodes reply with 34 closest nodes and usually only 1 - 3 nodes from those 34 successfully reply to the consequent ping request.
Is there another document with better implementation recommendation? May be it is already proved that using 15 minutes interval to change the nodes state to questionable is not efficient and I have to use 10 or other number? Where can I find the best up to date suggestions?
There is another strange thing. Bootstrap nodes like router.bittorrent.com reply with even more closest nodes and usually the "nodes" BDictionary property buffer length is not divisible to 6 (compact node info: 4 for IP and 2 for port). For now, I simply cut off the buffer at the closest divisible to 6 length but all that is strange. Does anybody know why that might happen?
the spec says (emphasis mine):
When a node receives a find_node query, it should respond with a key "nodes" and value of a string containing the compact node info for [...]
Further down:
Contact information for nodes is encoded as a 26-byte string. Also known as "Compact node info" the 20-byte Node ID in network byte order has the compact IP-address/port info concatenated to the end.
Additionally you should read the original Kademlia paper since the bittorrent BEP builds on the concepts described therein and omits deeper explanations of those concepts.
You might also want to read for a few few extensions that are more or less de-facto standard for most implementations now http://libtorrent.org/dht_extensions.html
And read the other DHT-related BEPs, some are fairly widely adopted and modify/clarify BEP-5-specified behavior, but generally in a backward-compatible way.
For example, find_node is supposed to return either target node info or 8 closest nodes
Nodes will return a variable amount of entries. Could be more than 8. Or fewer.

What is the best way to implement the x.224 OSI COTS protocol on Linux

I need to make an old Linux box running 2.6.12.1 kernel communicate with an older computer that is using:
ISO 8602 Datagram (connectionless service) 1987 12 15 (1st Edition)
ISO 8073 Class 4 (connection oriented service)
These are using "Inactive Network Layer" subset. (I am pretty sure this means I do not have to worry about routing. The two end points are hitting each other with their mac addresses.)
I have a kernel module that implements the connectionless part. In order to get the connection oriented service operational, what is the best approach? I have been taking the approach of adding in the struct proto_ops .connect, .accept, .listen functions to my existing connectionless driver by referring to the tcp implementation.
Maybe there is a better approach? I am spending a lot of time trying to decide what the tcp code is doing and then deciding if that is relevant to my needs. For example, the Nagle algorithm isn't needed because I don't have small bits of data being transmitted. In addition, there are probably a lot of error recovery and flow control stuff I don't need because I know the data that the two endpoints are transmitting and how frequently they transmit it. My plan is to implement this first with whatever simplistic (if any) packet retransmission, sequencing, etc.. to the point where my wireshark looks similar to the wireshark capture I have from the live system. Then try mine against the real thing and then add in whatever error recovery/retransmit stuff seems necessary. In other words, it is a pain in the rear trying to determine what is the guts of the tcp/stream implementation that I want to copy vs the extra error correction/flow control stuff that I might never need.
I found \net\core\stream.c which says:
* Generic stream handling routines. These are generic for most
* protocols. Even IP. Tonight 8-).
* This is used because TCP, LLC (others too) layer all have mostly
* identical sendmsg() and recvmsg() code.
* So we (will) share it here.
This suggested to me that maybe there might be a simpler stream thingy that I can start from. Can someone recommend a more basic streams driver that I should start from instead of tcp?
Is there any example code that provides a basic stream implementation?
I made a user level library to implement the protocol providing my own versions of open/read/write/select etc. If anyone else cares, you can find me at http://pnwsoft.com
Do not attempt to use openss7. It is a total waste of time.

Give reads priority over writes in Elasticsearch

I have an EC2 server running Elasticsearch 0.9 with a nginx server for read/write access. My index has about 750k small-medium documents. I have a pretty continuous stream of minimal writes (mainly updates) to the content. The speeds/consistency I receive with search is fine with me, but I have some sporadic timeout issues with multi-get (/_mget).
On some pages in my app, our server will request a multi-get of a dozen to a few thousand documents (this usually takes less than 1-2 seconds). The requests that fail, fail with a 30,000 millisecond timeout from the nginx server. I am assuming this happens because the index was temporarily locked for writing/optimizing purposes. Does anyone have any ideas on what I can do here?
A temporary solution would be to lower the timeout and return a user friendly message saying documents couldn't be retrieved (however they still would have to wait ~10 seconds to see an error message).
Some of my other thoughts were to give read priority over writes. Anytime someone is trying to read a part of the index, don't allow any writes/locks to that section. I don't think this would be scalable and it may not even be possible?
Finally, I was thinking I could have a read-only alias and a write-only alias. I can figure out how to set this up through the documentation, but I am not sure if it will actually work like I expect it to (and I'm not sure how I can reliably test it in a local environment). If I set up aliases like this, would the read-only alias still have moments where the index was locked due to information being written through the write-only alias?
I'm sure someone else has come across this before, what is the typical solution to make sure a user can always read data from the index with a higher priority over writes. I would consider increasing our server power, if required. Currently we have 2 m2x-large EC2 instances. One is the primary and the replica, each with 4 shards.
An example dump of cURL info from a failed request (with an error of Operation timed out after 30000 milliseconds with 0 bytes received):
{
"url":"127.0.0.1:9200\/_mget",
"content_type":null,
"http_code":100,
"header_size":25,
"request_size":221,
"filetime":-1,
"ssl_verify_result":0,
"redirect_count":0,
"total_time":30.391506,
"namelookup_time":7.5e-5,
"connect_time":0.0593,
"pretransfer_time":0.059303,
"size_upload":167002,
"size_download":0,
"speed_download":0,
"speed_upload":5495,
"download_content_length":-1,
"upload_content_length":167002,
"starttransfer_time":0.119166,
"redirect_time":0,
"certinfo":[
],
"primary_ip":"127.0.0.1",
"redirect_url":""
}
After more monitoring using the Paramedic plugin, I noticed that I would get timeouts when my CPU would hit ~80-98% (no obvious spikes in indexing/searching traffic). I finally stumbled across a helpful thread on the Elasticsearch forum. It seems this happens when the index is doing a refresh and large merges are occurring.
Merges can be throttled at a cluster or index level and I've updated them from the indicies.store.throttle.max_bytes_per_sec from the default 20mb to 5mb. This can be done during runtime with the cluster update settings API.
PUT /_cluster/settings HTTP/1.1
Host: 127.0.0.1:9200
{
"persistent" : {
"indices.store.throttle.max_bytes_per_sec" : "5mb"
}
}
So far Parmedic is showing a decrease in CPU usage. From an average of ~5-25% down to an average of ~1-5%. Hopefully this can help me avoid the 90%+ spikes I was having lock up my queries before, I'll report back by selecting this answer if I don't have any more problems.
As a side note, I guess I could have opted for more balanced EC2 instances (rather than memory-optimized). I think I'm happy with my current choice, but my next purchase will also take more CPU into account.

Resources