Hazelcast 3.7 Eviction Algorithm - hazelcast

Reading the docs on the new Eviction Algorithm, available from Hazelcast 3.7, it is not very clear to me how the parameters mentioned in that section, are linked to the actual Map eviction policy parameters.
Namely, the algorithm explanation uses:
GlobalCapacity: User defined maximum cache size (cluster-wide).
PartitionCount: Number of partitions in the cluster (defaults to 271).
BalancedPartitionSize: Number of elements in a balanced partition state, BalancedPartitionSize := GlobalCapacity / PartitionCount.
Deviation: An approximated standard deviation (tests proofed it to be pretty near), Deviation := sqrt(BalancedPartitionSize).
Whereas the eviction policy configuration mentions (amongst some others):
<hazelcast>
<map name="default">
...
<time-to-live-seconds>0</time-to-live-seconds>
<max-idle-seconds>0</max-idle-seconds>
<eviction-policy>LRU</eviction-policy>
<max-size policy="PER_NODE">5000</max-size>
...
</map>
</hazelcast>
One assumption to be made is that GlobalCapacity is somehow linked to the max-size property ??
Any help clarifying this is most welcome! : )

GlobalCapacity: User defined maximum cache size (cluster-wide).
PartitionCount: Number of partitions in the cluster (defaults to 271).
BalancedPartitionSize: Number of elements in a balanced partition
state, BalancedPartitionSize := GlobalCapacity / PartitionCount.
Deviation: An approximated standard deviation (tests proofed it to be
pretty near), Deviation := sqrt(BalancedPartitionSize).
The above are the variables used to explain algorithm in reference manual. They are not the API variables.
But specifically in your question: Yes, global capacity is equivalent and can defined by user with max-size config inside map-config.

Related

infinispan 9 '<eviction strategy="LRU" />' isn't an allowed element

Wildfly 18 eviction tag is not parsing giving Failed to parse configuration error.
this is coming when i upgrade Wildfly 11 to 18. In wildfly 11 (infinispan 4) its working fine
<subsystem xmlns="urn:jboss:domain:infinispan:4.0">
<cache-container name="security" default-cache="auth-cache">
<local-cache name="auth-cache">
<locking acquire-timeout="${infinispan.cache-container.security.auth-cache.locking.acquire-timeout}"/>
<eviction strategy="LRU" max-entries="${infinispan.cache-container.security.auth-cache.eviction.max-entries}"/>
<expiration max-idle="-1"/>
</local-cache>
</cache-container>
</subsystem>
In Wildfly 18 having below section (NOT WORKING)
<subsystem xmlns="urn:jboss:domain:infinispan:9.0">
<cache-container name="security" default-cache="auth-cache">
<local-cache name="auth-cache">
<locking acquire-timeout="${infinispan.cache-container.security.auth-cache.locking.acquire-timeout}"/>
<eviction strategy="LRU" max-entries="${infinispan.cache-container.security.auth-cache.eviction.max-entries}"/>
<expiration max-idle="-1"/>
</local-cache>
</cache-container>
</subsystem>
Its giving ^^^^ 'eviction' isn't an allowed element here.infinispan:9.4 its says Eviction is configured by adding the but even that gives unrecognized tag memory.
how to add eviction strategy=LRU or replacement to strategy:"LRU"=?
According to the docs in infinispan 9.0 eviction is configured by adding the <memory/> element to your <*-cache/> configuration sections. Eviction is handled by Caffeine utilizing the TinyLFU algorithm with an additional admission window. This was chosen as provides high hit rate while also requiring low memory overhead. This provides a better hit ratio than LRU while also requiring less memory than LIRS.
In general there are two types:
COUNT (This type of eviction will remove entries based on how many there are in the cache. Once the count of entries has grown larger than the size then an entry will be removed to make room.
MEMORY - This type of eviction will estimate how much each entry will take up in memory and will remove an entry when the total size of all entries is larger than the configured size. This type only works with primitive wrapper, String and byte[] types, thus if custom types are desired you must enable storeAsBinary. Also MEMORY based eviction only works with LRU policy.
So I would think you define it like that:
<cache-container name="security" default-cache="auth-cache">
<local-cache name="auth-cache">
<...your other options...>
<object-memory/>
</local-cache>
</cache-container>
OR
<binary-memory eviction-type="MEMORY/COUNT"/>
OR
off-heap-memory eviction-type="MEMORY/COUNT"/>
AND you can always specify the size:
size="${infinispan.cache-container.security.auth-cache.eviction.max-entries}"
Storeage types:
object-memory (Stores entries as objects in the Java heap. This is the default storage type. Storage type supports COUNT only so you do not need to explicitly set the eviction type. Policy=TinyLFU)
binary-memory (Stores entries as bytes[] in the Java heap. Eviction Type: COUNT OR MEMORY. Policy=TinyLFU)
off-heap-memory (Stores entries as bytes[] in native memory outside the Java. Eviction Type: COUNT OR MEMORY. Policy=LRU)
Lonzak's reponse is correct.
Additionally, you can just use your "urn:jboss:domain:infinispan:4.0" configuration from WildFly 9 in WildFly 19. WildFly will automatically update the configuration to its equivalent in the current schema version.

How partitions in azure event hub and partition by keyword in azure stream analytics is related?

https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-parallelization#calculate-the-maximum-streaming-units-of-a-job
As per the documentation,
Eg.
Query
*The input data stream is partitioned by 16.
*The query contains one step.
*The step is partitioned.
Max SUs for the job
96 (6 * 16 partitions)
What is that means?
Welcome to Stackoverflow!
To understand the example in the document, kindly understand "How many streaming Units are required for a specific job".
If a step don't contain "PARTITION BY", then maximum of 6 SUs for the
job.
If a step contain "PARTITION BY [N]", then maximum of 6 * [N]
SUs for the job.
How many Streaming Units are required for a job?
Choosing the number of required SUs for a particular job depends on the partition configuration for the inputs and the query that's defined within the job. The Scale page allows you to set the right number of SUs. It is a best practice to allocate more SUs than needed. The Stream Analytics processing engine optimizes for latency and throughput at the cost of allocating additional memory.
In general, the best practice is to start with 6 SUs for queries that don't use PARTITION BY. Then determine the sweet spot by using a trial and error method in which you modify the number of SUs after you pass representative amounts of data and examine the SU% Utilization metric. The maximum number of streaming units that can be used by a Stream Analytics job depends on the number of steps in the query defined for the job and the number of partitions in each step.
Hope this helps.

What factors influence the "Avoids Enormous Network Payloads" message?

According to the official documentation, the target is 1600KB:
https://developers.google.com/web/tools/lighthouse/audits/network-payloads
However, even when the payload is larger than the target, it sometimes still shows as a passed audit:
payload of 2405KB still passes
What are the other conditions that allow large payloads to pass the audit?
Lighthouse scores are 0–100 based on some log normal distribution math.
1600KB is not a passing score, it is approximately a maximum possible 100 score.
As of right now the values used for distribution calculation are 2500KB point of diminishing returns and 4000KB median, which would correspond to scores of about 93 and 50 respectively.
That puts 2405KB result at ~94 score which is sufficient to pass.

The 95% of non-normally distributed points around a mean/median

I asked users to tap a location repeatedly. To calculate the size of a target in that location, such that 95% of users will hit that target successfully, I usually measure 2 std of the tap offsets from the centroid. That works if the tap offsets are normally distributed, but my data now is not distributed normally. How can I figure out the equivalent of a 2 std around the mean/median?
If you're only measuring in one dimension, the region encompassed by +/-2 std in a Normal distribution corresponds fairly well to the central 95% of the distribution. Perhaps it's worth working with quantiles instead - take the interval corresponding to that within the 2.5th and 97.5th percentiles - this will be robust to skew or any other departure from normality.

Cassandra metrics- difference between latency to total latency

I'm using Cassandra 2.2 and sending the Cassandra metrics to Graphite using pluggable metrics,
I've searched in org.apache.cassandra.metrics.ColumnFamily and saw there is an attribute "count" in ReadLatency and ReadTotalLatency,
What is the difference between the 2 count attributes?
My main goal is to get the latency per read/write, how do you advise me to get it?
Thanks!
org.apache.cassandra.metrics.ColumnFamily.ReadTotalLatency is a Counter which gives the sum of all read latencies.
org.apache.cassandra.metrics.ColumnFamily.ReadLatency is a Timer which gives insights about how long the reads are taking, it reports attributes like min, max, mean, 75percentile, 90percentile, 99percentile
for your purpose you should be using the ReadLatency and Writelatency
Difference of 2 "count" attributes
org.apache.cassandra.metrics.ColumnFamily.ReadTotalLatency is a Counter.
Its "count" attribute provides the sum of all read latencies.
org.apache.cassandra.metrics.ColumnFamily.ReadLatency is a Timer.
Its "count" attribute provides the count of Timer#update calls.
To get the recent latency per read/write
Use attributes like "min", "max", "mean", "75percentile", "90percentile", "99percentile".
Cassandra 2.2.7 uses DecayingEstimatedHistogramReservoir for Timer's reservoir, which makes recent values more significant.

Resources