memcached statistics total_items is less than cmd_set - statistics

The statistics on my memcached server show a strange relationship: total_items is less than cmd_set. The only operations run on this server are "set" and "get", nothing else like "add", "replace", "delete" or CAS based operations.
When I go through the memcached source code, I see that any "normal" set (w/o CAS) will either replace or write the item, and both will increment total_items.
do_store_item() {
// ...
if (old_it != NULL)
item_replace(old_it, it, hv);
else
do_item_link(it, hv);
// ...
}
do_item_link() increases total_items, and item_replace() also calls do_item_link(). Then how can total_items be less than cmd_set?
Excerpt from memcached stats (numbers indented for readability):
STAT cmd_set 12827359728
STAT total_items 4237422053
STAT curr_items 60745375
STAT expired_unfetched 9898430934
STAT evicted_unfetched 30415090
STAT evictions 30421532
STAT reclaimed 9900995350

The statistics looked so strange because they are wrong. It is a memcached bug (integer overrun at 2^32). memcached developers have acknowledged and fixed it. For more details see https://github.com/memcached/memcached/issues/161.

Related

Force lshosts command to return megabytes for "maxmem" and "maxswp" parameters

When I type "lshosts" I am given:
HOST_NAME type model cpuf ncpus maxmem maxswp server RESOURCES
server1 X86_64 Intel_EM 60.0 12 191.9G 159.7G Yes ()
server2 X86_64 Intel_EM 60.0 12 191.9G 191.2G Yes ()
server3 X86_64 Intel_EM 60.0 12 191.9G 191.2G Yes ()
I am trying to return maxmem and maxswp as megabytes, not gigabytes when lshosts is called. I am trying to send Xilinx ISE jobs to my LSF, however the software expects integer, megabyte values for maxmem and maxswp. By doing debugging, it appears that the software grabs these parameters using the lshosts command.
I have already checked in my lsf.conf file that:
LSF_UNIT_FOR_LIMTS=MB
I have tried searching the IBM Knowledge Base, but to no avail.
Do you use a specific command to specify maxmem and maxswp units within the lsf.conf, lsf.shared, or other config files?
Or does LSF force return the most practical unit?
Any way to override this?
LSF_UNIT_FOR_LIMITS should work, if you completely drained the cluster of all running, pending, and finished jobs. According to the docs, MB is the default, so I'm surprised.
That said, you can use something like this to transform the results:
$ cat to_mb.awk
function to_mb(s) {
e = index("KMG", substr(s, length(s)))
m = substr(s, 0, length(s) - 1)
return m * 10^((e-2) * 3)
}
{ print $1 " " to_mb($6) " " to_mb($7) }
$ lshosts | tail -n +2 | awk -f to_mb.awk
server1 191900 159700
server2 191900 191200
server3 191900 191200
The to_mb function should also handle 'K' or 'M' units, should those pop up.
If LSF_UNIT_FOR_LIMITS is defined in lsf.conf, lshosts will always print the output as a floating point number, and in some versions of LSF the parameter is defined as 'KB' in lsf.conf upon installation.
Try searching for any definitions of the parameter in lsf.conf and commenting them all out so that the parameter is left undefined, I think in that case it defaults to printing it out as an integer in megabytes.
(Don't ask me why it works this way)

Mnesia pagination with fragmented table

I have a mnesia table configured as follow:
-record(space, {id, t, s, a, l}).
mnesia:create_table(space, [ {disc_only_copies, nodes()},
{frag_properties, [ {n_fragments, 400}, {n_disc_copies, 1}]},
{attributes, record_info(fields, space)}]),
I have at least 4 million records for test purposes on this table.
I have implemented something like this Pagination search in Erlang Mnesia
fetch_paged() ->
MatchSpec = {'_',[],['$_']},
{Record, Continuation} = mnesia:activity(async_dirty, fun mnesia:select/4, [space, [MatchSpec], 10000, read], mnesia_frag).
next_page(Cont) ->
mnesia:activity(async_dirty, fun() -> mnesia:select(Cont) end, mnesia_frag).
When I execute the pagination methods it brings batch between 3000 and 8000 but never 10000.
What I have to do to bring me the batches consistently?
The problem is that you expect mnesia:select/4, which is documented as:
select(Tab, MatchSpec, NObjects, Lock) -> transaction abort | {[Object],Cont} | '$end_of_table'
to get you the NObjects limit, being NObjects in your example 10,000.
But the same documentation also says:
For efficiency the NObjects is a recommendation only and the result may contain anything from an empty list to all available results.
and that's the reason you are not getting consistent batches of 10,000 records, because NObjects is not a limit but a recommendation batch size.
If you want to get your 10,000 records you won't have no other option that writing your own function, but select/4 is written in this way for optimization purposes, so most probably the code you will be written will be slower than the original code.
BTW, you can find the mnesia source code on https://github.com/erlang/otp/tree/master/lib/mnesia/src

NodeJS FS Write to Read inconsistent data without overflow (Solved as Buffer.toString)

I'm using the fs.createWriteStream(path[, options]) function to create a write stream splitted in text lines each ending with \n.
But, when the process ended, if I go to check the stream leater it seems to be corrupted, showing some (few) corrupted lines (like a 0.05% of the lines looks partialy cutted like a buffer overflow error).
Anyway if I grow the internal stream buffer from 16k to 4M with the highWaterMark option at creation of the streem, the error rate seems to change but do not disappear!)
It's due to an error in reading, not in writing.
I was doing something like:
var lines=[],
line='',
buf=new Buffer(8192);
while ((fs.readSync(fd,buf,0,buf.length))!=0) {
lines = buf.toString().split("\n");
lines[0]=line+lines[0];
line=lines.pop();
but this method you can find here and there on the web is really really wrong!
You have to check the real buffer lengt when you convert it to string, using buf.toString(null, 0 ,read_len)!!
var lines=[],
line='',
buf=new Buffer(8192),
read_len;
while ((read_len=fs.readSync(fd,buf,0,buf.length))!=0) {
lines = buf.toString(null, 0 ,read_len).split("\n");
lines[0]=line+lines[0];
line=lines.pop();

Compare different item in two file and output combined result to new file by using AWK

Greeting!
I have some file in pair taken from two nodes in network, and file has records about TCP segment send/receive time, IP id number, segment type,seq number and so on.
For same TCP flow, it looks like this on sender side:
1420862364.778332 50369 seq 17400:18848
1420862364.780798 50370 seq 18848:20296
1420862364.780810 50371 seq 20296:21744
....
or on receiver side(1 second delay, segment with IP id 50371 lost)
1420862364.778332 50369 seq 17400:18848
1420862364.780798 50370 seq 18848:20296
....
I want to compare IP identification number in two file and output to new one like this:
1420862364.778332 1420862365.778332 50369 seq 17400:18848 o
1420862364.780798 1420862365.780798 50370 seq 18848:20296 o
1420862364.780810 1420862365.780810 50371 seq 20296:21744 x
which has time of arrive on receiver side, and by comparing id field, when same value is not found in receiver sid(packet loss), an x will be added, otherwise o will be there.
I already have code like this,
awk 'ARGIND==1 {w[$2]=$1}
ARGIND==2 {
flag=0;
for(a in w)
if($2==a) {
flag=1;
print $1,w[a],$2,$3,$4;
break;
}
if(!flag)
print $1,"x",$2,$3,$4;
}' file2 file1 >file3
but it doesn't work in Linux, it stops right after I pressed Enter, and leave only empty file.
Shell script contains these code has been through chomd +x.
Please help. My code is not well organized, any new one liner will be appreciated.
Thank you for your time.
ARGIND is gawk-specific btw so check your awk version. – Ed Morton

Counting TCP retransmissions

I would like to know if there is a way to count the number of TCP retransmissions that occurred in a flow, in LINUX. Either on the client side or the server side.
Looks like netstat -s solves my purpose.
You can see TCP retransmissions for a single TCP flow using Wireshark. The "follow TCP stream" filter will allow you to see a single TCP stream. And the tcp.analysis.retransmission one will show retransmissions.
For more details, this serverfault question may be useful: https://serverfault.com/questions/318909/how-passively-monitor-for-tcp-packet-loss-linux
The Linux kernel provides an interface through the pseudo-filesystem proc for counters to track the TCPSynRetrans
For example:
awk '$1 ~ "Tcp:" { print $13 }' /proc/net/snmp
Per documentation:
* TCPSynRetrans
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
explanation below::
--
TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down
retransmissions into SYN, fast-retransmits, timeout retransmits, etc.
You can also adjust these settings also through the pseudo-filesystem procfs but under the sys directory. There is a handy utility that does this short-hand for you.
sysctl -a | grep retrans
net.ipv4.neigh.default.retrans_time_ms = 1000
net.ipv4.neigh.docker0.retrans_time_ms = 1000
net.ipv4.neigh.enp1s0.retrans_time_ms = 1000
net.ipv4.neigh.lo.retrans_time_ms = 1000
net.ipv4.neigh.wlp6s0.retrans_time_ms = 1000
net.ipv4.tcp_early_retrans = 3
net.ipv4.tcp_retrans_collapse = 1
net.ipv6.neigh.default.retrans_time_ms = 1000
net.ipv6.neigh.docker0.retrans_time_ms = 1000
net.ipv6.neigh.enp1s0.retrans_time_ms = 1000
net.ipv6.neigh.lo.retrans_time_ms = 1000
net.ipv6.neigh.wlp6s0.retrans_time_ms = 1000
net.netfilter.nf_conntrack_tcp_max_retrans = 3
net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300

Resources