Mysql seconds_behind master very high - mysql-5.7

Hi we have mysql master slave replication, master is mysql 5.6 and slave is mysql 5.7, seconds behind master is 245000, how I make it catch up faster. Right now it is taking more than 6 hours to copy 100 000 seconds.
My slave ram is 128 GB. Below is my my.cnf
[mysqld]
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
innodb_buffer_pool_size = 110G
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
# These are commonly set, remove the # and set as required.
basedir = /usr/local/mysql
datadir = /disk1/mysqldata
port = 3306
#server_id = 3
socket = /var/run/mysqld/mysqld.sock
user=mysql
log_error = /var/log/mysql/error.log
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
join_buffer_size = 256M
sort_buffer_size = 128M
read_rnd_buffer_size = 2M
#copied from old config
#key_buffer = 16M
max_allowed_packet = 256M
thread_stack = 192K
thread_cache_size = 8
query_cache_limit = 1M
#disabling query_cache_size and type, for replication purpose, need to enable it when going live
query_cache_size = 0
#query_cache_size = 64M
#query_cache_type = 1
query_cache_type = OFF
#GroupBy
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
#sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
enforce-gtid-consistency
gtid-mode = ON
log_slave_updates=0
slave_transaction_retries = 100
#replication related changes
server-id = 2
relay-log = /disk1/mysqllog/mysql-relay-bin.log
log_bin = /disk1/mysqllog/binlog/mysql-bin.log
binlog_do_db = brandmanagement
#replicate_wild_do_table=brandmanagement.%
replicate-wild-ignore-table=brandmanagement.t\_gnip\_data\_recent
replicate-wild-ignore-table=brandmanagement.t\_gnip\_data
replicate-wild-ignore-table=brandmanagement.t\_fb\_rt\_data
replicate-wild-ignore-table=brandmanagement.t\_keyword\_tweets
replicate-wild-ignore-table=brandmanagement.t\_gnip\_data\_old
replicate-wild-ignore-table=brandmanagement.t\_gnip\_data\_new
binlog_format=row
report-host=10.125.133.220
report-port=3306
#sync-master-info=1
read-only=1
net_read_timeout = 7200
net_write_timeout = 7200
innodb_flush_log_at_trx_commit = 2
sync_binlog=0
sync_relay_log_info=0
max_relay_log_size=268435456

Lots of possible solutions. But I'll go with the simplest one. Have you got enough network bandwidth to send all changes over the network? You're using "row" binlog, which may be good in case of random, unindexed updates. But if you're changing a lot of data using indexes only, then "mixed" binlog may be better.

Related

snakemake allocates memory twice

I am noticing that all my rules request memory twice, one at a lower maximum than what I requested (mem_mb) and then what I actually requested (mem_gb). If I run the rules as localrules they do run faster. How can I make sure the default settings do not interfere?
resources: mem_mb=100, disk_mb=8620, tmpdir=/tmp/pop071.54835, partition=h24, qos=normal, mem_gb=100, time=120:00:00
The rules are as follows:
rule bwa_mem2_mem:
input:
R1 = "data/results/qc/{species}.{population}.{individual}_1.fq.gz",
R2 = "data/results/qc/{species}.{population}.{individual}_2.fq.gz",
R1_unp = "data/results/qc/{species}.{population}.{individual}_1_unp.fq.gz",
R2_unp = "data/results/qc/{species}.{population}.{individual}_2_unp.fq.gz",
idx= "data/results/genome/genome",
ref = "data/results/genome/genome.fa"
output:
bam = "data/results/mapped_reads/{species}.{population}.{individual}.bam",
log:
bwa ="logs/bwa_mem2/{species}.{population}.{individual}.log",
sam ="logs/samtools_view/{species}.{population}.{individual}.log",
benchmark:
"benchmark/bwa_mem2_mem/{species}.{population}.{individual}.tsv",
resources:
time = parameters["bwa_mem2"]["time"],
mem_gb = parameters["bwa_mem2"]["mem_gb"],
params:
extra = parameters["bwa_mem2"]["extra"],
tag = compose_rg_tag,
threads:
parameters["bwa_mem2"]["threads"],
shell:
"bwa-mem2 mem -t {threads} -R '{params.tag}' {params.extra} {input.idx} {input.R1} {input.R2} | "
"samtools sort -l 9 -o {output.bam} --reference {input.ref} --output-fmt CRAM -# {threads} /dev/stdin 2> {log.sam}"
and the config is:
cluster:
mkdir -p logs/{rule} && # change the log file to logs/slurm/{rule}
sbatch
--partition={resources.partition}
--time={resources.time}
--qos={resources.qos}
--cpus-per-task={threads}
--mem={resources.mem_gb}
--job-name=smk-{rule}-{wildcards}
--output=logs/{rule}/{rule}-{wildcards}-%j.out
--parsable # Required to pass job IDs to scancel
default-resources:
- partition=h24
- qos=normal
- mem_gb=100
- time="04:00:00"
restart-times: 3
max-jobs-per-second: 10
max-status-checks-per-second: 1
local-cores: 1
latency-wait: 60
jobs: 100
keep-going: True
rerun-incomplete: True
printshellcmds: True
scheduler: greedy
use-conda: True # Required to run with local conda enviroment
cluster-status: status-sacct.sh # Required to monitor the status of the submitted jobs
cluster-cancel: scancel # Required to cancel the jobs with Ctrl + C
cluster-cancel-nargs: 50
Cheers,
Angel
Right now there are two separate memory resource requirements:
mem_mb
mem_gb
From the perspective of snakemake these are different, so both will be passed to the cluster. A quick fix is to use the same units, e.g. if the resource really requires only 100 mb, then the default resource should be changed to:
default-resources:
- partition=h24
- qos=normal
- mem_mb=100

Submitted metrics not showing up on prometheus endpoint

I have a code which looks like this, it is supposed to collect some custom metrics and expose it over prometheus.
def collect_metrics():
registry = prometheus_client.CollectorRegistry()
label_names = ['parent', 'namespace','team', 'name', 'status']
sib = Gauge(f'disk_sizeInBytes','Gets the size of the disk in bytes.', label_names, registry=registry)
msib = Gauge(f'disk_maxSizeInMegabytes', 'Gets or sets the maximum size of the disk in megabytes, which is the size of memory allocated for the disk.', label_names, registry=registry)
...
sib.labels(parent=parent_name, namespace=namespace_name, team=team, name=disk_name, status=disk_status).set(disk_list[dp]["sizeInBytes"])
msib.labels(parent=parent_name, namespace=namespace_name, team=team, name=disk_name, status=disk_status).set(disk_list[dp]["maxSizeInMegabytes"])
print(f'{datetime.datetime.now()} | disk_name: {disk_name} | sib: {disk_list[dp]["sizeInBytes"]} | msib: {disk_list[dp]["maxSizeInMegabytes"]}')
...
if __name__ == '__main__':
...
start_htdp_server(8005)
collect_metrics()
The code works fine without any errors, however I don’t see anything being shown over endpoint http://localhost:8005/, though i see some default metrics being shown such as:
# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 403.0
python_gc_objects_collected_total{generation="1"} 0.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable object found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 39.0
python_gc_collections_total{generation="1"} 3.0
python_gc_collections_total{generation="2"} 0.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="4",version="3.10.4"} 1.0
Can someone point me, what is the issue here?
Couple of things:
Remove registry = prometheus_client.CollectorRegistry()
Remove registry=registry from the Gauge declarations
Add a loop to keep the process running.
import datetime
import re
import time
from prometheus_client import CollectorRegistry,Gauge
from prometheus_client import start_http_server
def collect_metrics():
label_names = ['parent', 'namespace','team', 'name', 'status']
sib = Gauge(
'disk_sizeInBytes',
'Gets the size of the disk in bytes.',
label_names,
)
msib = Gauge(
'disk_maxSizeInMegabytes',
'Gets or sets the maximum size of the disk in megabytes, which is the size of memory allocated for the disk.',
label_names,
)
sib.labels(
parent="parent_name",
namespace="namespace_name",
team="team",
name="disk_name",
status="disk_status",
).set(10.0)
msib.labels(
parent="parent_name",
namespace="namespace_name",
team="team",
name="disk_name",
status="disk_status",
).set(5.0)
if __name__ == '__main__':
...
start_http_server(8005)
collect_metrics()
while True:
time.sleep(5)
# HELP disk_sizeInBytes Gets the size of the disk in bytes.
# TYPE disk_sizeInBytes gauge
disk_sizeInBytes{name="disk_name",namespace="namespace_name",parent="parent_name",status="disk_status",team="team"} 10.0
# HELP disk_maxSizeInMegabytes Gets or sets the maximum size of the disk in megabytes, which is the size of memory allocated for the disk.
# TYPE disk_maxSizeInMegabytes gauge
disk_maxSizeInMegabytes{name="disk_name",namespace="namespace_name",parent="parent_name",status="disk_status",team="team"} 5.0

Linux memory allocation on Rstudio/Rstudio Server

I am trying to do clustering with CLARA using Rstudio on Linux and I have a very large dataset.
However, it seemed that the memory is not enough for the whole dataset?
## Estimating the number of clusters ----
fviz_nbclust(df, clara, method = "silhouette", k.max = 15)
It showed me this:
Error: cannot allocate vector of size 339.8 GB
So I tried all of this and it still didn't work. memory.limit is also specific for Windows only (I still gave it a try tho).
# devtools::install_github("krlmlr/ulimit")
# gc()
# memory.limit(9999999999)
#
#
# install.packages("devtools", dependencies = TRUE)
# devtools::install_github("krlmlr/ulimit")
# ulimit::memory_limit(2000)
#
# devtools::install_github("jeroen/unix")
#
#
# if(.Platform$OS.type == "windows") withAutoprint({
# memory.size()
# memory.size(TRUE)
# memory.limit()
# })
# memory.limit(size=56000)
# memory.size(max = FALSE)
Can somebody help me with this?
Any help would be appreciated!
The error simply means that it cannot allocate 339.8 GB to your RAM. Do you have 360GB of RAM?
If not, you will just have to dplyr::nsample() and just run the function on a subset of your dataset.

SOLVED - Debug Darkice to understand why is not connecting to shoutcast

I'm trying to connect to a shoutcast server from a darkice client using Ubuntu. This is my configuration:
#this section describes general aspects of the live streaming session
[general]
duration = 0 # duration of encoding, in seconds. 0 means forever
bufferSecs = 10 # size of internal slip buffer, in seconds
reconnect = yes # reconnect to the server(s) if disconnected
realtime = no # run the encoder with POSIX realtime priority
rtprio = 3 # scheduling priority for the realtime threads
# this section describes the audio input that will be streamed
[input]
device = hw:CARD=PCH,DEV=0
sampleRate = 44100 # sample rate in Hz. try 11025, 22050 or 44100
bitsPerSample = 16 # bits per sample. try 16
channel = 2 # channels. 1 = mono, 2 = stereo
# this section describes a streaming connection to an IceCast2 server
# there may be up to 8 of these sections, named [icecast2-0] ... [icecast2-7]
# these can be mixed with [icecast-x] and [shoutcast-x] sections
[shoutcast-0]
bitrateMode = cbr
format = mp3
bitrate = 96
quality = 1.0
server = xxxxxxxxxxxxxxx
port = 8020
password = xxxxxxxxxxxxxxx
name = Radio website
url = https://www.mywebsite.it
genre = live
public = no
But when I run
darkice -v 10 -c /etc/darkice-shoutcast.cfg
It only shows this, without errors or similar, but there is no streaming at the url. Using BUTT it works. I've also tested with 8021 instead of 8020 for port (8020 it's the port number given by the provider) but no luck.
DarkIce 1.4 live audio streamer, http://code.google.com/p/darkice/
Copyright (c) 2000-2007, Tyrell Hungary, http://tyrell.hu/
Copyright (c) 2008-2013, Akos Maroy and Rafael Diniz
This is free software, and you are welcome to redistribute it
under the terms of The GNU General Public License version 3 or
any later version.
Using config file: /etc/darkice-shoutcast.cfg
18-May-2021 12:02:28 Using ALSA DSP input device: hw:CARD=PCH,DEV=0
18-May-2021 12:02:28 buffer size: 1764000
18-May-2021 12:02:28 encoding
18-May-2021 12:02:28 MultiThreadedConnector :: transfer, bytes 0
18-May-2021 12:02:28 MultiThreadedConnector :: ThreadData :: threadFunction, was (thread, priority, type): 0x5568a502c010 0 SCHED_OTHER
18-May-2021 12:02:28 MultiThreadedConnector :: ThreadData :: threadFunction, now is (thread, priority, type): 0x5568a502c010 0 SCHED_OTHER
ADDENDUM
I've used tcpdump to understand what could be and I just see something similar to "invalid password"
: Flags [P.], cksum 0xc379 (correct), seq 1:19, ack 3090, win 294, options [nop,nop,TS val 3348978428 ecr 531576376], length 18
E..F1O#.1.3'.}.......T.Jq.k.D......&.y.....
..Z...68Invalid Passwor
Suggestions on how to better debug or fix this?
SOLVED
It seems the error is related to the password and wrong parsing of the config file, so I've written it without spaces
[shoutcast-0]
bitrateMode = cbr
format = mp3
bitrate = 96
quality = 1.0
server = xxxxxxxxxxxxxxx
port = 8020
password=xxxxxxxxxxxxxxx
name = Radio website
url = https://www.mywebsite.it
genre = live
public = no

Counting TCP retransmissions

I would like to know if there is a way to count the number of TCP retransmissions that occurred in a flow, in LINUX. Either on the client side or the server side.
Looks like netstat -s solves my purpose.
You can see TCP retransmissions for a single TCP flow using Wireshark. The "follow TCP stream" filter will allow you to see a single TCP stream. And the tcp.analysis.retransmission one will show retransmissions.
For more details, this serverfault question may be useful: https://serverfault.com/questions/318909/how-passively-monitor-for-tcp-packet-loss-linux
The Linux kernel provides an interface through the pseudo-filesystem proc for counters to track the TCPSynRetrans
For example:
awk '$1 ~ "Tcp:" { print $13 }' /proc/net/snmp
Per documentation:
* TCPSynRetrans
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
explanation below::
--
TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down
retransmissions into SYN, fast-retransmits, timeout retransmits, etc.
You can also adjust these settings also through the pseudo-filesystem procfs but under the sys directory. There is a handy utility that does this short-hand for you.
sysctl -a | grep retrans
net.ipv4.neigh.default.retrans_time_ms = 1000
net.ipv4.neigh.docker0.retrans_time_ms = 1000
net.ipv4.neigh.enp1s0.retrans_time_ms = 1000
net.ipv4.neigh.lo.retrans_time_ms = 1000
net.ipv4.neigh.wlp6s0.retrans_time_ms = 1000
net.ipv4.tcp_early_retrans = 3
net.ipv4.tcp_retrans_collapse = 1
net.ipv6.neigh.default.retrans_time_ms = 1000
net.ipv6.neigh.docker0.retrans_time_ms = 1000
net.ipv6.neigh.enp1s0.retrans_time_ms = 1000
net.ipv6.neigh.lo.retrans_time_ms = 1000
net.ipv6.neigh.wlp6s0.retrans_time_ms = 1000
net.netfilter.nf_conntrack_tcp_max_retrans = 3
net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300

Resources