Maximum number of concurrent LOAD FROM S3 requests in Aurora Mysql - amazon-rds

I've been using the LOAD FROM S3 feature of Aurora Mysql. To max out the box and the bandwidth, I submitted 30 concurrent requests for S3 load. On checking the processlist in Mysql, I see only 4 S3 LOAD queries running at any point in time. Is that a hard limit? Or am I doing something incorrect?
Here are my scripts for submitting the loads:
Single LOAD from S3:
cat load-one-table.sh
export MYSQL_PWD=[REDACTED]
echo "Shard #$1"
mysql -u [REDACTED] -B -e "load data from s3 's3://A/B/C/D' into table DB.table_"$1;
Concurrent LOADS:
cat parallel-load.sh
for i in $(seq 1 $1); do
echo "Starting Task #$i"
nohup ./load-one-table.sh $i &
echo "Submitted Task #$i"
done
Trigger:
./parallel-load 30
I do see all 30 requests submitted in nohup logs.
Explciitly checking if all loads are running on the client:
ps auxww | grep -c "load-one-table"
31
# The result shows 31, since there is one extra process match for the grep
Checking accepted requests on the server:
show full processlist;
+-----+----------+---------------------+--------------------+---------+------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+----------+---------------------+--------------------+---------+------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| 606 | REDACTED | localhost | NULL | Query | 60 | executing | load data from s3 's3://REDACTED' into table REDACTED_2 |
| 607 | REDACTED | localhost | NULL | Query | 60 | executing | load data from s3 's3://REDACTED' into table REDACTED_4 |
| 608 | REDACTED | localhost | NULL | Query | 60 | executing | load data from s3 's3://REDACTED' into table REDACTED_1 |
| 609 | REDACTED | localhost | NULL | Query | 60 | executing | load data from s3 's3://REDACTED' into table REDACTED_3 |
| 610 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 611 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 612 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 613 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 614 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 615 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 616 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 617 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 618 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 619 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 620 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 621 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 622 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 623 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 624 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 625 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 626 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 627 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 628 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 629 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 630 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 631 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 632 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 633 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 634 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
| 635 | REDACTED | localhost | NULL | Sleep | 60 | cleaning up | NULL |
+-----+----------+---------------------+--------------------+---------+------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+

Related

Creating a table of schedules in excel from 3 references

I'm trying to create a table that's easy to view the schedules from a raw data.
The raw data looks like the below
Jobname: deletearchive
Start_time: 00:00,01:00,23:45
Days_of_week: su,mo,sa
I would like to put them into columns with header su and time from 00:00 to 23:00 for each day. One challenge i see is I need consider 3 criteria - day(su to mo), time(some are not at 00 minutes but need to be round down) and the jobname. I have 500 jobnames with different schedules.
+---------------+-------+-------+-------+-------+-------+-------+--+--+--+--+--+--+--+
| JOBNAME | su | su | su | mo | mo | mo | | | | | | | |
+---------------+-------+-------+-------+-------+-------+-------+--+--+--+--+--+--+--+
| | 00:00 | 01:00 | 23:00 | 00:00 | 01:00 | 23:00 | | | | | | | |
| deletearchive | yes | yes | 23:45 | yes | yes | 23:45 | | | | | | | |
| JOB2 | | | | | | | | | | | | | |
| JOB3 | | | | | | | | | | | | | |
+---------------+-------+-------+-------+-------+-------+-------+--+--+--+--+--+--+--+

Memsql is taking lot of time for a query to execute

I am having one problem with my memsql cluster when I run the query to fetch the 51M records it returns result in 5 minutes
but it used to take more than 15 min when data insertion is parallel to read.
I measured disk io and it is ok and the disk is hdd disk.
There are no other connections to the memsql and cpu is also 15% utilized with 64 core machine
Below are my varaiables
Variable_name | Value |
+----------------------------------------------+------------------------------------------------------------------------------+
| aggregator_failure_detection | ON |
| auto_replicate | OFF |
| autocommit | ON |
| basedir | /data/master-3306 |
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_sets_dir | /data/master-3306/share/charsets/ |
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
| columnar_segment_rows | 102400 |
| columnstore_window_size | 2147483648 |
| compile_only | OFF |
| connect_timeout | 10 |
| core_file | ON |
| core_file_mode | PARTIAL |
| critical_diagnostics | ON |
| datadir | /data/master-3306/data |
| default_partitions_per_leaf | 16 |
| enable_experimental_metrics | OFF |
| error_count | 0 |
| explain_expression_limit | 500 |
| external_user | |
| flush_before_replicate | OFF |
| general_log | OFF |
| geo_sphere_radius | 6367444.657120 |
| hostname | **** |
| identity | 0 |
| kerberos_server_keytab | |
| lc_messages | en_US |
| lc_messages_dir | /data/master-3306/share |
| leaf_failure_detection | ON |
| load_data_max_buffer_size | 1073741823 |
| load_data_read_size | 8192 |
| load_data_write_size | 8192 |
| lock_wait_timeout | 60 |
| master_aggregator | self |
| max_allowed_packet | 104857600 |
| max_connection_threads | 192 |
| max_connections | 100000 |
| max_pooled_connections | 4096 |
| max_prefetch_threads | 1 |
| max_prepared_stmt_count | 16382 |
| max_user_connections | 0 |
| maximum_memory | 506602 |
| maximum_table_memory | 455941 |
| memsql_id | ** |
| memsql_version | 5.7.2 |
| memsql_version_date | Thu Jan 26 12:34:22 2017 -0800 |
| memsql_version_hash | 03e5e3581e96d65caa30756f191323437a3840f0 |
| minimal_disk_space | 100 |
| multi_insert_tuple_count | 20000 |
| net_buffer_length | 102400 |
| net_read_timeout | 3600 |
| net_write_timeout | 3600 |
| pid_file | /data/master-3306/data/memsqld.pid |
| pipelines_batches_metadata_to_keep | 1000 |
| pipelines_extractor_debug_logging | OFF |
| pipelines_kafka_version | 0.8.2.2 |
| pipelines_max_errors_per_partition | 1000 |
| pipelines_max_offsets_per_batch_partition | 1000000 |
| pipelines_max_retries_per_batch_partition | 4 |
| pipelines_stderr_bufsize | 65535 |
| pipelines_stop_on_error | ON |
| plan_expiration_minutes | 720 |
| port | 3306 |
| protocol_version | 10 |
| proxy_user | |
| query_parallelism | 0 |
| redundancy_level | 1 |
| reported_hostname | |
| secure_file_priv | |
| show_query_parameters | ON |
| skip_name_resolve | AUTO |
| snapshot_trigger_size | 268435456 |
| snapshots_to_keep | 2 |
| socket | /data/master-3306/data/memsql.sock |
| sql_quote_show_create | ON |
| ssl_ca | |
| ssl_capath | |
| ssl_cert | |
| ssl_cipher | |
| ssl_key | |
| sync_slave_timeout | 20000 |
| system_time_zone | UTC |
| thread_cache_size | 0 |
| thread_handling | one-thread-per-connection |
| thread_stack | 1048576 |
| time_zone | SYSTEM |
| timestamp | 1504799067.127069 |
| tls_version | TLSv1,TLSv1.1,TLSv1.2 |
| tmpdir | . |
| transaction_buffer | 67108864 |
| tx_isolation | READ-COMMITTED |
| use_join_bucket_bit_vector | ON |
| use_vectorized_join | ON |
| version | 5.5.8 |
| version_comment | MemSQL source distribution (compatible; MySQL Enterprise & MySQL Commercial) |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
| warn_level | WARNINGS |
| warning_count | 0 |
| workload_management | ON |
| workload_management_expected_aggregators | 1 |
| workload_management_max_connections_per_leaf | 1024 |
| workload_management_max_queue_depth | 100 |
| workload_management_max_threads_per_leaf | 8192 |
| workload_management_queue_time_warning_ratio | 0.500000 |
| workload_management_queue_timeout | 3600
Some thoughts:
Workload profiling (https://docs.memsql.com/concepts/v5.8/workload-profiling-overview/) can help you understand what resources are limiting the speed of this query - if it's not cpu and not disk io, maybe it's network io or something
Query profiling (https://docs.memsql.com/sql-reference/v5.8/profile/) will help indicate which operators within the query are expensive
You mentioned your machine has 64 cores - is your cluster properly numa-optimized? Run memsql-ops memsql-optimize to check.
Jack, I did the workload profiling and find out that lock_time_ms is quite high when NETWORK_LOGICAL_SEND_B is high
LAST_FINISHED_TIMESTAMP, LOCK_TIME_MS, LOCK_ROW_TIME_MS, ACTIVITY_NAME, NETWORK_LOGICAL_RECV_B, NETWORK_LOGICAL_SEND_B, activity_name, database_name, partition_id, left(q.query_text, 50)
'2017-09-11 10:41:31', '39988', '0', 'InsertSelect_AggregatedHourly_temp_7aug_KING__et_al_c44dc7ab56d56280', '0', '538154753', 'InsertSelect_AggregatedHourly_temp_7aug_KING__et_al_c44dc7ab56d56280', 'datawarehouse', '7', 'SELECT \n combined.day AS day,\n `lineitem'

SSIS Convert column to rows from an excel sheet

I have an excel sheet table with a structure like this:
+------------+-----+----------+----------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+
| date | Day | StoreDdg | StoreR/H | DbgCategory1Dpt1 | R/HCategory1Dpt1 | DbgCategory2Dpt1 | R/HCategory2Dpt1 | DbgCategory3Dpt1 | R/HCategory2Dpt1 | DbgDepartment1 | R/HDepartment1 | DbgCategory1Dpt2 | R/HCategory1Dpt2 | DbgCategory2Dpt2 | R/HCategory2Dpt2 | DbgCategory3Dpt2 | R/HCategory2Dpt2 | DbgDepartment2 | R/HDepartment2 |
+------------+-----+----------+----------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+
| 1-Jan-2017 | Sun | 138,894 | 133% | 500 | 44% | 12,420 | 146% | | | | 11,920 | 104% | #DIV/0! | 13,580 | 113% | 9,250 | 92% | 6,530 | 147% |
| 2-Jan-2017 | Mon | 138,894 | 270% | 500 | 136% | 12,420 | 277% | 11,920 | | | | 193% | #DIV/0! | 13,580 | 299% | 9,250 | 225% | 6,530 | 181% |
+------------+-----+----------+----------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+
I would like to convert this into
+------------+-----+--------+-------------+---------------+---------+------+
| date | Day | Store | Department | Category | Dpt | R/H |
+------------+-----+--------+-------------+---------------+---------+------+
| 1-Jan-2017 | Sun | Store1 | Department1 | Category1Dpt1 | 138,894 | 133% |
| 1-Jan-2017 | Sun | Store1 | Department1 | Category2Dpt1 | 500 | 44% |
| 1-Jan-2017 | Sun | Store1 | Department1 | Category3Dpt1 | 12,420 | 146% |
| 1-Jan-2017 | Sun | Store1 | Department2 | Category1Dpt2 | 11,920 | 104% |
| 1-Jan-2017 | Sun | Store1 | Department2 | Category2Dpt2 | 13,580 | 44% |
| 1-Jan-2017 | Sun | Store1 | Department2 | Category3Dpt2 | 9,250 | 92% |
| 2-Jan-2017 | Mon | Store1 | Department1 | Category1Dpt1 | 138,894 | 270% |
| 2-Jan-2017 | Mon | Store1 | Department1 | Category2Dpt1 | 500 | 136% |
| 2-Jan-2017 | Mon | Store1 | Department1 | Category3Dpt1 | 12,420 | 277% |
| 2-Jan-2017 | Mon | Store1 | Department2 | Category1Dpt2 | 13,580 | 299% |
| 2-Jan-2017 | Mon | Store1 | Department2 | Category2Dpt2 | 9,250 | 225% |
| 2-Jan-2017 | Mon | Store1 | Department2 | Category3Dpt2 | 6,530 | 181% |
+------------+-----+--------+-------------+---------------+---------+------+
any recommendation about how to do this?
You can do this by taking the excel file as source. You might have to save as the excel in 2005 or 2007 format depending upon the version you are using of the visual studio if it is already in 2007 format then its good .
Now extracting the data for DbgDepartment1 and DbgDepartment2 , you may create 2 different source in the DFT. In one , you may select column which are related to DbgDepartment1 and in the second ,you may choose DbgDepartment2. You might have to use the Derived Column depending on the logic you will use further . Then you may use the Union Transformation, as the source file is the same and can load the data into the destination .Try it , you will get a solution .
I used R statistic language to solve this issue by using data tidying packages ("tidyr", "devtools")
for more info check the link: http://garrettgman.github.io/tidying/

how to join to files with awk/sed/grep/bash similar to SQL JOIN

how to join to files with awk/sed/grep/bash similar to SQL JOIN?
I have a file that looks like this:
and another one that looks like this:
i've also a text version of the image above:
+----------+------------------+------+------------+----+---------------------------------------------------+---------------------------------------------------+-----+-----+-----+------+-------+-------+--------------+------------+--+--+---+---+----+--+---+---+----+------------+------------+------------+------------+
| 21548598 | DSND001906102.2 | 0107 | 001906102 | 02 | FROZEN / O.S.T. | FROZEN / O.S.T. | 001 | 024 | | | 11.49 | 13.95 | 050087295745 | 11/25/2013 | | | N | N | 30 | | 1 | E | 1 | 10/07/2013 | 02/27/2014 | 10/07/2013 | 10/07/2013 |
| 25584998 | WD1194190DVD | 0819 | 1194190 | 18 | FROZEN / (WS DOL DTS) | FROZEN / (WS DOL DTS) | 050 | 110 | | G | 21.25 | 29.99 | 786936838961 | 03/18/2014 | | | N | N | 0 | | 1 | A | 2 | 12/20/2013 | 03/13/2014 | 12/20/2013 | 12/20/2013 |
| 25812794 | WHV1000292717BR | 0526 | 1000292717 | BR | GRAVITY / (UVDC) | GRAVITY / (UVDC) | 050 | 093 | | PG13 | 29.49 | 35.99 | 883929244577 | 02/25/2014 | | | N | N | 30 | | 1 | E | 3 | 01/16/2014 | 02/11/2014 | 01/16/2014 | 01/16/2014 |
| 24475594 | SNY303251.2 | 0085 | 303251 | 02 | BEYONCE | BEYONCE | 001 | 004 | | | 14.99 | 17.97 | 888430325128 | 12/20/2013 | | | N | N | 30 | | 1 | A | 4 | 12/19/2013 | 01/02/2014 | 12/19/2013 | 12/19/2013 |
| 25812787 | WHV1000284958DVD | 0526 | 1000284958 | 18 | GRAVITY (2PC) / (UVDC SPEC 2PK) | GRAVITY (2PC) / (UVDC SPEC 2PK) | 050 | 093 | | PG13 | 21.25 | 28.98 | 883929242528 | 02/25/2014 | | | N | N | 30 | | 1 | E | 5 | 01/16/2014 | 02/11/2014 | 01/16/2014 | 01/16/2014 |
| 21425462 | PBSDMST64400DVD | E349 | 64400 | 18 | MASTERPIECE CLASSIC: DOWNTON ABBEY SEASON 4 (3PC) | MASTERPIECE CLASSIC: DOWNTON ABBEY SEASON 4 (3PC) | 050 | 095 | 094 | | 30.49 | 49.99 | 841887019705 | 01/28/2014 | | | N | N | 30 | | 1 | A | 6 | 09/06/2013 | 01/15/2014 | 09/06/2013 | 09/06/2013 |
| 25584974 | WD1194170BR | 0819 | 1194170 | BR | FROZEN (2PC) (W/DVD) / (WS AC3 DTS 2PK DIGC) | FROZEN (2PC) (W/DVD) / (WS AC3 DTS 2PK DIGC) | 050 | 110 | | G | 27.75 | 39.99 | 786936838923 | 03/18/2014 | | | N | N | 0 | | 2 | A | 7 | 12/20/2013 | 03/13/2014 | 01/15/2014 | 01/15/2014 |
| 21388262 | HBO1000394029DVD | 0203 | 1000394029 | 18 | GAME OF THRONES: SEASON 3 | GAME OF THRONES: SEASON 3 | 050 | 095 | 093 | | 47.99 | 59.98 | 883929330713 | 02/18/2014 | | | N | N | 30 | | 1 | E | 8 | 08/29/2013 | 02/28/2014 | 08/29/2013 | 08/29/2013 |
| 25688450 | WD11955700DVD | 0819 | 11955700 | 18 | THOR: THE DARK WORLD / (AC3 DOL) | THOR: THE DARK WORLD / (AC3 DOL) | 050 | 093 | | PG13 | 21.25 | 29.99 | 786936839500 | 02/25/2014 | | | N | N | 30 | | 1 | A | 9 | 12/24/2013 | 02/20/2014 | 12/24/2013 | 12/24/2013 |
| 23061316 | PRT359054DVD | 0818 | 359054 | 18 | JACKASS PRESENTS: BAD GRANDPA / (WS DUB SUB AC3) | JACKASS PRESENTS: BAD GRANDPA / (WS DUB SUB AC3) | 050 | 110 | | R | 21.75 | 29.98 | 097363590545 | 01/28/2014 | | | N | N | 30 | | 1 | E | 10 | 12/06/2013 | 03/12/2014 | 12/06/2013 | 12/06/2013 |
| 21548611 | DSND001942202.2 | 0107 | 001942202 | 02 | FROZEN / O.S.T. (BONUS CD) (DLX) | FROZEN / O.S.T. (BONUS CD) (DLX) | 001 | 024 | | | 14.09 | 19.99 | 050087299439 | 11/25/2013 | | | N | N | 30 | | 1 | E | 11 | 10/07/2013 | 02/06/2014 | 10/07/2013 | 10/07/2013 |
+----------+------------------+------+------------+----+---------------------------------------------------+---------------------------------------------------+-----+-----+-----+------+-------+-------+--------------+------------+--+--+---+---+----+--+---+---+----+------------+------------+------------+------------+
The 2nd column from the first file can be joined to the 14th column of the second file!
here's what i've been trying to do:
join <(sort awk -F"\t" '{print $14,$12}' aecprda12.tab) <(sort awk -F"\t" '{print $2,$1}' output1.csv)
but i am getting these errors:
$ join <(sort awk -F"\t" '{print $14,$12}' aecprda12.tab) <(sort awk -F"\t" '{print $2,$1}' output1.csv)
sort: unknown option -- F
Try sort --help' for more information.
sort: unknown option -- F
Try sort --help' for more information.
-700476409 [waitproc] -bash 10336 sig_send: error sending signal 20 to pid 10336, pipe handle 0x84, Win32 error 109
the output i would like would be something like this:
+-------+-------+---------------+
| 12.99 | 14.77 | 3383510002151 |
| 13.97 | 17.96 | 3383510002175 |
| 13.2 | 13 | 3383510002267 |
| 13.74 | 14.19 | 3399240165349 |
| 9.43 | 9.52 | 3399240165363 |
| 12.99 | 4.97 | 3399240165479 |
| 7.16 | 7.48 | 3399240165677 |
| 11.24 | 9.43 | 4011550620286 |
| 13.86 | 13.43 | 4260182980316 |
| 13.98 | 12.99 | 4260182980507 |
| 10.97 | 13.97 | 4260182980514 |
| 11.96 | 13.2 | 4260182980545 |
| 15.88 | 13.74 | 4260182980552 |
+-------+-------+---------------+
what am i doing wrong?
You can do all the work in join and sort
join -1 2 -2 14 -t $'\t' -o 2.12,1.1,0 \
<( sort -t $'\t' -k 2,2 output1.csv ) \
<( sort -t $'\t' -k 14,14 aecprda12.tab )
Notes:
$'\t' is a bash ANSI-C quoted string which is a tab character: neither join nor sort seem to recognize the 2-character string "\t" as a tab
-k col,col sorts the file on the specified column
join has several options to control how it works; see the join(1) man page.
sort awk -F...
is not a valid command; it means sort a file named awk but of course, like the error message says, there is no -F option to sort. The syntax you are looking for is
awk -F ... | sort
However, you might be better off doing the joining in Awk directly.
awk -F"\t" 'NR==FNR{k[$14]=$12; next}
k[$2] { print $2, $1, k[$2] }' aecprda12.tab output1.csv
I am assuming that you don't know whether every item in the first file has a corresponding item in the second file - and that you want only "matching" items. There is indeed a good way to do this in awk. Create the following script (as a text file, call it myJoin.txt):
BEGIN {
FS="\t"
}
# loop around as long as the total number of records read
# is equal to the number of records read in this file
# in other words - loop around the first file only
NR==FNR {
a[$2]=$1 # create one array element for each $1/$2 pair
next
}
# loop around all the elements of the second file:
# since we're done processing the first file
{
# see if the associative array element exists:
gsub(/ /,"",$14) # trim leading/ trailing spaces
if (a[$14]) { # see if the value in $14 was seen in the first file
# print out the three values you care about:
print $12 " " a[$14] " " $14
}
}
Now execute this with
awk -f myJoin.txt file1 file2
Seems to work for me...

Severe mysqldump performance degradation using Centos Linux, 8GB PAE and MySQL 5.0.77

We use MySQL 5.0.77 on CentOS 5.5 on VMWare:
Linux dev.ic.soschildrensvillages.org.uk 2.6.18-194.11.4.el5PAE #1 SMP Tue Sep 21 05:48:23 EDT 2010 i686 i686 i386 GNU/Linux
We have recently upgraded from 4GB RAM to 8GB. When we did this the time of our mysqldump overnight backup jumped from under 10 minutes to over 2 hours. It also caused unresponsiveness on our plone based web site due to database load. The dump is using the optimized mysqldump format and is spooled directly through a socket to another server.
Any ideas on what we could do to fix gratefully appreciated. Would a MySQL upgrade help? Anything we can do to MySQL config? Anything we can do to Linux config? Or do we have to add another server or go to 64-bit?
We ran a previous (non-virtual) server on 6GB PAE and didn't notice a similar issue. This was on same MySQL version, but Centos 4.4.
Server config file:
[mysqld]
port=3307
socket=/tmp/mysql_live.sock
wait_timeout=31536000
interactive_timeout=31536000
datadir=/var/mysql/live/data
user=mysql
max_connections = 200
max_allowed_packet = 64M
table_cache = 2048
binlog_cache_size = 128K
max_heap_table_size = 32M
sort_buffer_size = 2M
join_buffer_size = 2M
lower_case_table_names = 1
innodb_data_file_path = ibdata1:10M:autoextend
innodb_buffer_pool_size=1G
innodb_log_file_size=300M
innodb_log_buffer_size=8M
innodb_flush_log_at_trx_commit=1
innodb_file_per_table
[mysqldump]
# Do not buffer the whole result set in memory before writing it to
# file. Required for dumping very large tables
quick
max_allowed_packet = 64M
[mysqld_safe]
# Increase the amount of open files allowed per process. Warning: Make
# sure you have set the global system limit high enough! The high value
# is required for a large number of opened tables
open-files-limit = 8192
Server variables:
mysql> show variables;
+---------------------------------+------------------------------------------------------------------+
| Variable_name | Value |
+---------------------------------+------------------------------------------------------------------+
| auto_increment_increment | 1 |
| auto_increment_offset | 1 |
| automatic_sp_privileges | ON |
| back_log | 50 |
| basedir | /usr/local/mysql-5.0.77-linux-i686-glibc23/ |
| binlog_cache_size | 131072 |
| bulk_insert_buffer_size | 8388608 |
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql-5.0.77-linux-i686-glibc23/share/mysql/charsets/ |
| collation_connection | latin1_swedish_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
| completion_type | 0 |
| concurrent_insert | 1 |
| connect_timeout | 10 |
| datadir | /var/mysql/live/data/ |
| date_format | %Y-%m-%d |
| datetime_format | %Y-%m-%d %H:%i:%s |
| default_week_format | 0 |
| delay_key_write | ON |
| delayed_insert_limit | 100 |
| delayed_insert_timeout | 300 |
| delayed_queue_size | 1000 |
| div_precision_increment | 4 |
| keep_files_on_create | OFF |
| engine_condition_pushdown | OFF |
| expire_logs_days | 0 |
| flush | OFF |
| flush_time | 0 |
| ft_boolean_syntax | + -><()~*:""&| |
| ft_max_word_len | 84 |
| ft_min_word_len | 4 |
| ft_query_expansion_limit | 20 |
| ft_stopword_file | (built-in) |
| group_concat_max_len | 1024 |
| have_archive | YES |
| have_bdb | NO |
| have_blackhole_engine | YES |
| have_compress | YES |
| have_crypt | YES |
| have_csv | YES |
| have_dynamic_loading | YES |
| have_example_engine | NO |
| have_federated_engine | YES |
| have_geometry | YES |
| have_innodb | YES |
| have_isam | NO |
| have_merge_engine | YES |
| have_ndbcluster | DISABLED |
| have_openssl | DISABLED |
| have_ssl | DISABLED |
| have_query_cache | YES |
| have_raid | NO |
| have_rtree_keys | YES |
| have_symlink | YES |
| hostname | app.ic.soschildrensvillages.org.uk |
| init_connect | |
| init_file | |
| init_slave | |
| innodb_additional_mem_pool_size | 1048576 |
| innodb_autoextend_increment | 8 |
| innodb_buffer_pool_awe_mem_mb | 0 |
| innodb_buffer_pool_size | 1073741824 |
| innodb_checksums | ON |
| innodb_commit_concurrency | 0 |
| innodb_concurrency_tickets | 500 |
| innodb_data_file_path | ibdata1:10M:autoextend |
| innodb_data_home_dir | |
| innodb_adaptive_hash_index | ON |
| innodb_doublewrite | ON |
| innodb_fast_shutdown | 1 |
| innodb_file_io_threads | 4 |
| innodb_file_per_table | ON |
| innodb_flush_log_at_trx_commit | 1 |
| innodb_flush_method | |
| innodb_force_recovery | 0 |
| innodb_lock_wait_timeout | 50 |
| innodb_locks_unsafe_for_binlog | OFF |
| innodb_log_arch_dir | |
| innodb_log_archive | OFF |
| innodb_log_buffer_size | 8388608 |
| innodb_log_file_size | 314572800 |
| innodb_log_files_in_group | 2 |
| innodb_log_group_home_dir | ./ |
| innodb_max_dirty_pages_pct | 90 |
| innodb_max_purge_lag | 0 |
| innodb_mirrored_log_groups | 1 |
| innodb_open_files | 300 |
| innodb_rollback_on_timeout | OFF |
| innodb_support_xa | ON |
| innodb_sync_spin_loops | 20 |
| innodb_table_locks | ON |
| innodb_thread_concurrency | 8 |
| innodb_thread_sleep_delay | 10000 |
| interactive_timeout | 31536000 |
| join_buffer_size | 2097152 |
| key_buffer_size | 8384512 |
| key_cache_age_threshold | 300 |
| key_cache_block_size | 1024 |
| key_cache_division_limit | 100 |
| language | /usr/local/mysql-5.0.77-linux-i686-glibc23/share/mysql/english/ |
| large_files_support | ON |
| large_page_size | 0 |
| large_pages | OFF |
| lc_time_names | en_US |
| license | GPL |
| local_infile | ON |
| locked_in_memory | OFF |
| log | OFF |
| log_bin | OFF |
| log_bin_trust_function_creators | OFF |
| log_error | |
| log_queries_not_using_indexes | OFF |
| log_slave_updates | OFF |
| log_slow_queries | OFF |
| log_warnings | 1 |
| long_query_time | 10 |
| low_priority_updates | OFF |
| lower_case_file_system | OFF |
| lower_case_table_names | 1 |
| max_allowed_packet | 67108864 |
| max_binlog_cache_size | 4294963200 |
| max_binlog_size | 1073741824 |
| max_connect_errors | 10 |
| max_connections | 200 |
| max_delayed_threads | 20 |
| max_error_count | 64 |
| max_heap_table_size | 33554432 |
| max_insert_delayed_threads | 20 |
| max_join_size | 18446744073709551615 |
| max_length_for_sort_data | 1024 |
| max_prepared_stmt_count | 16382 |
| max_relay_log_size | 0 |
| max_seeks_for_key | 4294967295 |
| max_sort_length | 1024 |
| max_sp_recursion_depth | 0 |
| max_tmp_tables | 32 |
| max_user_connections | 0 |
| max_write_lock_count | 4294967295 |
| multi_range_count | 256 |
| myisam_data_pointer_size | 6 |
| myisam_max_sort_file_size | 2146435072 |
| myisam_recover_options | OFF |
| myisam_repair_threads | 1 |
| myisam_sort_buffer_size | 8388608 |
| myisam_stats_method | nulls_unequal |
| ndb_autoincrement_prefetch_sz | 1 |
| ndb_force_send | ON |
| ndb_use_exact_count | ON |
| ndb_use_transactions | ON |
| ndb_cache_check_time | 0 |
| ndb_connectstring | |
| net_buffer_length | 16384 |
| net_read_timeout | 30 |
| net_retry_count | 10 |
| net_write_timeout | 60 |
| new | OFF |
| old_passwords | OFF |
| open_files_limit | 8192 |
| optimizer_prune_level | 1 |
| optimizer_search_depth | 62 |
| pid_file | /var/mysql/live/mysqld.pid |
| plugin_dir | |
| port | 3307 |
| preload_buffer_size | 32768 |
| profiling | OFF |
| profiling_history_size | 15 |
| protocol_version | 10 |
| query_alloc_block_size | 8192 |
| query_cache_limit | 1048576 |
| query_cache_min_res_unit | 4096 |
| query_cache_size | 0 |
| query_cache_type | ON |
| query_cache_wlock_invalidate | OFF |
| query_prealloc_size | 8192 |
| range_alloc_block_size | 4096 |
| read_buffer_size | 131072 |
| read_only | OFF |
| read_rnd_buffer_size | 262144 |
| relay_log | |
| relay_log_index | |
| relay_log_info_file | relay-log.info |
| relay_log_purge | ON |
| relay_log_space_limit | 0 |
| rpl_recovery_rank | 0 |
| secure_auth | OFF |
| secure_file_priv | |
| server_id | 0 |
| skip_external_locking | ON |
| skip_networking | OFF |
| skip_show_database | OFF |
| slave_compressed_protocol | OFF |
| slave_load_tmpdir | /tmp/ |
| slave_net_timeout | 3600 |
| slave_skip_errors | OFF |
| slave_transaction_retries | 10 |
| slow_launch_time | 2 |
| socket | /tmp/mysql_live.sock |
| sort_buffer_size | 2097152 |
| sql_big_selects | ON |
| sql_mode | |
| sql_notes | ON |
| sql_warnings | OFF |
| ssl_ca | |
| ssl_capath | |
| ssl_cert | |
| ssl_cipher | |
| ssl_key | |
| storage_engine | MyISAM |
| sync_binlog | 0 |
| sync_frm | ON |
| system_time_zone | GMT |
| table_cache | 2048 |
| table_lock_wait_timeout | 50 |
| table_type | MyISAM |
| thread_cache_size | 0 |
| thread_stack | 196608 |
| time_format | %H:%i:%s |
| time_zone | SYSTEM |
| timed_mutexes | OFF |
| tmp_table_size | 33554432 |
| tmpdir | /tmp/ |
| transaction_alloc_block_size | 8192 |
| transaction_prealloc_size | 4096 |
| tx_isolation | REPEATABLE-READ |
| updatable_views_with_limit | YES |
| version | 5.0.77 |
| version_comment | MySQL Community Server (GPL) |
| version_compile_machine | i686 |
| version_compile_os | pc-linux-gnu |
| wait_timeout | 31536000 |
+---------------------------------+------------------------------------------------------------------+
237 rows in set (0.00 sec)
If you're sending data over sockets, you might as well use MySQL Replication with the binlog. Use binlog_format=ROWS so that your slave(s) doesn't waste time with triggers and so on.
Failing that, you could try a "hotcopy" script such as http://dev.mysql.com/doc/refman/5.0/en/mysqlhotcopy.html
Above all, I think you are running outdated software. Upgrade to CentOS 5.5 64bit, and Percona Server (100% MySQL-compatible database on steroids) http://www.percona.com/software/percona-server/

Resources