I'm using the spark-cloudant library 1.6.3 that is installed by default with the spark service.
I'm trying to save some data to Cloudant:
val df = getTopXRecommendationsForAllUsers().toDF.filter( $"_1" > 6035)
println(s"Saving ${df.count()} ratings to Cloudant: " + new Date())
println(df.show(5))
val timestamp: Long = System.currentTimeMillis / 1000
val dbName: String = s"${destDB.database}_${timestamp}"
df.write.mode("append").json(s"${dbName}.json")
val dfWriter = df.write.format("com.cloudant.spark")
dfWriter.option("cloudant.host", destDB.host)
if (destDB.username.isDefined && destDB.username.get.nonEmpty) dfWriter.option("cloudant.username", destDB.username.get)
if (destDB.password.isDefined && destDB.password.get.nonEmpty) dfWriter.option("cloudant.password", destDB.password.get)
dfWriter.save(dbName)
However, I hit the error:
Starting getTopXRecommendationsForAllUsers: Sat Dec 24 08:50:11 CST 2016
Finished getTopXRecommendationsForAllUsers: Sat Dec 24 08:50:11 CST 2016
Saving 6 ratings to Cloudant: Sat Dec 24 08:50:17 CST 2016
+----+--------------------+
| _1| _2|
+----+--------------------+
|6036|[[6036,2503,4.395...|
|6037|[[6037,572,4.5785...|
|6038|[[6038,1696,4.894...|
|6039|[[6039,572,4.6854...|
|6040|[[6040,670,4.6820...|
+----+--------------------+
only showing top 5 rows
()
Use connectorVersion=1.6.3, dbName=recommendationdb_1482591017, indexName=null, viewName=null,jsonstore.rdd.partitions=5, + jsonstore.rdd.maxInPartition=-1,jsonstore.rdd.minInPartition=10, jsonstore.rdd.requestTimeout=900000,bulkSize=20, schemaSampleSize=1
Name: org.apache.spark.SparkException
Message: Job aborted due to stage failure: Task 2 in stage 642.0 failed 10 times, most recent failure: Lost task 2.9 in stage 642.0 (TID 409, yp-spark-dal09-env5-0049): java.lang.RuntimeException: Database recommendationdb_1482591017: nothing was saved because the number of records was 0!
at com.cloudant.spark.common.JsonStoreDataAccess.saveAll(JsonStoreDataAccess.scala:187)
I know there is data because I also save it to files:
! cat recommendationdb_1482591017.json/*
{"_1":6036,"_2":[{"user":6036,"product":2503,"rating":4.3957030284620355},{"user":6036,"product":2019,"rating":4.351395783537379},{"user":6036,"product":1178,"rating":4.3373212302468165},{"user":6036,"product":923,"rating":4.3328207761734605},{"user":6036,"product":922,"rating":4.320787353937724},{"user":6036,"product":750,"rating":4.307312349612301},{"user":6036,"product":53,"rating":4.304341611330176},{"user":6036,"product":858,"rating":4.297961629128419},{"user":6036,"product":1212,"rating":4.285360675560061},{"user":6036,"product":1423,"rating":4.275255129149407}]}
{"_1":6037,"_2":[{"user":6037,"product":572,"rating":4.578508339835482},{"user":6037,"product":858,"rating":4.247809350206506},{"user":6037,"product":904,"rating":4.1222486445799404},{"user":6037,"product":527,"rating":4.117342524702621},{"user":6037,"product":787,"rating":4.115781026855997},{"user":6037,"product":2503,"rating":4.109861422105844},{"user":6037,"product":1193,"rating":4.088453520710152},{"user":6037,"product":912,"rating":4.085139017248665},{"user":6037,"product":1221,"rating":4.084368219857013},{"user":6037,"product":1207,"rating":4.082536396283374}]}
{"_1":6038,"_2":[{"user":6038,"product":1696,"rating":4.894442132848873},{"user":6038,"product":2998,"rating":4.887752985607918},{"user":6038,"product":2562,"rating":4.740442462948304},{"user":6038,"product":3245,"rating":4.7366090605162094},{"user":6038,"product":2609,"rating":4.736125582066063},{"user":6038,"product":1669,"rating":4.678373819044571},{"user":6038,"product":572,"rating":4.606132758047402},{"user":6038,"product":1493,"rating":4.577140478430046},{"user":6038,"product":745,"rating":4.56568047928448},{"user":6038,"product":213,"rating":4.546054686400765}]}
{"_1":6039,"_2":[{"user":6039,"product":572,"rating":4.685425482619273},{"user":6039,"product":527,"rating":4.291256016077275},{"user":6039,"product":904,"rating":4.27766400846558},{"user":6039,"product":2019,"rating":4.273486883864949},{"user":6039,"product":2905,"rating":4.266371181044469},{"user":6039,"product":912,"rating":4.26006044096224},{"user":6039,"product":1207,"rating":4.259935289367192},{"user":6039,"product":2503,"rating":4.250370780277651},{"user":6039,"product":1148,"rating":4.247288578998062},{"user":6039,"product":745,"rating":4.223697008637559}]}
{"_1":6040,"_2":[{"user":6040,"product":670,"rating":4.682008703927743},{"user":6040,"product":3134,"rating":4.603656534071515},{"user":6040,"product":2503,"rating":4.571906881428182},{"user":6040,"product":3415,"rating":4.523567737705732},{"user":6040,"product":3808,"rating":4.516778146579665},{"user":6040,"product":3245,"rating":4.496176019230939},{"user":6040,"product":53,"rating":4.491020821805015},{"user":6040,"product":668,"rating":4.471757243976877},{"user":6040,"product":3030,"rating":4.464674231353673},{"user":6040,"product":923,"rating":4.446195112198678}]}
{"_1":6042,"_2":[{"user":6042,"product":3389,"rating":3.331488167984286},{"user":6042,"product":572,"rating":3.3312810949271903},{"user":6042,"product":231,"rating":3.2622287749148926},{"user":6042,"product":1439,"rating":3.0988533259613944},{"user":6042,"product":333,"rating":3.0859809743588706},{"user":6042,"product":404,"rating":3.0573976830913203},{"user":6042,"product":216,"rating":3.044620107397873},{"user":6042,"product":408,"rating":3.038302525994588},{"user":6042,"product":2411,"rating":3.0190834747311244},{"user":6042,"product":875,"rating":2.9860048032439095}]}
This is a defect with spark-cloudant 1.6.3 that is fixed with 1.6.4. The pull request is https://github.com/cloudant-labs/spark-cloudant/pull/61
The answer is to upgrade to spark-cloudant 1.6.4. See this answer if you are trying to do that on the IBM Bluemix Spark Service: Spark-cloudant package 1.6.4 loaded by %AddJar does not get used by notebook
We have a Percona Xtradb cluster with 5 nodes and an arbitrator. One of our Php developers ran a bad query on the cluster, crashing all the nodes. After the crash, we could not collect any error log to tell us what really went wrong as the entire cluster crashed without performing any logging.
I have always thought that when a single query is executed on the cluster, it is processed by only one of the nodes in the cluster. So if the query is bad (to the point of killing a db server), it should only crash the one node thats processing it, leaving the cluster running with the remaining 4 nodes.
This behavior has puzzled us and we would like to understand what is really going on especially that this is the second time this is happening. Why would a query running on the cluster while processed by one of the nodes would cause other nodes in the cluster to crash in case of some issue while being processed?
Below is our my.cnf config:
#
# Default values.
[mysqld_safe]
flush_caches
numa_interleave
#
#
[mysqld]
back_log = 65535
binlog_format = ROW
character_set_server = utf8
collation_server = utf8_general_ci
datadir = /var/lib/mysql
default_storage_engine = InnoDB
expand_fast_index_creation = 1
expire_logs_days = 7
innodb_autoinc_lock_mode = 2
innodb_buffer_pool_instances = 16
innodb_buffer_pool_populate = 1
innodb_buffer_pool_size = 32G # XXX 64GB RAM, 80%
innodb_data_file_path = ibdata1:64M;ibdata2:64M:autoextend
innodb_file_format = Barracuda
innodb_file_per_table
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_io_capacity = 1600
innodb_large_prefix
innodb_locks_unsafe_for_binlog = 1
innodb_log_file_size = 64M
innodb_print_all_deadlocks = 1
innodb_read_io_threads = 64
innodb_stats_on_metadata = FALSE
innodb_support_xa = FALSE
innodb_write_io_threads = 64
log-bin = mysqld-bin
log-queries-not-using-indexes
log-slave-updates
long_query_time = 1
max_allowed_packet = 64M
max_connect_errors = 4294967295
max_connections = 4096
min_examined_row_limit = 1000
port = 3306
relay-log-recovery = TRUE
skip-name-resolve
slow_query_log = 1
slow_query_log_timestamp_always = 1
table_open_cache = 4096
thread_cache = 1024
tmpdir = /db/tmp
transaction_isolation = REPEATABLE-READ
updatable_views_with_limit = 0
user = mysql
wait_timeout = 60
#
# Galera Variable config
wsrep_cluster_address = gcomm://ip_1, ip_2, ip_3,ip_4,ip_4,ip_5
wsrep_cluster_name = cluster_db
wsrep_provider = /usr/lib/libgalera_smm.so
wsrep_provider_options = "gcache.size=4G"
wsrep_slave_threads = 32
wsrep_sst_auth = "user:password"
wsrep_sst_donor = "db1"
#wsrep_sst_method = xtrabackup_throttle
wsrep_sst_method = xtrabackup-v2
#
# XXX You *MUST* change!
server-id = 1
Can you post the query? SELECT queries only execute on a single node but all write queries will execute everywhere. What's in your error log?
I am trying to connect to oracle database (PHK01200_SECCOMPAS_APPL.WORLD) from linux box (RHEL 6) using OCCI (Oracle instantclient version 12.1 (latest)). I am getting tns error while connection. tnsping works fine though. Could you please help in setting up correct configuration. What am I missing here?
Output
[m499757#hkl20030996 bin]$ ./sqlplus toolkit/******#PHK01200_SECCOMPAS_APPL.WORLD
SQL*Plus: Release 10.2.0.3.0 - Production on Thu May 22 12:32:56 2014
Copyright (c) 1982, 2006, Oracle. All Rights Reserved.
ERROR:
ORA-12505: TNS:listener does not currently know of SID given in connect
descriptor
Config details:
Tnsnames.ora
PHK01200_SECCOMPAS_APPL.WORLD =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = PHKLOD2001-xxxx.xx.hedani.net)(PORT = 1522))
)
(CONNECT_DATA =
(SID = PHK01200_SECCOMPAS_APPL)
)
)
Odbc.ini
[PHK01200_SECCOMPAS_APPL]
Driver = OracleODBC-12g
DSN = OracleODBC-12g
ServerName = PHK01200_SECCOMPAS_APPL.WORLD
UserID = toolkit
Password = ******
Odbcinst.ini
[OracleODBC-12g]
Description = Oracle ODBC driver for Oracle 12g
Driver = /cs/gat/share/oracle/64/instantclient/libsqora.so.12.1
Driver64 = /cs/gat/share/oracle/64/instantclient/libsqora.so.12.1
FileUsage = 1
Driver Logging = 7
LDAP.ora
# LDAP.ORA Configuration
# Generated by Oracle configuration tools.
DEFAULT_ADMIN_CONTEXT = "dc=uk,dc=csfb,dc=com"
#DEFAULT_ADMIN_CONTEXT = "dc=corpny,dc=csfb,dc=com"
DIRECTORY_SERVERS= (oid_ldap_server_sg.sg.csfb.com:1522:1524,oid_ldap_server_ny.corpny.csfb.com:1522:1524,oid_l dap_server_ln.csfp.co.uk:1522:1524)
DIRECTORY_SERVER_TYPE = OID
sqlnet.ora
AUTOMATIC_IPC = OFF
TRACE_LEVEL_CLIENT = OFF
TCP.NODELAY = YES
NAMES.DIRECTORY_PATH= (TNSNAMES,LDAP,ONAMES,HOSTNAME)
names.default_domain = world
name.default_zone = world
Below wokred.
./sqlplus toolkit/******#PHK01200
And in my python script I used below connection string
"DRIVER={OracleODBC-12g}; Dbq=PHKLOD2001-scan.ap.hedani.net:1522/PHK01200_SECCOMPAS_APPL.WORLD"