I have troubles connecting to cassandra.
Im trying to connect to s-cassandra (which is a stubbed cassandra as can be reviewed here), with a datastax node.js cassandra driver.
For some reason passing "127.0.0.1:8042" as a contact point to the driver
results in a DriverInternalError:( tough sometimes it does work randomly and I havent still figured out why sometimes it does and sometime i doesnt..)
The DriverInternalError I get:
{"name": "DriverInternalError",
"stack": "...",
"message": "Local datacenter could not be
determined",
"info": "Represents a bug inside the driver or in a
Cassandra host." }
That is what I see from Cassandra Driver's log:
log event: info -- Adding host 127.0.0.1:8042
log event: info -- Getting first connection
log event: info -- Connecting to 127.0.0.1:8042
log event: verbose -- Socket connected to 127.0.0.1:8042
log event: info -- Trying to use protocol version 4
log event: verbose -- Sending stream #0
log event: verbose -- Sent stream #0 to 127.0.0.1:8042
{"name":"application-storage","hostname":"Yuris-MacBook-Pro.local","pid":1338,"level":30,"msg":"Kafka producer is initialized","time":"2016-08-05T12:53:53.124Z","v":0}
log event: verbose -- Received frame #0 from 127.0.0.1:8042
log event: info -- Protocol v4 not supported, using v2
log event: verbose -- Done receiving frame #0
log event: verbose -- disconnecting
log event: info -- Connection to 127.0.0.1:8042 closed
log event: info -- Connecting to 127.0.0.1:8042
log event: verbose -- Socket connected to 127.0.0.1:8042
log event: info -- Trying to use protocol version 2
log event: verbose -- Sending stream #0
log event: verbose -- Sent stream #0 to 127.0.0.1:8042
log event: verbose -- Received frame #0 from 127.0.0.1:8042
log event: info -- Connection to 127.0.0.1:8042 opened successfully
log event: info -- Connection pool to host 127.0.0.1:8042 created with 1 connection(s)
log event: info -- Control connection using protocol version 2
log event: info -- Connection acquired to 127.0.0.1:8042, refreshing nodes list
log event: info -- Refreshing local and peers info
log event: verbose -- Sending stream #1
log event: verbose -- Done receiving frame #0
log event: verbose -- Sent stream #1 to 127.0.0.1:8042
log event: verbose -- Received frame #1 from 127.0.0.1:8042
log event: warning -- No local info provided
log event: verbose -- Sending stream #0
log event: verbose -- Done receiving frame #1
log event: verbose -- Sent stream #0 to 127.0.0.1:8042
log event: verbose -- Received frame #0 from 127.0.0.1:8042
log event: info -- Peers info retrieved
log event: error -- Tokenizer could not be determined
log event: info -- Retrieving keyspaces metadata
log event: verbose -- Sending stream #1
log event: verbose -- Done receiving frame #0
log event: verbose -- Sent stream #1 to 127.0.0.1:8042
log event: verbose -- Received frame #1 from 127.0.0.1:8042
log event: verbose -- Sending stream #0
log event: verbose -- Done receiving frame #1
log event: verbose -- Sent stream #0 to 127.0.0.1:8042
log event: verbose -- Received frame #0 from 127.0.0.1:8042
log event: info -- ControlConnection connected to 127.0.0.1:8042 and is up to date
Ive tried playing with the firewall and open application but help is not there.. tough sometimes it does work randomly and I havent still figured out why..
I have a mac OS X El Capitan
The Solution that helped me:
I needed to prime the system.local table as a prime-query-single
{
query: 'prime-query-single',
header: {'Content-Type': 'application/json'},
body: {
"when": {
"query": "SELECT * FROM system.local WHERE key='local'"
},
"then": {
"rows": [
{
"cluster_name": "custom cluster name",
"partitioner": "org.apache.cassandra.dht.Murmur3Partitioner",
"data_center": "dc1",
"rack": "rc1",
"tokens": [
"1743244960790844724"
],
"release_version": "2.0.1"
}
],
"result": "success",
"column_types": {
"tokens": "set<text>"
}
}
}
}
Related
I'm not able to connect the ios client app to the server.
Android is working fine.
Following socket client version i'm using.
We tried the following links:
https://socket.io/docs/v4/troubleshooting-connection-issues/
https://github.com/socketio/socket.io/issues/3794
Socket Server Version(node):
"socket.io": "^4.5.2"
iOS: Socket Client Version
pod 'Socket.IO-Client-Swift','16.0.0'
Android: Socket Client Version
'io.socket:socket.io-client:2.0.1'
Here is the error log.
2022-11-09 15:36:13.806244+0530 Chat[11058:250537] LOG SocketManager: Manager is being released
2022-11-09 15:36:13.809821+0530 Chat[11058:250537] LOG SocketIOClient{/}: Handling event: statusChange with data: [connecting, 2]
2022-11-09 15:36:13.810822+0530 Chat[11058:250537] LOG SocketIOClient{/}: Joining namespace /
2022-11-09 15:36:13.811809+0530 Chat[11058:250537] LOG SocketManager: Tried connecting socket when engine isn't open. Connecting
2022-11-09 15:36:13.812649+0530 Chat[11058:250537] LOG SocketManager: Adding engine
2022-11-09 15:36:13.816521+0530 Chat[11058:250965] LOG SocketEngine: Starting engine. Server: https://socketserver.com/
2022-11-09 15:36:13.817360+0530 Chat[11058:250965] LOG SocketEngine: Handshaking
2022-11-09 15:36:13.891828+0530 Chat[11058:250537] [UICollectionViewRecursion] cv == 0x7fc7f28afe00 Disabling recursion trigger logging
2022-11-09 15:36:14.011073+0530 Chat[11058:250979] [boringssl] boringssl_metrics_log_metric_block_invoke(144) Failed to log metrics
2022-11-09 15:36:14.031883+0530 Chat[11058:250936] LOG SocketEngine: Got message: 0{"sid":"MZvaOLNP7rHRyHPmAACE","upgrades":[],"pingInterval":25000,"pingTimeout":20000,"maxPayload":1000000}
2022-11-09 15:36:14.066076+0530 Chat[11058:250537] LOG SocketIOClient{/}: Handling event: websocketUpgrade with data: [["Connection": "upgrade", "Sec-WebSocket-Accept": "7Pq2JXa4nY+2HM4/iTfQnlr36zI=", "Server": "nginx/1.20.1", "Upgrade": "websocket", "Date": "Wed, 09 Nov 2022 10:06:14 GMT"]]
SocketAnyEvent: Event: websocketUpgrade items: Optional([["Connection": "upgrade", "Sec-WebSocket-Accept": "7Pq2JXa4nY+2HM4/iTfQnlr36zI=", "Server": "nginx/1.20.1", "Upgrade": "websocket", "Date": "Wed, 09 Nov 2022 10:06:14 GMT"]])
2022-11-09 15:36:14.068315+0530 Chat[11058:250537] LOG SocketManager: Engine opened Connect
2022-11-09 15:36:14.069540+0530 Chat[11058:250936] LOG SocketEngine: Writing ws: 0/, has data: false
2022-11-09 15:36:14.070357+0530 Chat[11058:250936] LOG SocketEngineWebSocket: Sending ws: 0/, as type: 4
2022-11-09 15:36:14.092916+0530 Chat[11058:250936] LOG SocketEngine: Got message: 44{"message":"invalid"}
2022-11-09 15:36:14.094201+0530 Chat[11058:250537] LOG SocketParser: Parsing 4{"message":"invalid"}
2022-11-09 15:36:14.098295+0530 Chat[11058:250537] LOG SocketParser: Decoded packet as: SocketPacket {type: 4; data: [{
message = invalid;
}]; id: -1; placeholders: -1; nsp: /}
2022-11-09 15:36:14.099612+0530 Chat[11058:250537] LOG SocketIOClient{/}: Handling event: error with data: [{
message = invalid;
}]
SocketAnyEvent: Event: error items: Optional([{
message = invalid;
}])
2022-11-09 15:36:39.094504+0530 Chat[11058:250935] LOG SocketEngine: Got message: 2
2022-11-09 15:36:39.094918+0530 Chat[11058:250935] LOG SocketEngine: Writing ws: has data: false
2022-11-09 15:36:39.094992+0530 Chat[11058:250537] LOG SocketIOClient{/}: Handling event: ping with data: []
SocketAnyEvent: Event: ping items: Optional([])
2022-11-09 15:36:39.095224+0530 Chat[11058:250935] LOG SocketEngineWebSocket: Sending ws: as type: 3
I have a Spark (2.1) streaming job that writes processed data to azure blob storage every batch (with batch interval 1 min). Every now and then (once every couple of hours) I get the 'java.net.ConnectException' with connection timeout message. This does gets retried and eventually succeeds. But this issue is causing delay in the completion of the 1 min streaming batch and is causing it to finish in 2 to 3 min when this error occurs.
Below is the executor log snippet with error message. I have spark.executor.cores=5.
Is there some kind of number of connections limit that might be causing this?
17/10/11 16:09:02 INFO root: {89f867cc-cbd3-4fa9-a549-4e07be3f69b0}: {Starting operation.}
17/10/11 16:09:02 INFO root: {89f867cc-cbd3-4fa9-a549-4e07be3f69b0}: {Starting operation with location 'PRIMARY' per location mode 'PRIMARY_ONLY'.}
17/10/11 16:09:02 INFO root: {89f867cc-cbd3-4fa9-a549-4e07be3f69b0}: {Starting request to 'http://<name>.blob.core.windows.net/rawData/2017/10/11/16/data1.json' at 'Wed, 11 Oct 2017 16:09:02 GMT'.}
17/10/11 16:09:02 INFO root: {89f867cc-cbd3-4fa9-a549-4e07be3f69b0}: {Waiting for response.}
..
17/10/11 16:11:09 WARN root: {89f867cc-cbd3-4fa9-a549-4e07be3f69b0}: {Retryable exception thrown. Class = 'java.net.ConnectException', Message = 'Connection timed out (Connection timed out)'.}
17/10/11 16:11:09 INFO root: {89f867cc-cbd3-4fa9-a549-4e07be3f69b0}: {Checking if the operation should be retried. Retry count = '0', HTTP status code = '-1', Error Message = 'An unknown failure occurred : Connection timed out (Connection timed out)'.}
17/10/11 16:11:09 INFO root: {89f867cc-cbd3-4fa9-a549-4e07be3f69b0}: {The next location has been set to 'PRIMARY', per location mode 'PRIMARY_ONLY'.}
17/10/11 16:11:09 INFO root: {89f867cc-cbd3-4fa9-a549-4e07be3f69b0}: {The retry policy set the next location to 'PRIMARY' and updated the location mode to 'PRIMARY_ONLY'.}
17/10/11 16:11:09 INFO root: {89f867cc-cbd3-4fa9-a549-4e07be3f69b0}: {Operation will be retried after '0'ms.}
17/10/11 16:11:09 INFO root: {89f867cc-cbd3-4fa9-a549-4e07be3f69b0}: {Retrying failed operation.}
Consumer stops fetching after getting following error
{ Error: Local: Broker transport failure at Error (native) origin: 'local', message: 'broker transport failure', code: -1, errno: -1, stack: 'Error: Local: Broker transport failure\n at Error (native)' }
All kafka servers are healthy even then I am getting the transport failure .
After this error consumers stops fetching further messages
I'm doing http request using request.js, and I have run my program for a very long time this is the first time got this kind of error, it seems related to openssl and TLS issue. Does anybody have idea?
2017-02-20T14:27:34.330Z - error: Error: write EPROTO 140125135710016:error:1408F10B:SSL routines:SSL3_GET_RECORD:wrong version number:../deps/openssl/openssl/ssl/s3_pkt.c:365:
at exports._errnoException (util.js:1022:11)
at WriteWrap.afterWrite (net.js:801:14) 2017-02-20T14:27:34.332Z - warn: worker warning, msg={"code":"EPROTO","errno":"EPROTO","syscall":"write"}
2017-02-20T14:27:34.348Z - error: uncaughtException: Cannot read property 'getNegotiatedProtocol' of null ,
version=v6.9.5,
argv=[/usr/bin/nodejs, /home/bda/tm/worker.js],rss=234672128, heapTotal=135876608, heapUsed=107730496, loadavg=[0.43798828125, 0.5263671875, 0.53857421875], uptime=906959,
trace=[column=36, file=_tls_wrap.js, function=TLSSocket._finishInit, line=588, method=_finishInit, native=false, column=38, file=_tls_wrap.js, function=TLSWrap.ssl.onhandshakedone, line=433, method=ssl.onhandshakedone, native=false], stack=[TypeError: Cannot read property 'getNegotiatedProtocol' of null, at TLSSocket._finishInit (_tls_wrap.js:588:36), at TLSWrap.ssl.onhandshakedone (_tls_wrap.js:433:38)]
I am using Spark Job Server (SJS) to create context and submit jobs.
My cluster includes 4 servers.
master1: 10.197.0.3
master2: 10.197.0.4
master3: 10.197.0.5
master4: 10.197.0.6
But only master1 has a public ip.
First of all I set up zookeeper for master1, master3 and master3 and zookeeper-id from 1 to 3.
I intend use master1, master2, master3 to be a masters of cluster.
That mean quorum=2 I set for 3 masters.
The zk connect is zk://master1:2181,master2:2181,master3:2181/mesos
each server I also start mesos-slave so I have 4 slaves and 3 masters.
As you can see all slaves are conencted.
But the funny thing is when I create a job to run it can not acquire the resource.
From logs I saw that it's continuing DECLINE the offer. This logs from master.
I0523 15:01:00.116981 32513 master.cpp:3641] Processing DECLINE call for offers: [ dc18c89f-d802-404b-9221-71f0f15b096f-O4264 ] for framework dc18c89f-d802-404b-9221-71f0f15b096f-0001 (sql_context-1) at scheduler-f5196abd-f420-48c6-b2fe-0306595601d4#10.197.0.3:28765
I0523 15:01:00.117086 32513 master.cpp:3641] Processing DECLINE call for offers: [ dc18c89f-d802-404b-9221-71f0f15b096f-O4265 ] for framework dc18c89f-d802-404b-9221-71f0f15b096f-0001 (sql_context-1) at scheduler-f5196abd-f420-48c6-b2fe-0306595601d4#10.197.0.3:28765
I0523 15:01:01.460502 32508 replica.cpp:673] Replica in VOTING status received a broadcasted recover request from (914)#127.0.0.1:5050
I0523 15:01:02.117753 32510 master.cpp:5324] Sending 1 offers to framework dc18c89f-d802-404b-9221-71f0f15b096f-0000 (sql_context) at scheduler-9b4637cf-4b27-4629-9a73-6019443ed30b#10.197.0.3:28765
I0523 15:01:02.118099 32510 master.cpp:5324] Sending 1 offers to framework dc18c89f-d802-404b-9221-71f0f15b096f-0001 (sql_context-1) at scheduler-f5196abd-f420-48c6-b2fe-0306595601d4#10.197.0.3:28765
I0523 15:01:02.119299 32508 master.cpp:3641] Processing DECLINE call for offers: [ dc18c89f-d802-404b-9221-71f0f15b096f-O4266 ] for framework dc18c89f-d802-404b-9221-71f0f15b096f-0000 (sql_context) at scheduler-9b4637cf-4b27-4629-9a73-6019443ed30b#10.197.0.3:28765
I0523 15:01:02.119858 32515 master.cpp:3641] Processing DECLINE call for offers: [ dc18c89f-d802-404b-9221-71f0f15b096f-O4267 ] for framework dc18c89f-d802-404b-9221-71f0f15b096f-0001 (sql_context-1) at scheduler-f5196abd-f420-48c6-b2fe-0306595601d4#10.197.0.3:28765
I0523 15:01:02.900946 32509 http.cpp:312] HTTP GET for /master/state from 10.197.0.3:35778 with User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36' with X-Forwarded-For='113.161.38.181'
I0523 15:01:03.118147 32514 master.cpp:5324] Sending 1 offers to framework dc18c89f-d802-404b-9221-71f0f15b096f-0001 (sql_context-1) at scheduler-f5196abd-f420-48c6-b2fe-0306595601d4#10.197.0.3:28765
For 1 of my slave I check
W0523 14:53:15.487599 32681 status_update_manager.cpp:475] Resending status update TASK_FAILED (UUID: 3c3a022c-2032-4da1-bbab-c367d46e07de) for task driver-20160523111535-0003 of framework a9871c4b-ab0c-4ddc-8d96-c52faf0e66f7-0019
W0523 14:53:15.487773 32681 status_update_manager.cpp:475] Resending status update TASK_FAILED (UUID: cfb494b3-6484-4394-bd94-80abf2e11ee8) for task driver-20160523112724-0001 of framework a9871c4b-ab0c-4ddc-8d96-c52faf0e66f7-0020
I0523 14:53:15.487820 32680 slave.cpp:3400] Forwarding the update TASK_FAILED (UUID: 3c3a022c-2032-4da1-bbab-c367d46e07de) for task driver-20160523111535-0003 of framework a9871c4b-ab0c-4ddc-8d96-c52faf0e66f7-0019 to master#10.197.0.3:5050
I0523 14:53:15.488008 32680 slave.cpp:3400] Forwarding the update TASK_FAILED (UUID: cfb494b3-6484-4394-bd94-80abf2e11ee8) for task driver-20160523112724-0001 of framework a9871c4b-ab0c-4ddc-8d96-c52faf0e66f7-0020 to master#10.197.0.3:5050
I0523 15:02:24.120436 32680 http.cpp:190] HTTP GET for /slave(1)/state from 113.161.38.181:63097 with User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'
W0523 15:02:24.165690 32685 slave.cpp:4979] Failed to get resource statistics for executor 'driver-20160523111535-0003' of framework a9871c4b-ab0c-4ddc-8d96-c52faf0e66f7-0019: Container 'cac7667c-3309-4380-9f95-07d9b888e44e' not found
W0523 15:02:24.165771 32685 slave.cpp:4979] Failed to get resource statistics for executor 'driver-20160523112724-0001' of framework a9871c4b-ab0c-4ddc-8d96-c52faf0e66f7-0020: Container '9c661311-bf7f-4ea6-9348-ce8c7f6cfbcb' not found
From SJS Logs
[2016-05-23 15:04:10,305] DEBUG oarseMesosSchedulerBackend [] [] - Declining offer: dc18c89f-d802-404b-9221-71f0f15b096f-O4565 with attributes: Map() mem: 63403.0 cpu: 8
[2016-05-23 15:04:10,305] DEBUG oarseMesosSchedulerBackend [] [] - Declining offer: dc18c89f-d802-404b-9221-71f0f15b096f-O4566 with attributes: Map() mem: 47244.0 cpu: 8
[2016-05-23 15:04:10,305] DEBUG oarseMesosSchedulerBackend [] [] - Declining offer: dc18c89f-d802-404b-9221-71f0f15b096f-O4567 with attributes: Map() mem: 47244.0 cpu: 8
[2016-05-23 15:04:10,366] WARN cheduler.TaskSchedulerImpl [] [akka://JobServer/user/context-supervisor/sql_context] - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
[2016-05-23 15:04:10,505] DEBUG cheduler.TaskSchedulerImpl [] [akka://JobServer/user/context-supervisor/sql_context] - parentName: , name: TaskSet_0, runningTasks: 0
[2016-05-23 15:04:11,306] DEBUG oarseMesosSchedulerBackend [] [] - Declining offer: dc18c89f-d802-404b-9221-71f0f15b096f-O4568 with attributes: Map() mem: 47244.0 cpu: 8
[2016-05-23 15:04:11,306] DEBUG oarseMesosSchedulerBackend [] [] - Declining offer: dc18c89f-d802-404b-9221-71f0f15b096f-O4569 with attributes: Map() mem: 63403.0 cpu: 8
[2016-05-23 15:04:11,505] DEBUG cheduler.TaskSchedulerImpl [] [akka://JobServer/user/context-supervisor/sql_context] - parentName: , name: TaskSet_0, runningTasks: 0
[2016-05-23 15:04:12,308] DEBUG oarseMesosSchedulerBackend [] [] - Declining offer: dc18c89f-d802-404b-9221-71f0f15b096f-O4570 with attributes: Map() mem: 47244.0 cpu: 8
[2016-05-23 15:04:12,505] DEBUG cheduler.TaskSchedulerImpl [] [akka://JobServer/user/context-supervisor/sql_context] - parentName: , name: TaskSet_0, runningTasks: 0
In master2 logs
May 23 08:19:44 ants-vps mesos-master[1866]: E0523 08:19:44.273349 1902 process.cpp:1958] Failed to shutdown socket with fd 28: Transport endpoint is not connected
May 23 08:19:54 ants-vps mesos-master[1866]: I0523 08:19:54.274245 1899 replica.cpp:673] Replica in VOTING status received a broadcasted recover request from (1257)#127.0.0.1:5050
May 23 08:19:54 ants-vps mesos-master[1866]: E0523 08:19:54.274533 1902 process.cpp:1958] Failed to shutdown socket with fd 28: Transport endpoint is not connected
May 23 08:20:04 ants-vps mesos-master[1866]: I0523 08:20:04.275291 1897 replica.cpp:673] Replica in VOTING status received a broadcasted recover request from (1260)#127.0.0.1:5050
May 23 08:20:04 ants-vps mesos-master[1866]: E0523 08:20:04.275512 1902 process.cpp:1958] Failed to shutdown socket with fd 28: Transport endpoint is not connected
From master3:
May 23 08:21:05 ants-vps mesos-master[22023]: I0523 08:21:05.994082 22042 recover.cpp:193] Received a recover response from a replica in EMPTY status
May 23 08:21:15 ants-vps mesos-master[22023]: I0523 08:21:15.994051 22043 recover.cpp:109] Unable to finish the recover protocol in 10secs, retrying
May 23 08:21:15 ants-vps mesos-master[22023]: I0523 08:21:15.994529 22036 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (1282)#127.0.0.1:5050
How to find the reason of that issues and fix it?