Using CCM with Opscenter and manual agent installation

Using CCM with Opscenter and manual agent installation - cassandra

I installed with CCM a 4 nodes local cluster (127.0.0.1-127.0.0.4).
ccm create -v 2.1.5 -n 4 gp
I'm trying to manually install the agents to use OpsCenter (5.2.3.2015121015). I started the OpsCenter removing the JMX port, so the configuration is
[jmx]
username =
password =
[agents]
[cassandra]
username =
seed_hosts = 127.0.0.1,127.0.0.2,127.0.0.3,127.0.0.4
password =
cql_port = 9042
I've configured all the four agents. For example agent3 configuration is
[Giampaolo]: ~/opscenter/> cat agent3/conf/address.yaml
stomp_interface: "127.0.0.1"
agent_rpc_interface: 127.0.0.3
jmx_host: 127.0.0.3
jmx_port: 7300
I started the OpsCenter in foreground and I have this error. As I start agent4 I have this error:
2015-12-30 17:46:21+0100 [gp] ERROR: The state of the following nodes could not be determined, most likely due to agents on those nodes not being properly connected: [<Node 127.0.0.4='4611686018427387904'>, <Node 127.0.0.3='0'>, <Node 127.0.0.2='-4611686018427387904'>, <Node 127.0.0.1='-9223372036854775808'>]
2015-12-30 17:46:24+0100 [gp] INFO: Agent for ip 127.0.0.4 is version None
2015-12-30 17:46:24+0100 [gp] INFO: Agent for ip 127.0.0.4 is version u'5.2.3'
2015-12-30 17:46:37+0100 [gp] INFO: Nodes without agents: 127.0.0.3, 127.0.0.2, 127.0.0.1
2015-12-30 17:46:51+0100 [gp] ERROR: The state of the following nodes could not be determined, most likely due to agents on those nodes not being properly connected: [<Node 127.0.0.4='4611686018427387904'>, <Node 127.0.0.3='0'>, <Node 127.0.0.2='-4611686018427387904'>, <Node 127.0.0.1='-9223372036854775808'>]
I'm a bit confused by the fact that first log tells me that an agent was found for 4, while after a bit I have the error on console.
The agent 4 log tells me at the beginning that there are no errors (only INFO severity):
[Giampaolo]: ~/opscenter/>agent4/bin/datastax-agent -f
INFO [main] 2015-12-30 17:46:20,399 Loading conf files: ./conf/address.yaml
INFO [main] 2015-12-30 17:46:20,461 Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.8.0_40
INFO [main] 2015-12-30 17:46:20,462 DataStax Agent version: 5.2.3
INFO [main] 2015-12-30 17:46:20,489 Default config values: {:cassandra_port 9042, :rollups300_ttl 2419200, :finished-request-cache-size 100, :settings_cf "settings", :agent_rpc_interface "127.0.0.4", :restore_req_update_period 60, :my_channel_prefix "/agent", :poll_period 60, :monitored_cassandra_pass "*REDACTED*", :thrift_conn_timeout 10000, :cassandra_pass "*REDACTED*", :rollups60_ttl 604800, :stomp_port 61620, :shorttime_interval 10, :longtime_interval 300, :max-seconds-to-sleep 25, :private-conf-props {:cassandra.yaml #{"broadcast_address" "rpc_address" "broadcast_rpc_address" "listen_address" "initial_token"}, :cassandra-rackdc.properties #{}}, :thrift_port 9160, :agent-conf-group "global-cluster-agent-group", :jmx_host "127.0.0.4", :ec2_metadata_api_host "169.254.169.254", :metrics_enabled 1, :async_queue_size 5000, :backup_staging_dir nil, :remote_verify_max 30000, :disk_usage_update_period 60, :throttle-bytes-per-second 500000, :rollups7200_ttl 31536000, :trace_delay 300, :remote_backup_retries 3, :cassandra_user "*REDACTED*", :ssl_keystore nil, :rollup_snapshot_period 300, :is_package false, :monitor_command "/usr/share/datastax-agent/bin/datastax_agent_monitor", :thrift_socket_timeout 5000, :remote_verify_initial_delay 1000, :cassandra_log_location "/var/log/cassandra/system.log", :restore_on_transfer_failure false, :ssl_keystore_password "*REDACTED*", :tmp_dir "/var/lib/datastax-agent/tmp/", :monitored_thrift_port 9160, :config_md5 nil, :jmx_port 7400, :jmx_metrics_threadpool_size 4, :use_ssl 0, :max_pending_repairs 5, :rollups86400_ttl 0, :monitored_cassandra_user "*REDACTED*", :nodedetails_threadpool_size 3, :api_port 61621, :monitored_ssl_keystore nil, :slow_query_fetch_size 2000, :kerberos_service nil, :backup_file_queue_max 10000, :jmx_thread_pool_size 5, :production 1, :monitored_cassandra_port 9042, :runs_sudo 1, :max_file_transfer_attempts 30, :config_encryption_active false, :running-request-cache-size 500, :monitored_ssl_keystore_password "*REDACTED*", :stomp_interface "127.0.0.1", :storage_keyspace "OpsCenter", :hosts ["127.0.0.1"], :rollup_snapshot_threshold 300, :jmx_retry_timeout 30, :unthrottled-default 10000000000, :multipart-chunk-size 5000000, :remote_backup_retry_delay 5000, :sstableloader_max_heap_size nil, :jmx_operations_pool_size 4, :slow_query_refresh 5, :remote_backup_timeout 1000, :slow_query_ignore ["OpsCenter" "dse_perf"], :max_reconnect_time 15000, :seconds-to-read-kill-channel 0.005, :slow_query_past 3600000, :realtime_interval 5, :pdps_ttl 259200}
INFO [main] 2015-12-30 17:46:20,492 Waiting for the config from OpsCenter
INFO [main] 2015-12-30 17:46:20,493 Attempting to determine Cassandra's broadcast address through JMX
INFO [main] 2015-12-30 17:46:20,494 Starting Stomp
INFO [main] 2015-12-30 17:46:20,495 Starting up agent communcation with OpsCenter.
INFO [main] 2015-12-30 17:46:24,595 Reconnecting to a backup OpsCenter instance
INFO [main] 2015-12-30 17:46:24,597 SSL communication is disabled
INFO [main] 2015-12-30 17:46:24,597 Creating stomp connection to 127.0.0.1:61620
INFO [StompConnection receiver] 2015-12-30 17:46:24,603 Reconnecting in 0s.
INFO [StompConnection receiver] 2015-12-30 17:46:24,608 Connected to 127.0.0.1:61620
INFO [main] 2015-12-30 17:46:24,609 Starting Jetty server: {:join? false, :ssl? false, :host "127.0.0.4", :port 61621}
INFO [Initialization] 2015-12-30 17:46:24,608 Sleeping for 2s before trying to determine IP over JMX again
INFO [StompConnection receiver] 2015-12-30 17:46:24,681 Got new config from OpsCenter [note values in address.yaml override those from OpsCenter]: {:cassandra_port 9042, :rollups300_ttl 2419200, :destinations [], :restore_req_update_period 1, :monitored_cassandra_pass "*REDACTED*", :cassandra_pass "*REDACTED*", :cassandra_rpc_interface "127.0.0.4", :rollups60_ttl 604800, :jmx_pass "*REDACTED*", :thrift_port 9160, :ec2_metadata_api_host "169.254.169.254", :metrics_enabled 1, :backup_staging_dir "", :rollups7200_ttl 31536000, :cassandra_user "*REDACTED*", :jmx_user "*REDACTED*", :metrics_ignored_column_families "", :cassandra_log_location "/var/log/cassandra/system.log", :monitored_thrift_port 9160, :config_md5 "e78e9aaea4de0b15ec94b11c6b2788d5", :provisioning 0, :use_ssl 0, :max_pending_repairs 5, :rollups86400_ttl -1, :monitored_cassandra_user "*REDACTED*", :api_port "61621", :monitored_cassandra_port 9042, :storage_keyspace "OpsCenter", :hosts ["127.0.0.4"], :metrics_ignored_solr_cores "", :metrics_ignored_keyspaces "system, system_traces, system_auth, dse_auth, OpsCenter", :rollup_subscriptions [], :jmx_operations_pool_size 4, :cassandra_install_location ""}
INFO [StompConnection receiver] 2015-12-30 17:46:24,693 Couldn't get broadcast address, will retry in five seconds.
INFO [Jetty] 2015-12-30 17:46:24,715 Jetty server started
INFO [Initialization] 2015-12-30 17:46:26,615 Sleeping for 4s before trying to determine IP over JMX again
INFO [StompConnection receiver] 2015-12-30 17:46:29,696 Couldn't get broadcast address, will retry in five seconds.
however after a while:
INFO [qtp153482676-24] 2015-12-30 17:49:07,057 HTTP: :get /cassandra/conf {:private_props "True"} - 500
ERROR [qtp153482676-24] 2015-12-30 17:49:09,084 Unhandled route Exception (:bad-permissions): Unable to locate the cassandra.yaml configuration file. If your configuration file is not located with the Cassandra install, please set the 'conf_location' option in the Cassandra section of the OpsCenter cluster configuration file and restart opscenterd. Checked the following locations: /etc/dse/cassandra/cassandra.yaml, /etc/cassandra/conf/cassandra.yaml, /etc/cassandra/cassandra.yaml
INFO [qtp153482676-24] 2015-12-30 17:49:09,085 HTTP: :get /cassandra/conf {:private_props "True"} - 500
INFO [StompConnection receiver] 2015-12-30 17:49:09,845 Couldn't get broadcast address, will retry in five seconds.
ERROR [qtp153482676-19] 2015-12-30 17:49:11,102 Unhandled route Exception (:bad-permissions): Unable to locate the cassandra.yaml configuration file. If your configuration file is not located with the Cassandra install, please set the 'conf_location' option in the Cassandra section of the OpsCenter cluster configuration file and restart opscenterd. Checked the following locations: /etc/dse/cassandra/cassandra.yaml, /etc/cassandra/conf/cassandra.yaml, /etc/cassandra/cassandra.yaml
I've tried after this to add the other agents, but I got strange results. Something like two agents are going to the same C* node so I've stopped trying before I think that this error causes the others.
Questions:
What is the error is OpsCenter log?
Is it somehow related to error on agent log?
Am I missing something on configuration (are more details needed?)
Why OpsCenter complains about a missing cassandra.yaml file? Shouldn't it deployable on any host even if it has not a local C* installation?
Thanks in advance
edit
I added some additional information:
[Giampaolo]: ~/> netstat -an | grep 7400
tcp4 0 0 127.0.0.1.7400 *.* LISTEN
should it be 127.0.0.4?
This is (part of ) configuration for node4:
[Giampaolo]: ~/.ccm/gp/node4/conf/> cat cassandra.yaml | grep 127
listen_address: 127.0.0.4
rpc_address: 127.0.0.4
- seeds: 127.0.0.1,127.0.0.2,127.0.0.3,127.0.0.4
and this is address.yaml for node4:
[Giampaolo]: ~/opscenter/agent4/conf/> cat address.yaml
stomp_interface: "127.0.0.1"
agent_rpc_interface: 127.0.0.4
jmx_host: 127.0.0.4
jmx_port: 7400
alias: "the4node"

The configuration was ok, except for a little detail. For every agent, in address.yaml file, localhost must be used instead of 127.0.0.1 in jmx_hostparameter as suggested in cassandra user mailing list

Related

How to prevent Python Kafka Producer fail due to use of logging

For some reason when I add any logging code before using a Kafka producer the connection fails. It took me some time to isolate the problem because the error logs pertained to the connection.
Without these two logging lines the code works just fine:
logging.basicConfig(level=logging.INFO)
here is the faulty code:
#!/usr/bin/env python
import traceback
import logging
from kafka import KafkaProducer
from kafka.errors import KafkaError
def produce_test(listen=False):
topics = ['topic1', 'topic2']
messages = [['key1', 'logging hell'], ['key1', 'silly defaults']]
encoded = []
for m in messages:
encoded.append((str(m[0]).encode('utf-8'), str(m[1]).encode('utf-8')))
logging.info(f"something I want to log")
producer = KafkaProducer(
bootstrap_servers='localhost:9092',
max_request_size=15048576,
)
for t in topics:
for m in encoded:
# pt = producer.partitions_for(t)
# print(f'partitions for {t}: {pt}')
print(f"sending: {m} to topic: {t}")
future = producer.send(topic=t, key=m[0], value=m[1])
producer.flush()
if listen:
try:
record_metadata = future.get(timeout=10)
# print(f'record_metadata: {record_metadata}')
except KafkaError:
# Decide what to do if produce request failed...
print(traceback.format_exc())
result = 'Fail'
finally:
producer.close()
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
print('in producer calzone di Napoli')
produce_test()
here is the error log:
in producer calzone di Napoli
INFO:root:something I want to log
INFO:kafka.conn:<BrokerConnection node_id=bootstrap-0 host=localhost:9092 <connecting> [IPv6 ('::1', 9092, 0, 0)]>: connecting to localhost:9092 [('::1', 9092, 0, 0) IPv6]
INFO:kafka.conn:Probing node bootstrap-0 broker version
ERROR:kafka.conn:Connect attempt to <BrokerConnection node_id=bootstrap-0 host=localhost:9092 <connecting> [IPv6 ('::1', 9092, 0, 0)]> returned error 61. Disconnecting.
INFO:kafka.conn:<BrokerConnection node_id=bootstrap-0 host=localhost:9092 <connecting> [IPv6 ('::1', 9092, 0, 0)]>: Closing connection. KafkaConnectionError: 61 ECONNREFUSED
INFO:kafka.conn:<BrokerConnection node_id=bootstrap-0 host=localhost:9092 <connecting> [IPv6 ('::1', 9092, 0, 0)]>: connecting to localhost:9092 [('::1', 9092, 0, 0) IPv6]
ERROR:kafka.conn:Connect attempt to <BrokerConnection node_id=bootstrap-0 host=localhost:9092 <connecting> [IPv6 ('::1', 9092, 0, 0)]> returned error 61. Disconnecting.
INFO:kafka.conn:<BrokerConnection node_id=bootstrap-0 host=localhost:9092 <connecting> [IPv6 ('::1', 9092, 0, 0)]>: Closing connection. KafkaConnectionError: 61 ECONNREFUSED
INFO:kafka.conn:<BrokerConnection node_id=bootstrap-0 host=localhost:9092 <connecting> [IPv4 ('127.0.0.1', 9092)]>: connecting to localhost:9092 [('127.0.0.1', 9092) IPv4]
INFO:kafka.conn:<BrokerConnection node_id=bootstrap-0 host=localhost:9092 <connecting> [IPv4 ('127.0.0.1', 9092)]>: Connection complete.
sending: (b'key1', b'logging hell') to topic: topic1
INFO:kafka.conn:Broker version identifed as 1.0.0
INFO:kafka.conn:Set configuration api_version=(1, 0, 0) to skip auto check_version requests on startup
INFO:kafka.conn:<BrokerConnection node_id=0 host=localhost:9092 <connecting> [IPv6 ('::1', 9092, 0, 0)]>: connecting to localhost:9092 [('::1', 9092, 0, 0) IPv6]
ERROR:kafka.conn:Connect attempt to <BrokerConnection node_id=0 host=localhost:9092 <connecting> [IPv6 ('::1', 9092, 0, 0)]> returned error 61. Disconnecting.
INFO:kafka.conn:<BrokerConnection node_id=0 host=localhost:9092 <connecting> [IPv6 ('::1', 9092, 0, 0)]>: Closing connection. KafkaConnectionError: 61 ECONNREFUSED
WARNING:kafka.client:Node 0 connection failed -- refreshing metadata
INFO:kafka.conn:<BrokerConnection node_id=0 host=localhost:9092 <connecting> [IPv4 ('127.0.0.1', 9092)]>: connecting to localhost:9092 [('127.0.0.1', 9092) IPv4]
INFO:kafka.conn:<BrokerConnection node_id=0 host=localhost:9092 <connecting> [IPv4 ('127.0.0.1', 9092)]>: Connection complete.
INFO:kafka.conn:<BrokerConnection node_id=bootstrap-0 host=localhost:9092 <connected> [IPv4 ('127.0.0.1', 9092)]>: Closing connection.
sending: (b'key1', b'silly defaults') to topic: topic1
sending: (b'key1', b'logging hell') to topic: topic2
sending: (b'key1', b'silly defaults') to topic: topic2
INFO:kafka.producer.kafka:Closing the Kafka producer with 0 secs timeout.
INFO:kafka.producer.kafka:Proceeding to force close the producer since pending requests could not be completed within timeout 0.
INFO:kafka.conn:<BrokerConnection node_id=0 host=localhost:9092 <connected> [IPv4 ('127.0.0.1', 9092)]>: Closing connection.
Process finished with exit code 0
I'm guessing the logger interferes with the Kafka library logger, but I just switched to confluent_kafka and bypassed the problem.

The logs you pasted show a producer successfully sending messages. Your application is working!
The couple of errors probably also happen when you disable logging, you just don't see them. Enabling logging does not break your application.
These errors happen because you're using localhost as a bootstrap server. The Kafka client has to resolve that name to an IP in order to connect to it. On your machine, it looks like localhost resolves to 2 IPs:
::1 on IPv6: The client tries to connect to that and fails
127.0.0.1 on IPv4: The client tries to connect to that and succeeds

How to capture data change in yugabyte db?

terminal 1:
postgres=# \c yugastore
You are now connected to database "yugastore" as user "postgres".
yugastore=# select count(*) from yugastore.users;
count
-------
2500
(1 row)
yugastore=# delete from yugastore.users;
DELETE 2500
(After starting insertion script at terminal 2)
yugastore=# select count(*) from yugastore.users;
ERROR: Query error: Restart read required at: { read: { physical: 1580057095845877 } local_limit: { physical: 1580057095880226 } global_limit: <min> in_txn_limit: <max> serial_no: 0 }
yugastore=# select count(*) from yugastore.users;
ERROR: Query error: Restart read required at: { read: { physical: 1580057098605539 } local_limit: { physical: 1580057098715271 } global_limit: <min> in_txn_limit: <max> serial_no: 0 }
terminal2:
yugastore.users table is created and being populated.
time: 11:44:31.796 cumulative records: 100
time: 11:44:32.608 cumulative records: 200
time: 11:44:32.909 cumulative records: 300
time: 11:44:33.213 cumulative records: 400
time: 11:44:33.661 cumulative records: 500
...
time: 11:46:24.710 cumulative records: 18900
time: 11:46:25.137 cumulative records: 19000
time: 11:46:25.606 cumulative records: 19100
terminal 3:
[root#srvr0 ~]# java -jar ./yb_cdc_connector.jar --table_name yugastore.users --master_addrs 127.0.0.1:7100 --log_only
[2020-01-26 11:45:57,844] INFO Starting CDC Kafka Connector... (org.yb.cdc.Main:28)
2020-01-26 11:45:58,201 [INFO|org.yb.cdc.KafkaConnector|KafkaConnector] Creating new YB client...
[2020-01-26 11:46:02,853] INFO Discovered tablet YB Master for table YB Master with partition ["", "") (org.yb.client.AsyncYBClient:1593)
[2020-01-26 11:46:03,724] ERROR [Peer fakeUUID -> 127.0.0.1:9100] Tablet server sent error Invalid argument (yb/rpc/yb_rpc.cc:411): Call on service yb.cdc.CDCService received from Connection (0x0000000005b8e2d0) server 127.0.0.1:46926 => 127.0.0.1:9100 with an invalid method name: CreateCDCStream (org.yb.client.TabletClient:380)
2020-01-26 11:46:03,725 [ERROR|org.yb.cdc.Main|Main] Application ran into error:
org.yb.client.NonRecoverableException: [Peer fakeUUID -> 127.0.0.1:9100] Tablet server sent error Invalid argument (yb/rpc/yb_rpc.cc:411): Call on service yb.cdc.CDCService received from Connection (0x0000000005b8e2d0) server 127.0.0.1:46926 => 127.0.0.1:9100 with an invalid method name: CreateCDCStream
at org.yb.client.TabletClient.decode(TabletClient.java:379)
at org.yb.client.TabletClient.decode(TabletClient.java:98)
at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500)
at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.yb.client.TabletClient.handleUpstream(TabletClient.java:608)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.handler.timeout.ReadTimeoutHandler.messageReceived(ReadTimeoutHandler.java:184)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.yb.client.AsyncYBClient$TabletClientPipeline.sendUpstream(AsyncYBClient.java:2002)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Update1:
After installing yugabyte db 2.0.10.0, error "Restart read required" resolved but no change logs printed:
Error logs:
[root#srvr0 ~]# java -jar ./yb-cdc-connector.jar --table_name yugastore.users --master_addrs 127.0.0.1:7100 --stream_id 1 --log_only
[2020-01-28 08:27:31,101] INFO Starting CDC Kafka Connector... (org.yb.cdc.Main:28)
2020-01-28 08:27:31,154 [INFO|org.yb.cdc.KafkaConnector|KafkaConnector] Creating new YB client...
[2020-01-28 08:27:32,288] INFO Discovered tablet YB Master for table YB Master with partition ["", "") (org.yb.client.AsyncYBClient:1593)
2020-01-28 08:27:32,597 [INFO|org.yb.cdc.KafkaConnector|KafkaConnector] Polling for new tablet ce5115a780224cd0ab8a8e9c1a46b961
2020-01-28 08:27:32,604 [INFO|org.yb.cdc.KafkaConnector|KafkaConnector] Polling for new tablet cca5b30bb7784ae2a8796097d6fd5b2f
2020-01-28 08:27:32,694 [ERROR|org.yb.cdc.Poller|Poller] Invalid Request
2020-01-28 08:27:32,695 [ERROR|org.yb.cdc.Poller|Poller] Invalid Request
[root#srvr0 ~]#
Please help me in resolving the issues.

The read restart issue that you see with select count(*) .. query has been fixed and is available from version 2.0.5.2: https://github.com/yugabyte/yugabyte-db/commit/3212616e351647436f808d4963d229e7881996c8.
Similarly, it seems like you are using an older, deprecated version of the CDC connector. You can get the connector using:
wget -O yb-cdc-connector.jar https://github.com/yugabyte/yb-kafka-connector/blob/master/yb-cdc/yb-cdc-connector.jar?raw=true
And then run:
java -jar ./yb-cdc-connector.jar --table_name yugastore.users --master_addrs 127.0.0.1:7100 --log_only

Running into RedisTimeoutException and other exceptions with Redisson and Azure Redis Cache

A lot of timeout exceptions and Can't add slave exceptions
Steps to reproduce or test case
intermittent
Redis version
Azure Redis Cache with 5 shards
4.0.14, 3.2.7
Redisson version
3.11.4
Redisson configuration
Default clustered config with the following overrides:
REDIS_ENABLED | true
REDIS_KEEP_ALIVE | true
REDIS_THREADS | 512
REDIS_NETTY_THREADS | 1024
REDIS_MASTER_CONNECTION_MINIMUM_IDLE_SIZE | 5
REDIS_MASTER_CONNECTION_POOL_SIZE | 10
REDIS_SLAVE_CONNECTION_MINIMUM_IDLE_SIZE | 5
REDIS_SLAVE_CONNECTION_POOL_SIZE | 10
REDIS_TIMEOUT | 1000
REDIS_RETRY_INTERVAL | 500
REDIS_TCP_NO_DELAY | true
I see following exceptions in the log:
`
exception: { [-]
class: org.redisson.client.RedisConnectionException
thrownfrom: unknown
}
level: ERROR
logger_name: org.redisson.cluster.ClusterConnectionManager
message: Can't add slave: rediss://:15002
process: 6523
stack_trace: org.redisson.client.RedisTimeoutException: Command execution timeout for command: (READONLY), params: [], Redis client: [addr=rediss://:15002]
at org.redisson.client.RedisConnection.lambda$async$1(RedisConnection.java:207)
at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:680)
at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:755)
at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:483)
... 2 common frames omitted
Wrapped by: org.redisson.client.RedisConnectionException: Unable to connect to Redis server: /:15002
at org.redisson.connection.pool.ConnectionPool$1.lambda$run$0(ConnectionPool.java:160)
at org.redisson.misc.RedissonPromise.lambda$onComplete$0(RedissonPromise.java:183)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:551)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
at org.redisson.misc.RedissonPromise.tryFailure(RedissonPromise.java:96)
at org.redisson.connection.pool.ConnectionPool.promiseFailure(ConnectionPool.java:330)
at org.redisson.connection.pool.ConnectionPool.lambda$createConnection$1(ConnectionPool.java:296)
at org.redisson.misc.RedissonPromise.lambda$onComplete$0(RedissonPromise.java:183)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
at org.redisson.misc.RedissonPromise.tryFailure(RedissonPromise.java:96)
at org.redisson.client.RedisClient$2$1.run(RedisClient.java:240)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518)
at i.n.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
stack_trace: org.redisson.client.RedisTimeoutException: Unable to get connection! Try to increase 'nettyThreads' and/or connection pool size settingsNode source: NodeSource [slot=15393, addr=redis://:15007, redisClient=null, redirect=MOVED, entry=null], command: (PSETEX), params: [some key, 3600000, PooledUnsafeDirectByteBuf(ridx: 0, widx: 457, cap: 512)] after 0 retry attempts
at org.redisson.command.RedisExecutor$2.run(RedisExecutor.java:209)
at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:680)
at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:755)
at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:483)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
level: ERROR
logger_name: org.redisson.cluster.ClusterConnectionManager
message: Can't add master: rediss://:15007 for slot ranges: [[15019-15564], [12288-13652], [4096-5461]]
process: 6574
thread_name: redisson-netty-2-718
timestamp: 2019-10-29 22:32:15.592
`
My logs are flooded with these exceptions and when I login to Azure portal, I see the CPU metric for Redis spiked to 100%. Any help is appreciated.

failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries = 6

I am trying to connect spark application with hbase. Below is the configuration I am giving
val conf = HBaseConfiguration.create()
conf.set("hbase.master", "localhost:16010")
conf.setInt("timeout", 120000)
conf.set("hbase.zookeeper.quorum", "2181")
val connection = ConnectionFactory.createConnection(conf)
and below are the 'jps' details:
5808 ResourceManager
8150 HMaster
8280 HRegionServer
5131 NameNode
8076 HQuorumPeer
5582 SecondaryNameNode
2798 org.eclipse.equinox.launcher_1.4.0.v20161219-1356.jar
8623 Jps
5951 NodeManager
5279 DataNode
I have alsotried with hbase master 16010
I am getting below error:
19/09/12 21:49:00 WARN ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.SocketException: Invalid argument
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1024)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060)
19/09/12 21:49:00 WARN ReadOnlyZKClient: 0x1e3ff233 to 2181:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries = 4
19/09/12 21:49:01 INFO ClientCnxn: Opening socket connection to server 2181/0.0.8.133:2181. Will not attempt to authenticate using SASL (unknown error)
19/09/12 21:49:01 ERROR ClientCnxnSocketNIO: Unable to open socket to 2181/0.0.8.133:2181

Looks like there is a problem to join zookeeper.
Check first that zookeeper is started on your local host on port 2181.
netstat -tunelp | grep 2181 | grep -i LISTEN
tcp6 0 0 :::2181 :::* LISTEN
In your conf, in hbase.zookeeper.quorum property you have to pass the ip of your zookeeper and not the port (hbase.zookeeper.property.clientPort)
My hbase connector is build with :
val conf = HBaseConfiguration.create()
conf.set("hbase.zookeeper.quorum", "10.80.188.65")
conf.set("hbase.master", "10.80.188.64:60000")
conf.set("hbase.zookeeper.property.clientPort", "2181")
conf.set("zookeeper.znode.parent", "/hbase-unsecure")
val connection = ConnectionFactory.createConnection(conf)

Unable to rename temporary file in SFTP remote directory

In continuation to the post - http://forum.spring.io/forum/spring-projects/integration/119697-unable-to-rename-file-in-sftp-remote-directory-please-help
I am using the sftp:outbound-channel-adapter to upload a file into one of the sftp server. Once file is uploaded, API is unable to rename the temporary file sample_test.pgp.writing to sample_test.pgp.
Before uploading the file I verified in the sftp remote folder for file exist and remote folder was completely empty.
When I looked at the debug level log, I could see below message and it is failing at the end with the invalid path error message.
[main] DEBUG: com.ftp.util.FileUploadUtil - Upload for file /sample_test.pgp triggered
[main] DEBUG: org.springframework.integration.channel.DirectChannel - preSend on channel 'ftp.uploadgateway.request.channel', message: [Payload=/sample_test.pgp][Headers={timestamp=1406654118428, id=bbba360d-492d-4348-b2e7-566aec7f4209}]
[main] DEBUG: org.springframework.integration.filter.MessageFilter - org.springframework.integration.filter.MessageFilter#3970ae0 received message: [Payload=/sample_test.pgp][Headers={timestamp=1406654118428, id=bbba360d-492d-4348-b2e7-566aec7f4209}]
[main] DEBUG: org.springframework.integration.channel.DirectChannel - preSend on channel 'upload.file.to.sftp', message: [Payload=/sample_test.pgp][Headers={timestamp=1406654118428, id=bbba360d-492d-4348-b2e7-566aec7f4209}]
[main] DEBUG: org.springframework.integration.channel.DirectChannel - preSend on channel 'logger', message: [Payload=/sample_test.pgp][Headers={timestamp=1406654118428, id=bbba360d-492d-4348-b2e7-566aec7f4209}]
[main] DEBUG: org.springframework.integration.handler.LoggingHandler - org.springframework.integration.handler.LoggingHandler#0 received message: [Payload=/sample_test.pgp][Headers={timestamp=1406654118428, id=bbba360d-492d-4348-b2e7-566aec7f4209}]
[main] INFO : org.springframework.integration.handler.LoggingHandler - [Payload=/sample_test.pgp][Headers={timestamp=1406654118428, id=bbba360d-492d-4348-b2e7-566aec7f4209}]
[main] DEBUG: org.springframework.integration.channel.DirectChannel - postSend (sent=true) on channel 'logger', message: [Payload=/sample_test.pgp][Headers={timestamp=1406654118428, id=bbba360d-492d-4348-b2e7-566aec7f4209}]
[main] DEBUG: org.springframework.integration.file.remote.handler.FileTransferringMessageHandler - org.springframework.integration.file.remote.handler.FileTransferringMessageHandler#0 received message: [Payload=/sample_test.pgp][Headers={timestamp=1406654118428, id=bbba360d-492d-4348-b2e7-566aec7f4209}]
[main] INFO : com.jcraft.jsch - Connecting to remote.sever.com port 10022
[main] INFO : com.jcraft.jsch - Connection established
[main] INFO : com.jcraft.jsch - Remote version string: SSH-2.0-SSHD
[main] INFO : com.jcraft.jsch - Local version string: SSH-2.0-JSCH-0.1.49
[main] INFO : com.jcraft.jsch - CheckCiphers: aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour256
[main] INFO : com.jcraft.jsch - CheckKexes: diffie-hellman-group14-sha1
[main] INFO : com.jcraft.jsch - diffie-hellman-group14-sha1 is not available.
[main] INFO : com.jcraft.jsch - SSH_MSG_KEXINIT sent
[main] INFO : com.jcraft.jsch - SSH_MSG_KEXINIT received
[main] INFO : com.jcraft.jsch - kex: server: diffie-hellman-group14-sha1,diffie-hellman-group1-sha1,diffie-hellman-group-exchange-sha1
[main] INFO : com.jcraft.jsch - kex: server: ssh-rsa
[main] INFO : com.jcraft.jsch - kex: server: aes128-cbc,aes192-cbc,aes256-cbc,3des-cbc,blowfish-cbc
[main] INFO : com.jcraft.jsch - kex: server: aes128-cbc,blowfish-cbc,aes192-cbc,aes256-cbc,arcfour,arcfour128,arcfour256
[main] INFO : com.jcraft.jsch - kex: server: hmac-sha1,hmac-md5,hmac-sha1-96,hmac-md5-96,hmac-sha256,hmac-sha256#ssh.com
[main] INFO : com.jcraft.jsch - kex: server: hmac-sha1,hmac-md5,hmac-sha1-96,hmac-md5-96,hmac-sha256,hmac-sha256#ssh.com
[main] INFO : com.jcraft.jsch - kex: server: none,zlib
[main] INFO : com.jcraft.jsch - kex: server: none,zlib
[main] INFO : com.jcraft.jsch - kex: server:
[main] INFO : com.jcraft.jsch - kex: server:
[main] INFO : com.jcraft.jsch - kex: client: diffie-hellman-group1-sha1,diffie-hellman-group-exchange-sha1
[main] INFO : com.jcraft.jsch - kex: client: ssh-rsa,ssh-dss
[main] INFO : com.jcraft.jsch - kex: client: aes128-ctr,aes128-cbc,3des-ctr,3des-cbc,blowfish-cbc,aes192-cbc,aes256-cbc
[main] INFO : com.jcraft.jsch - kex: client: aes128-ctr,aes128-cbc,3des-ctr,3des-cbc,blowfish-cbc,aes192-cbc,aes256-cbc
[main] INFO : com.jcraft.jsch - kex: client: hmac-md5,hmac-sha1,hmac-sha2-256,hmac-sha1-96,hmac-md5-96
[main] INFO : com.jcraft.jsch - kex: client: hmac-md5,hmac-sha1,hmac-sha2-256,hmac-sha1-96,hmac-md5-96
[main] INFO : com.jcraft.jsch - kex: client: none
[main] INFO : com.jcraft.jsch - kex: client: none
[main] INFO : com.jcraft.jsch - kex: client:
[main] INFO : com.jcraft.jsch - kex: client:
[main] INFO : com.jcraft.jsch - kex: server->client aes128-cbc hmac-md5 none
[main] INFO : com.jcraft.jsch - kex: client->server aes128-cbc hmac-md5 none
[main] INFO : com.jcraft.jsch - SSH_MSG_KEXDH_INIT sent
[main] INFO : com.jcraft.jsch - expecting SSH_MSG_KEXDH_REPLY
[main] INFO : com.jcraft.jsch - ssh_rsa_verify: signature true
[main] WARN : com.jcraft.jsch - Permanently added 'remote.sever.com' (RSA) to the list of known hosts.
[main] INFO : com.jcraft.jsch - SSH_MSG_NEWKEYS sent
[main] INFO : com.jcraft.jsch - SSH_MSG_NEWKEYS received
[main] INFO : com.jcraft.jsch - SSH_MSG_SERVICE_REQUEST sent
[main] INFO : com.jcraft.jsch - SSH_MSG_SERVICE_ACCEPT received
[main] INFO : com.jcraft.jsch - Authentications that can continue: publickey,keyboard-interactive,password
[main] INFO : com.jcraft.jsch - Next authentication method: publickey
[main] INFO : com.jcraft.jsch - Authentications that can continue: keyboard-interactive,password
[main] INFO : com.jcraft.jsch - Next authentication method: keyboard-interactive
[main] INFO : com.jcraft.jsch - Authentication succeeded (keyboard-interactive).
[main] DEBUG: org.springframework.integration.util.SimplePool - Obtained new org.springframework.integration.sftp.session.SftpSession#6e75d758.
[main] DEBUG: org.springframework.integration.sftp.session.SftpSession - Initial File rename failed, possibly because file already exists. Will attempt to delete file: /inbox/sample_test.pgp and execute rename again.
[main] DEBUG: org.springframework.integration.file.remote.session.CachingSessionFactory - Releasing Session back to the pool.
[main] DEBUG: org.springframework.integration.util.SimplePool - Releasing org.springframework.integration.sftp.session.SftpSession#6e75d758 back to the pool
[main] DEBUG: com.ftp.service.CtrlMPOJO - ERROR UPLOADING FILES EXCEPTION IS
org.springframework.integration.MessageDeliveryException: Error handling message for file [/sample_test.pgp]
at org.springframework.integration.file.remote.handler.FileTransferringMessageHandler.handleMessageInternal(FileTransferringMessageHandler.java:183)
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:73)
at org.springframework.integration.dispatcher.UnicastingDispatcher.doDispatch(UnicastingDispatcher.java:115)
at org.springframework.integration.dispatcher.UnicastingDispatcher.dispatch(UnicastingDispatcher.java:102)
at org.springframework.integration.channel.AbstractSubscribableChannel.doSend(AbstractSubscribableChannel.java:77)
at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:157)
at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:128)
at org.springframework.integration.core.MessagingTemplate.doSend(MessagingTemplate.java:288)
at org.springframework.integration.core.MessagingTemplate.send(MessagingTemplate.java:149)
at org.springframework.integration.filter.MessageFilter.handleRequestMessage(MessageFilter.java:107)
at org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:134)
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:73)
at org.springframework.integration.dispatcher.UnicastingDispatcher.doDispatch(UnicastingDispatcher.java:115)
at org.springframework.integration.dispatcher.UnicastingDispatcher.dispatch(UnicastingDispatcher.java:102)
at org.springframework.integration.channel.AbstractSubscribableChannel.doSend(AbstractSubscribableChannel.java:77)
at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:157)
at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:128)
at org.springframework.integration.core.MessagingTemplate.doSend(MessagingTemplate.java:288)
at org.springframework.integration.core.MessagingTemplate.send(MessagingTemplate.java:149)
at org.springframework.integration.core.MessagingTemplate.convertAndSend(MessagingTemplate.java:189)
at org.springframework.integration.gateway.MessagingGatewaySupport.send(MessagingGatewaySupport.java:183)
at org.springframework.integration.gateway.GatewayProxyFactoryBean.invokeGatewayMethod(GatewayProxyFactoryBean.java:309)
at org.springframework.integration.gateway.GatewayProxyFactoryBean.doInvoke(GatewayProxyFactoryBean.java:269)
at org.springframework.integration.gateway.GatewayProxyFactoryBean.invoke(GatewayProxyFactoryBean.java:260)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
at $Proxy5.uploadFilesToFTP(Unknown Source)
at com.ftp.util.FileUploadUtil.scanDirectoryAndUpload(FileUploadUtil.java:123)
at com.ftp.service.CtrlMPOJO.main(CtrlMPOJO.java:160)
Caused by: org.springframework.integration.MessagingException: Failed to write to '/inbox/sample_test.pgp.writing' while uploading the file
at org.springframework.integration.file.remote.handler.FileTransferringMessageHandler.sendFileToRemoteDirectory(FileTransferringMessageHandler.java:266)
at org.springframework.integration.file.remote.handler.FileTransferringMessageHandler.handleMessageInternal(FileTransferringMessageHandler.java:172)
... 28 more
Caused by: org.springframework.core.NestedIOException: Failed to delete file /inbox/sample_test.pgp; nested exception is org.springframework.core.NestedIOException: Failed to remove file: 2: Specified file path is invalid.
at org.springframework.integration.sftp.session.SftpSession.rename(SftpSession.java:157)
at org.springframework.integration.file.remote.session.CachingSessionFactory$CachedSession.rename(CachingSessionFactory.java:137)
at org.springframework.integration.file.remote.handler.FileTransferringMessageHandler.sendFileToRemoteDirectory(FileTransferringMessageHandler.java:262)
... 29 more
Caused by: org.springframework.core.NestedIOException: Failed to remove file: 2: Specified file path is invalid.
at org.springframework.integration.sftp.session.SftpSession.remove(SftpSession.java:71)
at org.springframework.integration.sftp.session.SftpSession.rename(SftpSession.java:151)
... 31 more
It works if I set the use-temporary-file-name=false but I do not want to set this flag incase if there is any file watcher job which may pick up incomplete file loaded at the remote server end.
Here is the configuration i have
<int:gateway id="file.upload.gateway"
service-interface="ftp.outbound.FTPUploadGateway"
default-request-channel="ftp.uploadgateway.request.channel"
default-reply-channel="ftp.uploadgateway.response.channel" />
<int:filter
input-channel="ftp.uploadgateway.request.channel"
output-channel="ftp.file.exist.outbound.channel"
discard-channel="upload.file.to.sftp"
expression="${ftp.outbound.remote.file.check.flag:false}">
</int:filter>
<sftp:outbound-channel-adapter id="sftpOutboundAdapter"
session-factory="sftpSessionFactory"
channel="upload.file.to.sftp"
charset="UTF-8"
remote-directory="${ftp.outbound.remote.directory}"
use-temporary-file-name="${ftp.outbound.use.temporary.filename:true}"
remote-filename-generator-expression="${ftp.outbound.remote.filename.expression}"/>
Here are the property values
ftp.outbound.remote.file.check.flag=false
ftp.outbound.remote.directory=/inbox/
ftp.outbound.use.temporary.filename=true
ftp.outbound.remote.filename.expression=payload.getName()

Please show the complete configuration.
Is your sftp user chrooted? If not /inbox/... is trying to manipulate the root directory on the server, and he likely doesn't have permissions for that.
Try removing the leading / from the remote dir.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Using CCM with Opscenter and manual agent installation - cassandra

The configuration was ok, except for a little detail. For every agent, in address.yaml file, localhost must be used instead of 127.0.0.1 in jmx_hostparameter as suggested in cassandra user mailing list

Related

How to prevent Python Kafka Producer fail due to use of logging

How to capture data change in yugabyte db?

Running into RedisTimeoutException and other exceptions with Redisson and Azure Redis Cache

failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries = 6

Unable to rename temporary file in SFTP remote directory

Categories

Resources