Deploying cassandra-mesos framework with marathon - cassandra

I'm using the latest jar files of cassandra-mesos framework (by using this jason file: https://teamcity.mesosphere.io/repository/download/Oss_Mesos_Cassandra_CassandraFramework/97399:id/marathon.json), but getting the following errors:
I0310 13:19:34.699774 16389 sched.cpp:264] No credentials provided.
Attempting to register without authentication I0310 13:19:34.701026
16389 sched.cpp:819] Got error 'Completed framework attempted to
re-register' I0310 13:19:34.701038 16389 sched.cpp:1625] Asked to
abort the driver I0310 13:19:34.701364 16389 sched.cpp:861] Aborting
framework '20160309-183453-2497969674-5050-19271-0001' I0310
13:19:34.719744 16373 sched.cpp:1591] Asked to stop the driver I0310
13:19:34.719784 16389 sched.cpp:835] Stopping framework
'20160309-183453-2497969674-5050-19271-0001'
Any idea?

The error says Completed framework attempted to re-register which means the framework keeps its state somewhere (probably in Zookeeper, but cannot access your URL with marathon.json to verify), and thus tries to start with the framework ID stored in this state. However, that framework ID is already deregistered, and Mesos does not allow you to start the framework with the same ID again.
The solution to this would be either to pick a different znode for framework storage or remove the existing znode before starting the framework.

Thanks a lot :-). It's working now. but when I tried to check zookeeper for cassandra-mesos, I got the following error: mesos-resolve zk://mesos-master-2:2181/cassandra-mesos/cassandra-mesos-fw
2016-03-13 12:46:22,428:26613(0x7fa4fa843700):ZOO_INFO#log_env#712: Client environment:zookeeper.version=zookeeper C client 3.4.5
2016-03-13 12:46:22,428:26613(0x7fa4fa843700):ZOO_INFO#log_env#716: Client environment:host.name=mesos-slave-1
2016-03-13 12:46:22,428:26613(0x7fa4fa843700):ZOO_INFO#log_env#723: Client environment:os.name=Linux
2016-03-13 12:46:22,428:26613(0x7fa4fa843700):ZOO_INFO#log_env#724: Client environment:os.arch=3.10.0-327.4.4.el7.x86_64
2016-03-13 12:46:22,428:26613(0x7fa4fa843700):ZOO_INFO#log_env#725: Client environment:os.version=#1 SMP Tue Jan 5 16:07:00 UTC 2016
2016-03-13 12:46:22,428:26613(0x7fa4fa843700):ZOO_INFO#log_env#733: Client environment:user.name=root
2016-03-13 12:46:22,428:26613(0x7fa4fa843700):ZOO_INFO#log_env#741: Client environment:user.home=/root
2016-03-13 12:46:22,428:26613(0x7fa4fa843700):ZOO_INFO#log_env#753: Client environment:user.dir=/ephemeral/cassandra-mesos
2016-03-13 12:46:22,428:26613(0x7fa4fa843700):ZOO_INFO#zookeeper_init#786: Initiating client connection, host=mesos-master-2:2181 sessionTimeout=10000 watcher=0x7fa5023200b0 sessionId=0 sessionPasswd= context=0x7fa4d8001ec0 flags=0
2016-03-13 12:46:22,429:26613(0x7fa4f6628700):ZOO_INFO#check_events#1703: initiated connection to server [10.254.227.148:2181]
2016-03-13 12:46:22,434:26613(0x7fa4f6628700):ZOO_INFO#check_events#1750: session establishment complete on server [10.254.227.148:2181], sessionId=0x25364fb9f3a0020, negotiated timeout=10000
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0313 12:46:22.434587 26616 group.cpp:313] Group process (group(1)#10.254.235.46:56890) connected to ZooKeeper
I0313 12:46:22.434659 26616 group.cpp:787] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
I0313 12:46:22.434670 26616 group.cpp:385] Trying to create path '/cassandra-mesos/cassandra-mesos-fw' in ZooKeeper
Failed to detect master from 'zk://mesos-master-2:2181/cassandra-mesos/cassandra-mesos-fw' within 5secs

Related

Error on running integration tests while building hono from source - certificate expired

I followed the steps for building hono from source from this page https://www.eclipse.org/hono/docs/dev-guide/building_hono/
The build completes without errors, but when running the integration tests, I receive lots of errors related to timeouts and expired certificates. Here's an excerpt of the log:
HTTP11:59:07.040 [main] INFO o.e.h.adapter.http.impl.Application - The following profiles are active: prod
ARTEMIS2020-10-29 11:59:09,455 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) while handshaking with /172.19.0.5:38356 has occurred.
QPID2020-10-29 11:59:09.827479 +0000 SERVER (info) [C4] Connection to hono-artemis.hono:5671 failed: amqp:connection:framing-error SSL Failure: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed
Is it possible to update the certificates or is this not the main problem?
Here's the link to the complete log file
You are probably building sources from one of the previously released tags, i.e. not from the master branch, right?
If so, the demo certificates included in the source code may have expired in the meantime since the time of release. You can re-create the certificates by running the demo-certs/create_certs.sh script. Note that this needs to be done once only, i.e. it is not necessary to do this before each build.

Client stuck at 'Authenticating' phase

I've set up Azerothcore on a VPS following this tutorial. I've created an account but when I try to log in my client gets stuck on the authenticating phase.
I followed the tutorial completely except I had to download the data files from the recommended link in the AzerothCore Wiki, because my worldserver did not recognize the files provided in the tutorial.
I've checked the config files and database and everything seems ok. The address is what it should be (my VPS address) and ports seem to be ok, too. I've tried redownloading the client (WoWmane WotLK 3.3.5a client, with WoD models), checking my firewall (exceptions added for the WoW client) and checking the realmlist.wtf file and config file, to no avail. My folder is not read-only and I'm really lost now.
EDIT: I've now managed to get the 'Malformed package' error again. I started the auth server, then the world server, then tried to log in, then shut down both servers after the client got stuck again. I'll paste the relevant portion of the server log file:
2019-08-25 03:03:49 ERROR: WORLD: World initialized in 0 minutes 13 seconds
2019-08-25 03:03:49
2019-08-25 03:03:49 worldserver process priority class set to -15
2019-08-25 03:03:49 Max allowed socket connections 1024
2019-08-25 03:03:49 Starting up Auction House Listing thread...
2019-08-25 03:03:49 AzerothCore rev. 2f74802d03d5 2019-08-23 22:22:26 +0200 (master branch) (Unix, Release) (worldserver-daemon) ready...
2019-08-25 03:04:16 ERROR: WorldSocket::handle_input_header(): client (account: 0, char [GUID: 0, name: <none>]) sent malformed packet (size: 8, cmd: 1867972643)
2019-08-25 03:04:42 Auction House Listing thread exiting without problems.
2019-08-25 03:04:42 Halting process...

Kafka Zookeeper Security Authentication & Authorization(JAAS) Using SASL

Regarding Kafka-Zookeeper Security using DIGEST MD5 Authentication, I am trying to rotate/change credentials/password for both server(zookeeper) and client(kafka) jaas config file.
We have a 3 node cluster of 3 zookeepers and 3 kafka broker nodes with below jaas configuration file.
kafka.conf
org.apache.zookeeper.server.auth.DigestLoginModule required
username="super"
password="password";
};
zookeeper.conf
Server {
org.apache.zookeeper.server.auth.DigestLoginModule required
user_super="password";
};
To rotate we do a rolling restart of server(zookeeper) instances after updating the credential(password) and during the process of rolling restart after updating the same credential/password for super user for client(kafka instances) one at a time, we notice
[2019-06-15 17:17:38,929] INFO [ZooKeeperClient] Waiting until connected. (kafka.zookeeper.ZooKeeperClient)
[2019-06-15 17:17:38,929] INFO [ZooKeeperClient] Connected. (kafka.zookeeper.ZooKeeperClient)
these info level in server logs, which eventually results in unclean shutdown and restart of the broker which impacts the writes and reads for longer than expected. I have tried commenting requireClientAuthScheme=sasl in zookeeper zoo.cfg https://cwiki.apache.org/confluence/display/ZOOKEEPER/Client-Server+mutual+authentication to allow any clients authenticate to zookeeper but no success.
Also, alternative approach - tried to update the credential/password in jaas config file dynamically using sasl.jaas.config and do get the same exception documented in this jira (reference: https://issues.apache.org/jira/browse/KAFKA-8010).
can someone have any suggestions? Thanks in advance.

prestodb| worker not found & v1/collector/general is not returning any values

Hi I have configured prestodb with one coordinator and one worker.
when I run the worker I do get the message like
Discovery server connect succeeded for refresh (presto/general)
Discovery server connect succeeded for refresh (collector/general)
io.airlift.discovery.client.Announcer Discovery server connect succeeded for announce
however when I run a query it says worker not available.
also when I try to see if the below urls works
http://<master>/v1/service/presto/general - works ( i can see both nodes)
However when i use
http://<master>//v1/service/collector/general - doesn't work below is the result
{"environment":"dev","services":[]}
Server config.properties
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8000
query.max-memory=50GB
query.max-memory-per-node=3GB
discovery-server.enabled=true
discovery.uri=http://gdcrtdev01.[domain]:8000
Worker config.properties
coordinator=false
http-server.http.port=8000
query.max-memory=50GB
query.max-memory-per-node=1GB
discovery.uri=http://gdcrtdev01.[domain]:8000

We are running a map reduce/spark job to bulk load hbase data in One of the environment

We are running a map reduce/spark job to bulk load hbase data in one of the environments.
While running it, connection to the hbase zookeeper cannot initialize throwing the following error.
16/05/10 06:36:10 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=c321shu.int.westgroup.com:2181,c149jub.int.westgroup.com:2181,c167rvm.int.westgroup.com:2181 sessionTimeout=90000 watcher=hconnection-0x74b47a30, quorum=c321shu.int.westgroup.com:2181,c149jub.int.westgroup.com:2181,c167rvm.int.westgroup.com:2181, baseZNode=/hbase
16/05/10 06:36:10 INFO zookeeper.ClientCnxn: Opening socket connection to server c321shu.int.westgroup.com/10.204.152.28:2181. Will not attempt to authenticate using SASL (unknown error)
16/05/10 06:36:10 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /10.204.24.16:35740, server: c321shu.int.westgroup.com/10.204.152.28:2181
16/05/10 06:36:10 INFO zookeeper.ClientCnxn: Session establishment complete on server c321shu.int.westgroup.com/10.204.152.28:2181, sessionid = 0x5534bebb441bd3f, negotiated timeout = 60000
16/05/10 06:36:11 INFO mapreduce.HFileOutputFormat2: Looking up current regions for table ecpdevv1patents:NormNovusDemo
Exception in thread "main" org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions:
Tue May 10 06:36:11 CDT 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller#3927df20, java.io.IOException: Call to c873gpv.int.westgroup.com/10.204.67.9:60020 failed on local exception: java.io.EOFException
We have executed the same job in Titan DEV too but facing the same problem. Please let us know if anyone has faced the same problem before.
Details are,
• Earlier job was failing to connect to localhost/127.0.0.1:2181. Hence only the property hbase.zookeeper.quorum has been set in map reduce code with c149jub.int.westgroup.com,c321shu.int.westgroup.com,c167rvm.int.westgroup.com which we got from hbase-site.xml.
• We are using jars of cdh version 5.3.3.

Resources