Cannot add members to MongoDB Replica Set - linux

I'm trying to configure a MongoDB Replica Set but every time I try to add another member it fails.
I have 3 members I'm trying to configure. Their mongod.conf files all look like this:
# mongo.conf
#where to log
logpath=/log/mongod.log
logappend=true
# fork and run in background
fork = true
smallfiles=true
rest=true
port = 27017
replSet=KidzpaceReplSet
dbpath=/data
With the acception of the ports. They are 27017(Primary), 27018(Secondary) and 27019(Arbiter) respectively.
I have verified that the members can see each other:
[ec2-user#domU-12-31-39-06-C4-74 ~]$ mongo --host 174.129.232.170 --port 27018
MongoDB shell version: 2.4.3
connecting to: 174.129.232.170:27018/test
>
[ec2-user#domU-12-31-39-0A-30-E8 ~]$ mongo --host 174.129.230.20 --port 27017
MongoDB shell version: 2.4.3
connecting to: 174.129.230.20:27017/test
>
When adding the second member to the set it returns OK:
KidzpaceReplSet:PRIMARY> rs.add("174.129.232.170:27018")
{ "ok" : 1 }
However whatever the next command I run is, In this case it's adding my Arbiter, the set fails with this error:
KidzpaceReplSet:PRIMARY> rs.add("174.129.232.177:27019", true)
Tue May 28 20:24:07.139 DBClientCursor::init call() failed
Tue May 28 20:24:07.140 trying reconnect to 127.0.0.1:27017
Tue May 28 20:24:07.141 reconnect 127.0.0.1:27017 ok
reconnected to server after rs command (which is normal)
This is the the log file:
Tue May 28 20:44:06.173 [rsStart] replSet I am domU-12-31-39-06-C4-74:27017
Tue May 28 20:44:06.173 [rsStart] replSet STARTUP2
Tue May 28 20:44:07.175 [rsSync] replSet SECONDARY
Tue May 28 20:44:07.175 [rsMgr] replSet info electSelf 0
Tue May 28 20:44:08.174 [rsMgr] replSet PRIMARY
Tue May 28 20:44:29.813 [conn1] replSet replSetReconfig config object parses ok, 2 members specified
Tue May 28 20:44:29.817 [conn1] replSet replSetReconfig [2]
Tue May 28 20:44:29.817 [conn1] replSet info saving a newer config version to local.system.replset
Tue May 28 20:44:29.834 [conn1] replSet saveConfigLocally done
Tue May 28 20:44:29.834 [conn1] replSet info : additive change to configuration
Tue May 28 20:44:29.834 [conn1] replSet replSetReconfig new config saved locally
Tue May 28 20:44:39.835 [rsHealthPoll] DBClientCursor::init call() failed
Tue May 28 20:44:39.835 [rsHealthPoll] replset info 174.129.232.170:27018 heartbeat failed, retrying
Tue May 28 20:44:40.834 [rsHealthPoll] DBClientCursor::init call() failed
Tue May 28 20:44:40.834 [rsHealthPoll] replSet info 174.129.232.170:27018 is down (or slow to respond):
Tue May 28 20:44:40.835 [rsHealthPoll] replSet member 174.129.232.170:27018 is now in state DOWN
Tue May 28 20:44:40.835 [rsMgr] replSet total number of votes is even - add arbiter or give one member an extra vote
Tue May 28 20:44:40.835 [rsMgr] can't see a majority of the set, relinquishing primary
Tue May 28 20:44:40.835 [rsMgr] replSet relinquishing primary state
Tue May 28 20:44:40.835 [rsMgr] replSet SECONDARY
Tue May 28 20:44:40.835 [rsMgr] replSet closing client sockets after relinquishing primary
Tue May 28 20:44:42.044 [conn1] end connection 127.0.0.1:58727 (0 connections now open)
Tue May 28 20:44:46.150 [rsHealthPoll] replSet member 174.129.232.170:27018 is up
Tue May 28 20:44:46.151 [rsMgr] replSet not electing self, not all members up and we have been up less than 5 minutes
Tue May 28 20:44:52.156 [rsMgr] replSet not electing self, not all members up and we have been up less than 5 minutes
UPDATE
I'm wondering if maybe the problem is when I run rs.initiate(). It gives me this output:
{
"set" : "KidzpaceReplSet",
"date" : ISODate("2013-05-28T20:59:05Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "domU-12-31-39-06-C4-74:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 23,
"optime" : {
"t" : 1369774732,
"i" : 1
},
"optimeDate" : ISODate("2013-05-28T20:58:52Z"),
"self" : true
}
],
"ok" : 1
}
Notice the name of the member? "name" : "domU-12-31-39-06-C4-74:27017" Where does this name come from? It's not my IP Address. I'm not sure but maybe this could be the source of the problem.

So it turns out rs.initiate() might give the member that launches it some kind of internal alias for it's IP address. In my case it was: domU-12-31-39-06-C4-74.
The initial connection to the secondary is fine because the primary instigates it. However since the secondary now has this alias to use when it tries to talk back to the primary, it fails.
The solution was a to copy the existing configuration:
cfg = rs.conf()
manually change the name(host) of the primary node:
cfg.members[0].host = 666.666.666.666:27017
And reconfigure the replica set:
rs.reconfig(cfg)

Related

How to see python script errors being run from rsyslog action

This is my action in rsyslog.conf:
module(load="omprog")
if( $msg contains "UPDOWN") then {
action(type="omprog" binary="/etc/rsyslog.d/netmiko.py" template="RSYSLOG_TraditionalFileFormat")
}
This is the python script I am working on:
pattern = re.compile('GigabitEthernet0\/\d{1,2}')
def process_line(line):
state = ''
if 'to up' in line:
state = f'UP\n'
elif 'to down' in line:
state = f'DOWN\n'
file = open("/home/blinky/python.log","a")
result = re.findall(pattern, line)
if len(result) > 0:
file.write(f'{result} - {state}')
file.close()
try:
msg = sys.stdin.readline()
file = open("/home/blinky/python.log","a")
file.write(line)
file.close()
process_line(msg)
except Exception as e:
file = open("/etc/rsyslog.d/python_error.log","a")
file.write(e)
file.close()
So the issue I have is trying to debug the python script, I can not see any of the errors it produces, as you can see I am trying to output the exception to a file but I get nothing there either. I have looked in the log file and this is the response I get from doing a shut no shut on the switch port:
Nov 20 21:50:39 10.0.0.254 1281: Nov 20 21:50:38.013: %LINK-5-CHANGED: Interface GigabitEthernet0/14, changed state to administratively down
Nov 20 21:50:39 10.0.0.254 1282: Nov 20 21:50:39.013: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/14, changed state to down
Nov 20 21:50:39 repperio rsyslogd: omprog: program '/etc/rsyslog.d/netmiko.py' (pid 2006160) terminated; will be restarted [v8.2112.0 try https://www.rsyslog.com/e/2119 ]
Nov 20 21:50:39 repperio rsyslogd: action 'action-1-omprog' suspended (module 'omprog'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2112.0 try https://www.rsyslog.com/e/2007 ]
Nov 20 21:50:40 repperio rsyslogd: action 'action-1-omprog' resumed (module 'omprog') [v8.2112.0 try https://www.rsyslog.com/e/2359 ]
Nov 20 21:50:43 10.0.0.254 1283: Nov 20 21:50:42.756: %LINK-3-UPDOWN: Interface GigabitEthernet0/14, changed state to up
Nov 20 21:50:43 10.0.0.254 1284: Nov 20 21:50:43.756: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/14, changed state to up
Nov 20 21:50:43 repperio rsyslogd: child process (pid 2006317) exited with status 1 [v8.2112.0]
Nov 20 21:50:43 repperio rsyslogd: omprog: program '/etc/rsyslog.d/netmiko.py' (pid 2006317) terminated; will be restarted [v8.2112.0 try https://www.rsyslog.com/e/2119 ]
Nov 20 21:50:43 repperio rsyslogd: action 'action-1-omprog' suspended (module 'omprog'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2112.0 try https://www.rsyslog.com/e/2007 ]
Nov 20 21:50:44 repperio rsyslogd: action 'action-1-omprog' resumed (module 'omprog') [v8.2112.0 try https://www.rsyslog.com/e/2359 ]
The script monitors the Cisco switch for interfaces going up and down and triggers the python script, this in turn will alter the configuration of the switch port using Netmiko. Without the ability to debug the python script I am scuppered, any ideas?

How I can use grok instead of if/else conditions?

I have following log-lines for example:
Fri Jul 24 01:48:47.572 2020 Failed to fetch database name
Fri Jul 24 01:48:47.572 2020 Failed to fetch database name
Fri Jul 24 01:48:47.572 2020 Unable to connect with database
Now I want to differentiate if it is "Failed to fetch database" or "Unable to connect with database". In the first case I want to add the field "Severity = high" an in the other case "Severity = low". But I dont´t want to do it with multiple if/else conditions, because the performance won´t be good (I have many other cases - not only this two). So I wanted to do it with multiple groks like:
grok {
tag_on_failure => []
match => {"errormessage" => "^%{DATA}Failed to fetch database name%{DATA}" }
}
But this pattern isn´t working. Can anyone help me????

running background tasks through dramatic does not work

I'm trying to run background task processing, redis and rabbitMQ work in separate docker containers
#dramatiq.actor(store_results=True)
def count_words(url):
try:
response = requests.get(url)
count = len(response.text.split(" "))
print(f"There are {count} words at {url!r}.")
except requests.exceptions.MissingSchema:
print(f"Message dropped due to invalid url: {url!r}")
result_backend = RedisBackend(host="172.17.0.2", port=6379)
result_broker = RabbitmqBroker(host="172.17.0.5", port=5672)
result_broker.add_middleware(Results(backend=result_backend))
dramatiq.set_broker(result_broker)
message = count_words.send('https://github.com/Bogdanp/dramatiq')
print(message.get_result(block=True))
RabbitMQ:
{"queue_name":"default","actor_name":"count_words","args":["https://github.com/Bogdanp/dramatiq"],"kwargs":{},"options":{},"message_id":"8e10b6ef-dfef-47dc-9f28-c6e07493efe4","message_timestamp":1608877514655}
Redis
1:C 22 Dec 2020 13:38:15.415 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
1:M 22 Dec 2020 13:38:15.417 * Running mode=standalone, port=6379.
1:M 22 Dec 2020 13:38:15.417 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 22 Dec 2020 13:38:15.417 # Server initialized
1:M 22 Dec 2020 13:38:15.417 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 22 Dec 2020 13:38:15.417 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
1:M 25 Dec 2020 10:08:12.274 * Background saving terminated with success
1:M 26 Dec 2020 19:23:59.445 * 1 changes in 3600 seconds. Saving...
1:M 26 Dec 2020 19:23:59.660 * Background saving started by pid 24
24:C 26 Dec 2020 19:23:59.890 * DB saved on disk
24:C 26 Dec 2020 19:23:59.905 * RDB: 4 MB of memory used by copy-on-write
1:M 26 Dec 2020 19:23:59.961 * Background saving terminated with success
Error:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/dramatiq/message.py", line 147, in get_result
return backend.get_result(self, block=block, timeout=timeout)
File "/usr/local/lib/python3.6/dist-packages/dramatiq/results/backends/redis.py", line 81, in get_result
raise ResultTimeout(message)
dramatiq.results.errors.ResultTimeout: count_words('https://github.com/Bogdanp/dramatiq')

SOUP UI Groovy || For loop is looping 50 times where the condition is using this number anywhere

Here I am just taking value(integer) from Properties file and using the same in for loop.
Note : If I use direct number instead of "getTestCasePropertyValue" value it work as expected. Not getting how loop is looping it 50 times.
Groovy script:
def getTestCasePropertyValue = testRunner.testCase.getPropertyValue( "NumOfPayments" )
log.info(getTestCasePropertyValue )
for(i=0; i<=getTestCasePropertyValue; i++)
{
log.info("Test Print"+i)
}
Output:
Fri Mar 06 12:58:47 IST 2020:INFO:2
Fri Mar 06 12:58:47 IST 2020:INFO:Test Print0
Fri Mar 06 12:58:47 IST 2020:INFO:Test Print1
Fri Mar 06 12:58:47 IST 2020:INFO:Test Print2
Fri Mar 06 12:58:47 IST 2020:INFO:Test Print3
...
Fri Mar 06 12:58:47 IST 2020:INFO:Test Print50
Your value from the properties is a String. You will detect problems like this easier, if you use .inspect() to log things.
Also the character '2' is 50 as integer, which then the for loop conditions casts this too.
def getTestCasePropertyValue = "2"
println(getTestCasePropertyValue.inspect())
// → '2'
println(getTestCasePropertyValue as char as int)
// → 50
So best explicitly cast to a number using e.g. .toLong() on the string:
println(getTestCasePropertyValue.toLong().inspect())
// → 2

rerun a test case in ready api using tear down script

I have a test case "Login" which intermittently fails due to login issues.
I would like to implement a tear down script to get the status of the script and rerun if it failed.
Here is what I implemented and it doesn't work as expected.
testRunner.testCase.setPropertyValue("LoginStatus",
testRunner.getStatus().toString())
def loginStatus = context.expand( '${#TestCase#LoginStatus}' )
int retryAttempts = context.expand( '${#Project#RetryAttempts}' ).toInteger()
def myContext = (com.eviware.soapui.support.types.StringToObjectMap)context
while ( loginStatus == "FAIL" && retryAttempts <= 1) {
retryAttempts = retryAttempts+1
log.info "increment retry attempts-" + retryAttempts
testRunner.testCase.testSuite.project.setPropertyValue( "RetryAttempts",
retryAttempts.toString() )
testCase.run(myContext, false)
log.info "after run statement-"+retryAttempts
}
log.info "before final statement"
testRunner.testCase.testSuite.project.setPropertyValue( "RetryAttempts", "0"
)
The script runs 3 times even though it is configured to rerun once. The logs
Fri May 18 13:55:15 EDT 2018:INFO:increment retry attempts-1
Fri May 18 13:55:16 EDT 2018:INFO:increment retry attempts-2
Fri May 18 13:55:16 EDT 2018:INFO:before final statement
Fri May 18 13:55:16 EDT 2018:INFO:after run statement-2
Fri May 18 13:55:16 EDT 2018:INFO:before final statement
Fri May 18 13:55:16 EDT 2018:INFO:after run statement-1
Fri May 18 13:55:16 EDT 2018:INFO:increment retry attempts-2
Fri May 18 13:55:17 EDT 2018:INFO:before final statement
Fri May 18 13:55:17 EDT 2018:INFO:after run statement-2
Fri May 18 13:55:17 EDT 2018:INFO:before final statement

Resources