Does gearmand with libdrizzle work while mysql-database is down for a while? - gearman

Use-Case:
The gearmand is fully operational with libdrizzle as persistence-layer to a mysql-database
The drizzle connection crashes (e.g. the gearmand-database is locked for some minutes during nightly backups, or the mysql server crashes or network-problems to the database-server).
Question:
Does the gearmand work without the persistence in this moment (MySQL) and catch up later?

Answer
No.
Details
Debian 6
gearmand 1.1.8 (via https://launchpad.net/gearmand)
exactly 5000 jobs to be created via doBackground
persist the jobs into mysql
/usr/local/sbin/gearmand -q mysql --mysql-user user1 --mysql-password
pass1 --mysql-db gearmand
Scenario #1
Scenario:
Enable READ lock for gearman queue table
Result:
The script, which creates the background tasks, is on hold.
After removing the READ lock, the script continues and creates all 5000 jobs successfully.
Note: I just tested the lock for some seconds. The script might crash due to a timeout.
Scenario #2
Scenario:
Stop the entire mysql server instance (with the gearman queue)
Result:
Without the mysqld, the jobs cannot be created.
3974 jobs out of 5000 have been created.
gearmand output:
mysql_stmt_prepare failed: Can't connect to local MySQL server through
socket X
PHP script output:
PHP Warning: GearmanClient::doBackground():
gearman_client_run_tasks:QUEUE_ERROR:QUEUE_ERROR
Unfortunately, with my test scenarios, the gearmand stops work if the mysql persistence layer is unavailable.

Related

Nodejs stuck on processing whenever the app is restarted

I have a nodejs application running on Linux, as we all know, whenever I restart the nodejs app it will get a new PID, suppose while the nodejs app is running, a client connects to it and running some process and the process status is processing, during that point of time, if the nodejs app restarts(on the server-side), how can we make sure the client connects back to the previous processing state.
What is happening now is, whenever the server restarts, the process stucks in processing forever.
Just direct me to a sample of how this scenario is handled in real life.
Thank You.
If I'm understanding you correctly, then the answer is you can't...
The reason for this is that, when you restart the process the event loop is restarted, meaning any processes that were running or were waiting in the event loop are gone. You are essentially clearing out the event loop when you restart.
I would say though, if you know the process is 'crashing' node then you probably want to look into that process and see why is crashing, place it in a try catch to it wont kill the server.
now with that said ( and without knowing what, processing state really means ) you could set a flag in your DB server for say 'job1' and have a status column of say 'running' when it was kicked off. When the node server restarts it can read Job status for 'running' jobs, if the 'job' is in a 'running' state you can fire off the job again and once complete update the table to 'completed'
This probably not the most efficient way as it's much better to figure out why the process if crashing, but as a fall-back this could work although in a clustered environment this could cause issues because server 1 may fail while server 2 is processing because server 1 does not know what server two is doing. With more details as to the use case, environment etc would probably allow for a better answer

Keep connection alive using --sysctl with Docker run

I currently have a container that is running node services. There code running creates a subscription to the Salesforce Change data capture event bus using CometD. This has been working well but after some time the service will stop receiving events from the Salesforce.
I am thinking this is happening because the alpine Linux container could be marking the connection as broken after data is not received after a while. I have verified that the CometD libraries are creating a connection with keep alive set as true.
Right now I am trying to increase the keep-alive time by running the container with the command:
docker run --sysctl net.ipv4.tcp_keepalive_time=10800 --sysctl net.ipv4.tcp_keepalive_intvl=60 --sysctl net.ipv4.tcp_keepalive_probes=20 -p 80:3000 <imageid>
My thinking behind this is:
net.ipv4.tcp_keepalive_time=10800
This means that the keepalive routines wait for three hours (1080 secs)
before sending the first keepalive probe
net.ipv4.tcp_keepalive_intvl=60
Resend the prob every 60 seconds
net.ipv4.tcp_keepalive_probes=20
If no ACK response is received for 20 consecutive times, the connection is
marked as broken.
I guess what I am asking is if this is the correct way to go about running the docker container so that sysctl will run with the settings I have passed in.
I am new to docker, so, I'm sure I did something that doesn't make sense. Thank you for any suggestions.

Start redis in the foreground and a node server with one command in parallel but in series

There are tools like npm-run-all that allow persistent processes to run in parallel in one process. I am interested in doing this with redis and a node server.
However I am looking for a way to run the two in parallel, but only run the node process when the redis process is verifiably successful.
Is there any unix / bash tool that can achieve what I want?
I can see this working in two ways:
Option 1
A tool that checks for specific stdout from a process for instance redis will write Ready to accept connections to stdout, the tool would watch for this as a regular expression. When it has received it an event internally would fire and the node server would be run.
Option 2
A tool that checks if / when the http connection is available for a specific server and when it receives a proper health check response the internal event is fired and the subsequent node server would be run. There would also need to be a timeout involved. The con with this is that it's only specific to processes that spin up servers and endpoints on a consistent local port.
How about a script that responds PING command?
!/bin/bash
X="`redis-cli ping`"
echo ${X}
while [ "${X}" != "PONG" ]; do
echo "redis not yet ready"
echo "${X}"
sleep 50
X="`redis-cli ping`"
done
echo 'Lets start node'

How can I prevent similar queues from running at the same time?

We currently process a set of tasks using Queue workers in Laravel. When I am using multiple threads of php artisan queue:work jobs end up running together (async). We are using Beanstalkd as the queue driver.
The issue is that in the queue work we are polling an API that only allows one concurrent session for a particular agent_id. That is, only one API call with the same agent_id can run at a time.
We thought of spinning up multiple php artisan queue:work threads with a filter on the queue_name matching the agent_id but we have over 500 agents therefore we would need 500 threads so this is not ideal.
Is there anyway to implement a lock style feature for each agent_id so that if a job is already running for a particular agent_id it will send it back to the queue? Or are there any features of beanstalkd that would allow for this?
The other option could also be to gracefully handle the rejection from the API when the user is already logged in (and send the job back to the queue). But this could get messy and could clutter the logs.
You could either run only a single worker that is capable of running the fetch-from-API job, or use some sort of external marshalling/lock service.
The options for that, may be either an internal rate limiting system, or some kind of common atomically locking system. A memcached or redis server where a worker tries to set a lock-key, and only the agent that successfully sets it, gets to work on the task. An advantage of that may be that as soon as the API request has been completed, you can remove the lock, and then while the worker processes the results, a different worker can make a new request.

To stop the EC2 instance after the execution of a script

I configured a ubuntu server(AWS ec2 instance) system as a cronserver, 9 cronjobs run between 4:15-7:15 & 21:00-23:00. I wrote a cron job on other system(ec2 intance) to stop this cronserver after 7:15 and start again # 21:00. I want the cronserver to stop by itself after the execution of the last script. Is it possible to write such script.
When you start the temporary instance, specify
--instance-initiated-shutdown-behavior terminate
Then, when the instance has completed all its tasks, simply run the equivalent of
sudo halt
or
sudo shutdown -h now
With the above flag, this will tell the instance that shutting down from inside the instance should terminate the instance (instead of just stopping it).
Yes, you can add an ec2stop command to the end of the last script.
You'll need to:
install the ec2 api tools
put your AWS credentials on the intance, or create IAM credentials that have authority to stop instances
get the instance id, perhaps from the inIstance-data
Another option is to run the cron jobs as commands from the controlling instance. The main cron job might look like this:
run processing instance
-wait for sshd to accept connections
ssh to processing instance, running each processing script
stop processing instance
This approach gets all the processing jobs done back to back, leaving your instance up for the least amount of time., and you don't have to put the credentials on thee instance.
If your use case allows for the instance to be terminated instead of stopped, then you might be able to replace the start/stop cron jobs with EC2 autoscaling. It now sports schedules for running instances.
http://docs.amazonwebservices.com/AutoScaling/latest/DeveloperGuide/index.html?scaling_plan.html

Resources