scheduling a job to download files from ftp - linux

I see there're so many threads regarding job scheduling and it seems it takes me long time to find out which one is what I'm looking for.
So let me describe what I need, and if there are related thread I'd be grateful if you point me one.
I'm planning to create a scheduled task on a debian machine, to run every, let's say, 15 mins to download files from ftp server to some local folder.
What's tools/programs will I need for this?
I think I will need to use some programming language to code a logic for downloading files, so the issue how to make this program run as a scheduled task.
Please edit my thread if you find it's not good expressed.
Thanks in advance.

Use cron
edit your /etc/crontab to :
*/15 * * * * /path/to/file
in /path/to/file write your ftp command:
ftp google.com

Related

Cron to read from multiple crontab files for each system

I have a large landscape of servers. There are logical groupings for some servers (clusters). I'd find a way to run crontabs on specific clusters. Specifically, I'd like to have a centralized location were I can edit their crontabs at the same time.
Currently, what I'm doing is accessing each server and editing their crontabs the manual way.
Thanks
Maybe you should consider a tool like Ansible which would allow you to manage the cron entries or cron files locally, and distribute them to your servers.
Ansible is quite easy to learn and all you need is Ansible, Python and a working ssh connection.

How to track back where a script is scheduled in crontab

I have scheduled one shell script in cron that fires emails depending on the condition.
I have modified the sender email address.
Now , the issue is while i tested the script in different test environments, somewhere probably it is still active and firing emails from test environment. I have checked crontabs in test environments but nowhere i found out it is scheduled.
Can you guys help me on how to track back where from those emails getting triggered? which instance? which cron etc.
Thanks in advance.
I suggest consulting the cron log file. This file records when and what programs cron starts on behalf of which user.
For FreeBSD this is /var/log/cron. On Linux, as always, it may depend on your distro, the cron implementation, phase of the moon :-) Running man cron might point you at the right file in the FILES section.

setting up a cron job on cpanel that dn't have the cron job option

please can anyone give me a clue on how to Set up a cronjob to execute my API callback URL once every 15 mins, since my control panel does not have that ability, please i need specifications with examples because am still at the armature level, example: i want to setup a cronjop to execute this API call back url http://shop.site.com/modules/cashenvoy/validation.php every 15 mins and my control panel dn't have the option for direct cronjob setting, please how do i go about to setup a cronjob to execute this URL, thanks ur suggestions is appreciated.
Try to use crontab:
# to list current cron jobs
sudo crontab -u username -l
# edit the cron list
sudo crontab -u username -e
This proves a terminal where you can edit the cron jobs.
Insert the following:
*/15 * * * * curl -s http://shop.site.com/modules/cashenvoy/validation.php
This will every 15 minutes silently call the wanted website.
A third party cron job provider may help: http://www.easycron.com.
Disclaimer: I work for easycron.com.
This demo assumes you've already logged in to cPanel
Now let's learn how to setup a cron job
1) Click the Cron Jobs icon
2) Enter the email address where you want the cron job results sent after each time it runs
3) Now you have to define exactly when and how often you want the cron job to run. This is made easier by using one of the pre-defined or common settings
Notice that by choosing a common setting, all fields are filled in automatically. This also helps you understand what each field means
If you have a shared hosting account, crons should not be made more regular than once every 10 minutes.
4) Let's choose Once a week
5) Next, enter the command of the script you want to run, including the path (from root). If you are on shared hosting - add nice -n 15 to the end. This ensures that the server gives the cron a lower priority than critical system processes, helping maintain stability and server uptime.
Remember to add "nice -n 15" to the end of your command!
6) When ready, click Add New Cron Job
That's it! The cron job has been set as you can see here. You can create additional cron jobs, and edit or delete existing ones
This is the end of the tutorial. You now know how to setup cron jobs in cPanel

Run program in background, constantly, on a web server - preferably in PHP

I want to create a website application, that will allow our members to get text message/email alerts every day 1 hour before their lesson (a reminder).
My server-side language background is strictly in PHP (although I've tampered some c++ back in the day). For this to work, obviously, I'll needs to somehow run a program constantly on my server.
Can this be accomplished in PHP?
If yes, is php efficient at this?
If not, how can I do this?
Or, maybe, this an entirely wrong approach, and there's a better way of creating such a service.
yes, u can consider make PHP as a daemon
or check this out php execute a background process
or simply use cron - http://en.wikipedia.org/wiki/Cron
but you should NOT create a web service/application just to run background PHP processes, it should cater for complex job
Sure, you can use PHP as a scripting language on the server. It's just as capable as any other.
Write a PHP script that checks your database for what members need to be alerted, then sends the message. Add a crontab to run this script every minute/hour/whatever. To run a php script from the command line, you run the php interpreter and give it the script name to run.
$ php /path/to/script.php
You would have to start a service on the server itself or create a CRON job to run at any given interval. If you don't have admin privileges you will have to do a CRON job, which can usually be setup in your host's cpanel.
For instance, you could create a small PHP script that
1) Searched for all lessons that start in the hour proceeding the current hour. So if the script is run at 5pm it would search for lessons that start between 6pm and 6:59.
2) Send an email to those members.
It wouldn't be exactly 1 hour though.

Process text files ftp'ed into a set of directories in a hosted server

The situation is as follows:
A series of remote workstations collect field data and ftp the collected field data to a server through ftp. The data is sent as a CSV file which is stored in a unique directory for each workstation in the FTP server.
Each workstation sends a new update every 10 minutes, causing the previous data to be overwritten. We would like to somehow concatenate or store this data automatically. The workstation's processing is limited and cannot be extended as it's an embedded system.
One suggestion offered was to run a cronjob in the FTP server, however there is a Terms of service restriction to only allow cronjobs in 30 minute intervals as it's shared-hosting. Given the number of workstations uploading and the 10 minute interval between uploads it looks like the cronjob's 30 minute limit between calls might be a problem.
Is there any other approach that might be suggested? The available server-side scripting languages are perl, php and python.
Upgrading to a dedicated server might be necessary, but I'd still like to get input on how to solve this problem in the most elegant manner.
Most modern Linux's will support inotify to let your process know when the contents of a diretory has changed, so you don't even need to poll.
Edit: With regard to the comment below from Mark Baker :
"Be careful though, as you'll be notified as soon as the file is created, not when it's closed. So you'll need some way to make sure you don't pick up partial files."
That will happen with the inotify watch you set on the directory level - the way to make sure you then don't pick up the partial file is to set a further inotify watch on the new file and look for the IN_CLOSE event so that you know the file has been written to completely.
Once your process has seen this, you can delete the inotify watch on this new file, and process it at your leisure.
You might consider a persistent daemon that keeps polling the target directories:
grab_lockfile() or exit();
while (1) {
if (new_files()) {
process_new_files();
}
sleep(60);
}
Then your cron job can just try to start the daemon every 30 minutes. If the daemon can't grab the lockfile, it just dies, so there's no worry about multiple daemons running.
Another approach to consider would be to submit the files via HTTP POST and then process them via a CGI. This way, you guarantee that they've been dealt with properly at the time of submission.
The 30 minute limitation is pretty silly really. Starting processes in linux is not an expensive operation, so if all you're doing is checking for new files there's no good reason not to do it more often than that. We have cron jobs that run every minute and they don't have any noticeable effect on performance. However, I realise it's not your rule and if you're going to stick with that hosting provider you don't have a choice.
You'll need a long running daemon of some kind. The easy way is to just poll regularly, and probably that's what I'd do. Inotify, so you get notified as soon as a file is created, is a better option.
You can use inotify from perl with Linux::Inotify, or from python with pyinotify.
Be careful though, as you'll be notified as soon as the file is created, not when it's closed. So you'll need some way to make sure you don't pick up partial files.
With polling it's less likely you'll see partial files, but it will happen eventually and will be a nasty hard-to-reproduce bug when it does happen, so better to deal with the problem now.
If you're looking to stay with your existing FTP server setup then I'd advise using something like inotify or daemonized process to watch the upload directories. If you're OK with moving to a different FTP server, you might take a look at pyftpdlib which is a Python FTP server lib.
I've been a part of the dev team for pyftpdlib a while and one of more common requests was for a way to "process" files once they've finished uploading. Because of that we created an on_file_received() callback method that's triggered on completion of an upload (See issue #79 on our issue tracker for details).
If you're comfortable in Python then it might work out well for you to run pyftpdlib as your FTP server and run your processing code from the callback method. Note that pyftpdlib is asynchronous and not multi-threaded, so your callback method can't be blocking. If you need to run long-running tasks I would recommend a separate Python process or thread be used for the actual processing work.

Resources