How do I curl a URL with an unknown filename at the end? - linux

I'm talking to a server that creates a new zip file daily, ex: (data-1234.zip). Every day the name of the previous zip is removed and a new one is created with an incremented number, ex: (data-1235.zip). The script will be run sporadically throughout the week but it's on a lab system where the user can't manually update the name with what's on the server.
The server only has one zip file in that directory, it's just a matter of getting the correct naming convention. There is, however a "data.ini" file in the folder as well, so something just searching by first name wouldn't necessarily work. I've seen posts similar to This question using regex but the file is currently on 10,609 and I'd rather not use expansion for potentially thousands of calls depending on access to modify the script in the coming years. I've been searching for something similar to "data-*.zip" but haven't had any luck.

Question was solved by changing commands and running
lftp https://download.companyname.com/product/data/ -e "mget data-*.zip; bye"
since lftp allows wildcards in the filename, unlike curl.

Related

Generate hash of newly downloaded file

I'd like my bash script to perform an action every time new file is downloaded to /Downloads (generate hash of downloaded file and send it to API). So far I've been trying to make use of "inotify-tools", but it works only for newly created file and that won't do.
Script should work like this:
I download a file via browser (normal way)
Script notices new file and is executed automatically
Thanks in advance for help :D
You can use /etc/crontab to check ~/Downloads folder at startup and every n minutes. Script that will run every nth minute can do either
Keep the number of files. If number decreases script updates cache. And if number increases then gets the latest created file (or modified) and sends that file's hash to the api via curl.
Keep the name of files. If a file no longer exists, script then updates the cache of file names. If a new file appears again hashes and sends hash to the api via curl.
You can keep cache of files under /tmp.
If you can provide an example scenario I can write a simple script

Want to create a specific script for my Raspberry Pi to watch directory and do some actions

I am a complete newbie in writing scripts, I have just started a few days ago, and was already able to create simple scripts to find files, move them, delete them, etc...
I have a Raspberry Pi 4 with Raspberry Pi OS installed on it.
Now I want to create a better script, using "inotify" to monitor a specific directory and performs some actions if some specific files are found. Aaaaaand, I am a bit lost.
Here is what I have found and tested :
MONITORDIR="/my_dir"
inotifywait -m -r -e create --format '%w%f' "${MONITORDIR}" | while read NEWFILE
do
...
With this, I can generate an action whenever any new file appeared in my folder.
What I want :
If a new file with specific name (not the complete name, but just a part of the name of the file) with a specific .pdf extension is detected in the directory,
Then, move this file in another directory
And send an email using postfix, including the name of this new file, without the complete path of the file
Any help with this will be good for me, since I am a beginner, I know have a lot to learn, and I am sure I will.
Thank you !

Log-in only once using wget multiple times on same ftp-server

Basically, I am using wget on a file containing multiple URLs. I notice that for each line the command I use:
wget -i list_of_urls
and for each row in "list_of_urls" wget does a log-in step to the FTP server that I'm downloading from. It does the log-in step automatically, without me entering any username and password. Each line produces the output
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.13|.21... connected.
Logging in as anonymous ... Logged in!
followed by the file downloading.
Is there any way to log in only for the first row and then using that login to download all the following rows? Since the URLs point to the same FTP server, only different files, it feels like logging in for each row is wasteful.
Edit: changed from "website" to "FTP server" since that was what I actually meant, thanks. Added a sample output of the log-in message.
After some fiddling around I think using the rsync protocol solved the problem. This works in this case since the file host has both ftp and rsync servers containing the same files. I then simply (for small file sizes) use
rsync $(tr '\n' ' ' <list_of_urls) /usrpath/
which was much faster than using wget on the ftps. I had to include the $(tr '\n' ' ' <list_of_urls) since the list of urls had end of line separation, but rsync takes space-separated files in the command line. It seems like the rsync protocol in this case only logs on once and then downloads all files, since it went much faster.
Another problem arises with this method when list_of_urls is very long, which I haven't solved yet.

How to retrive Files generated in the past 120 minutes in Linux and also moved to another location

For one of my Project, I have a certain challenge where I need to take all the reports generated in a certain path, I want this to be an automated process in "Linux". I know the way how to get the file names which have been updated in the past 120 mins, but not the files directly. Now my requirements are in such a way
Take a certain files that have been updated in past 120 mins from the path
/source/folder/which/contains/files
Now do some bussiness logic on this generated files which i can take care of
Move this files to
/destination/folder/where/files/should/go
I know how to achieve #2 and #3 but not sure of #1. Can someone help me how can i achieve this.
Thanks in Advance.
Write a shell script. Sample below. I haven't provided the commands to get the actual list of file names as you said you know how to do that.
#!/bin/sh
files=<my file list>
for file in $files; do
cp $file <destination_dirctory>
done

How to call a bash script automatically when directory contents chage

My goal is to run a bash script automatically whenever any new file is added to a particular directory or any subdirectory of that particular directory.
Detail Scenario:
I am creating an automated process for file submission from teachers to students and vice versa. Sender will upload file and it will be stored inside the Uploads directory in the LAMP server in the format, ex. "name_course-name_filename.pdf". I want some method so that when any file stored inside the Uploads folder, the same time a script will be called and send that file to the list of receives.
From the database I can find the list of receiver for that particular course and student.
The only concern of mine is, how to call a script automatically and make it work on individual file whenever the content of the directory changes. Cron will do in intervals but not a real time work.
Linux provides a nice mechanism for that purpose which is called inotify. inotify is mostly available as a C API. But there have been developed shell utilities as well. You should use inotifywait from inotifytools (pkg name in debian) for this. Here comes a basic example:
#!/bin/bash
directory="/tmp" # or whatever you are interested in
inotifywait -m -e create "$directory" |
while read folder eventlist eventfile
do
echo "the following events happened in folder $folder:"
echo "$eventlist $eventfile"
done
Update:
If the problem goes complicated, for example you'll have to monitor recursive, dynamic directory structures, you should have a look at incron It's a cron like daemon which executes scripts on certain events. But the events are file system events rather than timer events.
There is another option to 'inotifywait':
-d --daemon
Same as --monitor, except run in the background logging events to a file
that must be specified by --outfile. Implies --syslog.
For completeness:
-m --monitor
Instead of exiting after receiving a single event, execute indefinitely.
The default behaviour is to exit after the first event occurs.
Within the do-done block of your 'while' statement, you might parse each event report for interesting details then use 'case-esac' to take action based on each event that you care about.
For something that you plan to rely on for your operations, you might also consider replacing the hard-coded '$directory' with some sort of configuration file. Such a file might include the path and filename, the interesting events for that path and file, and a script to run when those events happened.
The script might take the list of events as parameters and then 'case-esac' again.
Just one man's ramblins,
~~~ 8d;-Dan

Resources