List all RSYNCed folders in GCP GAE Linux - linux

I set up some folders in GAE to be synced using the command -
gsutil rsync -r gs://sample1bucket1 ./sample1;
But I have forgotten what all places I have done it. How to list all these?

As per my understanding of your question, all your GAE folders are in cloud storage bucket, “sample1 bucket1”, and you are trying to sync them into directory “sample1”.If yes, then while writing the rsync commands you have to mention source and destination. So you should know where you are syncing all your files to, as per public documentation.
However,
you can list the folders in the current directory using the “ ls “
command to check for your destination folder and later cd into those
folders “cd simple1” (for your case) to see if the content has been
copied from your bucket to the file.
You can also list the number of running rsync processes using :
ps -ef| grep rsync | wc -l
I am leaving some information regarding the commands, in case you need them :
You can list all objects in a bucket using :
gsutil ls -r gs://bucket
You can list the directory with detailed information using :
rsync --list-only username#servername:/directoryname
You can list the folder contents using :
rsync --list-only username#servername:/directoryname/
You can also use the following command to parse out exactly what you need :
rsync -i

Related

How to download files which are created in last 24 hours using gsutil in GCP console?

I have a directory in a gcp storage bucket. And there are 2 subdirectories in that bucket.
Is there a way to download files which are created in last 24 hours in those subdirectories using gsutil command from console?
gsutil does not support filtering by date.
An option is to create a list of files to download via another tool or script, one object name per line.
Use stdin to specify a list of files or objects to copy. You can use
gsutil in a pipeline to upload or download objects as generated by a
program. For example:
cat filelist | gsutil -m cp -I gs://my-bucket
or:
cat filelist | gsutil -m cp -I ./download_dir
where the output of cat filelist is a one-per-line list of files,
cloud URLs, and wildcards of files and cloud URLs.
I was able to achieve part of it using gcp console and shell.
Steps:
Go to storage directory in browser gcp console.
Click on filter and you'll get options to filter based on created before, created after etc.
Provide the date and apply filter
Click on Download button
Copy the command, Open the gcp shell and run it. The required files will be downloaded there.
Run the zip command in shell and archive the downloaded files.
Select the Download from shell options and provide file path to download.

Is there a way to download files matching a pattern trough SFTP on shell script?

I'm trying to download multiple files trough SFTP on a linux server using
sftp -o IdentityFile=key <user>#<server><<END
get -r folder
exit
END
which will download all contents on a folder. It appears that find and grep are invalid commands, so are for loops.
I need to download files having a name containing a string e.g.
test_0.txt
test_1.txt
but no file.txt
Do you really need the -r switch? Are there really any subdirectories in the folder? You do not mention that.
If there are no subdirectories, you can use a simple get with a file mask:
cd folder
get *test*
Are you required to use sftp? A tool like rsync that operates over ssh has flexible include/exclude options. For example:
rsync -a <user>#<server>:folder/ folder/ \
--include='test_*.txt' --exclude='*.txt'
This requires rsync to be installed on the remote system, but that's very common these days. If rsync isn't available, you could do something similar using tar:
ssh <user>#<server> tar -cf- folder/ | tar -xvf- --wildcards '*/test_*.txt'
This tars up all the files remotely, but then only extracts files matching your target pattern on the receiving side.

How can I download all the files from a remote directory to my local directory?

I want to download all the files in a specific directory of my site.
Let's say I have 3 files in my remote SFTP directory
www.site.com/files/phone/2017-09-19-20-39-15
a.txt
b.txt
c.txt
My goal is to create a local folder on my desktop with ONLY those downloaded files. No parents files or parents directory needed. I am trying to get the clean report.
I've tried
wget -m --no-parent -l1 -nH -P ~/Desktop/phone/ www.site.com/files/phone/2017-09-19-20-39-15 --reject=index.html* -e robots=off
I got
I want to get
How do I tweak my wget command to get something like that?
Should I use anything else other than wget ?
Ihue,
Taking a shell programatic perspective I would recommend you try the following command line script, note I also added the citation so you can see the original threads.
wget -r -P ~/Desktop/phone/ -A txt www.site.com/files/phone/2017-09-19-20-39-15 --reject=index.html* -e robots=off
-r enables recursive retrieval. See Recursive Download for more information.
-P sets the directory prefix where all files and directories are saved to.
-A sets a whitelist for retrieving only certain file types. Strings and patterns are accepted, and both can be used in a comma separated list. See Types of Files for more information.
Ref: #don-joey
https://askubuntu.com/questions/373047/i-used-wget-to-download-html-files-where-are-the-images-in-the-file-stored

linux server create symbolic links from filenames

I need to write a shell script to run as a cron task, or preferably on creation of a file in a certain folder.
I have an incoming and an outgoing folder (they will be used to log mail). There will be files created with codes as follows...
bmo-001-012-dfd-11 for outgoing and 012-dfd-003-11 for incoming. I need to filter the project/client code (012-dfd) and then place it in a folder in the specific project folder.
Project folders are located in /projects and follow the format 012-dfd. I need to create symbolic links inside the incoming or outgoing folders of the projects, that leads to the correct file in the general incoming and outgoing folders.
/incoming/012-dfd-003-11.pdf -> /projects/012-dfd/incoming/012-dfd-003-11.pdf
/outgoing/bmo-001-012-dfd-11.pdf -> /projects/012-dfd/outgoing/bmo-001-012-dfd-11.pdf
So my questions
How would I make my script run when a file is added to either incoming or outgoing folder
Additionally, is there any associated disadvantages with running upon file modification compared with running as cron task every 5 mins
How would I get the filename of recent (since script last run) files
How would I extract the code from the filename
How would I use the code to create a symlink in the desired folder
EDIT: What I ended up doing...
while inotifywait outgoing; do find -L . -type l -delete; ls outgoing | php -R '
if(
preg_match("/^\w{3}-\d{3}-(\d{3}-\w{3})-\d{2}(.+)$/", $argn, $m)
&& $m[1] && (file_exists("projects/$m[1]/outgoing/$argn") != TRUE)
){
`ln -s $(pwd)/outgoing/$argn projects/$m[1]/outgoing/$argn;`;
}
'; done;
This works quite well - cleaning up deleted symlinks also (with find -L . -type l -delete) but I would prefer to do it without the overhead of calling php. I just don't know bash well enough yet.
Some near-answers for your task breakdown:
On linux, use inotify, possibly through one of its command-line tools, or script language bindings.
See above
Assuming the project name can be extracted thinking positionally from your examples (meaning not only does the project name follows a strict 7-character format, but what precedes it in the outgoing file also does):
echo `basename /incoming/012-dfd-003-11.pdf` | cut -c 1-7
012-dfd
echo `basename /outgoing/bmo-001-012-dfd-11.pdf`| cut -c 9-15
012-dfd
mkdir -p /projects/$i/incoming/ creates directory /projects/012-dfd/incoming/ if i = 012-dfd,
ln -s /incoming/foo /projects/$i/incoming/foo creates a symbolic link from the latter argument, to the preexisting, former file /incoming/foo.
How would I make my script run when a file is added to either incoming or outgoing folder
Additionally, is there any associated disadvantages with running upon file modification compared with running as cron task
every 5 mins
If a 5 minutes delay isn't an issue, I would go for the cron job (it's easier and -IMHO- more flexible)
How would I get the filename of recent (since script last run) files
If your script runs every 5 minutes, then you can tell that all the files created in between now (and now - 5 minutes) are newso, using the command ls or find you can list those files.
How would I extract the code from the filename
You can use the sed command
How would I use the code to create a symlink in the desired folder
Once you have the desired file names, you can usen ln -s command to create the symbolic link

How do I mirror a directory with wget without creating parent directories?

I want to mirror a folder via FTP, like this:
wget --mirror --user=x --password=x ftp://ftp.site.com/folder/subfolder/evendeeper
But I do not want to create a directory structure like this:
ftp.site.com -> folder -> subfolder -> evendeeper
I just want:
evendeeper
And anything below it to be the resulting structure. It would also be acceptable for the contents of evendeeper to wind up in the current directory as long as subdirectories are created for subdirectories of evendeeper on the server.
I am aware of the -np option, according to the documentation that just keeps it from following links to parent pages (a non-issue for the binary files I'm mirroring via FTP). I am also aware of the -nd option, but this prevents creating any directory structure at all, even for subdirectories of evendeeper.
I would consider alternatives as long as they are command-line-based, readily available as Ubuntu packages and easily automated like wget.
For a path like: ftp.site.com/a/b/c/d
-nH would download all files to the directory a/b/c/d in the current directory, and -nH --cut-dirs=3 would download all files to the directory d in the current directory.
I had a similar requirement and the following combination seems to be the perfect choice:
In the below example, all the files in http://url/dir1/dir2 (alone) are downloaded to local directory /dest/dir
wget -nd -np -P /dest/dir --recursive http://url/dir1/dir2
Thanks #ffledgling for the hint on "-nd"
For the above example:
wget -nd -np --mirror --user=x --password=x ftp://ftp.site.com/folder/subfolder/evendeeper
Snippets from manual:
-nd
--no-directories
Do not create a hierarchy of directories when retrieving recursively. With this option turned on, all files will get saved to the current directory, without clobbering (if a name shows up more than once, the
filenames will get extensions .n).
-np
--no-parent
Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that only the files below a certain hierarchy will be downloaded.
-np (no parent) option will probably do what you want, tied in with -L 1 (I think, don't have a wget install before me), which limits the recursion to one level.
EDIT. ok. gah... maybe I should wait until I've had coffee.. There is a --cut or similar option, which allows you to "cut" a specified number of directories from the output path, so for /a/b/c/d, a cut of 2 would force wget to create c/d on your local machine
Instead of using:
-nH --cut-dirs=1
use:
-nH --cut-dirs=100
This will cut more directories and no folders will be created.
Note: 100 = the number of folders to skip creating.
You can change 100 to any number.

Resources