SFTP set upload order by local file date - linux

Reading SFTP command line shell seems there is no flag to upload entire
folders to remote server sending local files according to the date of the last modification so that the recently modified local files are the last to be uploaded remotely.
I tried to change the date with touch -t but sftp always seems to follow the alphabetical order.
I can not create an ordered list and send it through a batch file because my need is to send whole folders not single file [ i.e. put foldername ]
My need now is how to upload an entire folder that contains some XML files and only one of them having a random name that I know in advance must be sent to the remote server as last.
Thanks in advance.

My need now is how to upload an entire folder that contains some XML files and only one of them having a random name that I know in advance must be sent to the remote server as last.
Move the one specific local file away (to a temporary folder)
Upload the rest of the files
Upload the specific local file explicitly
Move the specific local file back.

Related

How can I attach all content files from folders in my blob container with a Logic App?

I have a blob container "image-blob", and I create a folder blob with OCR image text and the image (two files, image.txt (with the text of an image) and image.png). The container have multiple folders, and inside each folder both files. How can I make a Logic App in which it sends an email with both files of every folder? (this would be an email for each folder with 2 files). The name of the folder is generated randomly and every file has the name of the folder + extension.
I've tried making a condition and if isFolder() method, but nothing happens.
This is how my container looks like:
This is files each folder have:
You could try with List blobs in root folder if your folders are in the root of the container or if not you could use List blobs.
If you try List blobs in root folder, your flow would be like the below pic shows. After List blobs you will get all blob info and you could add action like Get blob content using path.
And if you use List blobs, only the first step is different. And you need specify the container path. The other steps just like the List blobs in root folder.
In my test, I add the get blob content using path action and here is the result.
It did get all blob , however due to the For each action, you could only get one by one, so in your situation, you maybe need to store the info you need into a file then get the whole information sheet from the file.
Hope this could help you, if you still have other questions, please let me know.
How can I make a Logic App in which it sends an email with both files of every folder?
It's hard to put two files in an email. The following snapshot shows that send an email with each files of every folder.
If you still have any problem, please feel free to let me know.

azure logic app how to check a specific file in a sftp is changed

Need to check a sftp site and there will be multiple files been uploaded to a folder. I am writing logic apps to process the files. Each logic app will handle one file because each file format is different. Problem is sftp trigger can only detect change to ANY file in the folder. So if a file changes, the logic app for that file will run, but OTHER logic apps will run as well which is not desired.
i have tried use recurrence trigger then followed by a sftp get file content by path action but that will fail if the file specified does not exist, what I want is the logic app just quit or better not been triggered at all.
How to just trigger the logic app if a particular file is updated/uploaded?
On your logic App you can actually use the Dynamic Content and Expressions to do the following
decodeBase64(triggerOutputs()['headers']['x-ms-file-name-encoded'])
Hope it helps!
I tried my Azure web FTP site with a condition if file name is equal to abc.txt and get the same Input. The Expression result is always false.
Then I check the Run Details I found the file name in OUTPUTS wasn't abc.txt, it's encrypted with base64.
In my test abc.txt was encrypted into YWJjLnR4dA==, then I changed the file name to YWJjLnR4dA== in the Logic App condition and it works.
So you could go to check your run history get the file name or you could go to this site encrypt you file name with Base64.
Hope thhis could help you, if you still have other questions, please let me know.

WinSCP - ignore single directory during sync

I have a directory on a server that needs to synchronize its contents to a client. It is set to delete files on the client that have also been deleted from the directory that is to be synced from the server.
I want to ignore a specific directory, so it does not delete its contents on the client.
The following script (located on the client) currently deletes the contents located in /files/synced/oss/test/ but I want that directory to keep its contents on the client.
option exclude "Thumbs.db; /files/synced/oss/test/"
synchronize local -delete "D:\files" "/files/synced"
If I understand your question correctly, you do not want to exclude a remote folder /files/synced/oss/test/.
You want to exclude a local folder D:\files\oss\test\.
Also note that the option exclude has been deprecated, use -filemask switch instead.
synchronize local -delete -filemask="| Thumbs.db; D:\files\oss\test\" "D:\files" "/files/synced"

Azure DML Data slices?

I have 40 mil blobs of 10 TB in blob storage. I am using DML CopyDirectory to copy these into another storage account for backup purpose. It took nearly 2 weeks to complete. Now i am worried that until which date the blobs are copied to target directory. Is it the date when the job started or the date job finished ?
Does DML uses anything like data slices ?
Now i am worried that until which date the blobs are copied to target directory. Is it the date when the job started or the date job finished ?
As far as I know, when you start the CopyDirectory method, it will just send the request to tell the azure storage account to copy files from another storage account. All the copy operation is azure storage.
If we run the method to start copy the directory, the azure storage will firstly create the file with 0 size as below:
After the job finished, you will find it has change the size as below:
So the result is if the job started it will create the file in the target directory, but the file size is 0. You could see image1's file last modify time.
The azure storage will continue copy the file content to the target directory.
If the job finished, it will change the file last modify time.
So the DML SDK just tell the storage to copy files, then it will continue send the request to the azure storage to check each file's copy status.
Like below:
Thanks. But what happens if the files added to the source directory during this copy operation ? Does the new files as well get copied to the target directory ?
In short answer Yes.
The DML won't get the whole blob list and send request to copy all the file at one time.
It will firstly get a part of your file name list and send request to tell the storage copy file.
The list is sort by the file name.
For example.
If the DML have already copied the file name like 0 file as below.
This target blob folder
If you add the 0 start file to your folder,it will not copy.
This is copy from blob folder.
Copy completely blob folder:
If you add the file at the end of your blob folder and the DML doesn't scan it, it will be copied to the new folder.
so during that 2 weeks at least a million blobs must have been added to the container with very random names. So i think DML doesn't work in the case of large containers ?
As far as I know, the DML is designed for high-performance uploading, downloading and copying Azure Storage Blob and File.
When you using the DML CopyDirectoryAsync to copy the blob file.It will firstly send a request to list the folder's current file, then it will send the request to copy the file.
The default of the operation sending a request to list the folder's current file number is 250.
After get list it will generate a marker which is the next blob search file names. It will start to list the next file name in the folder and start copy again.
And by default, the .Net HTTP connection limit is 2. This implies that only two concurrent connections can be maintained.
It means if you don't set the .Net HTTP connection limit, the CopyDirectoryAsync will just get 500 record and start copy.
After copy completely, the operation will start to copy next files.
You could see this images:
The marker:
I suggest you could firstly set the max http connections to detect more blob files.
ServicePointManager.DefaultConnectionLimit = Environment.ProcessorCount * 8;
Besides, I suggest you could create multiple folder to store the files.
For example, you could create a folder which stores one week files.
Next week, you could start a new folder.
Then you could backup the old folder's file without new files store into that folder.
Finally, you could also write your own code to achieve your requirement, you need firstly get the list of the folder's files.
The max result of one request to get the list is 5000.
Then you could send the request to tell the storage copy each files.
If the file upload to the folder after you get the list, it will not copy to the new folder.

Avoid over-writing blobs AZURE

if i upload a file on azure blob in the same container where the file is existing already, it is over-writing the file, how to avoid overwriting the same? below i am mentioning the scenario...
step1 - upload file "abc.jpg" on azure in container called say "filecontainer"
step2 - once it gets uploaded, try uploading some different file with the same name to the same container
Output - it will overwrite existing file with the latest uploaded
My Requirement - i want to avoid this overwrite, as different people may upload files having same name to my container.
Please help
P.S.
-i do not want to create different containers for different users
-i am using REST API with Java
Windows Azure Blob Storage supports conditional headers using which you can prevent overwriting of blobs. You can read more about conditional headers here: http://msdn.microsoft.com/en-us/library/windowsazure/dd179371.aspx.
Since you want that a blob should not be overwritten, you would need to specify If-None-Match conditional header and set it's value to *. This would cause the upload operation to fail with Precondition Failed (412) error.
Other idea would be to check for blob's existence just before uploading (by fetching it's properties) however I would not recommend this approach as it may lead to some concurrency issues.
You have no control over the name your users upload their files with. You, however, have control over the name you store those files with. The standard way is to generate a Guid and name each file accordingly. The chances of conflict is almost zero.
A simple pseudocode looks like this:
//generate a Guid and rename the file the user uploaded with the generated Guid
//store the name of the file in a dbase or what-have-you with the Guid
//upload the file to the blob storage using the name you generated above
Hope that helps.
Let me put it that way:
step one - user X uploads file "abc1.jpg" and you save it io a local folder XYZ
step two - user Y uploads another file with same name "abc1.jpg", and now you save it again in a local folder XYZ
What do you do now?
With this I am illustrating that your question does not relate to Azure in any way!
Just do not rely on original file names when saving files. Where-ever you are saving them. Generate random names (GUIDs for example) and "attach" the original name as meta-data.

Resources