Azure Logic Apps: Check for file type - azure

I setup an Azure Logic App that checks for newly created files in a OneDrive folder and then sends these (images) to the MS Vision API for tagging. This flow works fine.
How can I setup a condition to only react on a specific file type (images) or even better only when the file has a certain file ending, like ".jpg", ".png" etc.?
I tried to setup a condition on the "File content type" but couldn't figure out the appropriate value for the condition ("image" doesn't work).
I couldn't find any hints on the webs and neither on SO. Any help is very much appreciated.

When reading file attachments using the GMail action, I had to use starts with because the Content-Type property contained the MIME type followed by the file name.
The following example is for checking if the file is an Excel file (.xlsx, not .xls):
I also used http://mime.ritey.com/ to upload my files and ensure I had the MIME type correct.

File name is part of the metadata provided by the OneDrive Connector.
Using that, you can apply conditions/filters based on the extension. File content type is probably pretty reliable but in practice, the extension might be better.

I think I found a solution. I was able to kind of reverse engineer the file types by setting up an app that is triggered by new files and writes the file content type to a text file in a different folder.
image/jpg and image/png are image files
application/x-zip-compressed is a zipped file
So it seems that Azure uses standard MIME types to identify the file type (which very much makes sense... :0)

Related

Is there a way to upload files without a file extension suffix to Flask-Dropzone?

I'm trying to use Flask-Dropzone as part of a web-app to upload files for processing. These files typically don't have a file extension due to a quirk of the export process that generates these files.
I've consulted both the Flask-Dropzone docs and the Dropzone.js docs and both seem to imply that if DROPZONE_ALLOWED_FILE_CUSTOM = False then every upload of all file types should be accepted. However, when navigating the file upload window, the filter defaults to "All Supported Types" and seems to only accept images. I can toggle this to "All Files" but when trying to upload anything else the dropzone gives the default error message about the file not being allowed.
I am able to set custom allowed file types such as .pdf's, .xlsx, etc. However this isn't useful as the files in question doesn't have a declared file type extension.

Get the actual file extension using NiFi

I am using Apache NiFi to ingest data from Azure Storage. Now, the file I want to a huge file (100+ GB) read can have any extension and I want to read the file's header to get its actual extension.
I found python-magic package which uses libmagic to read the file's header to fetch the extension, but this requires the file to be present locally.
The NiFi pipeline to ingest the data looks like this
I need a way to get the file extension in this NiFi pipeline. Is there a way to read the file's header from the Content Repo? If yes, how do we do it? FlowFile has only the metadata which says the content-type as text/plain for a CSV.
There is no such thing as a generic 'header' that all files have that gives you it's "real" extension. A file is just a collection of bits, and we sometimes choose to give extensions/headers/footers/etc so that we know how to interpret those bits.
We tend to add that 'type' information in two ways, via a file extension e.g. .mp4 and/or via some metadata that accompanies the file - this is sometimes a header, which is sometimes plaintext and easily readible, but this is not always true. Additioanlly, it is up to the user and/or the application to set this information, and up the user and/or application to read it - neither of which are a given.
If you do not trust that the file has the proper extension applied (e.g. video.txt when it's actually an mp4) then you could also try to interrogate the metadata that is held in Azure Blob Storage (ContentType) and see what that says - however, this is also up to the user/application to set when the file is uploaded to ABS, so there is no guarantee that it is any more accurate than the file extension.
text/plain is not invalid for a plaintext CSV, as CSVs are just formatted plaintext - similar to JSON. However, you can be more specific and use e.g. text/csv for CSV and application/json for JSON.
NiFi does have IndentifyMimeType which can try to work it out for you by interrogating the file, but it is more complex that just accessing some 'header'. This processor uses Apache Tika for the detection, and adds a mime.type attribute to the FlowFile.
If your file is some kind of custom format, then this processor likely won't help you. If you know your files have a specific header, then you'll need to provide more information for your exact situation.

Acumatica - Attachments File Extension Filter

Good Day!
I am uploading some files in form as attachments and I am trying to filter the uploaded file extensions. But I am not getting the exact solution for this.
Is there any way to filter the extensions of the files being uploaded in file attachments?
Thank you so much for the help.
If you mean blocking file upload based on the extension you can do so in File Upload Preference Screen (SM202550):
If you mean the filter for the native open file dialog. It seems hardcoded for a handful of common web files and I don't think it can be changed easily.
If you mean the file upload dialog grid, unlike most grids in the system it is not filterable through the column headers:
Maybe the Search in Files page (SM202520) is what you're looking for. You can search files based on extension here:

How do I discover the Content Types available on a Document Library?

I have a user that has requested the ability to add files from a custom web site that will upload a file and populate the content types. I have the first part done, uploading the file. I do not know how to read the possible content types and how to update the content types for the specific file being uploaded.
Your question is not very clear - files external to sharepoint do not have a predictable content type. It's not like file extension associations, where .exe is always an executable, and .gif is always an image. Within sharepoint, the only limitation for files' content types is that the content type inherit from the Document content type. The association you make with any given type of file must be invented by you
As for finding out what content types exist on a document library, examine the SPList instance's .RootFolder.ContentTypes property.
Secondly, to set the content type on a file that has been uploaded you will most likely have to develop an Event Receiver which is a class derived from SPItemEventReceiver. You can trap the ItemAdded event and set the file's content type programmatically. This is done by setting one of it's internal properties to the ID of one of the SPContentType's retrieved in the earlier step.
-Oisin

What security issues we acquire if we publish a form that lets you upload any type of file into our database?

I am trying to assess our security risk if we allow to have a form in our public website that lets the user upload any type of file and get it stored in the database.
I am worried about the following:
Robots uploading information
A huge increment of the size of the database
The form is an resume upload so HR people will be downloading those files in a jpeg or doc or pdf format but actually getting a virus.
You can use captchas for dealing with robots
Set a reasonable file size limit for each upload
You can do multiple checking for your file upload control.
1) Checking the extension of file (.wmv, .exe, .doc). This can be implemented by Regex expression.
2) Actually check the file header or definition type (ex: gif, word, image, etc, xls). Sometimes file extension is not sufficient.
3) Limit the file size. (Ex: 20mb)
4) Never accept the filename provided by the user. Always rename the file to some GUID according to your specifications. This way hacker wont be able to predict the actual name of the file which is stored on the server.
5) Store all the files out of web virtual directory. Preferably store in separate File Server.
6) Also implement the Captcha for File upload.
In general, if you really mean to allow any kind of file to be uploaded, I'd recommend:
A minimal type check using mime magic numbers that the extension of the file corresponds to the given one (though this doesn't solve much if you are not going to limit the kinds of files that can be uploaded).
Better yet, have an antivirus (free clamav for example) check the file after uploading.
On storage, I always prefer to use the filesystem for what it was created: storing files. I would not recommend storing files in the database (suposing a relational database). You can store the metadata of the file on the database and a pointer to the file on the file system.
Generate a unique id for the file and you can use a 2-level directory structure to store the data: E.g: Id=123456 => /path/to/store/12/34/123456.data
Said that, this can vary depending on what you want to store and how do you want to manage it. It's not the same to service a document repository, a image gallery or a simple "shared directory"

Resources