Programmatically determine file types in SharePoint - sharepoint

Is there a way to programmatically determine a file type in SharePoint? I want to limit the types of files that are being uploaded into a document library. I have written an EventReceiver that on ItemAdding conducts the following -
if (!(properties.AfterUrl.Contains(".docx") || properties.AfterUrl.Contains(".pptx") || properties.AfterUrl.Contains(".xlsx") ))
Surely there's a better way to do so?

Blocking file types is only possible at the farm level (through the central admin).
An Event Handler checking the file's extension is the only way to go if you want to be able to administer this at a document library level.
So no, there is no better way of doing this.

If you are really interested in restricting certain types of files, I would recommend to go beyond file's extension or mime types and inspect file's content to determine its nature, which is what IE and Firefox do.
(BTW, there's an IE API whose name I cannot remember right now that gives you the mime type of a file after inspecting it.)

Related

Azure Logic Apps: Check for file type

I setup an Azure Logic App that checks for newly created files in a OneDrive folder and then sends these (images) to the MS Vision API for tagging. This flow works fine.
How can I setup a condition to only react on a specific file type (images) or even better only when the file has a certain file ending, like ".jpg", ".png" etc.?
I tried to setup a condition on the "File content type" but couldn't figure out the appropriate value for the condition ("image" doesn't work).
I couldn't find any hints on the webs and neither on SO. Any help is very much appreciated.
When reading file attachments using the GMail action, I had to use starts with because the Content-Type property contained the MIME type followed by the file name.
The following example is for checking if the file is an Excel file (.xlsx, not .xls):
I also used http://mime.ritey.com/ to upload my files and ensure I had the MIME type correct.
File name is part of the metadata provided by the OneDrive Connector.
Using that, you can apply conditions/filters based on the extension. File content type is probably pretty reliable but in practice, the extension might be better.
I think I found a solution. I was able to kind of reverse engineer the file types by setting up an app that is triggered by new files and writes the file content type to a text file in a different folder.
image/jpg and image/png are image files
application/x-zip-compressed is a zipped file
So it seems that Azure uses standard MIME types to identify the file type (which very much makes sense... :0)

How do I discover the Content Types available on a Document Library?

I have a user that has requested the ability to add files from a custom web site that will upload a file and populate the content types. I have the first part done, uploading the file. I do not know how to read the possible content types and how to update the content types for the specific file being uploaded.
Your question is not very clear - files external to sharepoint do not have a predictable content type. It's not like file extension associations, where .exe is always an executable, and .gif is always an image. Within sharepoint, the only limitation for files' content types is that the content type inherit from the Document content type. The association you make with any given type of file must be invented by you
As for finding out what content types exist on a document library, examine the SPList instance's .RootFolder.ContentTypes property.
Secondly, to set the content type on a file that has been uploaded you will most likely have to develop an Event Receiver which is a class derived from SPItemEventReceiver. You can trap the ItemAdded event and set the file's content type programmatically. This is done by setting one of it's internal properties to the ID of one of the SPContentType's retrieved in the earlier step.
-Oisin

When is SPFile.Properties != to SPFile.Item.Properties in SharePoint?

One of our customers has a problem that we cannot reproduce. We programmatically copy a document's properties to a destination file using SPFile.Properties. However, for some reason the file's properties do not match the meta data specified on the list the file is stored in.
Now, we can probably solve this by copying SPFile.Item.Properties (not tested yet), but I am just wondering under what circumstances SPFile.Properties is unequal to SPFile.Item.Properties.
Update: We have just received an update from our customer. Using SPFile.Item.Properties always returns the up to date information. However, we still would like to understand the original question.
There is a slight difference between SPFile.Properties and SPFile.Item fields and the first one is much, much slower to call.
You have most probably seen Microsoft Office document's "properties" window (this one - http://dradisframework.org/images/tutorial/custom_document_properties.png). These are the properties that are read when you access SPFile.Properties. Reading them is slow since there is some code infrastructure that parses the binary DOC file and finds the properties. (takes up to 30 or something milliseconds for every property access) See more here: http://msdn.microsoft.com/en-us/library/microsoft.sharepoint.spfile.properties.aspx
In SharePoint, every item is an SPListItem and its field values (and I don't use the word "properties" on purpose here) are stored in Sharepoint's content database. So, when you access SPFile.Item.Properties, you actually look at the SPListItem to which the file is attached and look at its properties from SharePoint's content database.
What happens behind the scene, when you upload a file having some "Office properties" set, is that SharePoint copies them to same-named fields in SPListItem. (Some information about it here: http://weblogs.asp.net/bsimser/archive/2004/11/22/267846.aspx)
This is why these properties typically have the same value, BUT it only happens if SharePoint knows how to read metadata from your file and write them back. So, in case you put a .txt file in your SharePoint store, you will not get any SPFile.Properties back.
The user will always see the ListItem Properties and not the SPFile properties in a document library. So using the ListItem properties in the copy is the way to go.
I believe this issue is related to the Sharepoint property promotion/demotion feature which enables document properties to be embedded in the physical MSOffice file and travel with it to the client etc. This however is only supported currently for Office file types (to my knowledge).
Jonathan
Trying to find the "official documented" anything for sharepoint is pretty much undoable. :-D. The online docs suck, you are better of using blog entries etc.
P.S. I agree with Alex here. Although an SPFile never exists in a list without an accompanying SPListItem, the connection between the 2 can get corrupted (i.e. being able to edit the list item but the file is not openable). This to me indicates information about the 2 is stored in different locations in the content db. I have had this happen before.

ms office file extensions

I made a discovery some time back. Just follow these steps:
Create a .doc/.xls/.ppt file in office 2003. Keep some test data in there and close the file. Now rename the file to change it's file extension to a random string, taking care that it is unassociated, like test.asdfghjkl etc.
Double click the file and it opens seamlessly in the parent application.
Now AFAIK, windows checks the file extension of the file and uses it to do an action, viz open an application and pass the file to it to open. Then how does the office suite manage to do this?
EDIT: How about the case when the extension is changed to one that is associated with another application. Is there a priority algorithm in place for handling that ?
Do you have the "View extensions for known types" option on?
EDIT: #Comments....
Yes, its a stupid/insulting question, but when troubleshooting a problem I have learned to assume nothing, and trust the users 0%.
BUT, I tried it, and you're right. Its stupid that MS has this kind of behavior, and it can only lead to security vulnerabilities, which led me on a search for your answer.
From the posts at http://seclists.org/fulldisclosure/2007/Jan/0444.html
"You have stumbled on an age-old
quirky behavior of Windows. Office
document formats are based on a
standard Windows container format, OLE
structured storage files, also known
as "docfiles". A docfile's name and
extension are irrelevant - the file
is, conceptually, a serialization of
an OLE object, and like all
serialization formats it contains the
identifier of the application that
produced it, in the form of an OLE
class id (in GUID format) in this
case. You can easily verify that it
doesn't work with the newer Office XML
formats"
Indeed it doesnt work for the 2007 *X file types, but 2K3 is still a problem. To solve this problem... Upgrade! =)
And here at security focus under TOC point 2.
So, there you go.
I can't seem to make this happen now, but I know I saw Windows reading XML processing instructions a few years back. Maybe that is what's going on?

WSS 3.0: change parent type for a content type

I have created a hierarchy of content types. The root of my hierarchy has the content type "Document" as a parent. There are about 20 other content types derived from my root.
Now, I want to change the parent from "Document" to something else. Is it possible? Either in the web interface or in code? Can the definition of content types be dumped to a text file and then recreated? Or any other trick?
If you can create a feature that contains all your custom content types, you will be able to change the XML that defines each content type and it's columns.
This will give you the ability to change the content types for your site by removing the feature and installing it again with the changes (using a Solution is best).
Note that any content using the older content types will still use them after updating the feature (content types are stored at the site level, list level and on the actual item).

Resources