How can I set file upload limit for specific content type in crafter 3.0. Say for example Content type ABC must have file upload limit of 25MB where as Content type XYZ Must have file upload limit of 10MB.
You can build your own Data Source that imposes the desired limits.
Here is how to build your own DS: https://docs.craftercms.org/en/3.0/developers/extending-studio/build-a-data-source.html
Related
I am using Apache NiFi to ingest data from Azure Storage. Now, the file I want to a huge file (100+ GB) read can have any extension and I want to read the file's header to get its actual extension.
I found python-magic package which uses libmagic to read the file's header to fetch the extension, but this requires the file to be present locally.
The NiFi pipeline to ingest the data looks like this
I need a way to get the file extension in this NiFi pipeline. Is there a way to read the file's header from the Content Repo? If yes, how do we do it? FlowFile has only the metadata which says the content-type as text/plain for a CSV.
There is no such thing as a generic 'header' that all files have that gives you it's "real" extension. A file is just a collection of bits, and we sometimes choose to give extensions/headers/footers/etc so that we know how to interpret those bits.
We tend to add that 'type' information in two ways, via a file extension e.g. .mp4 and/or via some metadata that accompanies the file - this is sometimes a header, which is sometimes plaintext and easily readible, but this is not always true. Additioanlly, it is up to the user and/or the application to set this information, and up the user and/or application to read it - neither of which are a given.
If you do not trust that the file has the proper extension applied (e.g. video.txt when it's actually an mp4) then you could also try to interrogate the metadata that is held in Azure Blob Storage (ContentType) and see what that says - however, this is also up to the user/application to set when the file is uploaded to ABS, so there is no guarantee that it is any more accurate than the file extension.
text/plain is not invalid for a plaintext CSV, as CSVs are just formatted plaintext - similar to JSON. However, you can be more specific and use e.g. text/csv for CSV and application/json for JSON.
NiFi does have IndentifyMimeType which can try to work it out for you by interrogating the file, but it is more complex that just accessing some 'header'. This processor uses Apache Tika for the detection, and adds a mime.type attribute to the FlowFile.
If your file is some kind of custom format, then this processor likely won't help you. If you know your files have a specific header, then you'll need to provide more information for your exact situation.
We are using Kentico 10
Question 1
We are using DirectUploadControl to upload image in a page form.
We have column PDFImage in the page type. I can see the value of this field to be a GUID.
Where the image is stored on disk when uploaded by this control? Which table is updated with file name?
I tried the page type table, cms document and media file but couldn't find.
Question 2
We need to process the image when uploaded. Is there an event? Right now we are doing this in DocumentEvents.SaveVersion.After
This depends on how the system is set to store files. So, it can be in the DB or on disk or both. When on disk, it depends what folder is set to store the attachments.
In the document events there is the SaveAttachment event - so maybe you can try using that one. Or, it might be better to create a custom uploader form control - depending on your needs.
Trying to upload file with size around 3Gb gives an Error 500:
Log.nsf doesn't contain any useful info about the error and refers to trace-log-0.xml, which contain the following exception:
CLFAD0169E: Error writing to persisted content to response 30EC97558DFA85F701A8264A917629CAF0A0329A/DominoDoc-738E-preview/Videoclip_2017_InternationalMarketing_preview.wmv/{3}
java.io.IOException: HTTP: Internal error:
at com.ibm.domino.xsp.bridge.http.servlet.XspCmdHttpServletResponse.write(XspCmdHttpServletResponse.java:860)
at com.ibm.domino.xsp.bridge.http.servlet.XspCmdServletOutputStream.write(XspCmdServletOutputStream.java:72)
at com.ibm.commons.util.io.StreamUtil.copyStream(StreamUtil.java:137)
at com.ibm.commons.util.io.StreamUtil.copyStream(StreamUtil.java:118)
at com.ibm.xsp.webapp.PersistenceServiceResourceProvider$PersistenceServiceResource.write(PersistenceServiceResourceProvider.java:116)
at com.ibm.xsp.webapp.FacesResourceServlet.doGet(FacesResourceServlet.java:110)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:693)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:806)
at com.ibm.xsp.webapp.FacesModuleResourceServlet.service(FacesModuleResourceServlet.java:85)
at com.ibm.designer.runtime.domino.adapter.ComponentModule.invokeServlet(ComponentModule.java:588)
at com.ibm.domino.xsp.module.nsf.NSFComponentModule.invokeServlet(NSFComponentModule.java:1335)
at com.ibm.designer.runtime.domino.adapter.ComponentModule$AdapterInvoker.invokeServlet(ComponentModule.java:865)
at com.ibm.designer.runtime.domino.adapter.ComponentModule$ServletInvoker.doService(ComponentModule.java:808)
at com.ibm.designer.runtime.domino.adapter.ComponentModule.doService(ComponentModule.java:577)
at com.ibm.domino.xsp.module.nsf.NSFComponentModule.doService(NSFComponentModule.java:1319)
at com.ibm.domino.xsp.module.nsf.NSFService.doServiceInternal(NSFService.java:662)
at com.ibm.domino.xsp.module.nsf.NSFService.doService(NSFService.java:482)
at com.ibm.designer.runtime.domino.adapter.LCDEnvironment.doService(LCDEnvironment.java:357)
at com.ibm.designer.runtime.domino.adapter.LCDEnvironment.service(LCDEnvironment.java:313)
at com.ibm.domino.xsp.bridge.http.engine.XspCmdManager.service(XspCmdManager.java:272)
Files with smaller sizes ~900Mb are uploaded fine. What can be the reason of the issue? Is it related to Domino limitations, like:
Rich text field size: Limited only by available disk space up to 1GB
Mentioned here: Table of Notes and Domino known limits, because upload from XPages is actually performed to the Domino Document field defined as Mime by the following code snippet:
doc.createMIMEEntity("preview");
Notes:
target NSF DB where file is being uploaded is DAOS-enabled.
Max POST size in Domino settings document set to 0 (unlimited)
Max POST size in Web Site Document set to 0 (unlimited)
NSF itself doesn't set a limit for max size in File Upload Options
Thanks for any thoughts!
Check "Maximum size of request content". As this technote says, that also affects it http://www-01.ibm.com/support/docview.wss?uid=swg21096111.
Also, file uploads get written to the xspupload folder as specified in xsp.properties or the default temp file location. That drive could be limited on space that can be used, which would also cause errors in serialization.
But I'd agree with Frantisek that uploading 3Gb via HTTP is not advisable. As well as timeout issues and performance, the limit is per attachment. So if many people upload large files at the same time, the drive the xspupload folder is on may run out of space. Of course once any of those Notes Documents is saved, the temp file gets removed, so I'm not sure how you would be able to diagnose that scenario. I'm not an expert in this area, but possibly FTP to a file system may be a better approach or administrative intervention to manually store occasional very large files in a specific area.
I have 150GB of jpg's in around 30 folders. I am trying to import them into the media library of a CMS. The CMS will accept a bulk import of images in a zip file but there is a limit of 500MB on the size of the zip (and it won't accept multi-volume zips).
I need to go into each folder and zip the images into a small number of ~500MB zip files. I am using WinRAR but it doesn't seem to have the facility to do what I want.
Is there another product that will do what I want?
Thanks
David
It is possible with WinRAR also. Please see this guide: Create Multi-part Archives to Split Large Files for Emailing, Writing to CD [How To]
I am trying to assess our security risk if we allow to have a form in our public website that lets the user upload any type of file and get it stored in the database.
I am worried about the following:
Robots uploading information
A huge increment of the size of the database
The form is an resume upload so HR people will be downloading those files in a jpeg or doc or pdf format but actually getting a virus.
You can use captchas for dealing with robots
Set a reasonable file size limit for each upload
You can do multiple checking for your file upload control.
1) Checking the extension of file (.wmv, .exe, .doc). This can be implemented by Regex expression.
2) Actually check the file header or definition type (ex: gif, word, image, etc, xls). Sometimes file extension is not sufficient.
3) Limit the file size. (Ex: 20mb)
4) Never accept the filename provided by the user. Always rename the file to some GUID according to your specifications. This way hacker wont be able to predict the actual name of the file which is stored on the server.
5) Store all the files out of web virtual directory. Preferably store in separate File Server.
6) Also implement the Captcha for File upload.
In general, if you really mean to allow any kind of file to be uploaded, I'd recommend:
A minimal type check using mime magic numbers that the extension of the file corresponds to the given one (though this doesn't solve much if you are not going to limit the kinds of files that can be uploaded).
Better yet, have an antivirus (free clamav for example) check the file after uploading.
On storage, I always prefer to use the filesystem for what it was created: storing files. I would not recommend storing files in the database (suposing a relational database). You can store the metadata of the file on the database and a pointer to the file on the file system.
Generate a unique id for the file and you can use a 2-level directory structure to store the data: E.g: Id=123456 => /path/to/store/12/34/123456.data
Said that, this can vary depending on what you want to store and how do you want to manage it. It's not the same to service a document repository, a image gallery or a simple "shared directory"