Snap binary files upload - haskell-snap-framework

I need to understand the file uploading process with Snap.
Given this form:
<form id="form" action="/files/upload" method="POST" enctype="multipart/form-data">
<input type="file" id="files" name="files[]" multiple />
<button type="submit" onclick="handleFiles(e)">Upload</button>
</form>
Do I use the same functions like getPostParams to process binary files or do I use functions from Snap.Util.FileUploads?
I need to upload and save binary files like PDF in the database. The database driver will accept a ByteString to store a binary file.
I went through the Snap.Util.FileUploads but it does not look like it is what I need.
So I am not sure how to process this in the handler?
Thanks.
EDIT
With some help from IRC I managed to come up with the below construct. I think it should be close to correct?? Well, it compiles and dumps the file to mongodb. I can also read it back. Although I don't quite understand the enumerators and Iteratee stuff ...
handleFiles :: AppHandler ()
handleFiles = do
[file] <- handleMultipart defaultUploadPolicy $ \part -> do
content <- liftM BS.concat EL.consume
return content
let b = ["file" =: Binary file]
r <- eitherWithDB $ insert "tests" b
either (error . show) (const $ return () ) r

Use Snap.Util.FileUploads. It's quite tricky to get file uploading right without leaving yourself open to security vulnerabilities. That FileUploads module has been carefully designed with this in mind.
The docs describe handleFileUploads pretty well. It "reads uploaded files into a temporary directory and calls a user handler to process them." You supply it with a handler with the type:
[(PartInfo, Either PolicyViolationException FilePath)] -> m a
handleFileUploads stores all the incoming files on disk subject to the policy you specify. Then it calls your handler and passes it the list of files handled.

Related

Need Help Understanding Node JS Syntax Error

Can anyone please help me to understand the syntax error based on the attached screenshot below?
My script is supposed to access a given JSON and return the specific value, but somehow it's returning this error.
Edit 1
I tested my script with a dummy JSON and the script didn't return any error, so I suspect my original JSON might be giving problem. Here's my JSON.
{
"og_object": {
"id": "1192199560899293",
"description": "Hi everyone I have an important announcement to make. So ever since Penny started school, I've been having mixed feelings. \u00a0Besides having a bit of space to myself to breathe and rest my brain/legs, I'm actually a bit bittersweet cos my little baby, who used to sleep at weird hours and gobble puree",
"title": "Fighter and Penny's new sibling",
"type": "article",
"updated_time": "2017-04-12T01:17:57+0000"
},
"share": {
"comment_count": 0,
"share_count": 109
},
"id": "http://fourfeetnine.com/2017/03/05/fighter-and-pennys-new-sibling/"
}
Edit 2
Here's my script that I run that produces the error.
var objects = require('./output.txt');
console.log(objects);
output.txt is the file that contains the JSON that I pasted in Edit 1
var objects = require('./output.txt');
The require() function belongs to the module loading system. Despite the name, it can actually load several types of files and directories and not only Node modules. As per the high-level algorithm in pseudocode shown in docs:
require(X)
If X begins with './' or '/' or '../'
a. LOAD_AS_FILE(Y + X)
[...]
LOAD_AS_FILE(X)
If X is a file, load X as JavaScript text. STOP
If X.js is a file, load X.js as JavaScript text. STOP
If X.json is a file, parse X.json to a JavaScript Object. STOP
If X.node is a file, load X.node as binary addon. STOP
Since you get SyntaxError, output.txt does not contain valid JavaScript code.
If you really want to load JSON, you need to enforce subrule #3 by renaming the file to output.json.
Thanks to #Jordan suggestion. The fault is indeed due to wrong file extension. After changing the file extension from .txt to .json, then the syntax error disappeared.

tail -f implementation in node.js

I have created an implementation of tail -f in node.js using socket.io and fs.watch function.
I read the file using fs.readFile, convert it into array of lines and returns it to the client. Stores the current length in variable.
Then whenever the "file changed" event fires, I re-read the whole file, converts it into array of lines. And then compare the old length and current length. and slice it like
fileContent.slice(oldLength, fileContent.length)
this gives me the changed content. So running perfectly fine.
Problem: I am reading the whole file every time the file gets changed, which is not efficient if file is too large. So is there any way, of reading a file once, and then gets the changed content if there is any change?
I have also tried, spawning child process for "tail -f"
var spawn = require ('child_process').spawn;
var child = spawn ('tail', ['-f', logfile]);
child.stdout.on ('data', function (data){
linesArray = data.toString().split("\n")
console.log ( "Data sent" + linesArray[0]);
io.emit('changed', {
data: linesArray,
});
});
the problem with this is:
on("data") event fires multiple time when I save the logfile by writing some content.
On first load, it correctly returns the last ten line of the file. But if there is a change then it return the whole content again and again.
So if you have any idea of solving this problem let me know. Till then I will dig the internet.
So, I got the solution by reading someone else's code. So solution was to use fs.open which will open the file and then instead of reading whole file we can read the particular block from the file using fs.read() function.
To know about the fs.open/fs.read, read this nodejs-file-system.
Official doc : fs.read

Clojure agents append to excel file

I've been using docjure to write to excel files. Mostly I want to append rows to already existing files, usually one at a time. When I do this without agents/future, I load the file, use add-rows to add the data, and then rewrite the file like this:
(defn append [filename data]
"data is in the same format as create-workbook, i.e. [[\"n\" \"m\"] [1 2] [3 4]]"
(let [workbook (load-workbook filename))
sheet (select-sheet workbook "Sheet1")]
(add-rows! sheet data)
(save-workbook! filename workbook)))
I make a lot of calls to append, so I found this: http://blakesmith.me/2012/05/25/understanding-clojure-concurrency-part-2.html, which shows you how to use agents to write to a file using future.
First of all, I'm using FileOutputStream instead of FileWriter, which would still work, except whereas in the tutorial's example you just append strings to the end of the file using .write and then close, I need to rewrite the file every time I "append" (I think?) since there's more bytes in a .xlsx workbook than just characters.
I don't really know how to set this up since with the tutorial's logging example, write-out returns the updated instance of the BufferedWriter and I don't know what the equivalent of that would be.
My other option would be to add the data to the vector concurrently (load the file once and keep returning new vectors [[\"n\" \"m\"] [1 2] [3 4]] with the data added) but I'm planning on doing ~10000-100000 of these calls and that seems like a lot to keep track of... although to be fair reading and writing all the data that many times is probably not that great either.
If you have any suggestions on how I can do this, I'd appreciate it. I'd be willing to make calls to the Apache POI itself too, if there's a better way to append with that. Thanks.
--- UDPATE ---
I just rewrote the the logger example with the file as an agent instead of the output stream and it seems to work. I'll let you know if it ends up working with docjure/Apache POI.
(def logfile (agent (File. "blah.txt")))
(defn write-out [file msg]
(with-open [out (BufferedWriter. (FileWriter. file true))]
(.write out msg))
file)
--- UDPATE 2---
I got an analogous version written with docjure, but the unfortunately because opening the file happens within write-out and that happens during each future (I don't see a way around this if I use File as an agent, and I don't see another way to do it besides that) most of them read the empty file and write the row to that since they're all done in parallel and the end result is that most of them overwrite each other.
Ultimately I decided to just add each row vector to an overall data vector and write once. I can do that with just pmap, so its a lot neater. The one downside is if something goes wrong none of the data is written to the file at all, but the upside is that the time it takes to write is reduced since there's only one write call. Also, I would have been loading the large amount of data into memory every time which takes time. Memory usage is the same either way.
If anyone still wants to answer this, I'd still be interested, but the method in my first update does not work (each future reads in an empty file and uses that to append to). I'll post that code incase it helps anyone though--docjure version of the aforementioned tutorial:
(def file (agent (File. "blah.xlsx")))
(defn write-out [file workbook]
(with-open [out (FileOutputStream. file)]
(.write workbook out))
file)
(defn write-workbook [file data]
(let [filename (.getPath #file)
workbook (try (load-workbook filename)
(catch Exception e (create-workbook "Sheet1" [])))
sheet (select-sheet "Sheet1" workbook)]
(add-rows! sheet data)
(send file write-out workbook)))
(defn test [file]
(write-workbook file [["n" "m"]])
(dotimes [i 5]
(future (write-workbook file [[i (inc i)]]))))
Thanks

How to handle multiple file upload in hunchentoot?

I know how to handle a single file upload in hunchentoot using hunchentoot:post-parameter, but when I add a property multiple, i.e., <input name="file" type="file" multiple="multiple"/>. I got (hunchentoot:post-parameter "file") only for one of them. Is there (and what is ) the mechanism for receiving all files, chosen by user?
The Hunchentoot API does not directly give you access to multiple uploaded files, but you can use (hunchentoot:post-parameters *request*) to retrieve the list of all POST parameters (which includes the uploaded files). This will be an alist, and you can get a list of all uploaded files using standard alist techniques (e.g. (remove "file" (hunchentoot:post-parameters hunchentoot:*request*) :test (complement #'equal) :key #'car)).
This is a rather straight-forward task in hunchentoot. Assuming you have a html <input> element with name="files" and multi="true", you could access all the files associated with the the "files" input like this:
(loop for post-parameter in (hunchentoot:post-parameters*)
if (equal (car post-parameter) "files")
collect post-parameter))
This will give you a list whose length should match the number of uploaded files associated with the name "files". Each of the elements will be a list that looks like this:
("files" #P"/temporary/file1" "name of file" "file type")
More information can be found in the very well-documented reference.

Running a main function inside another main function on a .hs file changes the behaviour IO?

I'm currently finishing my first Haskell Project and, on the final step of the work, my I/O function seems to behave strangely after I connected the different haskell files.
I have a main file (f1.hs) which loads some info of a multimedia library and saves it into variables on a new .hs file (f2.hs). I also have a "data processing and user interface" file (f3.hs), which reads those variables and, depending on what the user orders, it sorts them and displays them on screen. This f3.hs file works with menus, commanded by the valus of the keyboard input (getLine).
In order to make the work sequence "automatic", I made a "main" function on the f1.hs file, which creates the f2.hs file and then with the System.Cmd module, I did a system "runhaskell f3.hs". This routes the user from the f1.hs file to the main function of f3.hs.
The strange thing is that, after I did that, all the getLine seem to appear before the last line of the function prompt.
What it should appear would be:
Question One.....
Answer: (cursor's place)
but what I get is:
Question One.....
(cursor's place)
Answer:
This only happens if I runhaskell f1.hs. If I try to runhaskell f3.hs directly, it works correctly (though I can't do it on the final job, as the f2.hs file needs to be created first). Am I doing something wrong with this sequence?
I'm sorry for the lack of code, but I thought that it wouldn't be any help for the understanding of the problem...
This is typically caused by line buffering, meaning the text does not actually get printed to the console until a newline is printed. The solution is to manually flush the buffer, i.e. something like this:
import System.IO
main = do ...
putStr "Answer: "
hFlush stdout
...
Alternatively, you can disable buffering by using hSetBuffering stdout NoBuffering, though this comes at a slight performance cost, so I recommend doing the flushing manually when you need to print a partial line.

Resources