Yesod scaffolded site slow to release database pool connection - haskell

UPDATE
I have simplified the demonstration of this with an actual project created from the scaffold - you can check it out here: https://github.com/tetigi/yesod-bug-test
Follow the README to set up the repo and replicate the issue! Thanks :)
ORIGINAL POST
I've recently been trying to create a simple website using yesod - in one particular handler, it makes a couple of runDB calls (selecting and inserting some values into a ~200 item DB). However, on medium load, such as reloading the page rapidly in a browser, the page starts to hang.
Doing some debugging, I found that it seems the yesod app is not releasing the connections to the DB pool in a timely fashion and ends up waiting for them to release. To correborate this, I found the other following things:
Reducing the DB pool to 2 gave me a freeze after only a couple of clicks
The default (10) froze after about 5 seconds of clicking
Increasing the DB pool to 100 gave me a much longer click period, up to about 10-15 seconds of rapid clicking
The issue is the same whether I'm using postgres or sqlite
In postgres, it was possible to see the 'COMMIT' transactions stacking up over time
These transactions would eventually dissappear over time and the website would be responsive again
Is there something I'm missing here? The webpage does not do anything complicated, as the snippet below will show. Any ideas? As it stands, the website will be unuseable for multiple users until I find a way to fix this!
I'm using the standard scaffolded yesod application via stack as is recommended in the documentation.
Cheers!
Luke
Example handler code (abridged)
getCompareR :: Handler Html
getCompareR = do
-- Get all entities from the db. Throws error if < 2 elems in the DB.
entities <- fmap (\xs -> assert (length xs >= 2) xs) $ runDB $ selectList [] []
-- Pick an entity at random
Entity _ thisThingEntity <- liftIO $ runRVar (choice entities) DevRandom
-- Pull out everything NOT the thing we just picked
otherEntities <- runDB $ selectList [ComparisonHash !=. (comparisonHash thisThingEntity)] []
-- Pick one at random
Entity _ thatThingEntity <- liftIO $ runRVar (choice otherEntities) DevRandom
-- Some stuff including some inserts
-- ...
-- ...
runDB $ sequence [update thisId [ComparisonElo =. thisElo], update thatId [ComparisonElo =. thatElo]]
-- Start laying out the webpage
defaultLayout $ do
-- Fill in the rest with compare.hamlet
$(widgetFile "compare")

The issue lies within Data.Random - replacing the choice call with something like:
import System.Random (randomRIO)
...
-- Pick an entity at random
randomInt1 <- liftIO $ randomRIO (0, length entities -1)
let Entity _ thisThingEntity = entities !! randomInt1
Fixed everything and we no longer get slow down. Not really sure why Data.Random is doing this, but at least it works now!
Another interesting thing to note - the issue is NOT present on Mac OS X, only on Linux flavours (CentOS, Arch, Ubuntu being the ones we tried)

Related

Is possible to use GUI with littler?

I want to write small script with simple GUI using Rscript or littler.
In the example I use gWidget2RGtk2.
For example, helloworld.R
#!/usr/bin/r
library(gWidgets2RGtk2)
W <- gwindow("Window", visible=FALSE)
L <- glabel("Hello World!", container=W)
visible(W) <- TRUE
This works well if it run in a R session, but get an error when it run from shell:
Error in UseMethod(".gwindow") :
no applicable method for '.gwindow' applied to an object of class "NULL"
In the case of graphics, I know that is required X11() before use plot().
Is possible fix this script to allow render widgets from shell?
(I only need run the script on linux machine)
EDIT: This is an example that works well on Linux. (includes suggestions received in the answer and comment.)
#!/usr/bin/r
require(RGtk2) # required for gtkMain()
require(gWidgets2)
options(guiToolkit="RGtk2")
W <- gwindow("Window", visible=FALSE,
handler = function(h, ...) {
gtkMainQuit() # stop main loop when windows is closed.
}
)
L <- glabel("Hello Word!", container=W)
visible(W) <- TRUE
gtkMain() # start main loop to keep the script alive.
Yes, I have done that in the past. You have to make sure you have a GUI event loop running to keep the app alive by waiting.

How to get the process ID of a created process in Haskell?

Maybe I'm just missing something obvious in the System.Process API (http://hackage.haskell.org/package/process), but it doesn't appear to support getting the raw PID of a process created. The API usually returns a ProcessHandle which can be used easily enough, but this doesn't appear to fulfill a deployment need I have.
I have a case where I want to spawn a long-running process, log the PID it's using, and be able to automatically come back at a later time (days, weeks, months) and kill the old process and re-start with a new process. I'm sure there are several ways to do this auto-deploy-and-restart, but PIDs seemed like the simplest way to do so without too much platform-dependent code.
I'm open to other suggestions about my underlying problem, but it seems odd to me that I can't find any direct PID references (or a way to convert to them) in the process API. This seems like an oversight of the API.
Here is some example code:
import System.Process
import System.Process.Internals
-- | returns Just pid or Nothing if process has already exited
getPid ph = withProcessHandle ph go
where
go ph_ = case ph_ of
OpenHandle x -> return $ Just x
ClosedHandle _ -> return Nothing
main = do
(_,_,_,ph) <- createProcess $ shell "echo $$"
getPid ph >>= print
Note: I haven't tested this under Windows, but it works on OSX and, presumably, Linux.
For Windows, the Win32 package has a getProcessId function in the module System.Win32.Process, and according to code I've read, this should work:
import System.Win32.Process (getProcessId)
main = do
(_,_,_,ph) <- createProcess $ shell "echo $$"
pid <- withProcessHandle ph go
print pid
where go (OpenHandle x) = fmap Just $ getProcessId x
go (ClosedHandle _) = return Nothing
The code I am basing this on is the code for interruptProcessGroupOf (link)
Looks like interruptProcessGroupOf in System.Process calls either System.Posix.Process.getProcessGroupIDOf (POSIX/not Windows) or System.Win32.Process.getProcessId (Windows) to get the pid: http://git.haskell.org/packages/process.git/blob/HEAD:/System/Process.hs
If everything else fails,
import System.Process.Internals
and then dig inside the ProcessHandle abstraction. You probably want to extract the PHANDLE from the MVar.
Note that this breaks the abstraction layer, which is meant to make the code portable across OSs. Use it with extreme care, and be prepared for it to break in new versions of the library.

Mysterious serial port behavior

I'm trying to write a command line utility in Haskell (first serious program in Haskell) to query a sensor connected to an arduino over a serial port. The relevant portion of the code is as follows:
-- Read a single reading from the serial port
recursiveread :: SerialPort -> B.ByteString -> IO B.ByteString
recursiveread s acc sent = do
recd <- recv s 1000
if ((B.pack "Done\r\n") `B.isSuffixOf` acc)
then return acc
else recursiveread s $ B.append acc recd
-- Read a single reading from the serial port
getSingleRead :: FilePath -> IO [String]
getSingleRead path = do
s <- openSerial path defaultSerialSettings
send s $ B.pack "1"
acc <- recursiveread s B.empty
closeSerial s
return $ (lines . B.unpack) acc
-- Checks if the filepath exists, and prints a single reading
readspr :: [FilePath] -> IO ()
readspr [path] = do
exists <- doesFileExist path
case exists of
False -> putStrLn "No device found"
True -> getSingleRead path >>= (mapM_ putStrLn)
The dispatch function (not shown) calls readspr and uses a "/dev/cu.modem1421" as a parameter.
The problem I'm having I suspect has to do with order of evaluation. The expected behavior is to receive a frame of data like this, which I check for "Done" as the terminator:
T: 33697
Data
0:3.2967772483
1:3.2967772483
2:3.2967772483
...
126:3.2967772483
127:3.2967772483
Done
Here is the problem:
1) When I run this code by itself after compiling, from Bash - the program computes for a second and hangs - no output.
2) However when I go into ghci and open a serial port with the openSerial :: FilePath -> SerialPort command, and then run the program in Bash I do see the expected output. (Shouldn't I get resource busy here or something? If I open the connection using screen I do get that error)
This behavior is repeatable - I can closeSerial in ghci and go back to no output in Bash.
Prior research / additional background:
I'm using the System.Hardware.SerialPort library (serialport-0.4.7: Cross platform serial port library in cabal) on an OS X machine. There is an issue pertaining to the blocking / non-blocking nature of the serial port on OS X that is at odds with the UNIX standard - Issue 13 on Github for serialport-0.4.7. However, I'm unable to make much sense of it. Any help on making sense of this problem is highly appreciated. Is this an issue with my misreading of Haskell's laziness, or am I missing something on the order of evaluation, where the port is being closed before I can read it fully, or with the library? (in which case I should post it on the Github page?)
Further research: (case solved)
Here's the problem - Arduino resets when a serial connection is opened (can be disabled). When I opened a connection with the program - it would receive the communication from the host before it booted up. I noticed this when observing the LEDs on the Arduino in the two separate cases - opening (and holding) the port in ghc allowed for enough time to elapse between the second open-and-read.
The solution was to add a delay of a couple seconds after opening the port - now "it just works"!
Answer updated for posterity:
Further research: (case solved)
Here's the problem - Arduino resets when a serial connection is opened (can be disabled). When I opened a connection with the program - it would receive the communication from the host before it booted up. I noticed this when observing the LEDs on the Arduino in the two separate cases - opening (and holding) the port in ghc allowed for enough time to elapse between the second open-and-read.
The solution was to add a delay of a couple seconds after opening the port - now "it just works"!

GTK and INotify don't work together

Using GHC, on Ubuntu 13.10, iNotify works-
import Control.Concurrent
import System.INotify
main = do
n <- initINotify
addWatch n [Modify] "/home/fred/" $ \event -> do
putStrLn $ "file changed: " ++ show event
threadDelay 10000000
and GTK2HS works-
import Graphics.UI.Gtk
main = do
initGUI
{-Add your widgets here.... or don't, the bug appears either way.-}
mainGUI
But if I put the two together, inotify never triggers. (it compiles and runs though....)
main = do
n <- initINotify
addWatch n [Modify] "/home/fred/" $ \event -> do
putStrLn $ "file changed: " ++ show event
initGUI
mainGUI
I've tried putting the inotify and GTK stuff in separate threads, it made no difference. I suspect something like a signal collision between the libs....
Oh, and in case it matters, I am trying to build a small tool that runs in the background, watches for file changes, and displays some info in the application indicator when this happens.
Note-
To trigger iNotify, just create or modify a file in the directory given in addWatch....
echo "abcd" > /home/fred/aFile
touch doesn't seem to work.
Compile (actually, technically, link) with -threaded. This way the inotify thread will be evacuated from the main execution context before the mainGUI loop goes into C-land and stops cooperatively switching to the GHC runtime. More details on multi-threading and gtk are available at this post I wrote a while ago.

Selecting rows referenced by another table in Yesod

Suppose I have the following Yesod (Database.Persist) database schema:
File
path Text
Tag
name Text
FileTag
file FileId
tag TagId
UniqueFileTag file tag
What is the most convenient way in Yesod to select File records that are referenced by a given Tag record? Do I need to resort to custom SQL? I'm using PostgreSQL as the database backend.
You can use custom SQL to solve this problem; I don't think Persistent offers a different solution, since it's not an ORM because it has to support non-relational backends like MongoDB.
You can implement a basic join like this:
let tagFileStatement =
Text.concat
[ "SELECT ?? "
, "FROM file, file_tag "
, "WHERE file.id = file_tag.file "
, "AND ? = file_tag.tag"
]
files <- runDB $ rawSql tagFileStatement
[toPersistValue theTagIdThatYouWantToLookupFilesFor]
files :: [Entity File]
There is a module for handling one-to-many relationships, either as application-level joins or as a proper SQL join. These modules probably count as the most poorly documented aspect in the entire Yesod project, unfortunately. Longer term, we're hoping to improve the state of more complex SQL queries, but we don't have any actual code to show for this right now.
There are a few threads on the Yesod mailing list covering how to use these, here's one thread that popped up: https://groups.google.com/forum/#!msg/yesodweb/a4EAvPS8wFA/ClPuS94TRFwJ%5B1-25%5D . We really need to either improve the Haddocks or write a wiki page detailing the behavior better.

Resources