I am trying to write a solution for one of the Hackerrank problems. The challenge is to count elements in a list, the elements vary from 0 to 99, so it is possible to count them in linear time. Here is what I got:
{-# LANGUAGE BangPatterns #-}
{-# OPTIONS_GHC -O3 #-}
module Main where
import Data.STRef
import Data.Foldable
import Control.Monad
import Control.Monad.ST
main = do
line1 <- getLine
line2 <- getLine
let
!ns = map read $ words line2 :: [Int]
res = runST $ do
refs <- forM [0..99] $ \i ->
newSTRef (0 :: Int)
traverse_ (\x -> modifySTRef' (refs !! x) (+1) ) ns
mapM (\ref -> readSTRef ref) refs
putStrLn . unwords . map show $ res
This code works but not fast enough to pass the last test case. Can someone recommend an improvement to it? (link to the problem)
This can be done as a one-liner using accumArray from Data.Array. Something like accumArray (+) 0 (0,99) . zip values $ repeat 1 where values is the input.
It appears to still not be fast enough, which is somewhat vexing. accumArray is more or less as efficient as possible for what it does. Testing on my system reveals the time for processing 1,000,000 input values to be about 1 second, even without compiling it, and that time is dominated by generating the random inputs. That's a far cry from the 5 seconds on the test site.. I have to wonder how overloaded that system is.
One problem you have is that you're looking up your STRefs in a list which means that you'll have to traverse O(n) steps for every lookup and modification. This can be alleviated by using something like Data.Map.Map which has O(log(n)) lookup and modification time.
You could also use a mutable Array or Vector for O(1) lookup/modification time in the ST monad. This is probably the fastest method.
Related
I'm looking to try to run a moderately expensive function on a large list of inputs, using part of the output of that function as one of its inputs. The code runs as expected, unfortunately it consumes a large amount of memory in the process (just under 22GiB on the heap, just over 1GiB maximum residency). Here is a simplified example of what I mean:
{-# LANGUAGE OverloadedStrings #-}
import Data.List (foldl')
import qualified Data.Text as T
import qualified Data.Text.Lazy as TL
import qualified Data.Text.Lazy.IO as TL
import qualified Data.Text.Lazy.Builder as TB
main :: IO ()
main = TL.putStr $ TB.toLazyText showInts
showInts :: TB.Builder
showInts = foldMap fst shownLines
where
shownLines = map (showInt maxwidth) [0..10^7]
maxwidth = foldl' (\n -> max n . snd) 0 shownLines
showInt :: Int -> Int -> (TB.Builder, Int)
showInt maxwidth n = (builder, len)
where
builder = TB.fromText "This number: "
<> TB.fromText (T.replicate (maxwidth - len) " ") <> thisText
<> TB.singleton '\n'
(thisText, len) = expensiveShow n
expensiveShow :: Int -> (TB.Builder, Int)
expensiveShow n = (TB.fromText text, T.length text)
where text = T.pack (show n)
Note that in the where clause of showInts, showInt takes maxwidth as an argument, where maxwidth itself depends on the output of running showInt maxwidth on the whole list.
If, on the other hand, I do the naìˆve thing and replace the definition of maxwidth with foldl' max 0 $ map (snd . expensiveShow) [0..10^7], then maximum residency falls to just 44KiB. I would hope that performance like this would be achievable without workarounds like precomputing expensiveShow and then zipping it with the list [0..10^7].
I tried consuming the list strictly (using the foldl package), but this did not improve the situation.
I'm trying to have my cake and eat it too: exploiting laziness, while also making things strict enough that we don't build up a mountain of thunks. Is this possible to do? Or is there a better technique for accomplishing this?
You can't do it like this.
The problem is that your showInts has to traverse the list twice, first to find the longest number, second to print the numbers with the necessary format. That means the list has to be held in memory between the first and second passes. This isn't a problem with unevaluated thunks; it is simply that the whole list, completely evaluated, is being traversed twice.
The only solution is to generate the same list twice. In this case it is trivial; just have two [0..10^7] values, one for the maximum length and the second to format them. I suspect in your real application you are reading them from a file or something, in which case you need to read the file twice.
I beg for your help, speeding up the following program:
main = do
jobsToProcess <- fmap read getLine
forM_ [1..jobsToProcess] $ \_ -> do
[r, k] <- fmap (map read . words) getLine :: IO [Int]
putStrLn $ doSomeReallyLongWorkingJob r k
There could(!) be a lot of identical jobs to do, but it's not up to me modifying the inputs, so I tried to use Data.HashMap for backing up already processed jobs. I already optimized the algorithms in the doSomeReallyLongWorkingJob function, but now it seems, it's quite as fast as C.
But unfortunately it seems, I'm not able to implement a simple cache without producing a lot of errors. I need a simple cache of Type HashMap (Int, Int) Int, but everytime I have too much or too few brackets. And IF I manage to define the cache, I'm stuck in putting data into or retrieving data from the cache cause of lots of errors.
I already Googled for some hours but it seems I'm stuck. BTW: The result of the longrunner is an Int as well.
It's pretty simple to make a stateful action that caches operations. First some boilerplate:
{-# LANGUAGE FlexibleContexts #-}
import Control.Monad.State
import Data.Map (Map)
import qualified Data.Map as M
import Debug.Trace
I'll use Data.Map, but of course you can substitute in a hash map or any similar data structure without much trouble. My long-running computation will just add up its arguments. I'll use trace to show when this computation is executed; we'll hope not to see the output of the trace when we enter a duplicate input.
reallyLongRunningComputation :: [Int] -> Int
reallyLongRunningComputation args = traceShow args $ sum args
Now the caching operation will just look up whether we've seen a given input before. If we have, we'll return the precomputed answer; otherwise we'll compute the answer now and store it.
cache :: (MonadState (Map a b) m, Ord a) => (a -> b) -> a -> m b
cache f x = do
mCached <- gets (M.lookup x)
case mCached of
-- depending on your goals, you may wish to force `result` here
Nothing -> modify (M.insert x result) >> return result
Just cached -> return cached
where
result = f x
The main function now just consists of calling cache reallyLongRunningComputation on appropriate inputs.
main = do
iterations <- readLn
flip evalStateT M.empty . replicateM_ iterations
$ liftIO getLine
>>= liftIO . mapM readIO . words
>>= cache reallyLongRunningComputation
>>= liftIO . print
Let's try it in ghci!
> main
5
1 2 3
[1,2,3]
6
4 5
[4,5]
9
1 2
[1,2]
3
1 2
3
1 2 3
6
As you can see by the bracketed outputs, reallyLongRunningComputation was called the first time we entered 1 2 3 and the first time we entered 1 2, but not the second time we entered these inputs.
I hope i'm not too far off base, but first you need a way to carry around the past jobs with you. Easiest would be to use a foldM instead of a forM.
import Control.Monad
import Data.Maybe
main = do
jobsToProcess <- fmap read getLine
foldM doJobAcc acc0 [1..jobsToProcess]
where
acc0 = --initial value of some type of accumulator, i.e. hash map
doJobAcc acc _ = do
[r, k] <- fmap (map read . words) getLine :: IO [Int]
case getFromHash acc (r,k) of
Nothing -> do
i <- doSomeReallyLongWorkingJob r k
return $ insertNew acc (r,k) i
Just i -> do
return acc
Note, I don't actually use the interface for putting and getting the hash table key. It doesn't actually have to be a hash table, Data.Map from containers could work. Or even a list if its going to be a small one.
Another way to carry around the hash table would be to use a State transformer monad.
I am just adding this answer since I feel like the other answers are diverging a bit from the original question, namely using hashtable constructs in Main function (inside IO monad).
Here is a minimal hashtable example using hashtables module. To install the module with cabal, simply use
cabal install hashtables
In this example, we simply put some values in a hashtable and use lookup to print a value retrieved from the table.
import qualified Data.HashTable.IO as H
main :: IO ()
main = do
t <- H.new :: IO (H.CuckooHashTable Int String)
H.insert t 22 "Hello world"
H.insert t 5 "No problem"
msg <- H.lookup t 5
print msg
Notice that we need to use explicit type annotation to specify which implementation of the hashtable we wish to use.
I use haskell for line-based data processing, i.e. tasks where you can apply sed, awk and similar tools. As a trivial example, let's prepend 000 to every line from standard input.
I have three alternative ways to do the task:
Lazy IO with lazy ByteStrings
Line-based conduit.
Chunk-based conduit with pure strict ByteString processing inside.
example.hs:
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE FlexibleContexts #-}
import ClassyPrelude.Conduit
import qualified Data.ByteString.Char8 as B8
import qualified Data.ByteString.Lazy.Char8 as BL8
import qualified Data.Conduit.Binary as CB
main = do
[arg] <- getArgs
case arg of
"lazy" -> BL8.getContents >>= BL8.putStr . BL8.unlines . map ("000" ++) . BL8.lines
"lines" -> runConduitRes $ stdinC .| CB.lines .|
mapC ("000" ++) .| mapC (`snoc` 10) .| stdoutC
"chunks" -> runConduitRes $ stdinC .| lineChunksC .|
mapC (B8.unlines . (map ("000" ++)) . B8.lines) .| stdoutC
lineChunksC :: Monad m => Conduit ByteString m ByteString
lineChunksC = await >>= maybe (return ()) go
where
go acc = if
| Just (_, 10) <- unsnoc acc -> yield acc >> lineChunksC
| otherwise -> await >>= maybe (yield acc) (go' . breakAfterEOL)
where
go' (this, next) = let acc' = acc ++ this in if null next then go acc' else yield acc' >> go next
breakAfterEOL :: ByteString -> (ByteString, ByteString)
breakAfterEOL = uncurry (\x -> maybe (x, "") (first (snoc x)) . uncons) . break (== 10)
$ stack ghc --package={classy-prelude-conduit,conduit-extra} -- -O2 example.hs -o example
$ for cmd in lazy lines chunks; do echo $cmd; time -p seq 10000000 | ./example $cmd > /dev/null; echo; done
lazy
real 2.99
user 3.06
sys 0.07
lines
real 3.30
user 3.36
sys 0.06
chunks
real 1.83
user 1.95
sys 0.06
(The results are consistent across multiple runs, and also hold for lines with several numbers).
So chunks is 1.6x faster than lines which is a bit faster than lazy. This means that conduits can be faster than plain bytestrings, but the overhead of conduit pipes is too heavy when you split chunks into short lines.
What I don't like about chunks approach is that it mixes both conduit and pure worlds, and it makes it harder to use it for more complex tasks.
The question is, did I miss a simple and elegant solution which would allow me to write efficient code in same fashion as with lines approach?
EDIT1: Per #Michael's suggestion I joined two mapC into one mapC (("000" ++). (snoc10)) in lines solution, to make number of pipes (.|) same between lines and chunks. This made it perform a bit better (down from 3.3s to 2.8s), but still significantly slower than chunks.
Also I tried older Conduit.Binary.lines which Michael suggested in comments, and it also improves performance a bit, by ~0.1s.
EDIT2: Fixed lineChunksC so it works with very small chunks, e.g.
> runConduitPure $ yield ("\nr\n\n"::ByteString) .| concatC .| mapC singleton .| lineChunksC .| sinkList
["\n","r\n","\n"]
My guess is that, for "lines", the mapC ("000" ++) .| mapC (`snoc` 10) part is doing a lot of work.
Concatenating several strict ByteStrings into another strict ByteString is expensive. Concatenating them into a lazy ByteString tends to be more efficient.
To avoid this cost, you can yield each part individually downstream as a strict ByteString (but be aware that then we aren't talking about "lines" anymore).
Alternatively, yield each transformed line as a lazy ByteString downstream.
The question is, did I miss a simple and elegant solution which would
allow me to write efficient code in same fashion as with lines
approach?
Some streaming libraries have an interesting feature: you can delimit lines in the stream, and manipulate them, without the need to materialize entire lines in memory at any point.
Here I'm using the streaming and streaming-bytestring packages, because I'm more familiar with them.
In module Data.ByteString.Streaming.Char8 of streaming-bytestring, we have the lines function:
lines :: Monad m => ByteString m r -> Stream (ByteString m) m r
lines turns a ByteString into a connected stream of ByteStrings at
divide at newline characters. The resulting strings do not contain
newlines. This is the genuinely streaming lines which only breaks
chunks, and thus never increases the use of memory.
The gist of it is that ByteString m r is already a streaming type! So this version of lines transforms a stream into a "stream of streams". And we can only reach the "next stream" (next line) by exhausting the "current stream" (current line).
Your "lines" example could be written as:
{-# language OverloadedStrings #-}
module Main where
import Control.Applicative ((*>))
import Streaming
import qualified Streaming.Prelude as S
import qualified Data.ByteString.Streaming.Char8 as Q
main :: IO ()
main = Q.stdout
. Q.unlines
. S.maps (\line -> "000" *> line)
. Q.lines
$ Q.stdin
I'm trying to make a conduit that's sort of a cross between takeWhile and isolate. That is, it will consume from the input and yield to the output until either the predicate no longer holds or it has reached the byte limit. I know the type signature will be
isolateWhile :: (Monad m) => Int -> (Word8 -> Bool) -> Conduit ByteString m ByteString
As an example of its use:
{-# LANGUAGE OverloadedStrings #-}
import Data.Conduit
import qualified Data.Conduit.List as CL
import qualified Data.Conduit.Binary as CB
import Control.Monad.Trans.Class
charToWord = fromIntegral . fromEnum
example :: Int -> Char -> IO ()
example limit upTo = do
untaken <- CB.sourceLbs "Hello, world!" $= conduit $$ CB.sinkLbs
putStrLn $ "Left " ++ show untaken
where
conduit = do
taken <- toConsumer $ isolateWhile limit (/= charToWord upTo) =$ CB.sinkLbs
lift $ putStrLn $ "Took " ++ show taken
CL.map id -- pass the rest through untouched
I expect that
ghci> example 5 'l'
Took "He"
Left "llo, world!"
ghci> example 5 'w'
Took "Hello"
Left ", world!"
However, the simplest possible definition of isolateWhile:
isolateWhile limit pred = CB.isolate limit =$= CB.takeWhile pred
yields
ghci> example 5 'l'
Took "He"
Left ", world!"
ghci> example 5 'w'
Took "Hello"
Left ", world!"
In other words, isolate will eat up the entire Hello, leaving He to takeWhile and discarding the llo. This data loss is undesirable for my application. However, it is notable that the second case yields the expected result.
If I swap the operands of =$= like so:
isolateWhile limit pred = CB.takeWhile pred =$= CB.isolate limit
Then
ghci> example 5 'l'
Took "He"
Left ", world!"
ghci> example 5 'w'
Took "Hello"
Left ""
Now I've fixed the first test, but broken the second one! This time, takeWhile will take whatever it needs and isolate will take a subset of that; but whatever takeWhile uses that isolate doesn't will be discarded, and this is undesirable.
Lastly, I tried:
isolateWhile limit pred = do
untaken <- CB.isolate limit =$= (CB.takeWhile pred >> CL.consume)
mapM_ leftover $ reverse untaken
This actually works! Whatever isolate accepts and takeWhile doesn't is consumed by the CL.consume and placed back into the stream with leftover. Unfortunately, this seems like a horrible kludge, and undesirably (although not unusably so) it will buffer up to limit bytes in memory only to put it back with leftover. That seems like a waste.
The only solution I can think of is to write it in terms of the primitives await, yield and leftover as takeWhile and isolate are themselves written. While this would solve all the problems without wasting much, it seems like there must be a better way.
Am I missing something, or is there really no better way to write this?
There's a known limitation in the current version of conduit: fusion always discards downstream leftovers, which is exactly what you're running into here. There are some discussions right now about an architecture to resolve this, but for the moment, writing your function in terms of the primitives is likely your best option.
I am trying to parse an input stream where the first line tells me how many lines of data there are. I'm ending up with the following code, and it works, but I think there is a better way. Is there?
main = do
numCases <- getLine
proc $ read numCases
proc :: Integer -> IO ()
proc numCases
| numCases == 0 = return ()
| otherwise = do
str <- getLine
putStrLn $ findNextPalin str
proc (numCases - 1)
Note: The code solves the Sphere problem https://www.spoj.pl/problems/PALIN/ but I didn't think posting the rest of the code would impact the discussion of what to do here.
Use replicate and sequence_.
main, proc :: IO ()
main = do numCases <- getLine
sequence_ $ replicate (read numCases) proc
proc = do str <- getLine
putStrLn $ findNextPalin str
sequence_ takes a list of actions, and runs them one after the other, in sequence. (Then it throws away the results; if you were interested in the return values from the actions, you'd use sequence.)
replicate n x makes a list of length n, with each element being x. So we use it to build up the list of actions we want to run.
Dave Hinton's answer is correct, but as an aside here's another way of writing the same code:
import Control.Applicative
main = (sequence_ . proc) =<< (read <$> getLine)
proc x = replicate x (putStrLn =<< (findNextPalin <$> getLine))
Just to remind everyone that do blocks aren't necessary! Note that in the above, both =<< and <$> stand in for plain old function application. If you ignore both operators, the code reads exactly the same as similarly-structured pure functions would. I've added some gratuitous parentheses to make things more explicit.
Their purpose is that <$> applies a regular function inside a monad, while =<< does the same but then compresses an extra layer of the monad (e.g., turning IO (IO a) into IO a).
The interesting part of looking at code this way is that you can mostly ignore where the monads and such are; typically there's very few ways to place the "function application" operators to make the types work.
You (and the previous answers) should work harder to divide up the IO from the logic. Make main gather the input and separately (purely, if possible) do the work.
import Control.Monad -- not needed, but cleans some things up
main = do
numCases <- liftM read getLine
lines <- replicateM numCases getLine
let results = map findNextPalin lines
mapM_ putStrLn results
When solving SPOJ problems in Haskell, try not to use standard strings at all. ByteStrings are much faster, and I've found you can usually ignore the number of tests and just run a map over everything but the first line, like so:
{-# OPTIONS_GHC -O2 -optc-O2 #-}
import qualified Data.ByteString.Lazy.Char8 as BS
main :: IO ()
main = do
(l:ls) <- BS.lines `fmap` BS.getContents
mapM_ findNextPalin ls
The SPOJ page in the Haskell Wiki gives a lot of good pointers about how to read Ints from ByteStrings, as well as how to deal with a large quantities of input. It'll help you avoid exceeding the time limit.