What's preventing IO flush in my code? - haskell

I tried solving this, and the following is trial stuff.
When I test this in ghci with hSetBuffering stdout NoBuffering, solveAct 1, 15 10, ghci showed few lines of results and blocked much time, and showed rest result at once.
How can I see the intermediate results in real time?
import Control.Monad
import Data.List
import Data.Maybe
import System.IO
readInts = fmap read . words <$> getLine :: IO [Int]
main = do
t <- readLn :: IO Int
hSetBuffering stdout NoBuffering
sequence_ $ solveAct <$> [1..t]
showTable x = intercalate "\n" $ intercalate " " . fmap show <$> x
solveAct i = do
[j, n] <- readInts
putStrLn $ "Case #" ++ show i ++ ":"
putStrLn $ showTable (take n $ solve (j-1))
digits n = [[x ^ y | y <- [1..n-1]] | x <- [2..10]]
primes = 2 : [x | x <- [3,5..], all (\y -> x `rem` y /= 0) $ takeWhile (<= intSqrt x) primes]
intSqrt = floor . sqrt . fromIntegral
getNDivisor n = listToMaybe [x | x <- takeWhile (<= intSqrt n) primes, n `rem` x == 0]
casesOfMat = subsequences . transpose . digits
casesOfJam n = fmap ([1 + x^n | x <- [2..10]]:) $ casesOfMat n
eachBaseReps n = fmap sum . transpose <$> casesOfJam n
solve :: Int -> [[Int]]
solve n = do
decimals <- eachBaseReps n
let divs = getNDivisor <$> decimals
guard $ all isJust divs
return $ last decimals : catMaybes divs

You are seeing the results in real time. It's just that the computation of all isJust . map getNDivisor takes a long time for the third element of eachBaseReps 14.

Related

De-sugaring Do-Notation Function

sugar a function that use's Do notation. But I'm struggling somewhat with creating/converting the function to using >>= and lambda's only. Any help appreciated.
This function takes a filepath, reads it, turns it into a list, then it takes the resulting list, splits it in half, based on its length, and lastly writes two files one taking the first half of the list, and one file consisting of the second half of the list. That is the idea anyway.
splitFile :: FilePath -> IO ()
splitFile file = do
x <- readFileUTF8 file
let y = splitAt (div (length $ lines x) 2) $ lines x
writeFile "/tmp/foo.txt-part1" $ unlines $ fst y
writeFile "/tmp/foo.txt-part2" $ unlines $ snd y
splitFile :: FilePath -> IO ()
splitFile file =
readFileUTF8 file >>= \x ->
let y = splitAt (div (length $ lines x) 2) $ lines x in
(writeFile "/tmp/foo.txt-part1" $ unlines $ fst y) >>
(writeFile "/tmp/foo.txt-part2" $ unlines $ snd y)
but you can replace >> .. with >>= \_ -> ...
You can also turn y into a pattern match, but to be completely faithful to the original you need to use a lazy pattern ~(a, b)
..
let (a, b) = splitAt (div (length $ lines x) 2) $ lines x in
writeFile "/tmp/foo.txt-part1" (unlines a) >>
writeFile "/tmp/foo.txt-part2" (unlines b)

Parallelizing evaluation and pointwise addition of lists

I have the following code:
normalHands :: [[[Char]]]
normalHands = -- a very long list
sevenHeads :: [[[Char]]]
sevenHeads = -- a very long list
countYakus :: [Int]
countYakus = foldr countYaku [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] (normalHands ++ sevenHeads) where
countYaku hand [p,oP,dT,aS,cS,fS,sO,cF,cT,dM,aO,tOAK,tP,sT,sF,f] =
-- involves pointwise addition of lists. The length of the accumulator is not changed.
How should I parallelize this? I tried the following:
import Control.Concurrent
import Control.Concurrent.QSem
import Control.Monad
import Data.List
--...
main :: IO ()
main = let
[pinfu, onePair, dragonTriplet, allSimples,
colorfulSequences, fullSequence, semiOrphans, concealedFour, colorfulTriplets, dragonsMinor, allOrphans, threeOfAKind,
twoPairs, semiTerminals, semiFlush, flush] = countYakus
in do
qSem <- newQSem 0
forkIO $ putStrLn ("0: " ++ show pinfu ++ ' ' : show onePair) >> signalQSem qSem
forkIO $ putStrLn ("1: " ++ show dragonTriplet ++ ' ' : show allSimples) >> signalQSem qSem
forkIO $ putStrLn ("2: " ++ show colorfulSequences ++ ' ' : show fullSequence) >> signalQSem qSem
forkIO $ putStrLn ("3: " ++ show semiOrphans ++ ' ' : show concealedFour) >> signalQSem qSem
forkIO $ putStrLn ("4: " ++ show colorfulTriplets ++ ' ' : show dragonsMinor) >> signalQSem qSem
forkIO $ putStrLn ("5: " ++ show allOrphans ++ ' ' : show threeOfAKind) >> signalQSem qSem
forkIO $ putStrLn ("6: " ++ show twoPairs ++ ' ' : show semiTerminals) >> signalQSem qSem
forkIO $ putStrLn ("7: " ++ show semiFlush ++ ' ' : show flush) >> signalQSem qSem
sequence_ $ replicate 8 (waitQSem qSem)
I compiled this with -O2 +RTS -N8, but despite I have 8 cores, my system monitor clearly shows that this code is being run in only one core. I guess it's because of normalHands ++ sevenHeads. So what is the correct way to parallelize this?
EDIT: Exploiting the associativity and commutativity of pointwise addition, I tried this:
countYakus :: [[[Char]]] -> [Int]
countYakus = foldl' countYaku [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] where
-- ...
divideList :: Int -> [a] -> [[a]]
divideList 0 _ = []
divideList n xs = let
(ys,zs) = splitAt n xs
in if null zs
then coZip ys (replicate n [])
else coZip ys (divideList n zs)
where
coZip :: [a] -> [[a]] -> [[a]]
coZip [] yss = yss
coZip (x:xs) (ys:yss) = (x:ys) : coZip xs yss
main :: IO ()
main = do
acc <- newIORef [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
mapM_ (\hands -> forkIO . atomicModifyIORef' acc $ \cs -> (zipWith (+) cs $ countYakus hands, ())) $ divideList 8 (normalHands ++ sevenHeads)
cs <- readIORef acc
mapM_ (putStrLn . show) cs
But it still runs only on one core.
EDIT 2: I tried using MVar:
main :: IO ()
main = do
acc <- newIORef [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
mVar <- newEmptyMVar
mapM_ (\hands -> forkIO $ putMVar mVar (countYakus hands)) $ divideList 8 (normalHands ++ sevenHeads)
replicateM_ 8 $ do
xs <- takeMVar mVar
ys <- readIORef acc
writeIORef acc (zipWith (+) xs ys)
cs <- readIORef acc
mapM_ (putStrLn . show) cs
But it still runs on only one core.

Print List of Lists without brackets

I am trying to print Pascals triangle up to some arbitrary row, after some thought I came up with this solution:
next xs = zipWith (+) ([0] ++ xs) (xs ++ [0])
pascal n = take n (iterate next [1])
main = do
n <- readLn :: IO Int
mapM_ putStrLn $ map show $ pascal n
Which works quite well, except for the printing. When I apply pascal 4 I get:
[1]
[1,1]
[1,2,1]
[1,3,3,1]
When what I really want is this:
1
1 1
1 2 1
1 3 3 1
Is there any way I can do this?
Define your own pretty-printing function:
import Data.List (intercalate)
show' :: Show a => [a] -> String
show' = intercalate " " . map show
You could unwords / unlines:
import Data.List
...
putStr $ unlines $ map (unwords . map show) $ pascal n

Why my program use so much memory?

For just a 25mb file the memory usage is constant at 792mb! I thought it had to do with my usage
from list, but moving certain parts of the code for vector (the arrays where fft is applied, for example) didn't change how much memory being used at all!
{-# LANGUAGE OverloadedStrings,BangPatterns #-}
import qualified Data.Attoparsec.Char8 as Ap
import Data.Attoparsec
import Control.Monad
import Control.Applicative
--import Control.DeepSeq (force)
import System.IO
import System.Environment
import Data.List (zipWith4,unzip4,zip4,foldl')
import Data.Bits
import Data.Complex
import Data.String (fromString)
import Data.ByteString.Internal
import qualified Data.ByteString.Char8 as B
import qualified Data.ByteString.Lazy.Char8 as Bl
import qualified Data.Vector.Unboxed as Vu
import qualified Statistics.Transform as St
{-
I run a test on a collection of data from a file
[(1,t),(2,t),(3,t),(4,t),(5,t)]
- - -
| - - -
| | - - -
| | |
[y++t, n, y++t]
To do that, I use splitN to create a list of list
[[(1,t),(2,t),(3,t)],[(2,t),(3,t),(4,t)],[(3,t),(4,t),(5,t)]]
Map a serie of functions to determine a value for each inner collection,
and return when an event happened.
-}
data FourD b a = FourD a a a b
instance Functor (FourD c) where
fmap f (FourD x y z d) = FourD (f x) (f y) (f z) d
mgrav_per_bit = [ 18, 36, 71, 143, 286, 571, 1142 ]
--Converting raw data to mg
aToG :: Int -> Double
aToG a = fromIntegral . sign $ uresult
where
twocomp = if a>128
then 256-a
else a
uresult = sum $ zipWith (*) mgrav_per_bit (map (fromEnum . testBit twocomp) [0..7])
sign = if a > 128
then negate
else id
--Data is (int,int,int,time)
--Converted to (St.CD^3,Bytestring) in place of maping afterwards.
parseAcc :: Parser (FourD B.ByteString St.CD)
parseAcc = do Ap.char '('
x <- fmap ((:+0) . aToG) Ap.decimal
Ap.char ','
y <- fmap ((:+0) . aToG) Ap.decimal
Ap.char ','
z <- fmap ((:+0) . aToG) Ap.decimal
Ap.char ','
time <- takeTill (== 41)
Ap.char ')'
return $! FourD x y z time
--applies parseAcc to many lines, fails at the end of file (Need to add a newline)
parseFile = many $ parseAcc <* (Ap.endOfInput <|> Ap.endOfLine)
readExpr input = case parse parseFile input of
Done b val -> val
Partial p -> undefined
Fail a b c -> undefined
unType (FourD x y d z) = (x ,y ,d ,z)
-- Breaks a list of FourD into smaller lists, apply f and g to those lists, then filter the result based if an even happened or not
amap :: (Num c, Ord c) => ([a] -> [c]) -> ([d] -> [ByteString]) -> [FourD d a] -> [Bl.ByteString]
amap f g = (uncurry4 (zipWith4 (filterAcc))). map4 f g . unzip4 . map (unType)
where map4 f g (a,b,c,d) = (f a,f b,f c,g d)
uncurry4 f (a,b,c,d) = f a b c d
-- before i had map filterAcc,outside amap. Tried to fuse everything to eliminate intermediaries
-- An event is detected if x > 50
filterAcc x y z t = if x > 50
then (Bl.pack . B.unpack) $ "yes: " `B.append` t
else ""
-- split [St.CD] in [(Vector St.CD)], apply fft to each, and compress to a single value.
-- Core of the application
fftAcross :: [St.CD] -> [Int]
fftAcross = map (floor . noiseEnergy . St.fft) . splitN 32
-- how the value is determined (sum of all magnitudes but the first one)
noiseEnergy :: (RealFloat a, Vu.Unbox a) => Vu.Vector (Complex a) -> a
noiseEnergy x = (Vu.foldl' (\b a-> b+(magnitude a)) 0 (Vu.drop 1 x))/32
-- how the values are split in (Vector St.CD), if lenght > 32, takes 32, otherwhise I'm done
splitN :: Vu.Unbox a => Int -> [a] -> [Vu.Vector a]
splitN n x = helper x
where
helper x = if atLeast n x
then (Vu.take n (Vu.fromList x)) : (helper (drop 1 x) )
else []
-- Replacing the test by atLeast in place of a counter (that compared to length x,calculated once) reduced the behaviour that memory usage was constant.
-- this is replicated so the behaviour of splitN happens on the time part of FourD, Can't use the same since there is no Vector Bytestring instance
splitN2 n x = helper x
where
helper x = if atLeast n x
then (head x) : (helper (drop 1 x))
else []
atLeast :: Int -> [a] -> Bool
atLeast 0 _ = True
atLeast _ [] = False
atLeast n (_:ys) = atLeast (n-1) ys
main = do
filename <- liftM head getArgs
filehandle <- openFile "results.txt" WriteMode
contents <- liftM readExpr $ B.readFile filename
Bl.hPutStr (filehandle) . Bl.unlines . splitAndApplyAndFilter $ contents where
splitAndApplyAndFilter = amap fftAcross (splitN2 32)
Edit: after some refactoring, fusing some maps, reducing length, I managed to get this working at 400~ with a 25mb input file. Still, on a 100mb, it takes 1.5gb.
The program is intended to determine if a certain event happened ina point of time, for that it requries a collection of values (im using 32 atm), runs a fft in it, sum those values and see if passes a threshold. If yes, print the time to a file.
http://db.tt/fT8kXPKz for a 25mb testfile
I found the solution due a topic in reddit about the same problem!
Parsing with Haskell and Attoparsec
The great majority of my problem was caused by the fact attoparsec is strict and haskell data are rather large (so a 100mb text file can be actually much more in run time)
The other half was that profiling doubles the memory use, and I didn't account for that.
After changing the parser to be lazy, my program uses 120mb in place of 800mb (when input size is 116mb), so sucess!
In case this interest someone, here is the relevant piece of code change:
readExpr input = case parse (parseAcc<*(Ap.endOfLine<*Ap.endOfInput<|>Ap.endOfLine)) input of
Done b val -> val : readExpr b
Partial e -> []
Fail _ _ c -> error c
The full code:
{-# LANGUAGE OverloadedStrings,BangPatterns #-}
import qualified Data.Attoparsec.Char8 as Ap
import Data.Attoparsec
import Control.Monad
import Control.Applicative
--import Control.DeepSeq (force)
import System.IO
import System.Environment
import Data.List (zipWith4,unzip4,zip4,foldl')
import Data.Bits
import Data.Complex
import Data.String (fromString)
import Data.ByteString.Internal
import qualified Data.ByteString.Char8 as B
import qualified Data.ByteString.Lazy.Char8 as Bl
import qualified Data.Vector.Unboxed as Vu
import qualified Statistics.Transform as St
{-
I run a test on a collection of data from a file
[(1,t),(2,t),(3,t),(4,t),(5,t)]
- - -
| - - -
| | - - -
| | |
[y++t, n, y++t]
To do that, I use splitN to create a list of list
[[(1,t),(2,t),(3,t)],[(2,t),(3,t),(4,t)],[(3,t),(4,t),(5,t)]]
Map a serie of functions to determine a value for each inner collection,
and return when an event happened.
-}
data FourD b a = FourD a a a b
instance Functor (FourD c) where
fmap f (FourD x y z d) = FourD (f x) (f y) (f z) d
mgrav_per_bit = [ 18, 36, 71, 143, 286, 571, 1142 ]
--Converting raw data to mg
aToG :: Int -> Double
aToG a = fromIntegral . sign $ uresult
where
twocomp
| a>128 = 256-a
| otherwise = a
uresult = sum $ zipWith (*) mgrav_per_bit (map (fromEnum . testBit twocomp) [0..7])
sign
| a > 128 = negate
| otherwise = id
--Data is (int,int,int,time)
--Converted to (St.CD^3,Bytestring) in place of maping afterwards.
parseAcc :: Parser (FourD B.ByteString St.CD)
parseAcc = do Ap.char '('
x <- fmap ((:+0) . aToG) Ap.decimal -- Parse, transform to mg, convert to complex
Ap.char ','
y <- fmap ((:+0) . aToG) Ap.decimal
Ap.char ','
z <- fmap ((:+0) . aToG) Ap.decimal
Ap.char ','
time <- takeTill (== 41)
Ap.char ')'
return $! FourD x y z time
--applies parseAcc to many lines, fails at the end of file (Need to add a newline)
parseFile = many $ parseAcc <* (Ap.endOfInput <|> Ap.endOfLine)
readExpr input = case parse (parseAcc<*(Ap.endOfLine<*Ap.endOfInput<|>Ap.endOfLine)) input of
Done b val -> val : readExpr b
Partial e -> []
Fail _ _ c -> error c
unType (FourD x y d z) = (x ,y ,d ,z)
-- Breaks a list of FourD into smaller lists, apply f and g to those lists, then filter the result based if an even happened or not
amap :: (Num c, Ord c) => ([a] -> [c]) -> ([d] -> [ByteString]) -> [FourD d a] -> [ByteString]
amap f g = (uncurry4 (zipWith4 (filterAcc))). map4 f g . unzip4 . map (unType)
where map4 f g (a,b,c,d) = (f a,f b,f c,g d)
uncurry4 f (a,b,c,d) = f a b c d
-- before i had map filterAcc,outside amap. Tried to fuse everything to eliminate intermediaries
-- An event is detected if x > 50
filterAcc x y z t
| x > 50 = t
| otherwise = ""
-- split [St.CD] in [(Vector St.CD)], apply fft to each, and compress to a single value.
-- Core of the application
fftAcross :: [St.CD] -> [Int]
fftAcross = map (floor . noiseEnergy . St.fft) . splitN 32
-- how the value is determined (sum of all magnitudes but the first one)
noiseEnergy :: (RealFloat a, Vu.Unbox a) => Vu.Vector (Complex a) -> a
noiseEnergy x = (Vu.foldl' (\b a-> b+(magnitude a)) 0 (Vu.drop 1 x))/32
-- how the values are split in (Vector St.CD), if lenght > 32, takes 32, otherwhise I'm done
splitN :: Vu.Unbox a => Int -> [a] -> [Vu.Vector a]
splitN n x = helper x
where
helper x
| atLeast n x = (Vu.take n (Vu.fromList x)) : (helper (drop 1 x) )
| otherwise = []
-- Replacing the test by atLeast in place of a counter (that compared to length x,calculated once) reduced the behaviour that memory usage was constant.
-- this is replicated so the behaviour of splitN happens on the time part of FourD, Can't use the same since there is no Vector Bytestring instance
splitN2 n x = helper x
where
helper x
| atLeast n x = (head x) : (helper (drop 1 x))
| otherwise = []
atLeast :: Int -> [a] -> Bool
atLeast 0 _ = True
atLeast _ [] = False
atLeast n (_:ys) = atLeast (n-1) ys
intervalFinder :: [ByteString]->[B.ByteString]
intervalFinder x = helper x ""
where
helper (x:xs) ""
| x /= "" = ("Start Time: " `B.append` x `B.append` "\n"):(helper xs x)
| otherwise = helper xs ""
helper (x:xs) y
| x == "" = ( "End Time: "`B.append` y `B.append` "\n\n" ):(helper xs "")
| otherwise = helper xs x
helper _ _ = []
main = do
filename <- liftM head getArgs
filehandle <- openFile "results.txt" WriteMode
contents <- liftM readExpr $ B.readFile filename
Bl.hPutStr (filehandle) . Bl.fromChunks . intervalFinder . splitAndApplyAndFilter $ contents
hClose filehandle
where
splitAndApplyAndFilter = amap fftAcross (splitN2 32)
--contents <- liftM ((map ( readExpr )) . B.lines) $ B.readFile filename
{- *Main> let g = liftM ((amap fftAcross (splitN2 32)) . readExpr) $ B.readFile "te
stpattern2.txt"
-}
-- B.hPutStrLn (filehandle) . B.unlines . map (B.pack . show ) . amap (map (floor .quare) . (filter (/=[])) . map ( (drop 1) . (map (/32)) . fft ) . splitN 32) . map ( fmap(fromIntegral . aToG)) . map readExpr $ contents

Concatenating a list of numbers into one integer in haskell

I am doing yet another Project Euler problem - Problem 38.
I have this function which returns a list of numbers but what I need is that list of numbers to be one number. It calculates the concatenated product of an integer.
f (a,b) = a*b
conProInt x n = map f (zip (replicate n x) ([1..n]))
prob38 = maximum [ (conProInt (x) (n)) | x <- [100..500], n <- [1..9], (sort $ nub $ (decToList $ (conProInt x n) )) == (sort $ (decToList $ (conProInt x n) )), (sort $ nub $ (decToList $ (conProInt x n))) == [1..9] ]
eg:
conProInt 192 3
returns:
[192,384,576]
what I need returned is:
192384576
I have searched around and can't find a function or think of a function I could construct that would deliver what I need. How would I go about this?
EDIT:
I have updated the script to incorporate faster concatenation, but it doesn't return the correct result:
f (a,b) = a*b
conProInt x n =( combine (map f (zip (replicate n x) ([1..n]))))
prob38 = maximum [ (conProInt (x) (n)) | x <- [1..50000], n <- [2..40], (sort $ nub $ (decToList $ (conProInt x n) )) == (sort $ (decToList $ (conProInt x n) )), (sort $ nub $ (decToList $ (conProInt x n))) == [1..9] ]
I'm pretty sure the pandigital test
(sort $ nub $ (decToList $ (conProInt x n) )) == (sort $ (decToList $ (conProInt x n) )), (sort $ nub $ (decToList $ (conProInt x n))) == [1..9]
won't fail. I tried to make the search as large as possible, but the maximum 9-pandigital I got was 986315724. Any suggestions? Is the range of values for n a very large one?
Going via Strings is probably easiest:
read $ concat $ map (show) [192,384,576]
Though you'll probably need to add a type signature:
Prelude> (read $ concat $ map (show) [192,384,576]) :: Int
192384576
You can use this function to concatenate a list of numbers:
concatNumbers :: [Int] -> String
concatNumbers = concat . map show
If you want the function to return the concatenation as a number, you can use read.
Here's an example of how to concatenate digits without converting to and from character strings.
-- foldl1' is a strict fold. "foldl1" would also work...
import Data.List (foldl1')
-- Combine two numbers such that their digits are concatenated.
-- op 1 23 = 123, op 0 12 = 12, op 12345 67 = 1234567
op :: Int -> Int -> Int
op a b = a * power 10 (numDigits b) + b
-- How many digits does a positive number have?
numDigits :: Int -> Int
numDigits x = length . takeWhile (>= 1) . iterate (`div` 10) $ x
-- Take a positive number and raise it to a positive power.
-- power 5 2 = 25, power 10 3 = 1000
power :: Int -> Int -> Int
power x y = foldl1' (*) . take y $ repeat x
-- Take a list of numbers, and concatenate all their digits.
combine :: [Int] -> Int
combine xs = foldl1' op xs
example run:
Prelude> :m +Data.List
Prelude Data.List> let power x y = foldl1' (*) . take y $ repeat x
Prelude Data.List> let numDigits = length . takeWhile (>=1) . iterate (`div` 10)
Prelude Data.List> let op a b = a * power 10 (numDigits b) + b
Prelude Data.List> let combine xs = foldl1' op xs
Prelude Data.List> combine [192, 384, 576]
192384576
Prelude Data.List>

Resources