Thought I'd try to implement SHA1 in Haskell myself. I came up with an implementation that compiles and returns the right answer for the null string (""), but nothing else. I can't figure out what might be wrong. Can someone familiar with the algorithm and SHA1 point it out?
import Data.Bits
import Data.Int
import Data.List
import Data.Word
import Text.Printf
import qualified Data.ByteString.Lazy as L
import qualified Data.ByteString.Lazy.Char8 as C
h0 = 0x67452301 :: Word32
h1 = 0xEFCDAB89 :: Word32
h2 = 0x98BADCFE :: Word32
h3 = 0x10325476 :: Word32
h4 = 0xC3D2E1F0 :: Word32
sha1string :: String -> String
sha1string s = concat $ map (printf "%02x") $ sha1 . C.pack $ s
sha1 :: L.ByteString -> [Word8]
sha1 msg = concat [w32ToComps a, w32ToComps b, w32ToComps c, w32ToComps d, w32ToComps e]
where (a, b, c, d, e) = sha1' msg 0 h0 h1 h2 h3 h4
sha1' msg sz a b c d e
| L.length m1 < 64 = sha1'last (padded msg sz) a b c d e
| otherwise = uncurry5 (sha1' m2 (sz + 64)) $ whole a b c d e m1
where (m1, m2) = L.splitAt 64 msg
sha1'last msg a b c d e
| m1 == L.empty = (a, b, c, d, e)
| otherwise = uncurry5 (sha1'last m2) $ whole a b c d e m1
where (m1, m2) = L.splitAt 64 msg
whole a b c d e msg = partcd (partab msg) a b c d e
partcd ws a b c d e = (h0 + a', h1 + b', h2 + c', h3 + d', h4 + e')
where
(a', b', c', d', e') = go ws a b c d e 0
go ws a b c d e 80 = (a, b, c, d, e)
go (w:ws) a b c d e t = go ws temp a (rotate b 30) c d (t+1)
where temp = (rotate a 5) + f t b c d + e + w + k t
partab chunk = take 80 ns
where
ns = initial ++ zipWith4 g (drop 13 ns) (drop 8 ns) (drop 2 ns) ns
g a b c d = rotate (a `xor` b `xor` c `xor` d) 1
initial = map (L.foldl (\a b -> (a * 256) + fromIntegral b) 0) $ paginate 4 chunk
f t b c d
| t >= 0 && t <= 19 = (b .&. c) .|. ((complement b) .&. d)
| t >= 20 && t <= 39 = b `xor` c `xor` d
| t >= 40 && t <= 59 = (b .&. c) .|. (b .&. d) .|. (c .&. d)
| t >= 60 && t <= 79 = b `xor` c `xor` d
k t
| t >= 0 && t <= 19 = 0x5A827999
| t >= 20 && t <= 39 = 0x6ED9EBA1
| t >= 40 && t <= 59 = 0x8F1BBCDC
| t >= 60 && t <= 79 = 0xCA62C1D6
padded msg prevsz = L.append msg (L.pack pad)
where
sz = L.length msg
totalsz = prevsz + sz
padsz = fromIntegral $ (128 - 9 - sz) `mod` 64
pad = [0x80] ++ (replicate padsz 0) ++ int64ToComps totalsz
uncurry5 f (a, b, c, d, e) = f a b c d e
paginate n xs
| xs == L.empty = []
| otherwise = let (a, b) = L.splitAt n xs in a : paginate n b
w32ToComps :: Word32 -> [Word8]
w32ToComps = integerToComps [24, 16 .. 0]
int64ToComps :: Int64 -> [Word8]
int64ToComps = integerToComps [56, 48 .. 0]
integerToComps :: (Integral a, Bits a) => [Int] -> a -> [Word8]
integerToComps bits x = map f bits
where f n = fromIntegral ((x `shiftR` n) .&. 0xff) :: Word8
For starters, you appear to be keeping a size count in bytes (see sz + 64), but the count that gets appended should be in bits so you need to multiply by 8 somewhere (incidentally, I suggest you use cereal or binary instead of rolling your own Integer to big endian Word64). This isn't the only problem though.
EDIT: Found It
Ah-ha! Never forget, wikipedia is written by a bunch of imperative, mutable-world unenlighteneds! You finish each chunk with h0 + a', h1 + b', ... but that should be the old context plus your new values: a + a', b + b', .... Everything checks out after that (and the above size) fix.
The test code completes now with 5 property tests and 129 KATs succeeding.
End Edit
It would help you out a lot if you divided your implementation into the normal initial, update, finalize operations. That way you could compare intermediate results with other implementations.
I just built test code for your implementation using crypto-api-tests. The additional code is below if you're interested, don't forget to install crypto-api-tests.
import Test.SHA
import Test.Crypto
import Crypto.Classes
import Data.Serialize
import Data.Tagged
import Control.Monad
main = defaultMain =<< makeSHA1Tests (undefined :: SHA1)
data SHA1 = SHA1 [Word8]
deriving (Eq, Ord, Show)
data CTX = CTX L.ByteString
instance Serialize SHA1 where
get = liftM SHA1 (mapM (const get) [1..20])
put (SHA1 x) = mapM_ put x
instance Hash CTX SHA1 where
outputLength = Tagged 160
blockLength = Tagged (64*8)
initialCtx = CTX L.empty
updateCtx (CTX m) x = CTX (L.append m (L.fromChunks [x]))
finalize (CTX m) b = SHA1 $ sha1 (L.append m (L.fromChunks [b]))
Related
I am making calculating 2^(n-1) mod n. I had a problem with Haskell.
parse error (possibly incorrect indentation or mismatched brackets)
|
5 | | (mod e 2 == 0) = md (mod b*b m) (div e 2) m r
| ^
But the problem is I don't know what is the problem
modf :: Int -> Int
modf n = md 2 (n-1) n r
where
md b e m r
| (mod e 2 == 0) = md (mod b*b m) (div e 2) m r
| otherwise = md (mod b*b m) (div e 2) m (mod r*b m)
The main problem is that the indentation level of the guards is the same as the md function itself). You need to indent it by at least one space:
modf :: Int -> Int
modf n = md 2 (n-1) n r
where
md b e m r
| (mod e 2 == 0) = md (mod b*b m) (div e 2) m r
| otherwise = md (mod b*b m) (div e 2) m (mod r*b m)
Now the syntax error has been resolved, but it will raise an error on the fact that you use r, but never defined r.
We can implement this more elegantly by:
import Data.Bits(shiftR)
modf :: Int -> Int
modf m = go 2 (m-1)
where go k 1 = k
go k n | even n = go2
| otherwise = mod (go2 * k) m
where go2 = go (mod (k*k) m) (shiftR n 1)
I implemented the Winograd algorithm on Haskell and, trying to improve it, decided to make it more strict. I did this, but I do not understand why it started to work faster. Yesterday I asked a similar question, but posted not quite correct code.
Since that code does not show a problem, I post the code completely.
module Main where
import qualified Data.Vector as V
import qualified Data.Matrix as M
import Control.DeepSeq
import Control.Exception
import System.Clock
import System.Mem
import Data.Time
matrixCtor :: Int -> Int -> Int -> M.Matrix Int
matrixCtor x y size = M.matrix size size $ \(i,j) -> x*i+y*j
winogradLazy :: M.Matrix Int -> M.Matrix Int -> M.Matrix Int
winogradLazy a b = c
where
n = M.nrows a
p = M.ncols a
m = M.ncols b
-- Translate into vectors, since indexing in matrices takes longer.
-- Matrix b is also transposed
a' = V.generate n $ \i -> M.getRow (i+1) a
bt' = V.generate m $ \j -> M.getCol (j+1) b
rows = V.generate n $ \i -> group $ V.unsafeIndex a' i
cols = V.generate m $ \j -> group $ V.unsafeIndex bt' j
group x = foldl (groupHelper x) 0 [0,2..p-1]
groupHelper x acc i = let
x1 = V.unsafeIndex x (i)
x2 = V.unsafeIndex x (i+1)
in
acc - x1 * x2
c = M.matrix n m $ \(i,j) ->
let
a = V.unsafeIndex rows (i-1) + V.unsafeIndex cols (j-1)
b = wsum (V.unsafeIndex a' (i-1)) (V.unsafeIndex bt' (j-1))
in
a + b
wsum r c = foldl (wsumHelper r c) 0 [0,2..p-1]
wsumHelper r c acc i = let
x1 = V.unsafeIndex r (i)
x2 = V.unsafeIndex r (i+1)
y1 = V.unsafeIndex c (i)
y2 = V.unsafeIndex c (i+1)
in
acc +(x1+y2)*(x2+y1)
winogradStrict :: M.Matrix Int -> M.Matrix Int -> M.Matrix Int
winogradStrict a b = c
where
n = M.nrows a
p = M.ncols a
m = M.ncols b
-- Translate into vectors, since indexing in matrices takes longer.
-- Matrix b is also transposed
a' = V.generate n $ \i -> M.getRow (i+1) a
bt' = V.generate m $ \j -> M.getCol (j+1) b
rows = V.generate n $ \i -> group $ V.unsafeIndex a' i
cols = V.generate m $ \j -> group $ V.unsafeIndex bt' j
group x = foldl (groupHelper x) 0 [0,2..p-1]
groupHelper x acc i = let
x1 = V.unsafeIndex x (i)
x2 = V.unsafeIndex x (i+1)
in
acc - x1 * x2
c = a' `deepseq` bt' `deepseq` M.matrix n m $ \(i,j) ->
let
a = V.unsafeIndex rows (i-1) + V.unsafeIndex cols (j-1)
b = wsum (V.unsafeIndex a' (i-1)) (V.unsafeIndex bt' (j-1))
in
a + b
wsum r c = foldl (wsumHelper r c) 0 [0,2..p-1]
wsumHelper r c acc i = let
x1 = V.unsafeIndex r (i)
x2 = V.unsafeIndex r (i+1)
y1 = V.unsafeIndex c (i)
y2 = V.unsafeIndex c (i+1)
in
acc + (x1+y2)*(x2+y1)
lazyTest :: IO ()
lazyTest = do
let a = matrixCtor 2 (-1) 500
let b = matrixCtor 2 (-3) 500
evaluate $ force a
evaluate $ force b
start <- getCurrentTime
let c = winogradLazy a b
evaluate $ force c
end <- getCurrentTime
print (diffUTCTime end start)
strictTest :: IO ()
strictTest = do
let a = matrixCtor 2 (-1) 500
let b = matrixCtor 2 (-3) 500
evaluate $ force a
evaluate $ force b
start <- getCurrentTime
let c = winogradStrict a b
evaluate $ force c
end <- getCurrentTime
print (diffUTCTime end start)
main :: IO ()
main = do
performMajorGC
lazyTest
performMajorGC
strictTest
in the strict version before computing the matrix c, I do the following:
a' `deepseq` bt' `deepseq`
and because of this I get the following results
2.083201s --lazyTest
0.613508s --strictTest
I'm new to Haskell and i don't understand why my guard won't accept it. here's my code. The guard should fire in case b is a divider of a.
gCF :: Integer -> Integer -> Integer;
gCF n p
| (p <= 0 || n <= 0) = error "Input should be positive"
| (p > n) = f p n
| otherwise = f n p
where
f :: Integer -> Integer -> Integer;
f a b
| (fromInteger (a `div` b) / 1 == a / b) = b
| otherwise = f a (b - 1)
Here's the error shown.
testscript.hs:168:28: error:
• No instance for (Fractional Integer) arising from a use of ‘/’
• In the first argument of ‘(==)’, namely
‘fromInteger (a `div` b) / 1’
In the expression: (fromInteger (a `div` b) / 1 == a / b)
In a stmt of a pattern guard for
an equation for ‘f’:
(fromInteger (a `div` b) / 1 == a / b)
|
168 | | (fromInteger (a `div` b) / 1 == a / b) = b | ^^^^^^^^^^^^^^^^^^^^^^^^^^^
I think you make it the function more complex that necessary. Converting numbers between the Integer and Floating world can be dangerous, since it introduces rounding problems.
If I understand it correctly, you want to check if a is dividable by b. You can check this by verifying that mod a b == 0, and we are still in the integer worlds. So we can rewrite the program to:
gCF :: Integer -> Integer -> Integer
gCF n p | p <= 0 || n <= 0 = error "Input should be positive"
| p > n = f p n
| otherwise = f n p
where f a b | mod a b == 0 = b
| otherwise = f a (b-1)
Since a does not change in the recursive calls, we can factor that out:
gCF :: Integer -> Integer -> Integer
gCF n p | p <= 0 || n <= 0 = error "Input should be positive"
| otherwise = f (min p n)
where a = max p n
f b | mod a b == 0 = b
| otherwise = f (b-1)
We can also generalize the signature to let it work with any Integral type:
gCF :: Integral i => i -> i -> i
gCF n p | p <= 0 || n <= 0 = error "Input should be positive"
| otherwise = f (min p n)
where a = max p n
f b | mod a b == 0 = b
| otherwise = f (b-1)
i found a solution!
gCF :: Integer -> Integer -> Integer;
gCF n p
| (p <= 0 || n <= 0) = error "Input should be positive"
| (p > n) = floor (f (fromInteger p) (fromInteger n) (fromInteger n))
| otherwise = floor (f (fromInteger n) (fromInteger p) (fromInteger p))
where
f :: Float -> Float -> Float -> Float;
f a b c
| (fromInteger (floor (a / c)) == a / c) && (fromInteger (floor (b / c)) == b / c) = c
| otherwise = f a b (c - 1)
I wrote a program to perform SHA-1 in haskell, and while it does produce hashes, they do not match with the ones produced by other SHA-1 programs
Example: cat
hashes to: b5be86bc8bccfc24b01b093228ebb96fc92fa804 but is supposed to hash to 9d989e8d27dc9e0ec3389fc855f142c3d40f0c50
My code is:
(old code omitted)
I have no idea what is wrong. Can someone tell me where I made a mistake?
Edit:
I fixed the stuff that was pointed out, however it is still not working. It works correctly up until the inner loop.
I cleaned up the code so the functions for the inner loop are available as f1, f2 and f3
cat now interestingly hashes to ebe6c9fa1afa0ef5a0ca80bab251fd41cc29127e.
Code:
import Data.Word
import Data.Bits
import Data.Char (ord, intToDigit)
import Data.Binary (encode, decode)
import Numeric (showHex, showIntAtBase)
import System.IO (stdin)
import Data.Sequence ((<|), (|>))
import qualified Data.Sequence as S
import qualified Data.ByteString.Lazy as B
type Quintuple32 = (Word32, Word32, Word32, Word32, Word32)
addQuintuple (a, b, c, d, e) (f, g, h, i, j) =
(a + f, b + g, c + h, d + i, e + j)
shower :: Quintuple32 -> String
shower (a, b, c, d, e) = concatMap (`showHex` "") [a, b, c, d, e]
hash :: Int -> S.Seq Word32 -> Quintuple32 -> Quintuple32
hash i w h#(a, b, c, d, e)
| i < 20 = hash (i + 1) w (newhash (f1 h + k1))
| i < 40 = hash (i + 1) w (newhash (f2 h + k2))
| i < 60 = hash (i + 1) w (newhash (f3 h + k3))
| i < 80 = hash (i + 1) w (newhash (f2 h + k4))
| otherwise = h
where (k1, k2, k3, k4) = (0x5A827999, 0x6ED9EBA1, 0x8F1BBCDC, 0xCA62C1D6)
newhash a' = (rotate a 5 + a' + e + (w `S.index` i), a, rotate b 30, c, d)
f1 :: Quintuple32 -> Word32
f1 (_, b, c, _, _) = (b .&. c) .|. (complement b .&. c)
f2 :: Quintuple32 -> Word32
f2 (_, b, c, d, _) = b `xor` c `xor` d
f3 :: Quintuple32 -> Word32
f3 (_, b, c, d, _) = (b .&. c) .|. (b .&. d) .|. (c .&. d)
starting :: Quintuple32
starting = (0x67452301
, 0xEFCDAB89
, 0x98BADCFE
, 0x10325476
, 0xC3D2E1F0)
hasher :: Quintuple32 -> S.Seq Word32 -> Quintuple32
hasher acc x = addQuintuple acc (hash 0 (extend x) acc)
process :: B.ByteString -> Quintuple32
process = foldl hasher starting . chunks . pad
extend :: S.Seq Word32 -> S.Seq Word32
extend = extend' 16
extend' :: Int -> S.Seq Word32 -> S.Seq Word32
extend' 80 a = a
extend' i a = extend' (i + 1) (a |> xored)
where xored = rotate ((a `S.index` (i - 3)) `xor`
(a `S.index` (i - 8)) `xor`
(a `S.index` (i - 14)) `xor`
(a `S.index` (i - 16))) 1
toBytes :: String -> B.ByteString
toBytes = B.pack . map (fromIntegral . ord)
splitEvery n xs
| B.null xs = S.empty
| otherwise = B.take n xs <| splitEvery n (B.drop n xs)
chunks :: B.ByteString -> [S.Seq Word32]
chunks xs
| B.null xs = []
| otherwise = x : chunks (B.drop 64 xs)
where x = fmap decode (splitEvery 4 (B.take 64 xs))
pad :: B.ByteString -> B.ByteString
pad xs = B.append (add0 $ add1 xs) length64
where length64 = encode (fromIntegral (8 * B.length xs) :: Word64)
add1 :: B.ByteString -> B.ByteString
add1 = flip B.append (B.singleton 128)
add0 :: B.ByteString -> B.ByteString
add0 xs
| modulo /= 448 = add0 $ B.append xs (B.singleton 0)
| otherwise = xs
where modulo = (B.length xs * 8) `rem` 512
Also, a small question: is something like (a, b) = (8, 9) an acceptable thing to do to set multiple variables?
Oh, another one of these!
Two errors jump out at me immediately:
pad :: B.ByteString -> B.ByteString
pad xs = B.append (add0 $ add1 xs) length64
where length64 = encode (fromIntegral (B.length xs) :: Word64)
Notice the length you append is supposed to be the bit length, not the byte length.
add1 :: B.ByteString -> B.ByteString
add1 = flip B.append (B.singleton 255)
Notice 255 /= 0b10000000 and the pad is supposed to be the later.
In general you debug these by 1) going over the spec again and again. 2) Comparing to another implementation, such as Adam Wick's SHA package, and comparing for equality at as fine grained level as possible.
EDIT: There are two more bugs, basically transcription errors. Look around a bit and shout if you're still stuck.
I have a bunch of QuickCheck properties defined as follows:
...
prop_scaleData3 d n = n > 1 ⇛ length (scaleData d n) ≡ n
prop_scaleData4 d n = n > 1 ⇛ head (scaleData d n) ≡ -d
prop_scaleData5 d n = n > 1 ⇛ last (scaleData d n) ≡ d
...
That's a lot of repetition. What would be the right way to DRY it up?
What about something like
gt1 :: (Integer -> Prop) -> Prop
gt1 f = forAll $ \(Positive n) -> f $ n + 1
Then your properties become
prop_scaleData3 d = gt1 $ \n -> length (scaleData d n) ≡ n
prop_scaleData4 d = gt1 $ (≡ -d) . head . scaleData d
prop_scaleData5 d = gt1 $ (≡d) . last . scaleData d
This avoids the duplicated logic. Whether or not you like the pointfree stuff is up to you :)
prop_scaleData3 d n = n > 1 ==> length (scaleData d n) == n
prop_scaleData4 d n = n > 1 ==> head (scaleData d n) == -d
prop_scaleData5 d n = n > 1 ==> last (scaleData d n) == d
Just observe what is similar about these three functions and make a new helper function that extracts out the commonalities. For example:
scaleProp :: Int -> Int -> ([Int] -> Int) -> Int -> Bool
scaleProp d n op res = n > 1 ==> op (scaleData d n) == res
Then you can express your original props in terms of the helper:
prop_scaleData3 d n = scaleProp d n length n
prop_scaleData4 d n = scaleProp d n head (-d)
prop_scaleData4 d n = scapeProp d n last d
At this point the repetition isn't about logic so much as syntax (naming functions and applying arguments). In such cases I don't feel the DRY principle is really helpful - you can make less syntactic repetition but you'll lose readability or modularity. For example, Toxaris combined the solutions into one function; we can do the same but lets do it in a simpler way with just lists of booleans:
prop_scaleData345 d n =
let sp = scaleProp d n
in and [sp length n, sp head (-d), sp last d]
-- or instead:
-- in all (uncurry sp) [(length, n), (head, negate d), (last, d)]
If you put this pragma at the top of your file:
{-# LANGUAGE ParallelListComp #-}
You might be able to do something like this with GHC:
prop_scaleData345 d n = n > 1 => conjoin
[ f (scaleData d n) == x
| f <- [length, head, last]
| x <- [n , -d , d ]
]
This should generate a list of three properties, and then say that all of them have to be true. The first property uses f = length and x = n, the second property uses f = head and x = -d, and the last property uses f = last and x = d.