SHA-1 in haskell producing wrong hashes - haskell

I wrote a program to perform SHA-1 in haskell, and while it does produce hashes, they do not match with the ones produced by other SHA-1 programs
Example: cat
hashes to: b5be86bc8bccfc24b01b093228ebb96fc92fa804 but is supposed to hash to 9d989e8d27dc9e0ec3389fc855f142c3d40f0c50
My code is:
(old code omitted)
I have no idea what is wrong. Can someone tell me where I made a mistake?
Edit:
I fixed the stuff that was pointed out, however it is still not working. It works correctly up until the inner loop.
I cleaned up the code so the functions for the inner loop are available as f1, f2 and f3
cat now interestingly hashes to ebe6c9fa1afa0ef5a0ca80bab251fd41cc29127e.
Code:
import Data.Word
import Data.Bits
import Data.Char (ord, intToDigit)
import Data.Binary (encode, decode)
import Numeric (showHex, showIntAtBase)
import System.IO (stdin)
import Data.Sequence ((<|), (|>))
import qualified Data.Sequence as S
import qualified Data.ByteString.Lazy as B
type Quintuple32 = (Word32, Word32, Word32, Word32, Word32)
addQuintuple (a, b, c, d, e) (f, g, h, i, j) =
(a + f, b + g, c + h, d + i, e + j)
shower :: Quintuple32 -> String
shower (a, b, c, d, e) = concatMap (`showHex` "") [a, b, c, d, e]
hash :: Int -> S.Seq Word32 -> Quintuple32 -> Quintuple32
hash i w h#(a, b, c, d, e)
| i < 20 = hash (i + 1) w (newhash (f1 h + k1))
| i < 40 = hash (i + 1) w (newhash (f2 h + k2))
| i < 60 = hash (i + 1) w (newhash (f3 h + k3))
| i < 80 = hash (i + 1) w (newhash (f2 h + k4))
| otherwise = h
where (k1, k2, k3, k4) = (0x5A827999, 0x6ED9EBA1, 0x8F1BBCDC, 0xCA62C1D6)
newhash a' = (rotate a 5 + a' + e + (w `S.index` i), a, rotate b 30, c, d)
f1 :: Quintuple32 -> Word32
f1 (_, b, c, _, _) = (b .&. c) .|. (complement b .&. c)
f2 :: Quintuple32 -> Word32
f2 (_, b, c, d, _) = b `xor` c `xor` d
f3 :: Quintuple32 -> Word32
f3 (_, b, c, d, _) = (b .&. c) .|. (b .&. d) .|. (c .&. d)
starting :: Quintuple32
starting = (0x67452301
, 0xEFCDAB89
, 0x98BADCFE
, 0x10325476
, 0xC3D2E1F0)
hasher :: Quintuple32 -> S.Seq Word32 -> Quintuple32
hasher acc x = addQuintuple acc (hash 0 (extend x) acc)
process :: B.ByteString -> Quintuple32
process = foldl hasher starting . chunks . pad
extend :: S.Seq Word32 -> S.Seq Word32
extend = extend' 16
extend' :: Int -> S.Seq Word32 -> S.Seq Word32
extend' 80 a = a
extend' i a = extend' (i + 1) (a |> xored)
where xored = rotate ((a `S.index` (i - 3)) `xor`
(a `S.index` (i - 8)) `xor`
(a `S.index` (i - 14)) `xor`
(a `S.index` (i - 16))) 1
toBytes :: String -> B.ByteString
toBytes = B.pack . map (fromIntegral . ord)
splitEvery n xs
| B.null xs = S.empty
| otherwise = B.take n xs <| splitEvery n (B.drop n xs)
chunks :: B.ByteString -> [S.Seq Word32]
chunks xs
| B.null xs = []
| otherwise = x : chunks (B.drop 64 xs)
where x = fmap decode (splitEvery 4 (B.take 64 xs))
pad :: B.ByteString -> B.ByteString
pad xs = B.append (add0 $ add1 xs) length64
where length64 = encode (fromIntegral (8 * B.length xs) :: Word64)
add1 :: B.ByteString -> B.ByteString
add1 = flip B.append (B.singleton 128)
add0 :: B.ByteString -> B.ByteString
add0 xs
| modulo /= 448 = add0 $ B.append xs (B.singleton 0)
| otherwise = xs
where modulo = (B.length xs * 8) `rem` 512
Also, a small question: is something like (a, b) = (8, 9) an acceptable thing to do to set multiple variables?

Oh, another one of these!
Two errors jump out at me immediately:
pad :: B.ByteString -> B.ByteString
pad xs = B.append (add0 $ add1 xs) length64
where length64 = encode (fromIntegral (B.length xs) :: Word64)
Notice the length you append is supposed to be the bit length, not the byte length.
add1 :: B.ByteString -> B.ByteString
add1 = flip B.append (B.singleton 255)
Notice 255 /= 0b10000000 and the pad is supposed to be the later.
In general you debug these by 1) going over the spec again and again. 2) Comparing to another implementation, such as Adam Wick's SHA package, and comparing for equality at as fine grained level as possible.
EDIT: There are two more bugs, basically transcription errors. Look around a bit and shout if you're still stuck.

Related

Lagrange Interpolation for a schema based on Shamir's Secret Sharing

I'm trying to debug an issue with an implementation of a threshold encryption scheme. I've posted this question on crypto to get some help with the actual scheme but was hoping to get a sanity check on the simplified code I am using.
Essentially the the crypto system uses Shamir's Secret Sharing to combine the shares of a key. The polynomial is each member of the list 'a' multiplied by a increasing power of the parameter of the polynomial. I've left out the mod by prime to simplify the code as the actual implementation uses PBC via a Haskell wrapper.
I have for the polynomial
poly :: [Integer] -> Integer -> Integer
poly as xi = (f 1 as)
where
f _ [] = 0
f 0 _ = 0
f s (a:as) = (a * s) + f (s * xi) as
The Lagrange interpolation is:
interp0 :: [(Integer, Integer)] -> Integer
interp0 xys = round (sum $ zipWith (*) ys $ fmap (f xs) xs)
where
xs = map (fromIntegral .fst) xys
ys = map (fromIntegral .snd) xys
f :: (Eq a, Fractional a) => [a] -> a -> a
f xs xj = product $ map (p xj) xs
p :: (Eq a, Fractional a) => a -> a -> a
p xj xm = if xj == xm then 1 else negate (xm / (xj - xm))
and the split and combination code is
execPoly as#(a0:_) = do
let xs = zipWith (,) [0..] (fmap (poly as) [0..100])
let t = length as + 1
let offset = 1
let shares = take t (drop offset xs)
let sm2 = interp0 shares
putText ("poly and interp over " <> show as <> " = " <> show sm2 <> ". Should be " <> show a0)
main :: IO ()
main = do
execPoly [10,20,30,40,50,60,70,80,90,100,110,120,130,140,150] --1
execPoly [10,20,30,40,50,60,70,80] -- 2
execPoly(1) fails to combine to 10 but execPoly(2) combines correctly. The magic threshold seems to be 8.
Is my code correct? I am missing something in the implementation that limits the threshold size to 8?
As MathematicalOrchid said it was a precision problem.
Updated the code to:
f :: (Eq a, Integral a) => [a] -> a -> Ratio a
f xs xj = product $ map (p xj) xs
p :: (Eq a, Integral a)=> a -> a -> Ratio a
p xj xm = if xj == xm then (1 % 1) else (negate xm) % (xj - xm)
And it works as expected.

How to parametrize a constant (in this particular recursive function)?

test1 correctly produces the following structure from the string "abcdef":
(a,(1,[0])) -- type 'a' occur 1 time in position 0
(b,(1,[1])) -- type 'b' occur 1 time in position 1
(c,(1,[2]))
(d,(1,[3]))
(e,(1,[4]))
(f*,(1,[5])) -- type 'f' is the last of the list
But this result depends on the number 6, that is the length of a very particular class of string, invalid for general case.
So if the string in test1 is instead "abc" the result is wrong:
(a,(1,[0]))
(b,(1,[7]))
(c*,(1,[8]))
If the string in test1 is instead "abcdefgh" the result is also wrong:
(a,(1,[0]))
(b,(1,[2])) -- Should be [1]
(c,(1,[3])) -- Should be [2]
(d,(1,[4])) -- ...
(e,(1,[5]))
(f,(1,[6]))
(g,(1,[7]))
(h*,(1,[8]))
In addTrieWithCounter I'm not able to substitue this constant (6) with a parameterized function on the length of the word.
The CONTEXT of this function. The addTrieWithCounter will be placed in a special "loop" such "al alts" becames: addTrieWithCounter ... "al" 0 -> "drop the space" -> addTrieWithCounter ... "alts" 3. So the occurrences will be aligned with the initial string.
-- analyzing "all alts" should be obtained this result.
(a,(2,[4,0])) -- type 'a' occur 2 times in positions 3 and 0 (reversed order)
(l,(2,[5,1])) -- type 'l' (of seq "al") occur 2 times in positions 4 and 1 (reversed order)
(l*,(1,[2])) -- type 'l' (of seq "all") occur 1 time in positions 2
(t,(1,[6])) -- type 't' (of seq "alt") occur 1 time in positions 6
(s*,(1,[7])) -- type 's' (of seq "alts") occur 1 time in positions 7
It will be a trivial thing, but I have no idea.
Thanks in advance for your suggestions.
import qualified Data.Map as M
import Text.PrettyPrint as TP
import Data.Either (either)
data Trie a b = Nil | Trie (M.Map (Either a a) (b, Trie a b)) deriving Show
-- (Just a note: Trie will be a Monoid's instance. So with "Either" it is possible to distinguish the following cases: "all" and "alliance")
-- add an element to a Trie
addTrieWithCounter
:: Ord a =>
(Trie a (Int, [t1]), Int)
-> ((Int, [t1]) -> Int -> (Int, [t1]))
-> [a]
-> (Trie a (Int, [t1]), Int)
addTrieWithCounter (t,st) f [] = (t,st)
addTrieWithCounter (Nil,st) f xs = addTrieWithCounter (Trie M.empty, st) f xs
addTrieWithCounter (Trie m,st) f [x] =
(Trie $ M.insertWith (\(c,_) _ -> (f c st,Nil)) (Left x) (f (0,[]) st,Nil) m,st + 1)
addTrieWithCounter (Trie m, st) f (x:xs) =
case M.lookup (Right x) m of -- !!!!! PROBLEM IN THE FOLLOWING LINE !!!!!
Nothing -> let (t',st') = addTrieWithCounter (Nil, 6 - length xs ) f xs
in (Trie $ M.insert (Right x) (f (0,[]) st,t') m,st + 1)
Just (c,t) -> let (t',st') = addTrieWithCounter (t,st) f xs -- TO CHANGE
in (Trie $ M.insert (Right x) (f c st',t') m,st')
showTrieS f (t,_) = showTrie f t
showTrie :: Show a => (Either t t -> String) -> Trie t a -> Doc
showTrie _ Nil = empty
showTrie f (Trie m)
| M.null m = empty
| otherwise =
vcat $
do (k,(count,t)) <- M.assocs m
return $
vcat [ lparen TP.<> text (f k) TP.<> comma TP.<> (text . show $ count) TP.<> rparen
, nest 4 (showTrie f t)
]
test1 = showTrieS f1 t
where
f1 = (either (:"*") (:""))
t = addTrieWithCounter (Trie M.empty,0) f2 "abcdef"
f2 (cr,poss) st = ((cr + 1),(st : poss))
This will get you most of the way there. It doesn't solve your
exact problem, but shows how to remove the hard-coded length value.
import qualified Data.Map.Strict as M
import qualified Data.IntSet as S
import Data.Monoid
import Text.PrettyPrint hiding ((<>))
data GenTrie a b = Trie (M.Map a (b, GenTrie a b))
deriving (Show)
emptyTrie = Trie M.empty
data Info = Info { _count :: Int, _positions :: S.IntSet }
deriving (Show)
type Trie = GenTrie Char Info
addString :: Int -> String -> Trie -> Trie
addString i cs t = go t i cs
where
go :: Trie -> Int -> String -> Trie
go t i [] = t
go t i (c:cs) =
let Trie m = t
pair =
case M.lookup c m of
Nothing ->
let t2 = go emptyTrie (i+1) cs
val = Info 1 (S.singleton i)
in (val, t2)
Just (info,t1) ->
let t2 = go t1 (i+1) cs
val = info { _count = _count info+1
, _positions = S.insert i (_positions info)
}
in (val, t2)
in Trie (M.insert c pair m)
printTrie = putStrLn . showTrie
showTrie = render . trieToDoc
trieToDoc :: Trie -> Doc
trieToDoc (Trie m)
| M.null m = empty
| otherwise =
vcat $
do (ch, (info,t)) <- M.assocs m
let count = show (_count info)
pos = show (S.toList (_positions info))
return $
vcat [ text [ch] <> space <> text count <> space <> text pos
, nest 4 (trieToDoc t)
]
test1 = printTrie $ addString 0 "abc" emptyTrie
test2 = printTrie $ addString 4 "alts" $ addString 0 "all" emptyTrie
addTrieWithCounter (Trie m,st) f (x:xs) =
case M.lookup (Right x) m of
Nothing -> let (t',st') = addTrieWithCounter (Nil, st + 1 ) f xs
in (Trie $ M.insert (Right x) (f (0,[]) st,t') m, st')
Just (c,t) -> let (t',st') = addTrieWithCounter (t,st + 1) f xs
in (Trie $ M.insert (Right x) (f c st,t') m,st')

Can't match tuple inside of foldl

I have the following code, which should™ convert an excel column type its corresponding number. For example AA into 27 and AB into 28:
import Data.Char (ord)
import Data.List (foldl1')
columnToNumber :: String -> Int
columnToNumber s = foldl1' (\acc (i, v) -> acc + 26 ^ i * v) (values s)
where values s = zip (reverse [0..(length s)]) ((\c -> ord c - 64) <$> s)
The idea is to take the string "AA" convert it to the corresponding numbers
["A", "A"] -> [1, 1]
and zip it with the base so from right to left 26^0, 26^1, 26^2, and so on.
zip [1, 0] [1, 1] -> [(1, 1), (0, 1)]
That way the result of the fold would be
26^1 * 1 + 26^0 * 1 = 27
Unfortunately, I'm getting the following errors and I'm not sure why:
ExcelSheetColumn.hs:7:34:
Couldn't match expected type ‘Int’
with actual type ‘(Integer, Int)’
In the pattern: (i, v)
In the first argument of ‘foldl1'’, namely
‘(\ acc (i, v) -> acc + 26 ^ i * v)’
In the expression:
foldl1' (\ acc (i, v) -> acc + 26 ^ i * v) (values s)
ExcelSheetColumn.hs:7:63:
Couldn't match type ‘(Int, Int)’ with ‘Int’
Expected type: [Int]
Actual type: [(Int, Int)]
In the second argument of ‘foldl1'’, namely ‘(values s)’
In the expression:
foldl1' (\ acc (i, v) -> acc + 26 ^ i * v) (values s)
Could someone help me out?
to get it compiling you actually just have to switch foldl1' to foldl' and add the starting accumulator:
import Data.Char (ord)
import Data.List (foldl')
columnToNumber :: String -> Int
columnToNumber s = foldl' (\acc (i, v) -> acc + 26 ^ i * v) 0 (values s)
where values s = zip (reverse [0..(length s)]) ((\c -> ord c - 64) <$> s)
if you add the suggestion Free_D made (start at length s - 1):
columnToNumber :: String -> Int
columnToNumber s = foldl' (\acc (i, v) -> acc + 26 ^ i * v) 0 (values s)
where values s = zip (reverse [0..(length s -1)]) ((\c -> ord c - 64) <$> s)
you get the desired results:
λ> columnToNumber "AA"
27
λ> columnToNumber "AB"
28
I don't know if you actually gonna need this but hey why not:
what you probably don't like is that columnToNumber "A1" is 11 - to fix this you need to tread digits differently from letters:
columnToNumber :: String -> Int
columnToNumber s = foldl' (\acc (i, v) -> acc + 26 ^ i * v) 0 (values s)
where values s = zip (reverse [0..(length s -1)]) (parse <$> s)
parse c
| c >= '0' && c <= '9' = ord c - ord '0'
| otherwise = ord c - 64
Looking at the definition of foldl1', it has to take two things that are of the same type and produce something similar
*Main Data.List> :t foldl1'
foldl1' :: (a -> a -> a) -> [a] -> a
But foldl is what you want:
*Main Data.List> :t foldl
foldl :: Foldable t => (b -> a -> b) -> b -> t a -> b
So essentially this:
import Data.Char (ord, toUpper)
columnToNumber :: String -> Int
columnToNumber s = foldl (\acc (i, v) -> acc + 26 ^ i * v) 0 $ values s where
values s = zip [l - 1, l - 2 ..0] ((\c -> (ord.toUpper) c - 64) <$> s) where
l = length s

Rsync's rolling checksum in haskell

I'm starting learn both haskell and remote delta compression. My first step is to implement rsync's version of rolling checksum in haskell. Does chunk equal to X(i) in those formula? If so I'm getting confused.
Haskell can turn bytestring into byte array
How I do turn array of Word8s into that big chunk. Word32768? I mean what if X(i) is list of Word8s?
After that how to do arithmetical operation on those unsigned ints of 4KB size?
Also my current version of implementation just slides by 1B (Word8) each.
It's very easy to turn a ByteString into [Word8] using unpack, which should be sufficient for performing this algorithm (although not necessarily the most efficient).
Why do you need to turn Word8s into a Word32768? Why do you need a 2^15 bit number? That would be very difficult to represent, but you could use a list or array of Word8s, which is very easy to represent in memory and is equivalent.
For performing arithmetic, functions like map, zipWith, folds, and scans are very useful. For example, performing the first step of the algorithm:
import qualified Data.ByteString as BS
a :: Int -> Int -> ByteString -> Int
a k l x
= (`mod` m)
$ sum
$ map fromIntegral
$ take (l - k)
$ drop k
$ BS.unpack x
where m = 2 ^ 16
And implementing the function b is only marginally more difficult, you just have to figure out the sequence of l - i + 1 for i = k to l, then use zipWith (*) between the map fromIntegral and take (l - k). After that, it's quite trivial to implement s, although it certainly could be performed more efficiently if you factor out the common steps of take (l - k) $ drop k $ BS.unpack x.
Chunk does not equal to X(i) in that equation/formula in the link provided. It is mostly related to Data Deduplication. Also, rolling checksum can be used to create chunks, identify chunk boundaries etc.
Also my current implementation of rsync's rolling checksum is as follows. Next I'm gonna implement cyclic polynomial rolling checksum then read some books on Data Deduplication
import qualified Data.ByteString.Lazy as B
import qualified Data.ByteString.Lazy.Char8 as B8
import Data.Word
import Data.Bits
import Data.Int
type CheckSumPartial = Word16
type CheckSumA = CheckSumPartial
type CheckSumB = CheckSumPartial
type WindowSize = Int64
type CheckSum = Word32
type Byte = Word8
main:: IO ()
main = do
let str = B8.pack "abcdef"
let s1 = roll 3 str
let s2 = withoutRoll 3 str
print s1
print s2
return ()
roll :: WindowSize -> B.ByteString -> [CheckSum]
roll w str =
let
(a,b,s) = newABS w str
h = B.head str
t = B.tail str
in if fromIntegral (B.length t) < w
then [s]
else s : rollNext w t h a b
withoutRoll :: WindowSize -> B.ByteString -> [CheckSum]
withoutRoll w str =
let
(_,_,s) = newABS w str
t = B.tail str
in if fromIntegral (B.length t) < w
then [s]
else s : withoutRoll w t
newA :: WindowSize -> B.ByteString -> CheckSumA
newA w str =
let block = B.take w str
in B.foldr aSum (0::CheckSumA) block
where
aSum x acc = acc + (fromIntegral x :: CheckSumA)
newB :: WindowSize -> B.ByteString -> CheckSumB
newB w str =
let block = B.take w str
in fst $ B.foldr bSum (0::CheckSumB, w) block
where
bSum x (acc,l) = (acc + fromIntegral l * (fromIntegral x :: CheckSumB), l-1)
rollA :: CheckSumA -> Byte -> Byte -> CheckSumA
rollA prevA prevHead curLast = prevA - fromIntegral prevHead + fromIntegral curLast
rollB :: CheckSumA -> Byte -> WindowSize -> CheckSumB -> CheckSumB
rollB curA prevHead w prevB = prevB - fromIntegral w * fromIntegral prevHead + curA
calculateS :: CheckSumA -> CheckSumB -> CheckSum
calculateS a b = (fromIntegral a :: Word32) .|. shift (fromIntegral b :: Word32) 16
rollNext :: WindowSize ->B.ByteString -> Byte -> CheckSumA -> CheckSumB -> [CheckSum]
rollNext w str prevHead prevA prevB =
let
curBlock = B.take (fromIntegral w) str
curLast = B.last curBlock
h = B.head str
t = B.tail str
a = rollA prevA prevHead curLast
b = rollB a prevHead w prevB
s = calculateS a b
in if fromIntegral (B.length t) < w
then [s]
else s : rollNext w t h a b
newABS :: WindowSize -> B.ByteString -> (CheckSumA, CheckSumB, CheckSum)
newABS w str =
let a = newA w str
b = newB w str
s = calculateS a b
in (a,b,s)

SHA1 in Haskell -- something wrong with my implementation

Thought I'd try to implement SHA1 in Haskell myself. I came up with an implementation that compiles and returns the right answer for the null string (""), but nothing else. I can't figure out what might be wrong. Can someone familiar with the algorithm and SHA1 point it out?
import Data.Bits
import Data.Int
import Data.List
import Data.Word
import Text.Printf
import qualified Data.ByteString.Lazy as L
import qualified Data.ByteString.Lazy.Char8 as C
h0 = 0x67452301 :: Word32
h1 = 0xEFCDAB89 :: Word32
h2 = 0x98BADCFE :: Word32
h3 = 0x10325476 :: Word32
h4 = 0xC3D2E1F0 :: Word32
sha1string :: String -> String
sha1string s = concat $ map (printf "%02x") $ sha1 . C.pack $ s
sha1 :: L.ByteString -> [Word8]
sha1 msg = concat [w32ToComps a, w32ToComps b, w32ToComps c, w32ToComps d, w32ToComps e]
where (a, b, c, d, e) = sha1' msg 0 h0 h1 h2 h3 h4
sha1' msg sz a b c d e
| L.length m1 < 64 = sha1'last (padded msg sz) a b c d e
| otherwise = uncurry5 (sha1' m2 (sz + 64)) $ whole a b c d e m1
where (m1, m2) = L.splitAt 64 msg
sha1'last msg a b c d e
| m1 == L.empty = (a, b, c, d, e)
| otherwise = uncurry5 (sha1'last m2) $ whole a b c d e m1
where (m1, m2) = L.splitAt 64 msg
whole a b c d e msg = partcd (partab msg) a b c d e
partcd ws a b c d e = (h0 + a', h1 + b', h2 + c', h3 + d', h4 + e')
where
(a', b', c', d', e') = go ws a b c d e 0
go ws a b c d e 80 = (a, b, c, d, e)
go (w:ws) a b c d e t = go ws temp a (rotate b 30) c d (t+1)
where temp = (rotate a 5) + f t b c d + e + w + k t
partab chunk = take 80 ns
where
ns = initial ++ zipWith4 g (drop 13 ns) (drop 8 ns) (drop 2 ns) ns
g a b c d = rotate (a `xor` b `xor` c `xor` d) 1
initial = map (L.foldl (\a b -> (a * 256) + fromIntegral b) 0) $ paginate 4 chunk
f t b c d
| t >= 0 && t <= 19 = (b .&. c) .|. ((complement b) .&. d)
| t >= 20 && t <= 39 = b `xor` c `xor` d
| t >= 40 && t <= 59 = (b .&. c) .|. (b .&. d) .|. (c .&. d)
| t >= 60 && t <= 79 = b `xor` c `xor` d
k t
| t >= 0 && t <= 19 = 0x5A827999
| t >= 20 && t <= 39 = 0x6ED9EBA1
| t >= 40 && t <= 59 = 0x8F1BBCDC
| t >= 60 && t <= 79 = 0xCA62C1D6
padded msg prevsz = L.append msg (L.pack pad)
where
sz = L.length msg
totalsz = prevsz + sz
padsz = fromIntegral $ (128 - 9 - sz) `mod` 64
pad = [0x80] ++ (replicate padsz 0) ++ int64ToComps totalsz
uncurry5 f (a, b, c, d, e) = f a b c d e
paginate n xs
| xs == L.empty = []
| otherwise = let (a, b) = L.splitAt n xs in a : paginate n b
w32ToComps :: Word32 -> [Word8]
w32ToComps = integerToComps [24, 16 .. 0]
int64ToComps :: Int64 -> [Word8]
int64ToComps = integerToComps [56, 48 .. 0]
integerToComps :: (Integral a, Bits a) => [Int] -> a -> [Word8]
integerToComps bits x = map f bits
where f n = fromIntegral ((x `shiftR` n) .&. 0xff) :: Word8
For starters, you appear to be keeping a size count in bytes (see sz + 64), but the count that gets appended should be in bits so you need to multiply by 8 somewhere (incidentally, I suggest you use cereal or binary instead of rolling your own Integer to big endian Word64). This isn't the only problem though.
EDIT: Found It
Ah-ha! Never forget, wikipedia is written by a bunch of imperative, mutable-world unenlighteneds! You finish each chunk with h0 + a', h1 + b', ... but that should be the old context plus your new values: a + a', b + b', .... Everything checks out after that (and the above size) fix.
The test code completes now with 5 property tests and 129 KATs succeeding.
End Edit
It would help you out a lot if you divided your implementation into the normal initial, update, finalize operations. That way you could compare intermediate results with other implementations.
I just built test code for your implementation using crypto-api-tests. The additional code is below if you're interested, don't forget to install crypto-api-tests.
import Test.SHA
import Test.Crypto
import Crypto.Classes
import Data.Serialize
import Data.Tagged
import Control.Monad
main = defaultMain =<< makeSHA1Tests (undefined :: SHA1)
data SHA1 = SHA1 [Word8]
deriving (Eq, Ord, Show)
data CTX = CTX L.ByteString
instance Serialize SHA1 where
get = liftM SHA1 (mapM (const get) [1..20])
put (SHA1 x) = mapM_ put x
instance Hash CTX SHA1 where
outputLength = Tagged 160
blockLength = Tagged (64*8)
initialCtx = CTX L.empty
updateCtx (CTX m) x = CTX (L.append m (L.fromChunks [x]))
finalize (CTX m) b = SHA1 $ sha1 (L.append m (L.fromChunks [b]))

Resources