sicp 5.3 garbage collection - garbage-collection

I am currently reading garbage collection realization at sicp and resulting code at the book have no sense for me.
begin-garbage-collection
(assign free (const 0))
(assign scan (const 0))
(assign old (reg root))
(assign relocate-continue (label reassign-root))
(goto (label relocate-old-result-in-new))
reassign-root
(assign root (reg new))
(goto (label gc-loop))
gc-loop
(test (op =) (reg scan) (reg free))
(branch (label gc-flip))
(assign old (op vector-ref) (reg new-cars) (reg scan))
(assign relocate-continue (label update-car))
(goto (label relocate-old-result-in-new))
update-car
(perform (op vector-set!) (reg new-cars) (reg scan) (reg new))
(assign old (op vector-ref) (reg new-cdrs) (reg scan))
(assign relocate-continue (label update-cdr))
(goto (label relocate-old-result-in-new))
update-cdr
(perform (op vector-set!) (reg new-cdrs) (reg scan) (reg new))
(assign scan (op +) (reg scan) (const 1))
(goto (label gc-loop))
relocate-old-result-in-new
(test (op pointer-to-pair?) (reg old))
(branch (label pair))
(assign new (reg old))
(goto (reg relocate-continue))
pair
(assign oldcr (op vector-ref) (reg the-cars) (reg old))
(test (op broken-heart?) (reg oldcr))
(branch (label already-moved))
(assign new (reg free)) ;new location for pair
;; update free pointer
(assign free (op +) (reg free) (const 1))
;; Copy the car and cdr to new memory.
(perform (op vector-set!) (reg new-cars) (reg new) (reg oldcr))
(assign oldcr (op vector-ref) (reg the-cdrs) (reg old))
(perform (op vector-set!) (reg new-cdrs) (reg new) (reg oldcr))
;; Construct the broken heart.
(perform (op vector-set!) (reg the-cars) (reg old) (const broken-heart))
(perform (op vector-set!) (reg the-cdrs) (reg old) (reg new))
(goto (reg relocate-continue))
already-moved
(assign new (op vector-ref) (reg the-cdrs) (reg old))
(goto (reg relocate-continue))
gc-flip
(assign temp (reg the-cdrs))
(assign the-cdrs (reg new-cdrs))
(assign new-cdrs (reg temp))
(assign temp (reg the-cars))
(assign the-cars (reg new-cars))
(assign new-cars (reg temp))
First of all registers "free", "scan", "old", "new", "root" appears to be just indexes of values stored in "the-cars", "the-cdrs", "new-cars", "new-cdrs" for me.
The book says:
The state of the garbage-collection process is controlled by maintaining two pointers: free and scan. These are initialized to point to the beginning of the new memory. The algorithm begins by relocating the pair pointed at by root to the beginning of the new memory.
So assuming that root is not a pair leads us straight to gc-loop point without changing free and scan which ends program.
Assuming that root is a pair makes new=0, free=1, moves first element to the new memory and adjust old memory (which is only sane part for me), then it goes to reassign-root where root=0 (i see root as index of first element at the-car/cdr so root=0 at the very beginning for me), then we get to gc-loop (free!=scan) and
(assign old (op vector-ref) (reg new-cars) (reg scan))
which is pointless for me because we push value to a indexer. And later this code has no sense:
(assign oldcr (op vector-ref) (reg the-cars) (reg old))
What do I get wrong?

Let's use tables for this:
At the start:
index
0
1
2
3
4
the-cars
p1
p3
the-cdrs
p2
p4
index
5
6
new-cars
new-cdrs
After you move the root to the new location (x sign means broken-heart):
index
0
1
2
3
4
the-cars
x
p3
the-cdrs
p5
p4
index
5
6
new-cars
p1
new-cdrs
p2
So when it runs to (assign old (op vector-ref) (reg new-cars) (reg scan))
The scan is currently pointing to p5, so this assigns old to the car in p5, which is p1, which means the relocate-old-result-in-new will continue to work on p1.
Inside relocate-old-result-in-new, it checks if p1 has been moved, if not, then it moves the car & cdr of p1 to a new location, which is p6. And set up broken heart signal for p1.
so the state becomes:
index
0
1
2
3
4
the-cars
x
x
the-cdrs
p5
p6
index
5
6
new-cars
p1
p3
new-cdrs
p2
p4
Then it moves to update-car label, which updates the car of scan, which is p1, to the current new, which is p6, and then assign the cdr of scan, which is p2, to old, and continue working on that. After p2 has been moved, it moves to update-cdr label, which sets up the cdr of scan, which is p2, to point to a new location, then increase scan, and continue the gc-loop, which will start at p6.
For (assign oldcr (op vector-ref) (reg the-cars) (reg old)), it is a way to hold the car & cdr of the object pointed by old, to ease the check for broken heart, the moves... . So if the above makes sense to you, I guess understand oldcr would not be a problem.

Related

Pattern matching in Racket

I have this code snippet in Haskell:
matchEx :: Expr -> [Expr]
matchEx (Number n) = undefined
matchEx (Boolean b) = undefined
matchEx (If condStatement thenStatement elseStatement) = undefined
What would the equivalent pattern matching looks like in racket? After looking at the documentation, this is what I have. Thanks.
(define (matchEx-calls expr)
(match expr
[(literal n) ([])]
[(id ident) ([])]
[(If condStatement thenStatement elseStatement) ([])]
)
)
Here is an example to get you started:
#lang racket
(struct Expr () #:transparent)
(struct Number Expr (n) #:transparent)
(struct Boolean Expr (b) #:transparent)
(struct Or Expr (exprs) #:transparent)
(define (format-Expr e)
(define ~ format-Expr)
(match e
[(Number n) (~a n)]
[(Boolean b) (if b "true" "false")]
[(Or es) (match es
['() "true"]
[(list e) (~ e)]
[(list e1 e2) (~a (~ e1) " or " (~ e2))]
[(list e1 es ...) (~a (~ e1) " or " (~ (Or es)))])]
[(? Expr? e)
; we get here if we forgot to implement a case
(error 'format-expr (~a "internal error, got: " e))]
[_
; this on the other hand is an user error
(error 'format-expr (~a "expected an Expr, got: " e))]))
(format-Expr (Or (list (Boolean #t) (Boolean #f))))

Agents and threads synchronization in Clojure

I have to rewrite this code using agents in someway the result of x is 0 (It means that each thread is executed one after one). But I have problems because I do not have enough knowledge about agents use.
The original code is:
(def x 0)
(let [t1 (Thread. #(dotimes [_ 10000] (def x (inc x))))
t2 (Thread. #(dotimes [_ 10000] (def x (dec x))))]
(.start t1)
(.start t2)
(.join t1)
(.join t2)
(println x))
When I want to use an agent with await(agent_name) to make each thread run separately, it does not work, the result is always different from zero.
Please any suggestions about this?
I gave this a try and it prints 0 as expected:
(ns agent-demo.core
(:gen-class))
(def counter
(agent 0))
(defn -main [& args]
(let [t1 (Thread. #(dotimes [_ 10000]
(send counter inc)))
t2 (Thread. #(dotimes [_ 10000]
(send counter dec)))]
(.start t1)
(.start t2)
(.join t1)
(.join t2)
(await counter)
(println #counter)
(shutdown-agents)))

How do I define a recursive struct in Racket?

I'm trying to imitate recursive types in OCaml in untyped Racket, but I can't seem to find documentation on defining recursive structs. How would I go about mapping this:
type example =
| Red
| Blue of example
| Yellow of example * example;;
into something in Racket?
I've tried the following:
(struct Red ())
(struct Blue (e1))
(struct Yellow (e1 e2))
(struct example/c
(or/c (Red)
(Blue example/c)
(Yellow example/c example/c)))
However, it doesn't work as expected when I put example/c in a contract, because it claims it's a procedure. Any help?
I have changed the example.
Here is an example where the variable e has the contract example/c.
#lang racket
(struct Red () #:transparent)
(struct Blue (e1) #:transparent)
(struct Yellow (e1 e2) #:transparent)
(define example/c
(flat-murec-contract ([red/c (struct/c Red)]
[blue/c (struct/c Blue example/c)]
[yellow/c (struct/c Yellow example/c example/c)]
[example/c (or/c red/c blue/c yellow/c)])
example/c))
(define r (Red))
(define b (Blue r))
(define y (Yellow r b))
(define e y)
(provide (contract-out [e example/c]))
(match e
[(Red) (list "Red")]
[(Blue e1) (list "Blue" e1)]
[(Yellow e1 e2) (list "Yellow" e1 e2)]
[else "huh"])
If you change, say, (Yellow r b) to (Yellow r 42) then you get an error.

Can xmonad make our often used functions available anytime?

We have some functions needed to be called at any time and places. For example:
(defun winG (qty pb ps)
(- (* qty ps (- 1 SXF YHS)) (* 2 GHF) (* qty pb (+ 1 SXF))))
(defun winQ (qty pb ps)
(- (* qty ps (- 1 SXF)) (* 2 GHF) (* qty pb (+ 1 SXF))))
(defun stopLoss (qty pb &optional (lossRate 0.02))
(let ((tot (* qty pb (+ 1 SXF))))
(format t "Stop Loss at:~$~%" (- pb (/ (* tot lossRate) qty)))
(format t "Lost Money:~$(~d%)~%" (* tot lossRate) (* 100 lossRate))))
(defun div618 (p1 p2)
(let ((ratio '(0. 0.191 0.236 0.382 0.5 0.618 0.809 1.))
(price #'(lambda (r) (if (<= p1 p2) (+ p1 (* (- p2 p1) r)) (- p1 (* (- p1 p2) r))))))
(if (<= p1 p2)
(dolist (r (reverse ratio)) (format t "-------~3$ ~$-------~%" r (funcall price r)))
(dolist (r ratio) (format t "-------~3$ ~$-------~%" r (funcall price r))))))
Now we use stumpwm which can load our functions once started and we can call those functions just by striking hot key to open its eval window at any time and places. It is VERY convenient. However, the stumpwm is not VERY steady as xmonad. So we want to use xmonad instead of stumpwm and we donot mind to implement those Common Lisp functions using haskell.
Any suggestion is appreciated!
you should be able to do this via something like this
myKeys conf#(XConfig {XMonad.modMask = modm}) = M.fromList $
[ ((modm, xK_F1 ), spawn $ XMonad.terminal conf) --starts the terminal
, ((mod4Mask,xK_F1 ), spawn "gvim" >> -- you can concatenate commands too
spawn "notify-send -t 1000 gVim" >>
focusUrgent) -- or haskell functions
, ((mod4Mask,xK_F2 ), spawn "lisp evaluate lispy function")
, ((modm, xK_F3 ), haskellFunctionPortedFromLisp )]
hope this helps.

Kernighan & Ritchie word count example program in a functional language

I have been reading a little bit about functional programming on the web lately and I think I got a basic idea about the concepts behind it.
I'm curious how everyday programming problems which involve some kind of state are solved in a pure functional programing language.
For example: how would the word count program from the book 'The C programming Language' be implemented in a pure functional language?
Any contributions are welcome as long as the solution is in a pure functional style.
Here's the word count C code from the book:
#include <stdio.h>
#define IN 1 /* inside a word */
#define OUT 0 /* outside a word */
/* count lines, words, and characters in input */
main()
{
int c, nl, nw, nc, state;
state = OUT;
nl = nw = nc = 0;
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n')
++nl;
if (c == ' ' || c == '\n' || c = '\t')
state = OUT;
else if (state == OUT) {
state = IN;
++nw;
}
}
printf("%d %d %d\n", nl, nw, nc);
}
Basically, in a functional styly you'll want to divide the IO operation of getting your stream of data from the pure operation of some stateful transistion based on the current character and the current state.
The Haskell solution from Tikhon is very clean but performs three passes on your input data and will result in the entire input being contained in memory until the result is computed. You can process data incrementally, which I do below using the Text package but no other advanced Haskell tools (which could clean this up at the expense of understandability by non-Haskellers).
First we have our preamble:
{-# LANGUAGE BangPatterns #-}
import Data.Text.Lazy as T
import Data.Text.Lazy.IO as TIO
Then we define our data structure to hold the state of the process (number of characters, words, and lines along with the State IN/OUT):
data Counts = Cnt { nc, nl, nw :: !Int
, state :: State }
deriving (Eq, Ord, Show)
data State = IN | OUT
deriving (Eq, Ord, Show)
Now I define a "zero" state just for easy use. I'd normally make a number of helper functions or use a package like lense to make incrementing each field in the Counts structure simple, but will go without for this answer:
zeros :: Counts
zeros = Cnt 0 0 0 OUT
And now I translate your series of if/else statements into a pure state machine:
op :: Counts -> Char -> Counts
op c '\n' = c { nc = nc c + 1, nw = nw c + 1, nl = nl c + 1, state=OUT }
op c ch | ch == ' ' || ch == '\t' = c { nc = nc c + 1, state=OUT }
| state c == OUT = c { nc = nc c + 1, nw = nw c + 1, state = IN }
| otherwise = c { nc = nc c + 1 }
Finally the main function just gets the input stream and folds our operation over the characters:
main = do
contents <- TIO.getContents
print $ T.foldl' op zeros contents
EDIT: You mentioned not understanding the syntax. Here is an even simpler version that I will explain:
import Data.Text.Lazy as T
import Data.Text.Lazy.IO as TIO
op (nc, nw, nl, st) ch
| ch == '\n' = (nc + 1, nw + 1 , nl + 1, True)
| ch == ' ' || ch == '\t' = (nc + 1, nw , nl , True)
| st = (nc + 1, nw + 1 , nl , False)
| otherwise = (nc + 1, nw , nl , st)
main = do
contents <- TIO.getContents
print $ T.foldl' op (0,0,0,True) contents
The import statements give us access to the getContents and foldl' functions we use.
The op function uses a bunch of guards - parts like | ch = '\n' - which is basically like a C if/elseif/else series.
The tuples ( ... , ... , ... , ... ) contain all our state. Haskell variables are immutable, so we make new tuples by adding one (or not) to the values of the previous variables.
A simple way to do it would be to read in the input and then use some simple functions to get the line/word/character count. Something like this would work:
count :: String -> (Int, Int, Int)
count str = (length $ lines str, length $ words str, length str)
main :: IO ()
main = fmap count getContents >>= print
This isn't exactly the same, but it's close.
This works really simply. Given a string, we can turn it into a list of lines with the standard lines function and a list of words with the standard words function. Since String is just [Char], length returns the number of characters. This is how we get the three counts. (For reference, length $ lines str is the same as length (lines str).)
The important idea is how the IO--reading the input and printing it out--is separated from the actual logic.
Also, instead of going through the input character by character keeping track of some state, we get the actual numbers by applying simple functions to the input. These functions are all just compositions of standard library functions.
In your loop there are four state variables, nc, nw, nl and state, plus the next character c. The loop remembers nc, nw, nl and state from the last time through the loop, and c changes each iteration through the loop. Imagine instead that you take those variables out of the loop and put them in a vector: [state, nc, nw, nl]. Then you change your loop construct into a function that takes two arguments, the first being a vector [state, nc, nw, nl], and the second being c, and returns a new vector with the updated values of nc, nw, nl and state. In C-ish pseudocode:
f([state, nc, nw, nl], c) {
++nc;
if (c == '\n')
++nl;
if (c == ' ' || c == '\n' || c = '\t')
state = OUT;
else if (state == OUT) {
state = IN;
++nw;
}
return [state, nc, nw, nl];
}
Now you can call that function with the vector [OUT, 0, 0, 0] and the first character in the string ("hello, world", say), and it will return a new vector [IN, 1, 0, 0]. Call f again with this new vector and the second character 'e', and it returns [IN, 2, 0, 0]. Repeat for the rest of the characters in the string, and the last call will return [IN, 12, 2, 0], identical to the values printed by the C code.
The basic idea is that you take the state variables out of the loop, turn the guts of the loop into a function, and pass the vector of state variables and the next input in as arguments to that function, and return a new state vector as a result. There is a function called reduce that does this.
Here's how you would do it in Clojure (formatted to emphasize the vectors returned):
(defn f [[state nc nw nl] c]
(let [nl (if (= c \n)(inc nl) nl)]
(cond
(or (= c \space)(= c \n)(= c \t)) [:out (inc nc) nw nl]
(= state :out) [:in (inc nc) (inc nw) nl]
true [state (inc nc) nw nl]
)))
(defn wc [s] (reduce f [:out 0 0 0] s))
(wc "hello, world")
which returns (and prints in the repl) [:in 12 2 0]
Here's my shot at a purely functional, strict, single-pass, tail-recursive solution in Scheme:
(define (word-count input-port)
(let loop ((c (read-char input-port))
(nl 0)
(nw 0)
(nc 0)
(state 'out))
(cond ((eof-object? c)
(printf "nl: ~s, nw: ~s, nc: ~s\n" nl nw nc))
((char=? c #\newline)
(loop (read-char input-port) (add1 nl) nw (add1 nc) 'out))
((char-whitespace? c)
(loop (read-char input-port) nl nw (add1 nc) 'out))
((eq? state 'out)
(loop (read-char input-port) nl (add1 nw) (add1 nc) 'in))
(else
(loop (read-char input-port) nl nw (add1 nc) state)))))
word-count receives an input port as a parameter; notice that no additional data structures are created (structs, tuples, vectors, etc.) instead, all state is kept in parameters. As an example, for counting the words in a file containing this:
hello, world
Call the procedure like this:
(call-with-input-file "/path/to/file" word-count)
> nl: 0, nw: 2, nc: 12
Common Lisp is mentioned, but it is not a pure functional programming language and it does not support TCO in its standard. Individual implementations do.
Tail recursive version, if the compiler supports it:
(defun word-count (&optional (stream *standard-input*))
(labels ((word-count-aux (in-p chars words lines)
(case (read-char stream nil :eof)
(:eof (values chars words lines))
(#\newline (word-count-aux nil (1+ chars) words (1+ lines)))
((#\space #\tab) (word-count-aux nil (1+ chars) words lines))
(otherwise (word-count-aux t
(1+ chars)
(if in-p words (1+ words))
lines)))))
(word-count-aux nil 0 0 0)))
But since TCO is not in the standard, a portable version would look more like this:
(defun word-count (&optional (stream *standard-input*)
&aux (in-p nil) (chars 0) (words 0) (lines 0) char)
(loop while (setf char (read-char stream nil nil)) do
(case char
(#\newline (setf in-p nil) (incf lines))
((#\space #\tab) (setf in-p nil))
(otherwise (unless in-p (incf words)) (setf in-p t)))
(incf chars))
(values chars words lines))
Above is no longer Functional.
We can replace the loop with a higher-order stream-map:
(defun stream-map (function stream)
(loop for char = (read-char stream nil nil)
while char do (funcall function char)))
(defun word-count (&optional (stream *standard-input*)
&aux (in-p nil) (chars 0) (words 0) (lines 0) char)
(stream-map (lambda (char)
(incf chars)
(when (eql char #\newline)
(incf lines))
(if (member char '(#\space #\newline #\tab))
(setf in-p nil)
(unless in-p
(incf words)
(setf in-p t))))
stream)
(values chars words lines))
The state is modified by the closure. To get rid of that we can implement a stream-reduce.
(defun stream-reduce (function stream &key initial-value)
(let ((value initial-value))
(loop for char = (read-char stream nil nil)
while char
do (setf value (funcall function value char)))
value))
(defun word-count (&optional (stream *standard-input*))
(rest (stream-reduce
(lambda (state char)
(destructuring-bind (in-p chars words lines) state
(case char
(#\newline (list nil (1+ chars) words (1+ lines)))
((#\space #\tab) (list nil (1+ chars) words lines))
(otherwise (list t
(1+ chars)
(if in-p words (1+ words))
lines)))))
stream
:initial-value (list nil 0 0 0))))
Here is a Scheme version of the program, from my blog, which implements the entire Unix word count program, including argument- and file-handling. The key function is wc, which is purely functional. It moves all local variables into the arguments of a local function (defined via named-let), which is the standard idiom for converting an imperative loop to functional style. The man page and code appear below:
NAME
wc -- word count
SYNOPSIS
wc [ -lwc ] [ name ... ]
DESCRIPTION
Wc counts lines, words and characters in the named files,
or in the standard input if no name appears. A word is a
maximal string of characters delimited by spaces, tabs or
newlines.
If the optional argument is present, just the specified
counts (lines, words, or characters) are selected by the
letters l, w or c.
#! /usr/bin/scheme --script
(define l-flag #t)
(define w-flag #t)
(define c-flag #t)
(define (update-flags fs)
(if (not (member #\l fs)) (set! l-flag #f))
(if (not (member #\w fs)) (set! w-flag #f))
(if (not (member #\c fs)) (set! c-flag #f)))
(define (put-dec n width)
(let* ((n-str (number->string n)))
(display (make-string (- width (string-length n-str)) #\space))
(display n-str)))
(define (wc)
(let loop ((inword #f) (c (read-char)) (ls 0) (ws 0) (cs 0))
(cond ((eof-object? c) (values ls ws cs))
((char=? c #\newline)
(loop #f (read-char) (add1 ls) ws (add1 cs)))
((not (member c '(#\space #\newline #\tab)))
(if inword
(loop #t (read-char) ls ws (add1 cs))
(loop #t (read-char) ls (add1 ws) (add1 cs))))
(else (loop #f (read-char) ls ws (add1 cs))))))
(define (main args)
(when (and (pair? args) (char=? (string-ref (car args) 0) #\-))
(update-flags (cdr (string->list (car args))))
(set! args (cdr args)))
(if (null? args)
(let-values (((ls ws cs) (wc)))
(when l-flag (display ls) (display " "))
(when w-flag (display ws) (display " "))
(when c-flag (display cs) (display " "))
(newline))
(let loop ((args args) (l-tot 0) (w-tot 0) (c-tot 0))
(if (null? args)
(begin (when l-flag (put-dec l-tot 12))
(when w-flag (put-dec w-tot 12))
(when c-flag (put-dec c-tot 12)))
(with-input-from-file (car args)
(lambda ()
(let-values (((ls ws cs) (wc)))
(when l-flag (put-dec ls 12))
(when w-flag (put-dec ws 12))
(when c-flag (put-dec cs 12))
(display " ") (display (car args)) (newline)
(loop (cdr args) (+ l-tot ls) (+ w-tot ws) (+ c-tot cs)))))))))
(main (cdr (command-line)))
Here's a solution based on the Clojure example posted here but in CL using recursion.
(defstruct (state (:constructor make-state (state chars words lines)))
state chars words lines)
(defun wc (state stream)
(symbol-macrolet ((s (state-state state))
(c (state-chars state))
(w (state-words state))
(l (state-lines state)))
(case (read-char stream nil :eof)
(:eof state)
(#\Newline (wc (make-state :out (1+ c) w (1+ l)) stream))
(#\Space (wc (make-state :out (1+ c) w l) stream))
(t (if (eq s :out)
(wc (make-state :in (1+ c) (1+ w) l) stream)
(wc (make-state :in (1+ c) w l) stream))))))
(with-input-from-string (stream "Hello Functional Programming World")
(wc (make-state :out 0 0 0) stream))
;;; => #S(STATE :STATE :IN :CHARS 34 :WORDS 4 :LINES 0)
I believe you could write this somewhat more elegantly while still only iterating over the input once, but you'll need to make GHC do more work, certainly use -O2.
I have not yet compiled this code, much less compared it's speed vs. Thomas DuBuisson's answer, but this should indicate the basic direction.
{-# LANGUAGE BangPatterns #-}
import Data.List
wordcount = snd . foldl' go (False,0)
where go (!b,!n) !c = if elem c [' ','\t','\n'] then (False,n)
else (True, n + if b then 0 else 1)
linecount = foldl' go 0
where go !n !c = n + if c == '\n' then 1 else 0
main = interact $ show . go
where go x = (linecount x, wordcount x, foldl' (\!n _ ->n+1) 0 x)
If I understand stream fusion correctly, then GHC should inline wordcount and linecount into main, merge the three foldl' commands into one, well hopefully, and start rearranging the if checks. I'd hope it'd inlined elem and foldl' too of course.
If not, you could certainly force inlining and probably create a simple fusion rule, but maybe the defaults suffice. Or maybe some simple alterations produce the desired effect.
Btw, I have written foldl' (\n _ ->n+1) 0 x only because I've heard bad storied about length, but maybe length works fine, another change worth profiling.
In Haskell using strict IO rather than lazy. Does words only but you can easily implement characters and lines on top of this. Requires the text and conduit packages:
module Main
where
import Control.Applicative
import Data.Conduit
import qualified Data.Conduit.Binary as CB
import qualified Data.Conduit.List as CL
import qualified Data.Conduit.Text as CT
import qualified Data.Text as T
import System.Environment
main :: IO ()
main = do args <- getArgs
print <$> (runResourceT $
CB.sourceFile (args !! 0)
$$ CB.lines
=$= CT.decode CT.utf8
=$= CL.map T.words
=$ CL.fold (\acc words -> acc + length words) 0)
Here's a version in Typed Racket using match and the for loop macros:
(: word-count : Input-Port -> Void)
(define (word-count in)
(define-values (nl nw nc st)
(for/fold: ([nl : Integer 0] [nw : Integer 0] [nc : Integer 0]
[state : (U 'in 'out) 'out])
([c (in-input-port-chars in)])
(match* (c state)
[(#\newline _) (values (add1 nl) nw (add1 nc) 'out)]
[((? char-whitespace?) _)
(values (add1 nl) nw (add1 nc) 'out)]
[(_ 'out) (values nl (add1 nw) (add1 nc) 'in)]
[(_ _) (values nl nw (add1 nc) state)])))
(printf "nl: ~s, nw: ~s, nc: ~s\n" nl nw nc))
Here is a Haskell implementation, where I have tried to stay close to the approach followed by the original C program. Iterations often become fold operations, with state-containing variables ending up as the first argument to the operation passed to fold.
-- Count characters, words, and lines in an input string.
wordCount::String->(Int, Int, Int)
wordCount str = (c,w,l)
where (inWord,c,w,l) = foldl op (False,0,0,1) str
where op (inWord,c,w,l) next | next == '\n' = (False,c+1,w,l+1)
| next == '\t' || next == ' ' = (False,c+1,w,l)
| inWord == False = (True,c+1,w+1,l)
| otherwise = (True,c+1,w,l)
main = interact $ show . wordCount

Resources