What are structures with "subtraction" but no inverse? - haskell

A group extends the idea of a monoid to allow for inverses. This allows for:
gremove :: (Group a) => a -> a -> a
gremove x y = x `mappend` (invert y)
But what about structures like natural numbers, where there is no inverse? I'm thinking about:
class (Monoid a) => MRemove a where
mremove :: a -> a -> a
with laws:
x `mremove` x = mempty
x `mremove` mempty = x
(x `mappend` y) `mremove` y = x
And additionally:
class (MRemove a) => Group a where
invert :: a -> a
invert x = mempty `mremove` x
-- | For defining MRemove in terms of Group
defaultMRemove :: (Group a) => a -> a -> a
defaultMRemove x y = x `mappend` (invert y)
So, my question is: what is MRemove?

The closest common structure I can think of is a torsor, but it doesn't really apply to naturals in an obvious way. Think of the operations you can perform on time values:
"Subtract" two times, yielding an interval of time (a different type)
Add an interval of time to a time to get another time
Add or subtract intervals of time to get another interval
Very few other operations on pairs of time values make sense. You can't add times, or multiply them, or anything we're used to in algebra. On the other hand, the interval type is much more flexible, supporting addition, subtraction, inversion, and so on. A torsor could thus be defined in Haskell as:
class Group (Diff a) => Torsor a where
type Diff a
subtract : a -> a -> Diff a
add : a -> Diff a -> a
Anyway, that's an attempt at answering your direct question (you can find more at John Baez's excellent page on them), even though it doesn't cover your natural example.
The only other thing that comes close to answering your question, as far as I know, is the solution to code reuse in Coq's (semi)ring solver tactic. They introduce a notion of an "almost ring" with axioms similar to the ones you describe, to allow them to reuse most of their code for naturals as well as full rings. I don't think the idea is very widespread, though.

The name you're looking for is cancellative monoid, though strictly speaking a cancellative semigroup is enough to capture the concept of subtraction. I was wondering about the very same question a year or so ago, and I found the answer by digging through mathematical jargon. Have a look at the CancellativeMonoid class in the incremental-parser package. I'm currently preparing a new package that would contain only the monoid subclasses and a few of their instances, and I hope to release it soon.

A similar question has been asked here. The answer given there is a commutative monoid with monus.

EDIT: This answer is wrong. See my comment below. I'm preserving the answer in case it is interesting.
Take a look at subtraction semigroups. It's a semigroup with a subtraction operator that obeys these laws:
x - (y - x) = x
x - (x - y) = y - (y - x)
(x - y) - z = (x - z) - y
x <> (y - z) = (x <> y) - (x <> z)
(y - z) <> x = (y <> x) - (z <> x)
Sadly, I cannot find resources that discussion a "subtraction monoid", but I assume it would need to obey the following additional law:
x - x = 0

Related

Using recursion schemes in Haskell for solving change making problem

I'm trying to understand histomorphisms from this blog on recursion schemes. I'm facing a problem when I'm running the example to solve the change making problem as mentioned in the blog.
Change making problem takes the denominations for a currency and tries to find the minimum number of coins required to create a given sum of money. The code below is taken from the blog and should compute the answer.
{-# LANGUAGE DeriveFunctor #-}
module Main where
import Control.Arrow ( (>>>) )
import Data.List ( partition )
import Prelude hiding (lookup)
newtype Term f = In {out :: f (Term f)}
data Attr f a = Attr
{ attribute :: a
, hole :: f (Attr f a)
}
type CVAlgebra f a = f (Attr f a) -> a
histo :: Functor f => CVAlgebra f a -> Term f -> a
histo h = out >>> fmap worker >>> h
where
worker t = Attr (histo h t) (fmap worker (out t))
type Cent = Int
coins :: [Cent]
coins = [50, 25, 10, 5, 1]
data Nat a
= Zero
| Next a
deriving (Functor)
-- Convert from a natural number to its foldable equivalent, and vice versa.
expand :: Int -> Term Nat
expand 0 = In Zero
expand n = In (Next (expand (n - 1)))
compress :: Nat (Attr Nat a) -> Int
compress Zero = 0
compress (Next (Attr _ x)) = 1 + compress x
change :: Cent -> Int
change amt = histo go (expand amt)
where
go :: Nat (Attr Nat Int) -> Int
go Zero = 1
go curr#(Next attr) =
let given = compress curr
validCoins = filter (<= given) coins
remaining = map (given -) validCoins
(zeroes, toProcess) = partition (== 0) remaining
results = sum (map (lookup attr) toProcess)
in length zeroes + results
lookup :: Attr Nat a -> Int -> a
lookup cache 0 = attribute cache
lookup cache n = lookup inner (n - 1) where (Next inner) = hole cache
Now if you evaluate change 10 it will give you 3.
Which is... incorrect because you can make 10 using 1 coin of value 10.
So I considered maybe it's solving the coin change problem, which finds the maximum number of ways in which you can make the given sum of money. For e.g. you can make 10 in 4 ways with { 1, 1, ... 10 times }, { 1, 1, 1, 1, 5}, { 5, 5 }, { 10 }.
So what is wrong with this piece of code? Where is it going wrong in solving the problem?
TLDR
The above piece of code from this blog on recursion schemes is not finding minimum or maximum ways to change a sum of money. Why is it not working?
I put some more thought into encoding this problem with recursion schemes. Maybe there's a good way to solve the unordered problem (i.e., considering 5c + 1c to be different from 1c + 5c) using a histomorphism to cache the undirected recursive calls, but I don't know what it is. Instead, I looked for a way to use recursion schemes to implement the dynamic-programming algorithm, where the search tree is probed in a specific order so that you're sure you never visit any node more than once.
The tool that I used is the hylomorphism, which comes up a bit later in the article series you're reading. It composes an unfold (anamorphism) with a fold (catamorphism). A hylomorphism uses ana to build up an intermediate structure, and then cata to tear it down into a final result. In this case, the intermediate structure I used describes a subproblem. It has two constructors: either the subproblem is solved already, or there is some amount of money left to make change for, and a pool of coin denominations to use:
data ChangePuzzle a = Solved Int
| Pending {spend, forget :: a}
deriving Functor
type Cent = Int
type ChangePuzzleArgs = ([Cent], Cent)
We need a coalgebra that turns a single problem into subproblems:
divide :: Coalgebra ChangePuzzle ChangePuzzleArgs
divide (_, 0) = Solved 1
divide ([], _) = Solved 0
divide (coins#(x:xs), n) | n < 0 = Solved 0
| otherwise = Pending (coins, n - x) (xs, n)
I hope the first three cases are obvious. The last case is the only one with multiple subproblems. We can either use one coin of the first listed denomination, and continue to make change for that smaller amount, or we can leave the amount the same but reduce the list of coin denominations we're willing to use.
The algebra for combining subproblem results is much simpler: we simply add them up.
conquer :: Algebra ChangePuzzle Int
conquer (Solved n) = n
conquer (Pending a b) = a + b
I originally tried to write conquer = sum (with the appropriate Foldable instance), but this is incorrect. We're not summing up the a types in the subproblem; rather, all the interesting values are in the Int field of the Solved constructor, and sum doesn't look at those because they're not of type a.
Finally, we let recursion schemes do the actual recursion for us with a simple hylo call:
waysToMakeChange :: ChangePuzzleArgs -> Int
waysToMakeChange = hylo conquer divide
And we can confirm it works in GHCI:
*Main> waysToMakeChange (coins, 10)
4
*Main> waysToMakeChange (coins, 100)
292
Whether you think this is worth the effort is up to you. Recursion schemes have saved us very little work here, as this problem is easy to solve by hand. But you may find reifying the intermediate states makes the recursive structure explicit, instead of implicit in the call graph. Anyway it's an interesting exercise if you want to practice recursion schemes in preparation for more complicated tasks.
The full, working file is included below for convenience.
{-# LANGUAGE DeriveFunctor #-}
import Control.Arrow ( (>>>), (<<<) )
newtype Term f = In {out :: f (Term f)}
type Algebra f a = f a -> a
type Coalgebra f a = a -> f a
cata :: (Functor f) => Algebra f a -> Term f -> a
cata fn = out >>> fmap (cata fn) >>> fn
ana :: (Functor f) => Coalgebra f a -> a -> Term f
ana f = In <<< fmap (ana f) <<< f
hylo :: Functor f => Algebra f b -> Coalgebra f a -> a -> b
hylo alg coalg = ana coalg >>> cata alg
data ChangePuzzle a = Solved Int
| Pending {spend, forget :: a}
deriving Functor
type Cent = Int
type ChangePuzzleArgs = ([Cent], Cent)
coins :: [Cent]
coins = [50, 25, 10, 5, 1]
divide :: Coalgebra ChangePuzzle ChangePuzzleArgs
divide (_, 0) = Solved 1
divide ([], _) = Solved 0
divide (coins#(x:xs), n) | n < 0 = Solved 0
| otherwise = Pending (coins, n - x) (xs, n)
conquer :: Algebra ChangePuzzle Int
conquer (Solved n) = n
conquer (Pending a b) = a + b
waysToMakeChange :: ChangePuzzleArgs -> Int
waysToMakeChange = hylo conquer divide
The initial confusion with the blog post was because it was pointing to a different problem in the wikipedia link.
Retaking a look at change, it's trying to find the number of "ordered" ways of making change for a given value. This means that the ordering of coins matters. The correct value of change 10 should be 9.
Coming back to the problem, the main issue is with the implementation of the lookup method. The key point to note is that lookup is backwards i.e to calculate the contribution of a denomination to the sum it should be passed as argument to the lookup and not it's difference with the given value.
-- to find contribution of 5 to the number of ways we can
-- change 15. We should pass the cache of 15 and 5 as the
-- parameters. So the cache will be unrolled 5 times to
-- to get the value from cache of 10
lookup :: Attr Nat a -- ^ cache
-> Int -- ^ how much to roll back
-> a
lookup cache 1 = attribute cache
lookup cache n = lookup inner (n - 1) where (Next inner) = hole cache
The complete solution is described in this issue by #howsiwei.
Edit: Base on discussion in the comments this can be solved using histomorphisms but with a few challenges
It can be solved using histomorphisms but the cache and functor types will need to be more complex to hold more state. Namely -
The cache will need to keep a list of permitted denominations for a particular amount this will allow us eliminate overlap
The harder challenge is to come up with a functor that can order all the information. Nat will not be sufficient because it cannot distinguish between different values of a complex cache type.
I see two problems with this program. One of them I know how to fix, but the other apparently requires more knowledge of recursion schemes than I have.
The one I can fix is that it's looking up the wrong values in its cache. When given = 10, of course validCoins = [10,5,1], and so we find (zeroes, toProcess) = ([0], [5,9]). So far so good: we can give a dime directly, or give a nickel and then make change for the remaining five cents, or we can give a penny and change the remaining nine cents. But then when we write lookup 9 attr, we're saying "look 9 steps in history to when curr = 1", where what we meant was "look 1 step into history to when curr = 9". As a result we drastically undercount in pretty much all cases: even change 100 is only 16, while a Google search claims the right result is 292 (I haven't verified this today by implementing it myself).
There are a few equivalent ways to fix this; the smallest diff would be to replace
results = sum (map (lookup attr)) toProcess)
with
results = sum (map (lookup attr . (given -)) toProcess)
The second problem is: the values in the cache are wrong. As I mentioned in a comment on the question, this counts different orderings of the same denominations as separate answers to the question. After I fix the first problem, the lowest input where this second problem manifests is 7, with the incorrect result change 7 = 3. If you try change 100 I don't know how long it takes to compute: much longer than it should, probably a very long time. But even a modest value like change 30 yields a number that's much larger than it should be.
I don't see a way to fix this without a substantial algorithm rework. Traditional dynamic-programming solutions to this problem involve producing the solutions in a specific order so you can avoid double-counting. i.e., they first decide how many dimes to use (here, 0 or 1), then compute how to make change for the remaining amounts without using any dimes. I don't know how to work that idea in here - your cache key would need to be larger, including both the target amount and also the allowed set of coins.

Is the following code really currying in haskell?

I am trying to understand currying by reading various blogs and stack over flow answers and I think I understood some what. In Haskell, every function is curried, it means, when you have a function like f x y = x + y
it really is ((f x) y)
in this, the function initially take the first parameter 'x' as the parameter and partially applies it to function f which in turn returns a function for y. where it takes just y a single parameter and applies the function. In both cases the function takes only one parameter and also the process of reducing a function to take single parameter is called 'currying'. Correct me if my understanding wrong here.
So if it is correct, could you please tell me if the functions 'two' and 'three' are curried functions?
three x y z = x + y + z
two = three 1
same = two 1
In this case, I have two specialized functions, 'two' and 'same' which are reduced to take only one parameter so is it curried?
Let's look at two first.
It has a signature of
two :: Num a => a -> a -> a
forget the Num a for now (it's only a constraint on a - you can read Int here).
Surely this too is a curried function.
The next one is more interesting:
same :: Num a => a -> a
(btw: nice name - it's the same but not exactly id ^^)
TBH: I don't know for sure.
The best definition I know of a curried function is this:
A curried function is a function of N arguments returning another function of (N-1) arguments.
(if you want you can extent this to fully curried functions of course)
This will only fit if you define constants as functions with 0 parameters - which you surely can.
So I would say yes(?) this too is a curried function but only in a mathy borderline way (like the sum of 0 numbers is defined to be 0)
Best just think about this equationally. The following are all equivalent definitions:
f x y z = x+y+z
f x y = \z -> x+y+z
f x = \y -> (\z -> x+y+z)
f = \x -> (\y -> (\z -> x+y+z))
Partial application is only tangentially relevant here. Most often you don't want the actual partial application to be performed and the actual lambda object to be created in memory - hoping instead that the compiler will employ - and optimize better - the full definition at the final point of full application.
The presence of the functions curry/uncurry is yet another confusing issue. Both f (x,y) = ... and f x y = ... are curried in Haskell, of course, but in our heads we tend to think about the first as a function of two arguments, so the functions translating between the two forms are named curry and uncurry, as a mnemonic.
You could think that three function with anonymous functions is:
three = \x -> (\y -> (\z -> x + y + z)))

How to compare two functions for equivalence, as in (λx.2*x) == (λx.x+x)?

Is there a way to compare two functions for equality? For example, (λx.2*x) == (λx.x+x) should return true, because those are obviously equivalent.
It's pretty well-known that general function equality is undecidable in general, so you'll have to pick a subset of the problem that you're interested in. You might consider some of these partial solutions:
Presburger arithmetic is a decidable fragment of first-order logic + arithmetic.
The universe package offers function equality tests for total functions with finite domain.
You can check that your functions are equal on a whole bunch of inputs and treat that as evidence for equality on the untested inputs; check out QuickCheck.
SMT solvers make a best effort, sometimes responding "don't know" instead of "equal" or "not equal". There are several bindings to SMT solvers on Hackage; I don't have enough experience to suggest a best one, but Thomas M. DuBuisson suggests sbv.
There's a fun line of research on deciding function equality and other things on compact functions; the basics of this research is described in the blog post Seemingly impossible functional programs. (Note that compactness is a very strong and very subtle condition! It's not one that most Haskell functions satisfy.)
If you know your functions are linear, you can find a basis for the source space; then every function has a unique matrix representation.
You could attempt to define your own expression language, prove that equivalence is decidable for this language, and then embed that language in Haskell. This is the most flexible but also the most difficult way to make progress.
This is undecidable in general, but for a suitable subset, you can indeed do it today effectively using SMT solvers:
$ ghci
GHCi, version 8.0.1: http://www.haskell.org/ghc/ :? for help
Prelude> :m Data.SBV
Prelude Data.SBV> (\x -> 2 * x) === (\x -> x + x :: SInteger)
Q.E.D.
Prelude Data.SBV> (\x -> 2 * x) === (\x -> 1 + x + x :: SInteger)
Falsifiable. Counter-example:
s0 = 0 :: Integer
For details, see: https://hackage.haskell.org/package/sbv
In addition to practical examples given in the other answer, let us pick the subset of functions expressible in typed lambda calculus; we can also allow product and sum types. Although checking whether two functions are equal can be as simple as applying them to a variable and comparing results, we cannot build the equality function within the programming language itself.
ETA: λProlog is a logic programming language for manipulating (typed lambda calculus) functions.
2 years have passed, but I want to add a little remark to this question. Originally, I asked if there is any way to tell if (λx.2*x) is equal to (λx.x+x). Addition and multiplication on the λ-calculus can be defined as:
add = (a b c -> (a b (a b c)))
mul = (a b c -> (a (b c)))
Now, if you normalize the following terms:
add_x_x = (λx . (add x x))
mul_x_2 = (mul (λf x . (f (f x)))
You get:
result = (a b c -> (a b (a b c)))
For both programs. Since their normal forms are equal, both programs are obviously equal. While this doesn't work in general, it does work for many terms in practice. (λx.(mul 2 (mul 3 x)) and (λx.(mul 6 x)) both have the same normal forms, for example.
In a language with symbolic computation like Mathematica:
Or C# with a computer algebra library:
MathObject f(MathObject x) => x + x;
MathObject g(MathObject x) => 2 * x;
{
var x = new Symbol("x");
Console.WriteLine(f(x) == g(x));
}
The above displays 'True' at the console.
Proving two functions equal is undecidable in general but one can still prove functional equality in special cases as in your question.
Here's a sample proof in Lean
def foo : (λ x, 2 * x) = (λ x, x + x) :=
begin
apply funext, intro x,
cases x,
{ refl },
{ simp,
dsimp [has_mul.mul, nat.mul],
have zz : ∀ a : nat, 0 + a = a := by simp,
rw zz }
end
One can do the same in other dependently typed language such as Coq, Agda, Idris.
The above is a tactic style proof. The actual definition of foo (the proof) that gets generated is quite a mouthful to be written by hand:
def foo : (λ (x : ℕ), 2 * x) = λ (x : ℕ), x + x :=
funext
(λ (x : ℕ),
nat.cases_on x (eq.refl (2 * 0))
(λ (a : ℕ),
eq.mpr
(id_locked
((λ (a a_1 : ℕ) (e_1 : a = a_1) (a_2 a_3 : ℕ) (e_2 : a_2 = a_3), congr (congr_arg eq e_1) e_2)
(2 * nat.succ a)
(nat.succ a * 2)
(mul_comm 2 (nat.succ a))
(nat.succ a + nat.succ a)
(nat.succ a + nat.succ a)
(eq.refl (nat.succ a + nat.succ a))))
(id_locked
(eq.mpr
(id_locked
(eq.rec (eq.refl (0 + nat.succ a + nat.succ a = nat.succ a + nat.succ a))
(eq.mpr
(id_locked
(eq.trans
(forall_congr_eq
(λ (a : ℕ),
eq.trans
((λ (a a_1 : ℕ) (e_1 : a = a_1) (a_2 a_3 : ℕ) (e_2 : a_2 = a_3),
congr (congr_arg eq e_1) e_2)
(0 + a)
a
(zero_add a)
a
a
(eq.refl a))
(propext (eq_self_iff_true a))))
(propext (implies_true_iff ℕ))))
trivial
(nat.succ a))))
(eq.refl (nat.succ a + nat.succ a))))))

Implementing Iota in Haskell

Iota is a ridiculously small "programming language" using only one combinator. I'm interested in understanding how it works, but it would be helpful to see the implementation in a language I'm familiar with.
I found an implementation of the Iota programming language written in Scheme. I've been having a little trouble translating it to Haskell though. It's rather simple, but I'm relatively new to both Haskell and Scheme.
How would you write an equivalent Iota implementation in Haskell?
(let iota ()
(if (eq? #\* (read-char)) ((iota)(iota))
(lambda (c) ((c (lambda (x) (lambda (y) (lambda (z) ((x z)(y z))))))
(lambda (x) (lambda (y) x))))))
I've been teaching myself some of this stuff, so I sure hope I get the following right...
As n.m. mentions, the fact that Haskell is typed is of enormous importance to this question; type systems restrict what expressions can be formed, and in particular the most basic type systems for the lambda calculus forbid self-application, which ends up giving you a non-Turing complete language. Turing completeness is added on top of the basic type system as an extra feature to the language (either a fix :: (a -> a) -> a operator or recursive types).
This doesn't mean you can't implement this at all in Haskell, but rather that such an implementation is not going to have just one operator.
Approach #1: implement the second example one-point combinatory logic basis from here, and add a fix function:
iota' :: ((t1 -> t2 -> t1)
-> ((t5 -> t4 -> t3) -> (t5 -> t4) -> t5 -> t3)
-> (t6 -> t7 -> t6)
-> t)
-> t
iota' x = x k s k
where k x y = x
s x y z = x z (y z)
fix :: (a -> a) -> a
fix f = let result = f result in result
Now you can write any program in terms of iota' and fix. Explaining how this works is a bit involved. (EDIT: note that this iota' is not the same as the λx.x S K in the original question; it's λx.x K S K, which is also Turing-complete. It is the case that iota' programs are going to be different from iota programs. I've tried the iota = λx.x S K definition in Haskell; it typechecks, but when you try k = iota (iota (iota iota)) and s = iota (iota (iota (iota iota))) you get type errors.)
Approach #2: Untyped lambda calculus denotations can be embedded into Haskell using this recursive type:
newtype D = In { out :: D -> D }
D is basically a type whose elements are functions from D to D. We have In :: (D -> D) -> D to convert a D -> D function into a plain D, and out :: D -> (D -> D) to do the opposite. So if we have x :: D, we can self-apply it by doing out x x :: D.
Give that, now we can write:
iota :: D
iota = In $ \x -> out (out x s) k
where k = In $ \x -> In $ \y -> x
s = In $ \x -> In $ \y -> In $ \z -> out (out x z) (out y z)
This requires some "noise" from the In and out; Haskell still forbids you to apply a D to a D, but we can use In and out to get around this. You can't actually do anything useful with values of type D, but you could design a useful type around the same pattern.
EDIT: iota is basically λx.x S K, where K = λx.λy.x and S = λx.λy.λz.x z (y z). I.e., iota takes a two-argument function and applies it to S and K; so by passing a function that returns its first argument you get S, and by passing a function that returns its second argument you get K. So if you can write the "return first argument" and the "return second argument" with iota, you can write S and K with iota. But S and K are enough to get Turing completeness, so you also get Turing completeness in the bargain. It does turn out that you can write the requisite selector functions with iota, so iota is enough for Turing completeness.
So this reduces the problem of understanding iota to understanding the SK calculus.

CPS in curried languages

How does CPS in curried languages like lambda calculus or Ocaml even make sense? Technically, all function have one argument. So say we have a CPS version of addition in one such language:
cps-add k n m = k ((+) n m)
And we call it like
(cps-add random-continuation 1 2)
This is then the same as:
(((cps-add random-continuation) 1) 2)
I already see two calls there that aren't tail calls and in reality a complexly nested expression, the (cps-add random-continuation) returns a value, namely a function that consumes a number, and then returns a function which consumes another number and then delivers the sum of both to that random-continuation. But we can't work around this value returning by simply translating this into CPS again, because we can only give each function one argument. We need to have at least two to make room for the continuation and the 'actual' argument.
Or am I missing something completely?
Since you've tagged this with Haskell, I'll answer in that regard: In Haskell, the equivalent of doing a CPS transform is working in the Cont monad, which transforms a value x into a higher-order function that takes one argument and applies it to x.
So, to start with, here's 1 + 2 in regular Haskell: (1 + 2) And here it is in the continuation monad:
contAdd x y = do x' <- x
y' <- y
return $ x' + y'
...not terribly informative. To see what's going on, let's disassemble the monad. First, removing the do notation:
contAdd x y = x >>= (\x' -> y >>= (\y' -> return $ x' + y'))
The return function lifts a value into the monad, and in this case is implemented as \x k -> k x, or using an infix operator section as \x -> ($ x).
contAdd x y = x >>= (\x' -> y >>= (\y' -> ($ x' + y')))
The (>>=) operator (read "bind") chains together computations in the monad, and in this case is implemented as \m f k -> m (\x -> f x k). Changing the bind function to prefix form and substituting in the lambda, plus some renaming for clarity:
contAdd x y = (\m1 f1 k1 -> m1 (\a1 -> f1 a1 k1)) x (\x' -> (\m2 f2 k2 -> m2 (\a2 -> f2 a2 k2)) y (\y' -> ($ x' + y')))
Reducing some function applications:
contAdd x y = (\k1 -> x (\a1 -> (\x' -> (\k2 -> y (\a2 -> (\y' -> ($ x' + y')) a2 k2))) a1 k1))
contAdd x y = (\k1 -> x (\a1 -> y (\a2 -> ($ a1 + a2) k1)))
And a bit of final rearranging and renaming:
contAdd x y = \k -> x (\x' -> y (\y' -> k $ x' + y'))
In other words: The arguments to the function have been changed from numbers, into functions that take a number and return the final result of the entire expression, just as you'd expect.
Edit: A commenter points out that contAdd itself still takes two arguments in curried style. This is sensible because it doesn't use the continuation directly, but not necessary. To do otherwise, you'd need to first break the function apart between the arguments:
contAdd x = x >>= (\x' -> return (\y -> y >>= (\y' -> return $ x' + y')))
And then use it like this:
foo = do f <- contAdd (return 1)
r <- f (return 2)
return r
Note that this is really no different from the earlier version; it's simply packaging the result of each partial application as taking a continuation, not just the final result. Since functions are first-class values, there's no significant difference between a CPS expression holding a number vs. one holding a function.
Keep in mind that I'm writing things out in very verbose fashion here to make explicit all the steps where something is in continuation-passing style.
Addendum: You may notice that the final expression looks very similar to the de-sugared version of the monadic expression. This is not a coincidence, as the inward-nesting nature of monadic expressions that lets them change the structure of the computation based on previous values is closely related to continuation-passing style; in both cases, you have in some sense reified a notion of causality.
Short answer : of course it makes sense, you can apply a CPS-transform directly, you will only have lots of cruft because each argument will have, as you noticed, its own attached continuation
In your example, I will consider that there is a +(x,y) uncurried primitive, and that you're asking what is the translation of
let add x y = +(x,y)
(This add faithfully represent OCaml's (+) operator)
add is syntaxically equivalent to
let add = fun x -> (fun y -> +(x, y))
So you apply a CPS transform¹ and get
let add_cps = fun x kx -> kx (fun y ky -> ky +(x,y))
If you want a translated code that looks more like something you could have willingly written, you can devise a finer transformation that actually considers known-arity function as non-curried functions, and tream all parameters as a whole (as you have in non-curried languages, and as functional compilers already do for obvious performance reasons).
¹: I wrote "a CPS transform" because there is no "one true CPS translation". Different translations have been devised, producing more or less continuation-related garbage. The formal CPS translations are usually defined directly on lambda-calculus, so I suppose you're having a less formal, more hand-made CPS transform in mind.
The good properties of CPS (as a style that program respect, and not a specific transformation into this style) are that the order of evaluation is completely explicit, and that all calls are tail-calls. As long as you respect those, you're relatively free in what you can do. Handling curryfied functions specifically is thus perfectly fine.
Remark : Your (cps-add k 1 2) version can also be considered tail-recursive if you assume the compiler detect and optimize that cps-add actually always take 3 arguments, and don't build intermediate closures. That may seem far-fetched, but it's the exact same assumption we use when reasoning about tail-calls in non-CPS programs, in those languages.
yes, technically all functions can be decomposed into functions with one method, however, when you want to use CPS the only thing you are doing is saying is that at a certain point of computation, run the continuation method.
Using your example, lets have a look. To make things a little easier, let's deconstruct cps-add into its normal form where it is a function only taking one argument.
(cps-add k) -> n -> m = k ((+) n m)
Note at this point that the continuation, k, is not being evaluated (Could this be the point of confusion for you?).
Here we have a method called cps-add k that receives a function as an argument and then returns a function that takes another argument, n.
((cps-add k) n) -> m = k ((+) n m)
Now we have a function that takes an argument, m.
So I suppose what I am trying to point out is that currying does not get in the way of CPS style programming. Hope that helps in some way.

Resources