14

It was brought to my attention that the cost of type inference in a functional language like OCaml can be very high. The claim is that there is a sequence of expressions such that for each expression the length of the corresponding type is exponential on the length of the expression.

I devised the sequence below. My question is: do you know of a sequence with more concise expressions that achieves the same types?

# fun a -> a;;
- : 'a -> 'a = <fun>
# fun b a -> b a;;
- : ('a -> 'b) -> 'a -> 'b = <fun>
# fun c b a -> c b (b a);;
- : (('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'a -> 'c = <fun>
# fun d c b a -> d c b (c b (b a));;
- : ((('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'c -> 'd) ->
   (('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'a -> 'd
= <fun>
# fun e d c b a -> e d c b (d c b (c b (b a)));;
- : (((('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'c -> 'd) ->
    (('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'd -> 'e) ->
   ((('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'c -> 'd) ->
   (('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'a -> 'e
= <fun>
# fun f e d c b a -> f e d c b (e d c b (d c b (c b (b a))));;
- : ((((('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'c -> 'd) ->
     (('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'd -> 'e) ->
    ((('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'c -> 'd) ->
    (('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'e -> 'f) ->
   (((('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'c -> 'd) ->
    (('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'd -> 'e) ->
   ((('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'c -> 'd) ->
   (('a -> 'b) -> 'b -> 'c) -> ('a -> 'b) -> 'a -> 'f
= <fun>
mrrusof
  • 241
  • 2
  • 4

1 Answers1

16

In this answer, I'll stick to a core ML fragment of the language, with just lambda-calculus and polymorphic let following Hindley-Milner. The full OCaml language has additional features such as row polymorphism (which if I recall correctly doesn't change the theoretical complexity, but with which real programs tend to have larger types) and a module system (which if you poke hard enough can be non-terminating in pathological cases involving abstract signatures).

The worst-case time complexity for deciding whether a core ML program has a type is a simple exponential in the size of the program. The classical references for this result are [KTU90] and [M90]. A more elementary but less complete treatment is given in [S95].

The maximum size of the type of the type of a core ML program is in fact doubly exponential in the size of the program. If the type checker must print the type of the program, the time may therefore be doubly exponential; it can be brought back to a simple exponential by defining abbreviations for repeated parts of the tree. This can correspond to sharing parts of the type tree in an implementation.

Your example shows exponential growth of the type. Note, however, that it is possible to give a linear-size representation of the type by using abbreviations for repeated parts of the type. This can correspond to sharing parts of the type tree in an implementation. For example:

# fun d c b a -> d c b (c b (b a));;
t2 -> t2
where t2 = (t1 -> 'b -> 'c) -> t1 -> 'a -> 'd
where t1 = 'a -> 'b

Here is a conceptually simpler example: the size of the type of the pair (x,x) is twice the size of the size of the type of x, so if you compose the pair function $N$ times, you get a type of size $\Theta(2^N)$.

# let pair x f = f x x;;
# let pairN x = pair (pair (pair … (pair x)…));;
'a -> tN
where tN = (tN-1 -> tN-1 -> 'bN) -> 'bN
…
where t2 = (t1 -> t1 -> 'b2) -> 'b2
where t1 = ('a -> 'a -> 'b1) -> 'b1

By introducing nested polymorphic let definitions, the size of the type increases again exponentially; this time, no amount of sharing can shave off the exponential growth.

# let pair x f = f x x;;
# let f1 x = pair x in
  let f2 x = f1 (f1 x) in
  let f3 x = f2 (f2 x) in
  fun z -> f3 (fun x -> x) z;;

References

[KTU90] Kfoury, J.; Tiuryn; Urzyczyn, P. (1990). "ML typability is dexptime-complete". Lecture Notes in Computer Science. CAAP '90 431: 206-220. [Springer] [Google]

[M90] Mairson, Harry G. (1990). "Deciding ML typability is complete for deterministic exponential time". Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. POPL '90 (ACM): 382–401. [ACM]

[P04] Benjamin C. Pierce. Advanced Topics in Types and Programming Languages. MIT Press, 2004. [Amazon]

[PR04] François Pottier and Didier Rémy. "The Essence of ML Type Inference". Chapter 10 in [P04]. [pdf]

[S95] Michael I. Schwartzbach. Polymorphic Type Inference. BRICS LS-95-3, June 1995. ps

Gilles 'SO- stop being evil'
  • 44,159
  • 8
  • 120
  • 184