'How can you define a function who's return type depends on the value of its parameter?

Let's say I want to define a function called zero that takes a string like "f" or "i" and returns the zero value for either float or int.

So we could do something like this (I have no real implementation, this is just mocking up what I want):

utop # zero "f"
- : float = 0.0
utop # zero "i"
- : int = 0

What language features in OCaml make this sort of thing possible? And how do you actually use these features to accomplish something like that? (I.e. can someome show/explain how to do that?).

PS: In just about any other language I wouldn't expect something like this to even be possible. But I think in OCaml it is! I think so because Printf.printf from the Stdlib does something very similar (similar in spirit, but a lot more complicated in its details). I.e. the essence these examples have in common is that, Printf.printf "..." also has a static type that depends/varies based on the value passed as its first argument.

I tried to understand how that works by looking at source code for Printf but it is a bit too complex and deep a rabbit hole for me to understand. I am hoping this simpler but similar example of the zero function above would allow someone to explain the essence of the solution without all the complexity of printf's 'formatting syntax'.



Solution 1:[1]

While the format type from octachrons answer explains why printf can have a return type depending on the value of the format string I don't believe that is the right thing for the question. Mainly because the format string isn't limited to only one value. What should zero "%d%s" return?

So let me give an alternative using GADT:

# type 'a t =
| Int : int t
| Float : float t;;
type 'a t = Int : int t | Float : float t

The type 'a t is a GADT (Generalized algebraic data type) where each variant has a different type. A function, like zero, can then have the return type depend on the type of the GADT.

You might be tempted to just write this:

# let zero = function
  | Int -> 0
  | Float -> 0.0;;
Error: This pattern matches values of type float t
       but a pattern was expected which matches values of type int t
       Type float is not compatible with type int 

Unfortunately Ocamls type inference does not automatically understand GADTs and needs a little help. You have to always annotate a GADT for the type system to understand what is going on. A GADT type must be specified like this:

# let zero : type a . a t -> a = function
| Int -> 0
| Float -> 0.0;;
val zero : 'a t -> 'a = <fun>
# zero Int;;
- : int = 0
# zero Float;;
- : float = 0.

Solution 2:[2]

Having the return type of a function depends on the value requires the type system to be dependently typed. This is not the case of OCaml core language (the module language is dependently typed).

What happens with Printf is that the first argument of Printf is not a string but a format string. There is a bit of magic in the compiler to make format strings appear as strings in the syntax, but format strings are in a fact a Generalized Algebraic Data Type (GADT).

You can observe it in the toplevel if we know where to look. For instance,

open CamlinternalFormatBasics
let s: _ format6 = "%s";;

will print

val s : (string -> 'a, 'b, 'c, 'd, 'd, 'a) format6 =
  Format (String (No_padding, End_of_format), "%s")

In other words "%s" is a syntactic sugar for Format (String (No_padding, End_of_format), "%s").

In term of implementation, it is a bit simpler to look at a simplified version of format string with only string holes and literals:

type 'args fmt =
  | End: unit fmt
  | String_hole: 'inner_args fmt -> (string -> 'inner_args) fmt
  | Lit: string * 'args fmt -> 'args fmt
let s = Lit("Hello ", String_hole End)

Here, the value s is the simplified equivalent to "Hello %s" and is of type (string -> unit) fmt. Similarly,

let s2 = String_hole (String_hole End)

(aka "%s%s") is a value of type (string->string->unit) fmt. In other words, the type 'a of a value of type 'a fmt tracks how many strings are needed to fill all the string hole in the format string.

Since, the information exists in the type, we can describe the type of printf as: 'a fmt -> 'a (without any dependent types). The function printf returns a function that requires as many arguments as there are holes in the format string before returning unit. And indeed, it is possible to write down such printf function:

let rec printf: type a.  a fmt -> a = fun fmt ->
  match fmt with
  | End -> print_newline ()
  | Lit (lit,inner_fmt) -> print_string s; printf inner_fmt
  | String_hole inner_fmt -> fun s -> print_string s; printf inner_fmt

let () = printf s "world"
let () = printf s2 "Hello" "universe" 

This works because GADTs allow to express a finer relationship between variant constructors and the result type of an algebraic data types.

For instance, if I observe a constructor End in a branch of a pattern matching, I know that the type a was in fact unit, and I can return without doing anything.

Similarly, if I a see a Lit(lit,inner_fmt), I know that the inner_fmt has the same type a fmt as its parent. In other word, it expects as many arguments as its parent, and I can print the string lit and go on with evaluating my format string.

And the really interesting case, is the string hole String_hole fmt, which tells me that my type is (string -> $inner_args) fmt. In other words, the parent format string requires one more argument that its children. I must thus returns a function that takes a string argument, print its argument before resuming the evaluation of the inner format string.

OCaml implementation starts with this idea and adds support for many hole types in the format strings.

Thus printf as implemented in OCaml does not require dependent types, but only GADTs.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Goswin von Brederlow
Solution 2 octachron