Engineering YOCaml - Pretty Errors; A YOCaml Format Fix

A telling of my redesign of YOCaml’s provider error output into a clearer, prettier format.

Imagine you’ve just finished setting up your Yocaml blog. You’re excited and finally ready to publish your article on pretty printing. You run your server, expecting everything to work… …and bam. You get hit with an error.

Fail with Invalid record: {errors =
 Invalid subrecord
   Fail with Invalid record:
     {errors =
       Invalid field = name
         {error =
           Fail with Invalid shape:
             { expected = string;
               given = 1; };
          given = 1; };
      given =
        name = 1; };
given =
  author = {"name": 1}; }]

At first, you stare at the output and try to figure out what you broke so badly. You brace yourself. Terminal fullscreen. Docs open in 12 tabs. …then you actually read the error and realize it’s just a simple validation issue. Turns out the only thing that was really wrong was how the error looked. Walk with me as we fix error printing in Yocaml’s data validation.

The Makeover One of Yocaml’s strengths is that it can parse and verify your metadata against a model you provide. For example, you might have a model article that specifies it needs a date and a title. When one of your articles doesn’t fit that schema, Yocaml raises provider errors while reading the metadata. They usually fall into three categories:

Required metadata errors (an entire metadata entry is missing)
Parsing errors (the metadata could not be parsed)
Validation errors (the metadata parsed, but the shape or values were wrong) Let’s look at the before and after of each of them.

1) Required metadata

This is what you got when an entire metadata entry was missing:

--- Oh dear, an error has occurred ---  
Required metadata: `post`  
---

This case was already in a pretty good place! The printer uses OCaml’s Format module. Format provides printf-like functions, but instead of writing straight to the terminal, they write to a formatter, which handles spacing and layout based on the directives you give it. Format.fprintf ppf "Required metadata: %s" entity Here, we don’t ask much of the formatter. We just use it to dynamically insert the model/entity that was violated. For this case, the code was already doing the right thing, so I left it unchanged.

2) Parsing errors

Here’s what a parsing failure looked like before:

--- Oh dear, an error has occurred ---
Parsing error: given: `author linda
age: 21
`
message:
                                                             `Yaml: error calling parser: could not find expected ':' character`
---

Now things are starting to get a littttle unreadable. Somehow, there was both missing indentation and unnecessary indentation at the same time. This came from mixing raw newlines (\n) with Format printing:

Format.fprintf ppf
"Parsing error: @[given: @[`%s`@]
\nmessage:@[`%s`@]@]

At a glance, this looks reasonable. There’s an outer Format box, and nested boxes for given and message, with the idea that each would land nicely on its own line. But the raw newline forces a newline immediately, bypassing Format’s layout logic entirely. Once you mix the two, the formatter loses control, and the layout goes sideways. So the fix wasn’t complicated. First, I stuck to Format only, just like in the required metadata case. Second, I dropped Format boxes.(more on this later) Parsing errors are flat and predictable. I just needed one thing per line and didn't the dimensionality provided by boxes.

Format.fprintf ppf
"Parsing error:@,Given: `%s`@,Message: `%s`"
given message

That @, is a cut break hint. It tells Format it’s allowed to break the line there and reset indentation properly. With that change, a parsing error now looks like this:

--- Oh dear, an error has occurred ---
Parsing error:
Given: `author linda
age: 21`
Message: `Yaml: error calling parser: could not find expected ':' character`
---

Same information. Much easier to read.

3) Validation errors

Validation errors were the worst offenders. They can be deeply nested, and that nesting was being printed almost directly from the internal validation structure. At first glance, this is overwhelming.

Fail with Invalid record: {errors =
 Invalid subrecord
   Fail with Invalid record:
     {errors =
       Invalid field = name
         {error =
           Fail with Invalid shape:
             { expected = string;
               given = 1; };
          given = 1; };
      given =
        name = 1; };
given =
  author = {"name": 1}; }]

But Let’s break it down...

Invalid record
│
├─ Errors (1)
│  │
│  └─ Invalid subrecord
│     │
│     └─ Invalid record
│        │
│        ├─ Errors (1)
│        │  │
│        │  └─ Invalid field `name`
│        │     │
│        │     └─ Invalid shape
│        │        ├─ Expected: string
│        │        └─ Given: `1`
│        │
│        └─ Given record
│           └─ name = `1`
│
└─ Given record
   └─ author = `{"name": 1}`

At the top level, Yocaml is telling us the metadata is an invalid record. Inside that record, there’s an invalid subrecord. Since the subrecord is itself a record, validation continues one level deeper, where a specific field fails validation: name. And now we reach the actual error. The name field failed because of an invalid shape. A string was expected, but the value provided was 1. From there, the error starts carrying context back up the structure. You see given = 1, then given = name = 1, and finally given = author = {"name": 1}. This is the internal validation structure being printed almost as-is. It’s technically correct, but it reads more like a data structure than an error message.

The Fix

Validation errors can be nested in other ways too: a missing field inside a subrecord, a list with a single invalid element, or multiple failures across different branches.

Granted, not all validation errors are this deeply nested. Some are as simple as a single field having the wrong value. But nested errors were my stress test. If I could make those readable, the simpler cases would fall out naturally. And so for this blog as well, we’ll go through the nested error together and give it a whole new look!

Curly braces and symbols

One of the first things that gave me (and probably you too) a headache was the heavy use of curly braces. They mirrored OCaml record syntax, but for readability, they had to go. The change was simple too, I just went through the code removing curly braces. The equal signs, backticks and semicolons didnt make the cut either allowing our minds to focus on the error itself.

(* old *)
"Fail with Invalid record: { errors = …; given = … }"

(* new *)
"Invalid record:@,"

And voilà! Much easier to read.

Count and numbering

I thought that at each nested level of a record or a list, it would be quite helpful if someone debugging could know how many errors they were dealing with. That also naturally led to numbering the errors themselves when they were being written. In Invalid_record (and similarly in Invalid_list), the new code does:

Format.fprintf ppf "Errors (%d):@," (Nel.length errors);

Here, errors is of type Nel.t, and Nel.length errors counts only the direct children at that level, not any nested errors below. Numbering comes from Nel.iteri:

Nel.iteri
(fun i err ->
if i > 0 then pp_blankline ppf ();
Format.fprintf ppf "%d) %a" (i + 1)
(pp_record_error custom_error)
err)
errors;

Nel.iteri provides a zero-based index, which I convert into human-friendly numbering using (i + 1). This gives us output like:

Errors (1):
1) Invalid subrecord:

Indentation

And now the big one. To fix this I had to pause first and get a better understanding of format boxes, and now I’ll share a bit of that with you… The old printer used this kind of box:

@[<2> ... @] This is the default box type in Format. It tries a horizontal layout first and falls back to vertical if it doesn’t fit. In OCaml Format, break hints are things like @, (cut), @ (space), and @; (full break). With this kind of box, break hints are treated as optional suggestions.

The 2 means: if the box breaks onto a new line, indent by 2 spaces. One important thing to know is that with this box type, nested boxes inherit the current margin. So as boxes nest, indentation keeps increasing. From this alone, you can see why the indentation was getting out of hand. The new printer uses a vertical box instead:

@[<v 2> ... @] Vertical boxes format their contents vertically by design. When you use a break hint inside them, the break always happens. The 2 still means “indent by 2 spaces”, but the key difference is that nested vertical boxes reset to the left margin instead of inheriting it. This prevents the output from drifting further and further to the right. And with all these changes combined, the same validation error now looks like this:

Invalid record:
Errors (1):
1) Invalid subrecord:


Invalid record:
Errors (1):
1) Invalid field `name`:


Invalid shape:
Expected: string
Given: `1`


Given record:
name = `1`


Given record:
     author = `{"name": 1}`

And with that, we come to the end of this blog. There’s still one really important thing missing from the error output; something you’d expect to see the moment an error happens. I’ll talk about that in the next one.