Linda

Engineering YOCaml

Pretty Errors; Source and Target Path context

A telling of my redesign of YOCaml’s provider error to reflect source and target paths

Where did this come from? Is most probably the first question you ask yourself when an error pops up on your terminal. The source path of an error is arguably one of the most important if not the most important part of an error message. And in our case of a static site generator; it is equally important to know what target file failed to write.

So lets walkthrough part II of imporving error reporting for validation errors: Adding source and Path Context.

Prologue

Before I landed on the final approach, I had a first attempt that seemed straightforward but was actually very invasive.

It hinged on changing the structure of the provider error from this:

type provider_error =
| Parsing_error of { given : string; message : string }
| Validation_error of { entity : string; error : Data.Validation.value_error }
| Required_metadata of { entity : string }

To this:

type provider_error =
| Parsing_error of { given : string; message : string; source : Path.t; target : Path.t}
| Validation_error of { entity : string; error : Data.Validation.value_error; source : Path.t; target  :Path.t;
}
| Required_metadata of { entity : string; source : Path.t; target : Path.t;}

The idea being that every provider_error would now carry the source and target paths needed to produce better error diagnostics.

simple, right? but it was much more than that.

Adding fields (source and target) to the constructors changes the signature of those constructors, which changes the overall shape of the type. which means:

  1. every place constructing a provider_error must now provide source and target
  2. every pattern match destructuring the error must account for the new fields
  3. every function returning provider_error must adapt to the new structure

In short: a lot of the codebase suddenly breaks.

Safe to say, this approach was abandoned. And so the next question became: how do we attach source and target context without modifying the existing error type?

Second time’s the charm

The answer came in the form of a wrapper exception:

exception Provider_error of {
source : Path.t option;
target : Path.t option;
error : Required.provider_error;
}

The wrapper provides us with a way to attach additional context without touching provider_error type Conceptually, think of it like this:

Provider_error
 ├─ source : Path.t option
 ├─ target : Path.t option
 └─ error  : provider_error

Instead of modifying the error itself, we wrap it and attach the paths alongside it.

Notice also that source and target are optional. This reflects the reality of the build pipeline: different stages know different pieces of information. Sometimes we know both paths, sometimes only one.

With the structure in place, the next task was to actually attach the paths. As things stand right now in the implementation, we don’t yet have the paths themselves, only the mechanism to carry them alongside a provider_error.

Source

The source path represents the input file that triggered the error. In the context of a static site generator, this is usually the content file being processed, for example a Markdown post or page.

This brings us to two functions used in the YOCaml pipeline to read and validate metadata from an input file: read_file_with_metadata and read_file_as_metadata.

Both functions follow the same pattern. They read the file, attempt to validate its metadata, and produce a Result value. If validation succeeds, the metadata is returned. If it fails, a provider_error is produced.

This is the point where we have access to the input file path and where attaching the source becomes possible.

|> Result.fold
   ~error:(fun err ->
     raise
     @@ Provider_error { source = Some path; target = None; error = err })

On failure, the error is wrapped in the Provider_error exception from earlier, with the source set to the current file path. The target is set to None simply because we don’t know it yet at this stage of the pipeline. We can see the earlier design choice of making both fields optional starting to pay off. We attach what we know now and leave the rest for later.

Target

As we know about the source at the point of reading the input file, we know about the target at the point of writing the output file. Enter write_dynamic_file:

let write_dynamic_file target task =
 perform target task
   ~when_creation:(fun now target eff cache ->
     let open Eff.Syntax in
     let* fc, dynamic_deps = eff () in
     let* hc = Eff.hash fc in
     perform_writing now target cache fc hc dynamic_deps)
   ~when_update:perform_update

This is where the task is actually executed to produce content, and where the system decides whether to create, update, or skip writing the target file.

As the task runs, it eventually calls the same metadata reading and validation functions we saw earlier in the source section. If validation fails, the error is raised there, and that is the point where the source gets attached. From there, the error bubbles up, passing through write_dynamic_file, and continues up the call stack.

Nothing here adds the target. The error simply passes through with only the source attached and continues upward unchanged. This makes write_dynamic_file the right place to intercept the error and attach the target before letting it continue its journey up the stack, now carrying both source and target context.

We could rewrite write_dynamic_file so that it handles a Provider_error directly and attaches the target there. But since we can achieve the same result with a handler and without that level of invasiveness, that’s the approach I chose.

let propagate_target target program cache =
 let handler =
   Effect.Deep.
     {
       exnc = (fun exn -> raise exn)
     ; retc = Eff.return
     ; effc =
         (fun (type a) (eff : a Effect.t) ->
           match eff with
           | Eff.Yocaml_failwith (Eff.Provider_error e) ->
               Some
                 (fun (k : (a, _) continuation) ->
                   let open Eff in
                   let new_exn =
                     Eff.Provider_error { e with target = Some target }
                   in
                   let* x = raise new_exn in
                   continue k x)
           | _ -> None)
     }
 in
 Eff.run handler (fun cache -> program cache) cache

This wrapper intercepts Provider_error as it passes through and attaches the target. It then re-raises the same error so it can continue its journey up the call stack, now with both source and target attached. This is how the wrapper is used in write_dynamic_file:

let write_dynamic_file target task =
 propagate_target target
   (perform target task ...)

And with that, we can now follow the error back to our diagnostics and print both paths!

Conclusion

And now this leads us to the easy part: making the printer aware of the new context. The entry point changes from:

| Eff.Provider_error error ->
   glob_pp (pp_provider_error custom_error) error

to:

| Eff.Provider_error { source; target; error } ->
   glob_pp (pp_provider_error custom_error ~source ~target) error

And after a little OCaml Format magic, we finally reveal:

--- Oh dear, an error has occurred ---
Unable to write to target ./_www/posts/example.html:
Validation error in: ./content/posts/example.md (entity: `post`):
Invalid shape:
 Expected: string
 Given: `42`

And with that, we reach the end of the error pretty printing two part series!

Epilogue

With both the changes in Part I and II, the terminal was our main focus. But the work wouldn’t feel complete without seeing these changes reflected on the server as well.

YOCaml itself is runtime-agnostic, but two runtimes are currently available as plugins: yocaml_unix and yocaml_eio. In both servers, errors would previously escape without being formatted, resulting in unhelpful 500 responses. By introducing a try/with block in the server run functions, I catch the raised exception and route it through the same diagnostic printing function we have been working with so far.

With that, the error message in the terminal is now mirrored on the server as well, bringing everything together. The end. 🎉