Following along at home, should you run the code snippets? - prealpha; Formal Verification Agent Recipes

Skip to article frontmatter Skip to article content

This book should be valuable if you halfassedly skim the code snippets.

Some of you may opt to start a fresh repo and copy over certain snippets and fill in the gaps yourself, really treat it as a textbook. That should go pretty well.

Others will look at the book repo and poke it. If you clone it and open up claude code, you can “chat with the repo”. Ask claude to give you exercises by breaking it so you can fix it. If you interact with the book code in this way, do fork it so you can send me PRs with any errors you find.

Others still will ignore the code and barely read the excerpts that are in the text of this website.

All are welcome! As I said on the prior page, I expect all levels of engagement to be able to get something out of this.

Exercises¶

The exercises are all stretch goals, none of them are essential for following along with the book. They’re also all claude-code prompts, I expect it to be instructive to just watch Claude implement/solve them.

Installation¶

If you’re not using uv yet: it’s the correct way to interact with python in year of our lord. Install it. No, I don’t care if you want to follow along with requirements.txt or poetry. It’s year of our lord, we don’t do that anymore.

You’ll need to install dafny and elan (the Lean toolchain manager). In principle I could’ve dockerized so you wouldn’t need to, but that would’ve made the tutorial code a hair more complicated. So we’re going to rely on local installs.

elan is very nice to use, especially if you’re used to Rocq^[1]. It’s inspired by the rust ecosystem’s rustup, much of its API/CLI is the same. Similarly, lake is Lean’s cargo.

dafny is a part of the dotnet ecosystem, but seems to install pretty easily via your operating system’s package index.

A pattern¶

I’d like to draw your attention to some lines of the .gitignore

.gitignore

1
2
logs/
artifacts/

.gitignore ing two common things

Most agentic synthetic data projects, like formal verification agents, will have something like this. logs/ are used to understanding trajectories through the conversation space that led to a piece of synthetic data, as well as anything else you’d like to look back at. artifacts is used to persist outputs, or sometimes intermediate pieces of code.

Footnotes¶

As in, its much better than the Rocq experience.
↩

Introduction

The working proof engineer’s guide to agents

Introduction

Defining terms