We’ve now seen two different ways to build a verification agent: one with a framework (inspect-ai) and one without. So, which approach is better? The answer, as always, is “it depends.” ## TODO: audit the prose to make sure its not too LLMy
As a general rule of thumb, I recommend using a framework for any project you expect to work on for less than three months. Frameworks are great for research, rapid prototyping, and standard use cases. They provide a lot of boilerplate and infrastructure out of the box, which can save you a lot of time and effort.
But for any project you expect to maintain for more than three months, I recommend using plain Python. As we’ve seen in this chapter, frameworks can hide a lot of complexity, which can make it difficult to debug and customize your agent. When you’re working on a long-term project, that flexibility and control is worth the extra up-front investment.
Here’s a table that summarizes the trade-offs:
| Aspect | Plain Python | Frameworks (Inspect) |
|---|---|---|
| Setup time | Higher (implement loop) | Lower (batteries included) |
| Flexibility | Complete control | Constrained by abstractions |
| Maintenance | Direct, no surprises | Framework updates may break |
| Learning curve | Steep (must understand APIs) | Gentle (guided patterns) |
| Debugging | Explicit, transparent | Abstract, harder to trace |
| Customization | Trivial (just code) | May hit framework limits |
| Infrastructure | DIY (logging, dashboards) | Provided (inspect view) |
Ultimately, the choice between a framework and plain Python is a matter of trade-offs. But with the increasing power and reliability of modern language models and their APIs, the case for “rawdogging” it with plain Python is getting stronger every day.