Inspect AI is an open-source framework developed by the UK AI Safety Institute (AISI) with Meridian Labs as contributors. This framework enables systematic evaluation of large language models (LLMs), providing researchers, developers, and policy makers with the tools to conduct rigorous, repeatable assessments of AI capabilities and behaviors.
The inspect_swe package makes software engineering agents like Claude Code and Codex CLI available as standard Inspect Agents. This allows researchers to easily evaluate the performance of these agents on a wide variety of tasks using the Inspect AI framework.
The Inspect AI Visual Studio Code extension makes it simple and productive to use Inspect AI directly in the VSCode IDE. Includes features like an integrated log viewer, task browser, panels for configuring .env files and CLI options, as well as command for running and debugging tasks.
Inspect Scout is a tool for in-depth analysis of AI agent transcripts. Scout can scan full transcripts of agent interactions, individual messages or events using high performance parallel processing. Use Scout View to see rich visualization of scan results.
Inspect Viz is a a data visualisation library for Inspect AI. Inspect Viz provides flexible tools for creating high quality interactive visualisations from Inspect evaluations. Inspect Viz includes a support for a variety of built in visualizations for helping understand evaluation results as well as components to make it easy to create custom visualizations from Inspect Data.