An experiment in LLM spatio-textual reasoning capabilities. The agents are given tool call allowing them to render screenshots of the models, allowing them to iteratively improve the design. Feel free to click on the individual images to interactively explore the resulting models and read the entire input prompts
See the GitHub for the source code.
| Agent |
Battery Case
battery_case
|
Chess Rook
chess_rook
|
Citronhaj Stand
citronhaj_stand
|
Planetary Gearbox
planetary_gearbox
|
3D Printer Torture Test
torture_test
|
Towel Hook
towel_hook
|
|---|---|---|---|---|---|---|
| anthropic-claude-opus-4.5 |
#1
#2
|
#1
|
#1
|
#1
|
#1
|
#1
#2
|
| anthropic-claude-opus-4.6 |
#1
#2
|
#1
|
#1
|
#1
|
#1
|
#1
#2
|
| anthropic-claude-sonnet-4.5 |
#1
#2
|
#1
|
#1
|
#1
|
#1
|
#1
#2
|
| google-gemini-3-flash-preview |
#1
#2
|
#1
|
#1
|
#1
|
#1
|
#1
#2
|
| google-gemini-3-pro-preview |
#1
#2
|
#1
|
#1
|
#1
|
#1 ⚠
|
#1
#2
|
| openai-gpt-5.1-codex-max |
#1
#2
|
#1
|
#1
|
#1
|
#1
|
#1
#2
|
| openai-gpt-5.1-codex-mini |
#1
#2
|
#1
|
#1
|
#1
|
#1
|
#1
#2
|
| openai-gpt-5.2 |
#1
#2
|
#1
|
#1
|
#1
|
#1
|
#1
#2
|
| openai-gpt-5.2-codex |
#1
#2
|
#1
|
#1
|
#1
|
#1
|
#1
#2
|
Click and drag to rotate. Scroll to zoom.