Complex Instruction-Based Image Editing Benchmark

This research introduces Complex-Edit, a new benchmark for evaluating how well image editing models follow instructions with varying levels of complexity. The benchmark was created using GPT-4o to generate atomic editing tasks, which were then simplified and combined into more intricate instructions. The authors also present a suite of metrics and a VLM-based evaluation system to assess instruction following, identity preservation, and perceptual quality of edited images. Experiments using Complex-Edit reveal that open-source models lag behind proprietary ones, especially with more complex instructions, and that increased complexity can negatively impact the retention of original image elements and overall aesthetic quality. The study further examines sequential editing and a Best-of-N strategy as potential methods for handling complex edits and notes a tendency for models trained on synthetic data, including advanced models, to produce increasingly synthetic-looking results with higher instruction complexity.

Rss Apple Podcaster

Populärt på poddtoppen just nu

Senast besökta

Neural intel Pod

Sustainable Test-Driven Development

The Startup Ideas Podcast

Figuring Out Fabric: Learn Fabric in 30 minutes.

Web Rush

devtools.fm: Developer Tools, Open Source, Software Development

Get With The Programming

Heavy Networking

Dash