Optimize and Reduce: A Top-Down Approach for Image Vectorization

Abstract

Vector image representation is a popular choice when editability and flexibility in resolution are desired. However, most images are only available in raster form, making raster-to-vector image conversion (vectorization) an important task. Classical methods for vectorization are either domain-specific or yield an abundance of shapes which limits editability and interpretability. Learning-based methods, that use differentiable rendering, have revolutionized vectorization, at the cost of poor generalization to out-of-training distribution domains, and optimization-based counterparts are either slow or produce non-editable and redundant shapes. In this work, we propose Optimize & Reduce (O&R), a top-down approach to vectorization that is both fast and domain-agnostic. O&R aims to attain a compact representation of input images by iteratively optimizing Bézier curve parameters and significantly reducing the number of shapes, using a devised importance measure. We contribute a benchmark of five datasets comprising images from a broad spectrum of image complexities - from emojis to natural-like images. Through extensive experiments on hundreds of images, we demonstrate that our method is domain agnostic and outperforms existing works in both reconstruction and perceptual quality for a fixed number of shapes. Moreover, we show that our algorithm is x10 faster than the state-of-the-art optimization-based method.

Results

Qualitative Results

Using our method, given a Input image and skeleton we can perform structure-consistent pose estimation on images from unseen categories.

Image Abstraction

We demonstrate different levels of abstraction for vectorization tasks, by varying the number of shapes in the output image and by incorporating a CLIP loss as a reconstruction objective.

Image Interpolation

Our approach does not rely on a GAN to generate interpolated images; instead, it performs interpolation directly in the shapes domain between the source and target images. Our method is preferable as it reflects a realistic scenario where a user wants to interpolate between two emoji images they possess. Our results are achieved without depending on a GAN trained to generate the raster images.

BibTeX

If you find this research useful, please cite the following:


@inproceedings{DBLP:conf/aaai/OptimizeReduce,
  author       = {Or Hirchorn and
                  Amir Jevnisek and
                  Shai Avidan},
  title        = {Optimize and Reduce: A Top-Down Approach for Image Vectorization},
  booktitle    = {{AAAI}},
  publisher    = {{AAAI} Press},
  year         = {2024}
}