Deep Image Prior

You are currently viewing Deep Image Prior

by Dmitry Ulyanov,Andrea Vedaldi,Victor Lempitsky – CVPR 2018

Abstract

  • Deep CNN – able to learn image priors from a large number of example images
  • Contribution: Structure of a generator network is sufficient to capture a great deal of image statistics prior to any learning. 
  • A handcrafted prior with randomly initialized NN – gives excellent results in standard inversion problems such as denoising, superresolution and inpainting
  • Same priors to invert
  • This approach highlights the inductive bias captured by std generator network architectures

Introduction

  • Image reconstruction problems – denoising, single-image superresolution – GAN, VAE, direct pixel-wise error minimization
  • Learning from large dataset alone is insufficient because some authors showed that same image classification network generalized well on trained data can also overfit when presented with random labels. 
  • So, generalization requires the network structure to resonate with the structure of the data 
  • This paper is showing that without learning, plenty of image statistics can be captured.
  • Untrained ConvnNets, 
  • fit a generator network to a single degraded image. 
  • The network weights are the parametrization of the restored image.
  • Weights are randomly initialized and fitted to maximize their likelihood.
  • Reconstruction as conditional image generation problem.(denoising, inpainting, superresolution)

Method

  • Image generation – learn the generator networks x = f(z) that maps a random code vector zto an image x. x R3xHxW, z Rc`xH`C`,- network parameters
  • U-Net type “hourglass” architecture with skip connection. And z and x with same spatial size
  • x- clean image
  • x- corrupted image
  • x*- restored image, x*=minxE(x;x) + R(x), R(x) – Regularizer
    • Denoising: E(x;x)=||x-x||2- Needs early stoping
    • Inpainting: E(x;x)=||(x-x).m||2 – m is the binary mask
    • Super-resolution: E(x;x)=||d(x)-x||2
    • Feature Inversion: E(x;x)=||(x)-(x)||2
  • The choice of regularizer, which usually capture a generic prior on natural images, is more difficult and is the subject of much research. This paper replaces regularizer with implicit prior captured by the neural network, as follows
    • *=argminE(f(z);x) , x*= f*(z)
    • Minimizer *obtained by using optimizer such as SGD from random initialization of the parameters
  • Explicit prior: minx||d(x)-x||s.t x is a face, natural etc
  • Deep Image Prior: minx||d(x)-x||s.t xis an output of a CNN 
  • MAP: x*= argmaxxP(x/x)
  • P(x/x) = P(x/x) P(x)P(x)∝P(x/x) P(x)
  • P(x/x) – Likelihood
  • P(x)- prior
  • In degradation, x= x + , (0,2) and P(x/x)=(x ;x,2)
  • In restoration: x*=argmaxxP(x/x)

= argmaxxP(x/x)P(x)

= argmaxxP(x/x) since prior is a constant(we don’t have any preference)

= argmaxx(x;x,2) = x – We will not restore anything

  • That means MAP estimate is the same as ML estimate if the prior is uniform. (ie no prior)
  • Parametrization:
    • Regular: argminxE(x;x) + R(x) – search in image space
    • Parametrized: argminE(g();x) + R(g()) – search in some other space
    • If g is surjective(for each x exists :g()= x), then the two problems are equivalent
    • In practice even for surjective g, the solution will be different
    • Let’s treat g as a hyperparameter and tune it.  
    • g itself is a prior and maybe it’s sufficient to optimize only data term argminE(g(); x)
    • g() == f(z)- convolutional network with parameters
  • Deep Image Prior – Step by Step
    • 1. Initialize z (fill it with uniform noise U(-1,1)
    • 2. Solve argminE(f(z);x) using gradient-based method k+1=k – ∝∂E(f(z);x)/∂
    • 3. Get the solution x*=f*(z)

Applications

  • Denoising and generic reconstruction
    • x=x+∈, where ∈follows the particular distribution.
    • But in blind denoising the noise model is unknown.
  • Super-resolution
    • LR image – x R3xHxWand upsample factor t, and generate corresponding HR image – x R3xtHxtW
    • E(x;x)=||d(x)-x||2where d(.)is a downsampling operator that resizes an image by a factor of t.
  • Inpainting
  • Natural pre-image
  • Flash-no flash reconstruction

Related work

  • Highly related to self-similarity-based and dictionary-based priors.
  • Even a single-layer layer convolutional sparse coding is proposed for reconstruction.

Link: https://arxiv.org/abs/1711.10925

Leave a Reply