Deep Image Prior

by Dmitry Ulyanov,Andrea Vedaldi,Victor Lempitsky – CVPR 2018

Abstract

  • Deep CNN – able to learn image priors from a large number of example images
  • Contribution: Structure of a generator network is sufficient to capture a great deal of image statistics prior to any learning. 
  • A handcrafted prior with randomly initialized NN – gives excellent results in standard inversion problems such as denoising, superresolution and inpainting
  • Same priors to invert
  • This approach highlights the inductive bias captured by std generator network architectures

Introduction

  • Image reconstruction problems – denoising, single-image superresolution – GAN, VAE, direct pixel-wise error minimization
  • Learning from large dataset alone is insufficient because some authors showed that same image classification network generalized well on trained data can also overfit when presented with random labels. 
  • So, generalization requires the network structure to resonate with the structure of the data 
  • This paper is showing that without learning, plenty of image statistics can be captured.
  • Untrained ConvnNets, 
  • fit a generator network to a single degraded image. 
  • The network weights are the parametrization of the restored image.
  • Weights are randomly initialized and fitted to maximize their likelihood.
  • Reconstruction as conditional image generation problem.(denoising, inpainting, superresolution)

Method

  • Image generation – learn the generator networks x = f(z) that maps a random code vector zto an image x. x R3xHxW, z Rc`xH`C`,- network parameters
  • U-Net type “hourglass” architecture with skip connection. And z and x with same spatial size
  • x- clean image
  • x- corrupted image
  • x*- restored image, x*=minxE(x;x) + R(x), R(x) – Regularizer
    • Denoising: E(x;x)=||x-x||2- Needs early stoping
    • Inpainting: E(x;x)=||(x-x).m||2 – m is the binary mask
    • Super-resolution: E(x;x)=||d(x)-x||2
    • Feature Inversion: E(x;x)=||(x)-(x)||2
  • The choice of regularizer, which usually capture a generic prior on natural images, is more difficult and is the subject of much research. This paper replaces regularizer with implicit prior captured by the neural network, as follows
    • *=argminE(f(z);x) , x*= f*(z)
    • Minimizer *obtained by using optimizer such as SGD from random initialization of the parameters
  • Explicit prior: minx||d(x)-x||s.t x is a face, natural etc
  • Deep Image Prior: minx||d(x)-x||s.t xis an output of a CNN 
  • MAP: x*= argmaxxP(x/x)
  • P(x/x) = P(x/x) P(x)P(x)∝P(x/x) P(x)
  • P(x/x) – Likelihood
  • P(x)- prior
  • In degradation, x= x + , (0,2) and P(x/x)=(x ;x,2)
  • In restoration: x*=argmaxxP(x/x)

= argmaxxP(x/x)P(x)

= argmaxxP(x/x) since prior is a constant(we don’t have any preference)

= argmaxx(x;x,2) = x – We will not restore anything

  • That means MAP estimate is the same as ML estimate if the prior is uniform. (ie no prior)
  • Parametrization:
    • Regular: argminxE(x;x) + R(x) – search in image space
    • Parametrized: argminE(g();x) + R(g()) – search in some other space
    • If g is surjective(for each x exists :g()= x), then the two problems are equivalent
    • In practice even for surjective g, the solution will be different
    • Let’s treat g as a hyperparameter and tune it.  
    • g itself is a prior and maybe it’s sufficient to optimize only data term argminE(g(); x)
    • g() == f(z)- convolutional network with parameters
  • Deep Image Prior – Step by Step
    • 1. Initialize z (fill it with uniform noise U(-1,1)
    • 2. Solve argminE(f(z);x) using gradient-based method k+1=k – ∝∂E(f(z);x)/∂
    • 3. Get the solution x*=f*(z)

Applications

  • Denoising and generic reconstruction
    • x=x+∈, where ∈follows the particular distribution.
    • But in blind denoising the noise model is unknown.
  • Super-resolution
    • LR image – x R3xHxWand upsample factor t, and generate corresponding HR image – x R3xtHxtW
    • E(x;x)=||d(x)-x||2where d(.)is a downsampling operator that resizes an image by a factor of t.
  • Inpainting
  • Natural pre-image
  • Flash-no flash reconstruction

Related work

  • Highly related to self-similarity-based and dictionary-based priors.
  • Even a single-layer layer convolutional sparse coding is proposed for reconstruction.

Link: https://arxiv.org/abs/1711.10925

Kid ML

Kid ML is a contributor at KidML. We are committed to providing well-researched, accurate, and valuable content to our readers.

You May Also Like

Learning to segment microscopy images with lazy labels

A multi-task U-net for segmentation with lazy labels

by Rihuan Ke, Aurélie Bugeau, Nicolas Papadakis, Peter Schuetz, Carola-Bibiane Schönlieb [University of Cambridge] Abstract Paper proposes a DCNN for...

Beyond RGB: Very High Resolution Urban Remote Sensing With Multimodal Deep Networks

Beyond RGB: Very High Resolution Urban Remote Sensing With Multimodal Deep Networks

Nicolas Audebert,[ONERA, The French Aerospace Lab] Abstract Investigate various methods to deal with semantic labeling of very high resolution multi-modal...

Pedestrian Detection in Thermal Images using Saliency Maps

Pedestrian Detection in Thermal Images using Saliency Maps

Abstract Thermal images are good at predicting objects/people at night, but poor performance in daylightSoA networks use fusion networks with...

Label Super Resolution

Label Super Resolution

Kolya Malkin, Caleb Robinson, Le Hou, Rachel Soobitsky, Jacob Czawlytko, Dimitris Samaras, Joel Saltz, Lucas Joppa, Nebojsa Jojic [Microsoft Research,...

About Kid ML

Passionate about making AI and machine learning accessible to everyone, especially young learners and beginners.

Leave a Reply

About | Contact | Privacy Policy | Terms of Service | Disclaimer | Cookie Policy
© 2026 KidML. All rights reserved.