Simultanous Super Resolution of Underwater Images using Cycle Generative Adversarial Networks

Enhancing underwater imagery using GANs

Computer vision is an attractive option when it comes to underwater monitoring, especially at shallower depths. But, light-distortion factors due to refraction, absorption, and suspended particles lose many quality features in an image, with the resulting image being noisy and distorted. In this blog, we talk about using Generative Adversarial Networks (GANs) to improve underwater images’ resolution. We discuss recent breakthroughs in GANs in underwater imagery from scientific papers such as Deep SESR (we would reference all discussed papers below). We also discuss the possibility of using GANs to generate full-fledged datasets for underwater image restoration.

Lets talk about Ground Truth

When we talk about automating the process of colorizing greyscale images, we usually have paired training data because any color image can be converted to black and white. With underwater images, we lack the ground truth, which hinders us from using a similar approach.

Dataset Contruction

The major problem is acquiring training data. Those of you doing vision in underwater environments know the massive lack of open and annotated datasets. To bypass this problem, we use a type of GAN called the CycleGAN. CycleGANs are extremely useful in generating a dataset of image pairs. CycleGAN learns a mapping G: X-> Y such that images from G(X) are indistinguishable from images in Y, as well as a mapping F: Y -> X such that images sampled in from F(Y) are indistinguishable from images sampled in X.

Dataset Construction
Dataset Construction

  • Generator-A -> Domain-A
  • Generator-B -> Domain-B

Here the Generator models perform image translation where the image generation process is conditional on some input image (precisely image from the other domain). Generator-A takes an image from Domain B as input, and Generator-B takes an image from Domain-A as input.

  • Domain-B -> Generator-A -> Domain-A
  • Domain-A -> Generator-B -> Domain-B

Each generator has a corresponding discriminator model. The first discriminator model (Discriminator-A) takes real images from Domain-A and generates Generator-A images, and predicts whether they are real or fake. The second discriminator model (Discriminator-B) takes real images from Domain-B and generated Generator-B images and indicates whether they are real or fake.

  • Domain-A -> Discriminator-A -> [Real/Fake]
  • Domain-B -> Generator-A -> Discriminator-A -> [Real/Fake]
  • Domain-B -> Discriminator-B -> [Real/Fake]
  • Domain-A -> Generator-B -> Discriminator-B -> [Real/Fake]

So we construct two domains, A and B, where A contains the distorted images and B contains non-distorted images taken from select subsets in ImageNet. CycleGAN is then used to generate sets of image pairs such that they appear to be underwater.

Loss Function

There are two components to the CycleGAN objective function: An adversarial loss and cycle consistency loss. The adversarial loss is precisely like it sounds: Both generators attempt to “Trick” each other’s discriminator into being less able to distinguish between real-versions. For the adversarial loss, we can use least square loss :

Adversarial Loss
Adversarial Loss

However, the adversarial loss itself cannot produce acceptable images since it leaves the model constrained. It enforces the generated output to be of an appropriate domain but does not enforce input and output are recognizably the same. To address this issue, we use cycle consistency loss. It relies on the fact that if you convert an image to others and back again by successively feeding it through both generators, you should be able to get something similar to what you put in.

Cycle Consistent Loss
Cycle Consistent Loss


We compare patches on local images as one source of evaluation, as the original image has no specific ground truth.

Evaluation of the model
Evaluation of the model

I write about ML, Web Dev, and more topics. Subscribe to get new posts by email!

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

This blog and the website is open-source on Github.

At least this isn't a full screen popup

Subscribe to my newsletter to get new posts by email! I write about DL, Climate x Tech, and more topics.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.