Image Segmentation

  • Different types of segmentation
    • Semantic segmentation (every pixel is classified)
    • Instance segmentation (object level classification)
    • Panoptic segmentation (combination of pixel and instance)

Different segmentation

  • After convolutions and pooling we will have a lot of feature maps in reduced dimensions. With the help of compressed latent space representation we can do lot of things like classification, upsampling etc Latent space representation
  • Auto-encoders

Upsampling

  • Upsampling from Latent space representation to the original size of the image
  • Upsampling happens in decoder
  • Upsampling is used for image generation, enhancement, mapping and more
    • Nearest Neighor
    • Bed of Nails
    • Bilinear upsampling
    • Transposed convolutions
    • Max unpooling

Nearest Neighbor

Nearest Neighbors Upsampling

Bed of nails

Bed of nails

Bilinear interpolation

  • Most popular
  • Bilinear

Max unpooling

  • In Max unpooling we memorize the position of the pixel with max value during max pooling. During upsampling, we fill the value in the exact position of the pixel with max value
  • Max Unpooling

Transposed convolution

  • These is a learnable upsampling method

  • For upsampling the input provides the weight for the filter

  • Transposed convolutions

  • Learnable Transposed convolution

  • Example of Transpose convolution

Interpolate in pytorch

nearest_neighbour_out = F.interpolate(input, scale_factor=2, mode='nearest')

bilinear_interp_out = F.interpolate(input, scale_factor=2, mode='bilinear', align_corners=False)

Skip Connections

  • When we are decompressing the image, it is difficult for the network to come up with original image size. Lot of information is lost in compression. To help in the process, skip connections exist between encoder and decoder.

skip connections

  • There are two types of skip connections: Additive and concatenating

  • ResNet is addition

  • DenseNet is concatenation

Addition Vs Concat

In segmentation, skip connections are used to pass features from the encoder path to the decoder path in order to recover spatial information lost during downsampling

skip connection - Gradient Descent Difference between pooling and convolution * 1x1 convolution is to reduce the number of filters (not sure how it happens) 1x1 convolutions

Metrics and loss functions

  • The metric used is Intersection over union (i.e Jaccard similarity)
  • The loss function is Focal loss (This is weighted cross entropy loss in addition to ‘gamma’ which will take care of class imbalance)
  • Dice loss is also used Different types of loss functions