Fifa worldcup 2018 Round of 16 flag detection using CNN

Building a convolutional neural network that will be able to predict if the given image is of flag from one of the country that qualified for the fifa worldcup 2018 round of 16.

Hello everyone. This is going to be part one of the two-part tutorial series on live flag detection using OpenCV and CNN. This article is titled “Fifa worldcup 2018 Round of 16 flag detection using CNN” and in this part of the tutorial series, we will learn how to prepare dataset and train CNN to classify images. With the Fifa world cup being the buzzword everywhere right now, in this tutorial, we are going to implement CNN for a rather interesting problem. We are going to use CNN to classify flags of countries that qualified for Fifa Worldcup 2018 round of 16. The list of countries (in alphabetical order) being:

  1. Argentina
  2. Belgium
  3. Brazil
  4. Colombia
  5. Croatia
  6. Denmark
  7. England
  8. France
  9. Japan
  10. Mexico
  11. Portugal
  12. Russia
  13. Spain
  14. Sweden
  15. Switzerland
  16. Uruguay

The key thing to understand here is that the model we are building now can be trained on any type of class or any number of labels you want. We are using flags right now only to keep things interesting. For example, if there are any doctors reading this, after completing this article they will be able to build and train neural networks that can take a brain scan as an input and predict if the scan contains which type of tumor. Or if there are any botanist reading this, after completing this article, they will be able to build and train neural networks that can take an image of a leaf as an input and predict which type of plant it is. The possibilities are endless limited only by your imagination. So, let’s begin.

We are going to be using Keras library on top of tensorflow for building our CNN model. Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlowCNTK, or Theano. It was developed with a focus on enabling fast experimentation. However, for our purpose, we will be using tensorflow backend on python 3.6. We already have a bunch of tutorials on tensorflow so if you want to check those, you can follow this link.

Let’s begin. First of all, we will need to gather training image dataset i.e., the image for flags of the countries. For this, we will use a chrome extension called Bulk image downloader. It allows us to download multiple images at once. Add the extension to chrome. Afterward, simply, search for the image you need (Eg: Spain National Flag) on google image search and click on the bulk image downloader icon at the top right-hand corner. Click on “Current tab” and select the images that you want to download. Repeat this for all 16 teams. A sample image dataset can be downloaded from this google drive link. It contains roughly 35-40 images for flags of each team. It is recommended to have a higher number of input data for training set but as we will later see, even this amount is enough for our purpose.

Without further ado, let’s get started. First, let’s create a virtual environment and install all the necessary dependencies.

Create Virtualenv and install necessary dependencies

Why python virtual environment is needed has already been discussed in this another post here so I’m not going to do that here. Let’s get started with how we can set up virtualenv and install necessary dependencies in python 3.6. The easiest way is installing through python pip package. To install virtualenv through pip, simply type:

pip3 install --upgrade virtualenv

Once the virtualenv is installed, you can create separate virtual environments for each of your projects. Simply go to the project directory and type:

virtualenv kerasenv

You will see a message in your terminal like:

Installing setuptools, pip, wheel…done.

In a newly created virtualenv there will be an activate shell script. This resides in /bin/, so you can run:

source kerasenv/bin/activate

Now, we are ready to install necessary dependencies. The list of dependencies we will be needing for our project are as follows:

  1. tensorflow (1.5.0)
  2. Keras (2.1.4)
  3. OpenCV (3.4.1)
  4. sklearn (0.19.1)

You can install these all at the same time using the command:

pip3 install tensorflow keras opencv-python sklearn

Computation is much faster if you have a GPU but you’ll need to use GPU version of tensorflow. If you plan on using tensorflow-gpu instead, you can follow our other article here to learn how to install it.

Our other required dependencies such as scipy, numpy etc. should automatically be installed while installing these dependencies.

Introduction to Convolutional Neural Network (CNN)

Now, we are ready to build a Convolutional Neural Network (CNN) to classify MNIST handwritten digits. But first, we must understand what a CNN is. We will only be covering the basic theory of CNN in this article. I highly recommend you refer to materials of course CS231n, if you want a deeper understanding of how CNN works.

In machine learning, a Convolutional Neural Network (CNN, or ConvNet) is a class of deep, feed-forward artificial neural networks that have successfully been applied to analyzing visual imagery. Convolutional Neural Networks are a type of neural network that makes the explicit assumption that the inputs are images, which allows us to encode certain properties into the architecture. There are three main types of layers to build ConvNet architectures: Convolutional LayerPooling Layer, and Fully-Connected Layer. We will stack these layers to form a full ConvNet architecture.

CNN architecture

Image source: Wikipedia

CONV layer will compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and a small region they are connected to in the input volume, POOL layer will perform a downsampling operation along the spatial dimensions (width, height) and FC (i.e. fully-connected) layer will compute the class scores, ( resulting in volume of size [1x1x16] in our case), where each of the 16 numbers correspond to a class score, such as among the 16 categories of the flag labels. All this may seem very confusing to you right now. So I highly recommend you refer to materials of course CS231n if you want a deeper understanding of how CNN works. However, for now, all we need to understand is that CNNs are one of the best available tools for machine vision and we will be using it for our purpose for classification of Fifa worldcup 2018 Round of 16 flags.

3 Comments on Fifa worldcup 2018 Round of 16 flag detection using CNN

  1. where should i give an input image for comparing images in the database(that means the images trained by cnn)

  2. noice tutor mate! one question about model architecture; is there a motiviation to increasing the size of cnn from 64 layers, to 128, to 256; then quartering downto 64, and then go up again to 128?

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.