Machine Learning Diaries I, Computer Vision

Since the last time I attended a lecture on machine learning by Paige Bailey, my need to know more about this field has increased, pushing me to do further research on the different uses it can have. I found a course, which I found very interesting. In this course, we are briefly introduced to deep learning, and then guided to a further knowledge on the theoretical aspect. This is achieved by offering many examples on how TensorFlow works (Google’s deep learning and machine learning framework). After some discussions with my colleagues from Solid GEAR Projects, I started working on a facial recognition system which combines machine learning and computer vision to self-teach myself.

Those programming languages which are most used in projects like this are R system and Python, and, since I have a wide experience with the latter, I decided to use Python. This language also offers an easy-to-use code writing, which makes it possible for me to focus on algorhytms and change them rapidly. Furthermore, this project can be easily connected to a public API which could be created quickly with any of the several frameworks that Python provides: Django, Flask, and so on.

Flow in a machine learning system

The first aspect we need to know is what flow to recognize a face the system follows. In this post, we will only deal with the first part, which comprises getting the images and their pre-process for them to be understandable by the machine learning system:

We start by getting the images we need, either from the Internet or by receiving data from a camera. In the latter case, we will see how to do it.
Detect the faces within the image (we will only pick the biggest one).
Frame the face and detect the points of reference: there are 68 different that represent a face’s features.
Redress the face’s position within the image, so as to its features happen to meet with the points of reference.

By following these steps, we would already have the images ready to train the system.

So let’s get down to business: we are going to use one of Python’s library called OpenCV, which will help us to obtain images from our camera, and another library called Dlib to process the images and make the necessary changes to redress the position and size of the face. We will also need the file .dat, which contains the 68 points of reference of a face’s features, you can find it here.

import cv2
import sys
import time
import dlib
import openface


def main():
    """Method executed when the script is launched."""
    
    # This will allow the script to capture video from the camera.
    cam_output = cv2.VideoCapture()

    # We are going to let the camera warmup (the first 2/3 seconds there will not be image output)
    time.sleep(3000)
    
    # Start reading from the camera
    while True:
        success, image = cam_output.read()
    
        # At this point we already have the image, so we can start processing it,
        # So we may start getting the face.
        aligned_face = get_face_in_image_aligned(image)

        # Now that we have the cropped image, we may want to save it.
        if aligned_face is not None:
            cv2.imwrite('path/where/images/are/saved', 'image_name')

        # Press esc to finish reading from camera.
        if cv2.waitKey(1) == 27: 
           break

def get_face_in_image_aligned(image):
   """
   Gets the face in the image and aligns the image to focus that face.

   :param image: The image where the face will be found.
   
   :return: An image focusing the face, None if no face is detected.
   """
   
   face_detector = dlib.get_frontal_face_detector()
   predictor = dlib.shape_predictor('path/to/shape_predictor_68_face_landmarks.dat')
   face_aligner = openface.AlignDlib('path/to/shape_predictor_68_face_landmarks.dat') 
   
   try:
       # We are doing it this way because there may be more than one face and we only need one 
       detected_face_rect = enumerate(face_detector(image, 1)).next()[1]

       # Detect facial landmarks with the helping of the .dat file downloaded. 
       detected_face_landmarks = predictor(image, detected_face_rect)       

       # Align the face, also helped by the .dat file.
       aligned_face = face_aligner.align(
           534, # image size in pixels
           image,
           detected_face_rect, landmarkIndices=openface.AlignDlib.OUTER_EYES_AND_NOSE
       )
   except:
       return None

if __name__ == '__main__':
    main()
    sys.exit(0)

In the next post I will explain how to train the model and how to make the system recognize.

Machine Learning Diaries I, Computer Vision

Flow in a machine learning system

Other interesting articles

Leave a Comment Cancel reply

Flow in a machine learning system

Other interesting articles

Leave a Comment Cancel reply

¿Necesitas una estimación?