How to write your own recognition system using python and facenet

in #technology6 years ago (edited)

First time when I saw how Face ID for Apple worked, I thought it might be hard to implement it. In general, yes if you write all things from scratch, it can be a different problem. Nowadays we have plenty of instruments which we can use to build such system much faster. Just use good libraries.

One of such library is facenet. Also there are others but we will not discuss them in this text. If you are interested just google or pm me.

To build this program we will use python3.6, tensorflow, opencv, facenet and a little bit of magic. So first of all you need python3.6 to be installed on your computer (it should work with python3.7, but I did not test it). If you do not have python3.6 installed please follow instructions on official python site.

1. Here are we go. First of all you need to create virtual environment for python and activate it(assuming I've created folder for your new project and you are already in it):

python3 -m venv env
source env/bin/activate

2. After that install all requirements with this command:

pip3 install facenet opencv-python

That's all we need. 

3. Next step we need to download one of the pre-trained model from facenet(section Pre-trained models). I can not say what exactly model is better, both works fine so you can use what ever you want.

Let's assume you selected model 20180402-114759. Just download zip file and unzip it to your project folder. 

4. Take a photo of your face and place it in the folder images. Extension should be *.jpg.

5. In this step we are going to start writing script. I will explain in general what those blocks of code do and after latest block we will combine it all together. 

Import things that we need, and init some constants:

import os
import fnmatch
import re

import numpy as np
import cv2
from facenet.src.align import detect_face
from facenet.src import facenet
import tensorflow as tf

MINSIZE = 20
THRESHOLD = [0.6, 0.7, 0.7]
FACTOR = 0.709
MARGIN = 44
SCALE = 0.25

Init facenet model:

# init model
sess = tf.Session()
with sess.as_default():
    facenet.load_model('20180402-114759')

    images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")
    embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
    phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")
    embedding_size = embeddings.get_shape()
    input_image_size = images_placeholder.get_shape()[1]
pnet, rnet, onet = detect_face.create_mtcnn(sess, None)

def get_embedding(resized):
    reshaped = resized.reshape(-1, input_image_size, input_image_size, 3)
    feed_dict = {
        images_placeholder: reshaped, 
        phase_train_placeholder: False,
    }
    embedding = sess.run(embeddings, feed_dict=feed_dict)
 return embedding

def prewhiten(x):
    mean = np.mean(x)
    std = np.std(x)
    std_adj = np.maximum(std, 1.0/np.sqrt(x.size))
    y = np.multiply(np.subtract(x, mean), 1/std_adj)
 return y

20180402-114759 - name of the folder where we extracted pre-trained model. 

Load images from folder, find face in it(should be only one face on image) and create embedding for each face:

def findfiles(which, where='.'):
    rule = re.compile(fnmatch.translate(which), re.IGNORECASE)
 return [os.path.join(where, name) for name in os.listdir(where) if rule.match(name)]

# load images
EMBEDDINGS = {}

for filename in findfiles('*.jpg', 'images'):
    img = cv2.imread(filename)
    bounding_boxes, _ = detect_face.detect_face(img, MINSIZE, pnet, rnet, onet, THRESHOLD, FACTOR)
 if bounding_boxes.any():
 assert bounding_boxes.shape[0] == 1, 'Find too many faces on the image'
        box = bounding_boxes[0]
        x, y, x2, y2, accuracy = box
 if accuracy > 0.7:
            cropped = img[int(y):int(y2), int(x):int(x2), :]
            resized = cv2.resize(cropped, (input_image_size, input_image_size), interpolation=cv2.INTER_CUBIC)
            prewhitened = prewhiten(resized)
            name, _ = os.path.splitext(os.path.basename(filename))
            EMBEDDINGS[name] = get_embedding(prewhitened)
 else:
 raise Exception('Can not find face on the image')

Init opencv, detect faces using facenet and search in our database:

# init video
cap = cv2.VideoCapture(0)
while True:
 # Capture frame-by-frame
    ret, frame = cap.read()
 # Resize frame of video to 1/4 size for faster face recognition processing
    img = cv2.resize(frame, (0, 0), fx=SCALE, fy=SCALE)
 # Convert the image from BGR color (which OpenCV uses) to RGB color (which face_recognition uses)
    img = img[:, :, ::-1]

 # detect bounding boxes
    bounding_boxes, list_points = detect_face.detect_face(img, MINSIZE, pnet, rnet, onet, THRESHOLD, FACTOR)
 if bounding_boxes.any():
 for index, face in enumerate(bounding_boxes):
            x, y, x2, y2, accuracy = face
 if accuracy > 0.5:
                cropped = img[int(y):int(y2), int(x):int(x2), :]
                resized = cv2.resize(cropped, (input_image_size, input_image_size), interpolation=cv2.INTER_CUBIC)

 

                prewhitened = prewhiten(resized)
                guest_embedding = get_embedding(prewhitened)

 # try to find guest face in our database
                min_distance = None
                min_name = None
 for name, embedding in EMBEDDINGS.items():
                    distance = facenet.distance(guest_embedding, embedding, 0)
 if min_distance is None or min_distance > distance:
                        min_name = name
                        min_distance = distance

 # if we found face in database and distance is not too big
 if min_distance and min_distance < 1.1:
                    font = cv2.FONT_HERSHEY_SIMPLEX
                    x_text = x / SCALE
                    y_text = y / SCALE - 10
                    point = (int(x_text), int(y_text))
                    cv2.putText(frame, min_name, point, font, 1, (255, 255, 255), 2, cv2.LINE_AA)

 # show rectangle for face at image
                point = (int(x/SCALE), int(y/SCALE))
                point2 = (int(x2/SCALE), int(y2/SCALE))
                cv2.rectangle(frame, point, point2, (0, 255, 0), 2)

 # Display the resulting frame
    cv2.imshow('frame', frame)
 if cv2.waitKey(1) & 0xFF == ord('q'):
 break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

Combining all together. You can check script on gist.


Sort:  

Congratulations @ferumflex! You received a personal award!

Happy Birthday! - You are on the Steem blockchain for 2 years!

You can view your badges on your Steem Board and compare to others on the Steem Ranking

Do not miss the last post from @steemitboard:

New japanese speaking community Steem Meetup badge
Vote for @Steemitboard as a witness to get one more award and increased upvotes!

Congratulations @ferumflex! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s) :

You got more than 10 replies.
Your next target is to reach 50 replies.

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Support the HiveBuzz project. Vote for our proposal!