How Facial Recognition Works


Facial recognition is a process of using computer vision based mathematics to detect and recognize a human face in a photograph or video. Using biometrics the facial recognition system maps facial features such as the location and shape of the eye, nose, mouth, distinguishable landmarks unique to the person and other geometric aspects to the face.

Complex mathematical algorithms are used to produce a numerical sequence that represents the face in a language a computer can understand. This “faceprint” is as unique as a human fingerprint and can be analyzed in real time to identify people as they walk past a video CCT security camera.

The facial recognition industry is expected to grow to over $7.7 billion annually in 2022, nearly doubling from $4 billion in 2017.

For privacy-minded individuals, you will want to pay close attention to how and by whom your faceprint data is used. The technology itself is not dangerous and people do not need to be paranoid about “big brother watching you.” Cautious? Yes, but paranoid about facial recognition technology, no.

Just like all technology, those that are criminally minded can abuse facial recognition. Luckily there are also good uses including: unlocking your iPhone with FaceID, fighting back against revenge porn, scrubbing the internet of child porn, airport security, automatically tag your friends in social media photos, casinos protecting visitors from thieves, protecting online daters from fraudulent profile scams, mobile apps for law enforcement, among so many other great uses.

There are even some porn search engines jumping into the facial recognition industry allowing users to search for adult models based upon a photo of their face.

In the past, most systems required the person to face the camera and have their face correctly framed within the photo. This may work for mug shots taken by the local Sheriff’s Office, but for real-time analysis of CCT security camera footage people as they walk by, newer technology needed to be invented.

Now, many systems can build a 3D representation of the human face based upon multiple photos known to be of that individual, giving the system a more detailed understanding of that person’s unique facial attributes.

All uses of this technology, for good and evil, requires advanced neural networks trained to work together to provide the user with the desired service.


Article Quick Links


America Hijacked
America Hijacked

Breaking Down the Steps

Humans are great at recognizing people’s faces; we have had more than 2 million years of evolutionary adaptation behind us. Computers, even with all of their power cannot natively look at a person’s face and say “That’s Aunt Betty.”

Computer vision research started in the 1960s with scientists teaching computers how to detect and recognize a human face. Computers have come a long way since the 60s, but they still must be trained what to look at and how to distill that information down to binary computer language of 1s and 0s.

Before a computer can recognize a face in a photo, it must first determine if a human face is even in the picture. Facial recognition is very expensive in terms of computing power, whereas simple ‘is there a person in the photo’ is very inexpensive. For this reason, systems elect first to verify if there is a person in the photo before attempting the more complicated matching the photo to an actual person’s identity. This process is called Face Detection

1) Face Detection

A photo is analyzed to find the human face(s) within the image. Each face detected is isolated and processed separately. Open source libraries such as OpenCV and dlib are standard; of course, there are also many proprietary systems in use as well.

2) Alignment

The face is cropped from the original picture with the eyes and mouth aligned in a common location for all photos. With these attributes in the same general area for all images, it makes the next steps easier for the computer.

3) Representation

The facial recognition software will then generate a unique and propriety identifier for this face based upon the facial landmarks detected. This identifier is usually represented by a series of floating point integers (computer geek speak for a number with a decimal point). In this step, the system detects and extracts all of the meaningful data that it will use to generate your faceprint.

128D array ‘faceprint’

Of course, these numbers do not really mean much to us as mere mortals, to a computer this is what a human face looks like. Each number in the array is a numerical representation showing the location of the eyes, nose, mouth, etc.

4) Matching

With this unique identifier, the facial recognition software will then query a database of hundreds, thousands, or even millions of known identities and return a list of possible matches.

This query to the database is performed by comparing the Euclidean distance between the positions of each array element’s floating point element compared to that of the same element of a different person’s array. If the Euclidean distance of all 128 points is less than 0.6 it is a strong candidate for a match.

Euclidean Distance

Depending on how powerful the system is, in most cases the above four steps are processed and returned to the user in milliseconds. Now a human can manually verify the results returned by the system.

In our example, the girl from ‘Face 1’ is already known to the database and returns her information:


Training

We have covered the basics of how to detect a face in a photo and searched our database to see if that person is known, but we skipped a very important step.

Before you can run Face Detection or Face Recognition, you must create and train computer vision models so that the computer knows what a human face “looks like.” There are some great python wrappers around dlib like face_recognition that are open source and are already trained. Or if you are more adventurous, you can train your own models, I will direct you to an excellent article by Dr. Adrian Rosebrock on how to train a facial recognition model.

Face Clustering

An essential piece of a facial recognition system is the ability to intelligently group photographs of the same person into groups, or clusters. This can be accomplished through manual, semi-automated, and fully automated processes. By clustering these photos together, the system can better handle occlusion issues that partially block parts of the face in an individual photo such as sunglasses, hairstyles that cover part of the face, etc.

Clustering can also help by grouping photos of a person from dozens of different angles giving a complete representation of the person’s unique facial structure in 3-dimensional space.


Law Enforcement Use of Facial Recognition

MorphoTrust Idemia, is one of the largest vendors of face recognition and other biometric identification technology in the United States. It has designed systems for state DMVs, federal and state law enforcement agencies, border control and airports (including TSA PreCheck), and the state department.

Common vendors include 3M, Susped.ID, Cognitec, DataWorks Plus, stockNum Systems, FaceFirst, and NEC Global.

Suspect Photo Matching

Back in the day police departments had to manually search filing cabinets and mug books filled with photographs of suspects. Now they can use smartphone apps to take a quick photo of a suspect and instantly have their criminal records and personal information displayed on the screen for their review.

Many of these systems work together with social networking and other integrated surveillance systems to provide a complete look into a person’s identity.

Fighting Revenge Porn

Tech services that assist law enforcement and private sector litigators to search for and combat online revenge porn uploads from ex-boyfriends and ex-girlfriends.

Battling Child Porn

While not technically “Facial Recognition” per se, companies like Microsoft use computer vision algorithms to detect known child pornography online. They offer a free service called PhotoDNA to law enforcement agencies.

PhotoDNA creates a unique digital signature (known as a “hash”) of an image which is then compared against signatures (hashes) of other photos to find copies of the same image. When matched with a database containing hashes of previously identified illegal images, PhotoDNA is an incredible tool to help detect, disrupt and report the distribution of child exploitation material. PhotoDNA is not facial recognition software and cannot be used to identify a person or object in an image. A PhotoDNA hash is not reversible, and therefore cannot be used to recreate an image.

Microsoft, PhotoDNA (https://www.microsoft.com/en-us/PhotoDNA)

Bringing Facial Recognition to the Masses

This face recognition technology is used in more products and services that you may realize

Online/Mobile Device Banking

Many national banks like Wells Fargo and Bank of America are using facial recognition in their mobile apps and in their ATMs to better verify and authenticate legitimate users

Installing dlib on AWS Lambda

Thanks to the amazing assistance of AWS Support, I finally was able to install dlib on AWS Lambda:


1. Launch a new EC2 instance:
  1) On “Choose AMI” screen, select “Amazon Linux AMI 2018.03.0 (HVM), SSD Volume Type” as your AMI.
   2) On “Step 2: Choose an Instance Type” screen, I suggest to select a medium or large type (mine is t2.large). I tried to use t2.micro but the performance is not good enough for compiling dlib.
   3) Click “Review and Launch”, then click “Launch”.
  4) Use your key pair to finalize the configuration.

2. Use ssh to connect to your new EC2 instance with your private kay pair.

3. On your new EC2 Amazon Linux instance, please do following things:
   1) Install gcc-c++: sudo yum install gcc-c++ -y
   2) Install cmake: sudo yum install cmake -y
   3) Install python36: sudo yum install python36 -y
   4) Install python36-devel: sudo yum install python36-devel -y

America Hijacked
America Hijacked

4. Install Python dependencies:
   1) Install Pillow: sudo python3 -m pip install Pillow
       It will also install “PIL” for you.
   2) Install face_recognition: sudo python3 -m pip install face_recognition
       It will also install “face_recognition_modes”, “numpy”, and “dlib”.
   Above modules will be installed to “/usr/local/lib64/python3.6/site-packages/”. Here is my file list for your reference:
   —————————————————————————————————
   [[email protected] ~]$ ls /usr/local/lib64/python3.6/site-packages/ -l
   total 9652
   drwxr-xr-x  2 root root    4096 Feb 22 08:08 dlib-19.16.0.egg-info
   -rwxr-xr-x  1 root root 9852368 Feb 22 08:08 dlib.cpython-36m-x86_64-linux-gnu.so
   drwxr-xr-x 18 root root    4096 Feb 22 08:08 numpy
   drwxr-xr-x  2 root root    4096 Feb 22 08:08 numpy-1.16.1.dist-info
   drwxr-xr-x  4 root root    4096 Feb 22 08:00 PIL
   drwxr-xr-x  2 root root    4096 Feb 22 08:00 Pillow-5.4.1.dist-info
   —————————————————————————————————

5. Zip above mentioned “/site-packages” directory and download your zip file.

6. On your local machine, unzip above mentioned zip file. Then make some new folders for each module. 
   Here is my file structure for your reference:
   dlib
      └ python
               └ dlib.so (I renamed “dlib.cpython-36m-x86_64-linux-gnu.so” to “dlib.so”)

   face_recognition
      └ python
               │ face_recognition (directory and its contents)
               └ face_recognition-1.2.3.dist-info (directory and its contents)

   face_recognition_models
      └ python
               │ face_recognition_models (directory and its contents)
               └ face_recognition_models-0.3.0.egg-info (directory and its contents)

   numpy
      └ python
               │ numpy (directory and its contents)
               └ numpy-1.16.1.dist-info (directory and its contents)

   PILPillow
      └ python
               │ PIL (directory and its contents)
               └ Pillow-5.4.1.dist-info (directory and its contents)

7. Zip each “python” directory under above five folders. You will get 5 “python.zip” files. These files are for Lambda layers.

8. Open AWS Lambda console. Create 5 new lambda layers for above zip files. Upload these zip files to each layer correspondingly. Here are some details:
   ——————————————————————–
   Name                                                    Runtime
   ——————————————————————–
   PILPillow                                              Python 3.6
   dlib                                                       Not specified
   face_recognition                                 Python 3.6
   face_recognition_models                   Python 3.6
   numpy                                                  Python 3.6
   ——————————————————————–
   Please note, “python.zip” of “face_recognition_models” is too large (approx. 100.6MB) to be uploaded from console webpage, you have to upload it to a S3 bucket first, mark it public, and then upload to Lambda layer from your S3 bucket.

9. Create a new Lambda function, add above 5 layers to your Lambda function, select Python 3.6 as runtime. Then upload your Python code with a test jpeg picture. Here is my “lambda_function.py” for your reference:
   ——————————————————————–
   import face_recognition
   import json

   print(“cold start”)

   def lambda_handler(event, context):
       fileName = “1.jpeg”

       print(“start face recognition…”)

       image = face_recognition.load_image_file(str(fileName))
       encoding = face_recognition.face_locations(image)

       print(“face recognition finish…”)

       return {
           ‘statusCode’: 200,
           ‘body’: json.dumps(encoding[0])
       }
   ——————————————————————–
   Please note, in my case, I use “1.jpeg” for testing only.

10. Increase your Lambda function’s memory to about 512 MB. The default 3s timeout is enough for simple request. In my case, the running status are:
   ——————————————————————–
   Duration: 174.37 ms
   Billed Duration: 200 ms
   Memory Size: 512 MB
   Max Memory Used: 400 MB
   ——————————————————————–
   Please note, the memory and timeout should be updated as your functionality needs.

Explanations:
————————————————————————–
1. When installdlib” and “Pillow”, they will compile some library files locally. To make sure these executables working, we need a environment similar with Lambda’s so the EC2 is for this purpose.
2. I firstly build/install everything on my EC2 with its pre-installed Python 2, but some libraries can’t be invoked by Lambda. Then I switch to Python 3, that’s what we did in step 3 and 4.
3. To build dlib and other libraries, you should install Python development tools “python36-devel”. Because dlib is written in C++, so we also need to install gcc-c++ and cmake.
4. After include all of these 5 layers, you can verify your /opt files list with mine (as attachment “Opt_file_list.txt”). You can get the file list by:
   ——————————————————————————————
   import os
   directories = os.popen(“find /opt/* -type d –maxdepth 4″).read().split(“\n”)
   return {
       ‘directories’: directories
   }
   ——————————————————————————————
5. Increasing memory allocation is important, otherwise you will encounter timeout issue.
6. Be aware of Lambda total unzipped deployment files (includes your code/dependencies) size limitation, which is 250 MB. If your Lambda function has multiple layers, the total deployment size then is the sum of all referenced layers plus the Lambda itself. Please make sure the total size is within this limitation.
7. In production, the image file should be stored in storage services like S3. The jpeg file I upload to my Lambda function is just for testing.
8. Your functionality may vary from my POC, so you may need to increase the timeout and memory for advance processing.