Comparing the top face recognition APIs.

AWS rekognition vs Microsoft cognitive services vs Kairos

With the launch of iPhone X the age old technology of facial ID has taken a new mass appeal with many new apps and services building great application around this. Some of these use cases are mundane while others are pretty serious where the accuracy of the facial recognition can have much at stake like access to your private details, banking info and much more.

Face recognition is a complex deep learning problem and in the last 5 years we have made great technological progress to make it mainstream. Still due to the complexity of the technology, most of these new services who use facial ID do not build the technology themselves but rather use an API provides by the big tech companies like Amazon or Microsoft.

Since most of these API vendors provide similar functionality and at comparable price point, the major challenge is still to evaluate how these providers stack up with respect to each other. We compared the performance of face recognition APIs for AWS rekognition, Microsoft Azure cognitive services and Kairos.

Dataturks or I have no affiliation with any of these providers (except that in the past I worked at Microsoft and Amazon) and we have tried to be completely unbiased third party who just wanted to independently evaluate how these APIs stack up.

Target use case:

The use case we focused mainly was when there are a few training images available for a user, say similar to how while setting up an iphone X face ID, it initially captures a few images, and these training images have the user’s face from slightly different angles with mostly static background. Below is an example of a set of user images used for training.

And while verifying we use a similar setup, take a picture of user’s face and verify if the face belongs to the user in question. Example test pictures

Setup:

We use Face Recognition Dataset from University of Essex, UK to test the performance of the APIs for 50 different users. For each user we randomly selected 5 training images and used them to train the API. Then for each of these users we tested 3 positive examples (3 different images of the same user, expected output being that these match) and 3 negative examples (3 images of random users, expected output being that these do not match for the given user).

So a total of 6 tests for each of the 50 users. In total 300 tests for each of the 3 API providers.

Each of these APIs provide a way to make a collection/gallery for a given user where training images are added, post which the gallery is trained and then a new image can be tested against a gallery giving the confidence value of a match. This confidence value suggests how likely is it that the test image matches the user in question.

Results:

During testing each of these APIs, the primary concern was to look for 2 values, (1) True Positive (TP) : Given a different photo of the same person, the API correctly says that the photo matches, (2) False Positive (FP): Given a photo of a different user and the API incorrectly says that the photo matches.

Ideally one would want a 100% TP and 0% FP. For any serious application, any amount of FP is completely unacceptable since that can cause a serious security issue. Also, a low rate of TP would be really frustrating for users, since they might need to try multiple times to pass the authentication.

As stated above, these APIs return a ‘confidence’ value for each tested image, and what value you decide as the cutoff to trust the match determines when a tested image is considered a match. A higher cutoff can mean low FP but can also mean low TP. The exact value of the ‘confidence’ is not comparable across APIs, but the values of the cutoff we chose for each API reflects similar strictness of the decision for the API.

So we tested the results with a different value of this cutoff.


Higher accuracy scenarios:

Confidence cutoff for AWS rekognition: 95%, Confidence cutoff for Microsoft: 80%, Confidence cutoff for Kairos: 95%

Provider True +ve (TP) False +ve (FP) True -ve (TN) False +ve (FN) Precision (TP/(TP+FP)) Recall (TP/(TP+FN))
AWS rekognition 149 0 150 1 100% 99.3%
Microsoft cognitive services 131 0 150 19 100% 87.3%
Kairos 108 0 150 42 100% 72%

Medium accuracy scenarios:

Confidence cutoff for AWS rekognition: 70%, Confidence cutoff for Microsoft: 50%, Confidence cutoff for Kairos: 70%

Provider True +ve (TP) False +ve (FP) True -ve (TN) False +ve (FN) Precision (TP/(TP+FP)) Recall (TP/(TP+FN))
AWS rekognition 150 2 148 0 98.7% 100%
Microsoft cognitive services 137 0 150 13 100% 91.3%
Kairos 148 0 150 2 100% 98.7%

Low accuracy scenarios:

Confidence cutoff for AWS rekognition: 50%, Confidence cutoff for Microsoft: 30%, Confidence cutoff for Kairos: 50%

Provider True +ve (TP) False +ve (FP) True -ve (TN) False +ve (FN) Precision (TP/(TP+FP)) Recall (TP/(TP+FN))
AWS rekognition 150 2 148 0 98.7% 100%
Microsoft cognitive services 137 10 140 13 93.2% 91.3%
Kairos 150 19 131 0 88.7% 100%

We have made the code and dataset freely available for anyone to validate the results.

Examples

User training images: 5 images for a user.

Confidence as returned by each API for two test images of the same user.

Provider
AWS rekognition confidence: 99.52% 99.67%
Microsoft CS confidence: 92.6% 92.6%
Kairos confidence: 93.4% 96.7%

False Positive example for AWS rekognition:

User training images: 5 images for a user.

Test image: Image of a random users.

AWS rekognition confidence = 78.2%

False Positive example for Microsoft cognitive services:

User training images: 5 images for a user.

Test image: Image of a random users.

Microsoft cognitive services confidence = 43%

False Positive example for Kairos:

User training images: 5 images for a user.

Test image: Image of a random users.

Kairos confidence = 54%

Ease of use:

API wise, Kairos and AWS rekognition were the more pleasant to use and easy to integrate with apps. But a major drawback with AWS rekognition is that it only accepts images as S3 objects, unlike Kairos or Microsoft cognitive services who gracefully work with images stored anywhere on the web by just passing a URL to the image.

Microsoft cognitive services APIs are little complicated with few extra steps in between like the need to explicit train a model and wait for it to complete, or passing the image to be verified via a face detect API call and generate a face ID etc. Also we faced multiple rate limit exceeded errors while using Microsoft APIs (nothing of those with the other two).

Pricing:

Kairos is the cheapest of the three costing around $0.3 for every 1000 API calls, where as AWS rekognition costs around $1 for every 1000 API calls along with charges to store images on S3. Microsoft cognitive services is the costliest costing around $1.5 for every 1000 API calls.

These calls include the training and verification API calls.

If you like this, here is our blog on the comparision of image text detection APIs.

If you have any queries or suggestions I would love to hear about it. Please write to me at mohan@dataturks.com.