arno senoner 42t DKecmPk unsplash
Photo by Arno Senoner on Unsplash

As the world moves online, the threat of leaking sensitive data online grows as well. According to IBM, 20% of data breaches are caused by compromised credentials, forcing companies to spend more time improving their digital identity and authentication systems. Artificial intelligence has proven to be one of the most effective technologies in this area.

This MobiDev case study presents the implementation of face and voice recognition technologies for authentication to allow secure access to ultra-sensitive data.

Face and Voice Recognition Market Insights

The progress in fields of neural networks, big data and graphics processing units (GPU) has played a significant role in the widespread use of face recognition. The most popular use cases for this technology are security and marketing. For example, some advertising companies embed software in their billboards that are used to detect the gender, age, and ethnicity of people in order to deliver targeted ads.

Many enterprises include face recognition in their cybersecurity business plans. In the field, it is used to improve verification or authentication mechanisms. The technology enables secure access to a system without a password or can be used as an additional authentication factor to add extra security. Also, face recognition can be used in the video KYC process for digital customer authentication. This approach can reduce the customer onboarding cost by up to 90%.

According to Mordor Intelligence, the Facial Recognition Market was valued at $3.72 billion in 2020 and is projected to reach $11.62 billion by 2026.

The voice recognition market is booming as well. According to Statista, it’s expected to grow to $27.16 billion by 2026. Covid-19 has had a big impact on this. In particular, the use of voice recognition technology has proliferated in the entertainment, communications, and medical fields for search and assistance as a way of non-contact human interaction. There is also strong demand for this technology in retail, banking, smart home, automotive and security.

For example, by identifying the unique vocal patterns of the user, the technology can automatically authenticate a bank client within the first few seconds of calling customer support.

The total Face and Voice Biometrics market size are projected to reach $10,760 million by 2027, according to Valuets Report.

Project Overview

The main goal of the project that the MobiDev team worked on was to create an efficient and highly secure authentication system for accessing sensitive data. The system was supposed to be based on biometric technology.

The project was designed as a microservice and uses WebRTC. A microservice architecture has the advantage of being a separate subproject for certain functionality. This separateness makes a project easier to code, support, and enhance. It also allows developers to choose the best frameworks with optimal solutions to achieve the goals and required performance of the microservice.

The microservice for this project provides a single sign-on software (SSO) that uses biometric authentication. It works with both face recognition and voice recognition. For this system, there is no need to send captured media data to a server since an off-the-shelf WebRTC service was sufficient for the project’s needs.

Solutions Summary

The solution is based on voice recognition, facial characteristics, and answers to key questions. Biometric recognition continually improves through using machine learning.

The system needed training using existing datasets. The initial datasets were the collected voice recognition and facial photos. Additionally, there were ten alternatives that the MobiDev development team evaluated for validation of U.S. drivers’ licenses and the OCR module which helps convert different types of documents into searchable data.

The alternative solutions had to pass tests in real-world environments for accuracy. Of the options, the data science engineers decided that Google Vision was the best solution for integration in this project.

Advanced Facial Recognition

One of the key challenges of introducing biometrics into the security system is its vulnerability to spoofing attacks. That’s why the software developers introduced anti-spoofing protection for the facial recognition routines to improve the accuracy of authentication.

Anti-spoofing protection for security authentication has to effectively deal with presentation attacks (PA), which come in the following forms:

  • 2D Static PA: This method of attack uses photographs or a flat paper/plastic mask.
  • 2D Dynamic PA: This spoof uses a video display on a screen or many photos presented in sequence.
  • 3D Static PA: This method of attack uses a 3D print, a 3D sculpture, or a 3D mask.
  • 3D Dynamic PA: This spoof uses a robot that can produce expressions or a person wearing well-prepared makeup as a disguise.

The key to the implementation of successful anti-spoofing techniques is both accuracy and speed. The primary goals are twofold. One is that authorized users have a user-friendly experience and do not experience frustration with the system. At the same time, just as important is the second goal that the system must effectively block all unauthorized users from gaining access to the system.

Advanced Anti-Spoofing Techniques and Deep Learning

Advanced techniques for anti-spoofing include blink detection. This system determines the frequency of the blinking and the time in milliseconds that the eyes stay shut. Facial contours change when blinking. An average person blinks 15 to 30 times each minute. Static PA does not have blinking. Dynamic PA does not easily replicate the facial contours when blinking of the known authorized users.

Deep learning using Convolutional Neural Networks (CNNs) does not limit face recognition to a specific set of features. Instead, using a CNN creates trained convolution kernels that can detect things that are not visible to the human eye.

Additional Security Measures

The generation of security questions uses natural language processing (NLP) or extra voice. The decision of which photo datasets to use for training came from the development team.

The project was highly customizable, with all the software complexity running in the background while the user sees a simple graphical user interface (GUI). An added benefit was that the client’s business partners were able to integrate this authentication solution with their systems using APIs.

This single sign-on design permits single authentication across multiple enterprise systems, which is convenient for users. Yet, at the same time, it has to be robust and not be defeated by spoofing.

Applied Technologies

Here is the list of technology and methods used for this project:

Front End

  • React WebRTC (OpenTok)

Back End

  • RDS: PostgreSQL
  • Data Science and Machine Learning: TensorFlow, dlib, Keras, OpenCV, Tesseract OCR, and Google Vision
  • Storage: Amazon S3
  • Deployment (CI/CD): Docker, Jenkins, and Docker Swarm
  • Microservices: Python 3, Flask, Django and Django REST Framework, DFR, and Java
  • Single Sign-On: OAuth2
  • Web Server: nginx
  • Cache: Redis
  • Queue: Celery
  • Periodic Tasks: Celery Beat
  • MoviePy
  • Ffmpeg
  • Redis
  • JavaScript
  • BarCode Readers
  • UI/UX Design
  • Manual and Automated API Testing

Outcomes and Final Thoughts

The developed authentication system has proven the effectiveness of using face and voice recognition for secure access to sensitive information. The use of advanced anti-spoofing techniques allowed to reduce potential vulnerabilities and system inaccuracies. Machine learning algorithms allow the software to improve itself, and a single sign-on design supports secure single authentication across multiple enterprise systems.

Designed with advanced algorithms, biometrics authentication systems can greatly enhance the security of enterprises. At the same time, it is important to calculate potential risks and test hypotheses in order to choose the option that best meets the business objectives of the project.