Share via


Microsoft Kinect: Smart Customer Engagement Solutions

 

Introduction

Most of the path breaking technological advancements is futuristic. Technological innovations, in the Information Technology (IT) world are often powered by multiple advancements. There have been incessant developments in the computing world - first they were supposed to be smaller, then faster and now smarter. As computers and small handheld devices have penetrated into each and every corner of our everyday life, it is only imperative that they should not only be sleeker and superfast but also be able to interact with us intuitively. With new technologies emerging, the user experience of this digital world is becoming ultra-advanced with these powerful and innovative technologies. Since the last decade, we saw the revolution of digital world that has changed from keyboard interactions to touch based controls. Touch based interactions although an improvement; still require user’s proximity to the machine. Advancements in the area of motion computing, that processes, digitalizes, and detects the position and/or velocity of people and objects in order to interact with software systems has made touchless interactions with the computers possible. It allows developing a new kind of human-machine interfaces that enables the user to interact in a natural way (Natural User Interface- NUI) through gesture and voice commands instead of keyboard and mouse.

Detecting human presence, recognizing its gestures and speech has been a remarkable achievement and has paved way for the development of a new generation of software systems, that is innovative, easy to use and of course touchless. Although cameras and microphones has been here since long, sensors that could handle not only human presence but also capture its subtle movements has been a recent innovation. Most gesture recognition technologies are either 2D- or 3D-based, and work together with camera enabled devices. The camera-enabled device beams an invisible infrared light on the individual, which is reflected back to the camera and onto the gesture recognition Integrated Chip (IC) that calculates the subtle movements in its focus area.

There are many devices available (e.g., Microsoft Kinect, Leap Motion, Creative Senz3D, and DUO3D) in the market that can understand gestures, voice commands etc. and take some actions based on that. In this paper, we will discuss about the Microsoft Kinect and some of the interesting applications that can be potentially built using its capabilities- camera, gesture and voice recognition.

What is Kinect?

Kinect is a motion sensor device from Microsoft that works with the Xbox One video game consoles and Windows PCs. It supports the Natural User Interface (NUI) and allows using users to interact with it through gestures and voice commands. It recognizes the person in front of the device and doesn’t need any intermediary device, such as a controller, to get connected with it.

The physical device is a horizontal box having a small base with a motorized pivot. It is designed to be positioned lengthwise above or below the video display. The device has inbuilt camera, depth sensors and multi-array microphone, accelerometer and tilt motors.  Powered with these tools it provides full-body 3D motion capture and facial and voice recognition capabilities.

Kinect Features

The Microsoft Kinect is a simple gadget that has got the following smaller sub systems and software technologies:

  • RGB Camera - Kinect has a video camera that provides color streams as frames and supports multiple formats including Bayer and YUV. The camera can be controlled programmatically to adjust the brightness, contrast, exposure time, sharpness etc. Apart from providing the video output color stream is also used for the facial recognition.
  • Depth & Motion sensor - The depth sensor includes an infrared laser projector and a monochrome CMOS sensor. It can capture video data in 3D and needs ambient light conditions to function properly. The sensing range of the depth sensor can be adjusted. Kinect software is capable of automatically calibrating the sensor depending upon the user’s physical environment, furniture or other obstacles in gaming scenarios.  The sensor can be used to get the depth related information. It is possible to map between the color and depth inputs that are important while building applications. 
  • Multi Array Microphone - Kinect device comes with a set of microphones that can not only get the voice input but also detect the direction, suppresses the noise and allows programming. It enables a headset-free voice based interaction with the device.
  • Microsoft Software – Microsoft provides a SDK that comes with the necessary .NET APIs, documentation and sample programs. Developers can use it for building Kinect based .NET applications. There is an “Open Kinect” initiative that allows working with Kinect even on non-Windows platforms as well.

Kinect Version1 vs. Version2

So far Microsoft has released two types of Kinect devices, one for the Xbox games and another for the Windows. The first version was released in February 2012 and instantly became a hit. The second version was released in summer of 2014 and is called Kinect v2. The features and capabilities of both the types have come closer in the v2 release. Kinect v2 has significantly improved capabilities, as depicted the below comparison table:          

Kinect Specifications

Specifications Kinect for Windows v1 Kinect for Windows v2
Color Camera 640 x 480 @30 fps 1920 x 1080 @30fps
Depth Camera 320 x 240 512 x 424
Max Depth Distance ~4.5 M ~4.5 M
Min Depth Distance 40 cm in near mode 50 cm
Horizontal Field of View 57 degrees 70 degrees
Vertical Field of View 43 degrees 60 degrees
Tilt Motor Yes No
Skeleton Joints Defined 20 joints 26 joints
Full Skeletons Tracked 2 6
USB Standard 2 3
Supported OS Win 7, Win 8 Win 7, Win 8
Price(approx.) $99.99 $149.99

How Kinect Works

Kinect can be used in two modes:

  • Skeletal Tracking Mode: The device can detect human presence and track their joint movements
  • Face Tracking Mode: The device can recognize a human face and track the movements of different points on it

Kinect v1 can detect full skeletal movements of 2 persons and presence of 4 others. Kinect v2 can track full skeletal movements of up to 6 persons.  Kinect v2 can detect up to 26 body joints including head, shoulders, elbow, hip, knee, ankle and foot. In the Face tracking mode, it can track user face and gives two sets of values, 3D Shape Points (121 points) and Tracked Points (87 visible and 13 hidden).  The 3D shape points are the mesh vertices that make a 3D face model and wireframes. Kinect face recognition can also detect the movements in the head like pitch, roll yaw etc. These vertices tracked by the Kinect can be enumerated by name using the "Feature Point" list, e.g., “TopSkull”, “LeftCornerMouth”, or “OuterTopRightPupil” etc.

Smart Solutions Possible with Kinect

Microsoft Kinect was first released for the Xbox gaming platform. The gaming platform had a phenomenal success and Microsoft launched its Windows version along with SDKs for the developers. The device power packed with features including the camera, sensors etc. is no longer limited as a gaming platform rather can be used for building various innovative solutions for both the personal and business usages.

IGATE R&I team has explored Microsoft Kinect as a technology and developed gesture based applications to provide a rich user experience. In the following section we have discussed a few of the interesting possibilities that could be realized true, leveraging the Kinect platform. IGATE Kinect team has worked on these concepts and accessed the possibility of leveraging Kinect capabilities for building innovative business applications.

Smart Shopping

Driver-through Ordering

This is a touchless ordering system that can be implemented in restaurants. It allows the customers choose items from a digital menu-card through hand gestures or voice commands. They can scroll though the items and select them for ordering.

Tools Requisition System

This is solution that helps engineers working in the filed area or on industrial machines putting requests for the equipment required for their task. Their hands might be tied to the work at hand or muddy and they can’t touch the keyboard or a touch screen. Using voice commands or hand movements they can browse through the equipment catalogue and place a request for them.

Smart Mirror

Smart Mirror is a Kinect based in-store virtual trial room (virtual mirror) application that works with Kinect gesture recognition. When a customer walks in the store and stands in front of the screen with Kinect sensor, this application automatically detects the person’s presence and allows them to choose the garments and accessories with the hand gestures or voice command and virtually try them on. They can choose different combinations from the catalog and also take snapshots & share it on social networks.

This solution makes the shopping an exciting opportunity. The solution can integrate with the online inventory and allow customers to buy something even if it is not available in the store immediately and ship it to them. Apart from that it can reduce the waiting time in the queue for trying an outfit and make the rack space management easier.

Specs Mirror

Specs Mirror is a Kinect based in-store spectacles try-on application which works with Kinect face recognition. When customer walks in the store and stands in front of the screen with Kinect sensor, this application automatically detects the customer’s face and calculates a type for it (e.g., round or oval) and populates a shortlist of recommended spectacle frame types that are appropriate for the customer’s face. Customers can choose the spectacle frames with the hand gestures and virtually try them on their face. They can also take a snapshot and share it on social networks.

Smart Banking

Digital Signage

Digital Signage is an interactive signboard which can be used in exhibitions, malls, kiosks marketing and outdoor advertising. In banking sector, gesture and voice controlled digital signboards applications can be developed which can operate through the Kinect. Through these digital boards, banking and finance companies can advertise their services in a very interactive manner. It will improve their customer engagements and make the advertising or information dissemination process more effective.

Intrusion Detection for ATMs

Kinect is capable of detecting up to 6 persons (Kinect V2), if they come into the proximity range of Kinect. It can help building a security system at ATMs by detecting the presence of more than one person & immediately generating security alerts.

Smart Healthcare

Touch-less Integration in Medical Imaging

Surgeons need various types of information to be available rapidly, efficiently and safely during surgical procedures. Meanwhile, they need to free up hands throughout the surgery to access the mouse to control any application in the sterility mode. In addition, they are required to record audio as well as video files, and enter and save some data. Such contact-based interactions introduce the possibility for contaminated material to be transferred between the sterile and non-sterile. This constraint creates difficulties for surgical staff.

This solution can help the surgeons have a touch less interaction within surgical settings, allowing images to be viewed, controlled and manipulated without contact, through the use of camera-based gesture recognition technology.

More Interesting Possibilities

Apart from the solutions discussed above, that the IAGTE team has worked upon there are a few more very interesting possibilities where enthusiastic Kinect developers are venturing into. A few of them has been discussed here.

Smart Physiotherapy

Kinect provides an innovative way to play games by controlling the game object with body movement and gestures. Similarly applications can be developed to provide a smart physiotherapy solution where users can do physiotherapy exercises with fun and with less involvement from a personal trainer.

Smart Home System

Microsoft is aiming to make Kinect a crucial piece in smart homes. It is possible to develop applications that can be helpful in controlling connected home appliances such as smart lights, smart thermostats, smart plugs, or even connected door locks.  Kinect’s voice control feature can also be used for home automation.  The Kinect sensor can be very well integrated with Cortana personal assistant that was recently announced as part of Windows 10 in future.

Future of Kinect

Microsoft has a great vision for the future of Kinect. The device, an amalgam of multiple path breaking technologies, has much hidden potentials and we have just started witnessing a few of them. Companies like Fitnect and FaceCake have come up with interactive virtual fitting room applications,  kscan3d allows  creating 3d scans of people and objects over the web using Kinect and Ubi Interactive is offering an Kinect powered touch screen solution for education to name a few. Kinect v2 has seen a lot of improvement over v1 and future versions would be more features rich and powerful, suitable for developing different types of smart user centric smart and interactive solutions. Definitely a very exciting future awaits the device where it would be a part of our day to day activities in a meaningful way.

Conclusion

As the computing is morphing itself as a consumer centric process and user experience is driving all the human-machine interactions, Kinect definitely is a very powerful platform for developing gestures and voice based customer-engagement and other types of solutions. Although Kinect was supposed to be a gaming device initially, a lot of innovations have made it suitable for building business applications as well. Also the need for touchless computing has made Kinect as a preferred choice for lot of applications. As discussed in this paper, the diversity of solutions demonstrates that the possibilities are wide and unlimited. Enterprises will be immensely benefited by smartly leveraging these opportunities fulfilled by smart Kinect solutions.

References