How AI is Transforming Pro AV

Artificial intelligence eye
(Image credit: Getty Images)

For many years, the AV/IT industry has been using artificial intelligence (AI) to develop products and solutions. Control and automation companies designed drag-and-drop room configurators that eliminated programming, with sensors placed in rooms that, when triggered, turn on the lights, set the HVAC, and launch a presentation. Today, many are delivering intelligent touchless environments using AI and facial detection.

We can often look toward consumer electronics trends to predict technologies that will emerge in the enterprise and other commercial markets. Some kernels of the near-future could be gleaned from a virtual conference session held during CES 2021, between New York Times columnist and author Thomas Friedman and renowned artificial intelligence (AI) expert, president and CEO of Mobileye, Amnon Shashua. Together, they discussed fundamental questions related to the ethics and values governing today's technology, the challenges facing a rapid pace of change and automation, and solutions for maximizing opportunities in a world that is fast, fused, and deep.

Shashua explained that AI is narrow; it is software that is optimized to solve a single problem. "What Mobileye is doing is all about pattern recognition; cameras and other sensors understanding the visual world and then using that interpretation to drive decisions to do self-driving."

Shashua declared AI as the frontier of yesterday. "The next frontier is language," he said. This is where artificial general intelligence (AGI) comes in. An example would be software that can read and comprehend text to a degree that a person can have a conversation with the computer. "Once a computer can do that, it means it can understand context, understands common sense, understands temporal dimension," Shashua said. Taking that to the next step, the computer can also write a story. "If you summarize the main ideas, then you can let the computer write everything, and you can interact with the computer and refine it," he said.

Because of huge leaps in the progress of language comprehension, Shashua predicts that within the next two to five years, we will see computers understanding comprehension tests. "If you have more and more compute, and more and more data, you can do things that a few years ago would be considered science fiction," he said.

Relating

Mark Peterson, Shen Milsom & Wilke

Mark Peterson, Shen Milsom & Wilke

Mark Peterson, a principal at Shen Milsom & Wilke (SM&W), has had his eye on AI and AGI as it applies to the AV/IT industry. "AGI is the litmus test of being able to think, and be able to ask questions and then make responses without having it be pre-programmed and predetermined," he said. Computers are at another stage of intelligence. "Recently, we saw the announcement that Alexa now has the ability to anticipate when people want their lights shut off: whether they say the command or not, it will draw a conclusion based upon preexisting conditions," Peterson said.

When clients ask SM&W for the latest in smart building technologies, Peterson emphasized that it still begins with usage scenarios, no matter how advanced. "I create a narrative,” he said. “'When I walk into the room, what do I want to do?' The usage scenario narrative describes the automation process. And that starts to put people in the mindset of visualizing and understanding kind of where the automation can help them.”

Facial detection has been used in the security industry for a while, but companies are embedding that technology into more products, such as building entry kiosks, since the pandemic. In these cases, the person entering the building has "opted-in" and knows their facial profile is being used for their benefit. "I want to be able to have a frictionless experience entering the building and going upstairs,” Peterson said. “So, therefore, I want it to be able to help people recognize me.”

With facial detection and other technologies that are collecting more personal data, Peterson cautions, "There should be foundational rules, and we should be able to apply them."

Automating Intelligence

Rich Ventura, Sony Electronics Professional Solutions Americas

Rich Ventura, Sony Electronics Professional Solutions Americas

From automated and intelligent presenter tracking embedded into video cameras, to displays providing data on the number of people entering a building, or facial detection providing data such as sentiment, to audio solutions with automatic dynamic beamforming technology, to handwriting recognition, there's no shortage of innovative AV/IT solutions being brought to market every day.

The pandemic has increased the awareness of screens that capture data, such as people's temperature, as they enter a building. Rich Ventura, vice president of B2B at Sony Electronics Professional Solutions Americas, said, "I'm sure the individuals at World Privacy Forum and some of the other privacy groups are going to disagree with what I say here. But I'm hoping we're at the point that people realize that tools like this become very valuable and can help you."

In 2019, Sony Pro introduced a device called the Edge Analytics Appliance engineered with AI technology. "It is designed to do everything from following a presenter as they're speaking with audio tracking, to handwriting extraction so the audience can see the content being written on the board, and REA-C1000 technology that can detect and react to gestures from audience members," Ventura said.  

"These types of devices are an area where I see more and more organizations spending and investing money," he added. "They minimize the number of staff needed to run a presentation, as well as have the ability to share content."

Recognition

Joel Hagberg, Intel

Joel Hagberg, Intel 

For more than 10 years, Intel has been invested in computer vision, enabling partner companies to build devices in the digital display space. "Our product is collecting data, analyzing it very quickly in an ASIC, and then being able to contribute it to very exciting developments with machine learning, artificial intelligence," said Joel Hagberg, head of product management and marketing for Intel RealSense at Intel. "You can truly utilize computer vision to improve the performance of interaction, of products to make the world safer, more productive, and make it a more immersive environment."

Intel announced its Intel RealSense Touchless Control Software (TCS) in January, a cost-effective solution that will transform a touch-based system to touchless. "You don't need to replace your components; you're just mounting a camera, loading our software, and it's a plug-and-play change for those kiosk manufacturers," Hagberg said. "With the Intel ASIC doing the vision processing, "we've been able to take our Intel RealSense Depth Camera D435 with the TCS software and provide a seamless conversion from touch to touchless."

Answering the need for touchless access control, Intel also launched Intel RealSense ID, a secure facial authentication device. "There has been a significant spike in interest in the facial authentication space to move away from punching in codes with fingers," Hagberg said.

Intel takes privacy very seriously. Intel RealSense ID is not a surveillance camera and does not randomly scan faces. "It does not do a facial scan and store an image," Hagberg said. "We're not storing your facial image as a biometric; we're storing facial landmark as the biometric signature." The device is attention aware, meaning an enrolled user is aware they are being authenticated.

"There's a lot of intelligence that we're gathering, so our feeling is that if you can enable a robot, a computer, to become aware of its surroundings, we're making a truly smart device that can see and understand and interact and make decisions," Hagberg added.

"In the AV world, there are a lot of opportunities to be able to use the digital signage or voice devices to pull information,” Hagberg said. “We're giving the computer system the ability to get that visual data. I think it's a perfect platform for artificial intelligence and machine learning applications to be able to take the data, analyze it, and make it a safer environment, a more productive environment, or a more immersive experience for customers."

Connecting the Vectors

Christopher Jaynes, Mersive

Christopher Jaynes, Mersive

According to Christopher Jaynes, founder and CTO of Mersive, "If you look at how people assume technology will emerge, often they don't think about the ubiquitous infrastructural implications." Using automated cars as an example, "you think about, 'How would I build a smart car?'" Jaynes points out that the evolution is a combination of the car plus a smart infrastructure that's talking to it.

Supportive assistance in the workplace and education is where Jaynes is focusing his attention. "We have 190,000 rooms worldwide already with little boxes that are very, very smart. And when I walk into the room, I have a mobile device. So, when I say ‘assistant,’ I don't think of my assistant talking to me. The infrastructure itself is looking for ways to help my day get better."

For nearly two years, Mersive has been collecting meeting ratings from users. A random set of questions is asked to glean information about productivity, engagement, or sentiment. "It's not about the tech; it's the outcome," Jaynes said. "Those words are vectors into all the data we've collected for an AI to mine."

With this information, an employee who does not need a high level of productivity can ask, "Where are my most enjoyable meetings?" The assistant would direct the employee to the ideal available meeting space.

"So that, I think, is a big role of AI," Jaynes said. "This idea of assisting you through your day in ways that the AI itself knows based on the current status of where you are, what you need to get done, and where the infrastructure can support those needs through pattern recognition. That's coming real soon. That's a big deal."

Cindy Davis
Brand and content director of AV Technology

Cindy Davis is the brand and content director of AV Technology (AVT). She was a critical member of the AVT editorial team when the title won the “Best Media Brand” laurel in the 2018 SIIA Jesse H. Neal Awards. Davis moderates several monthly AV/IT roundtables and enjoys facilitating and engaging in deeper conversations about the complex topics shaping the ever-evolving AV/IT industry. She explores the ethos of collaboration, hybrid workplaces, experiential spaces, and artificial intelligence to share with readers. Previously, she developed the TechDecisions brand of content sites for EH Publishing, named one of the “10 Great Business Media Websites” by B2B Media Business magazine. For more than 25 years, Davis has developed and delivered multiplatform content for AV/IT B2B and consumer electronics B2C publications, associations, and companies. A lifelong New Englander, Davis makes time for coastal hikes with her husband, Gary, and their Vizsla rescue, Dixie, sailing on one of Gloucester’s great schooners and sampling local IPAs. Connect with her on LinkedIn