President of Univention North America, making sure you stay in control of your data, your company and your future.
Let’s face it—the average consumer never looks at a data privacy agreement. For most of us, the convenience of just using a cloud service, app or software trumps the rights and responsibilities that companies might restrict in their terms and conditions, usually well hidden inside pages upon pages of fine print that invite hasty scrolling to the “accept” button.
However, according to the latest research, this attitude is changing. Data privacy is becoming a concern, with 53% of Americans saying they’re worried AI will hurt people who want to keep their data private.
This number should warn companies trying to implement AI. Consumers are starting to care, especially if data around their identity is concerned. Thus, when implementing AI programs, organizations must carefully consider data ownership and protection to avoid alienating or angering their customers.
Let’s dive deeper into the evolving topic of AI and the havoc it can wreak with identity data. As it turns out, some simple considerations can help set consumers’ minds at ease.
Supercharged Scams
When AI gained high visibility in the mainstream media this year, there were numerous predictions about new AI-based cyberattacks. Now that we’ve adjusted to living with AI for over a year, it turns out that artificial intelligence hasn’t really introduced any new attacks into the mix of threat vectors. Instead, it’s supercharged existing scams in ways that were unimaginable before. From voice-spoofed scam calls to perfectly faked business emails, AI constantly challenges us to detect and deflect these attacks.
All those scams have one thing in common: The more AI “knows,” the better and potentially more successful the attacks. Training data is precious to the bad guys. If you want to spoof someone’s voice successfully, you need voice recordings, and faking a bank’s email requires multiple samples to “learn” the wording and data included to make the scam look and sound as authentic as possible.
The Trapfalls Of AI-Guided Training And Analysis
Calling a customer service hotline today starts with a friendly reminder that this call is recorded for training and quality assurance purposes. During the interaction, we might recite or key in our Social Security number, address and account number. Although it used to be expensive to sift through the massive number of calls the average bank or airline receives daily, AI is changing the metrics. Using AI, it’s now possible to provide real-time feedback to agents and supervisors in a call center.
The recorded calls provide the training data for the models. However, recordings contain all the identifying information provided during the call unless carefully sanitized.
Theoretically, an AI model should be designed to reduce the training data and ensure that the information it harvests is abstracted enough rather than regurgitating a precise copy of our inputs. However, as the infamous example of Github’s Copilot shows, AI isn’t infallible. Sometimes, the model returns exactly the training data it was fed.
Although that was just a copyright issue for Github, leaked identity data can create expensive real-world problems. Changing Social Security numbers or addresses is tough, and it’s impossible to change our voice patterns or dates of birth. Outsourcing core AI features to service providers can worsen matters, as it suddenly becomes possible to aggregate our identity data from multiple sources, creating a toxic data lake.
Data Ownership: The Cornerstone Of AI Development
Consequently, data ownership and control over the AI model are two key business questions when considering your AI strategy. The backlash over Zoom’s change of its terms of service, focusing on the company’s intent to use transcriptions to train their model, clearly shows that consumers and businesses aren’t willing to be used as a quarry of training data for a company’s AI endeavors. They’re even less inclined to serve as training models if it’s unclear what happens to their data and who can access the results.
We must take data ownership and protection seriously to harness the coming AI revolution for good outcomes. Data privacy policies can’t simply exist to protect companies from lawsuits and regulatory action. They have to start living up to their names and outline the privacy protection in place to collect each consumer’s data.
We must also question whether we can outsource data analytics and AI to third-party providers. Especially in protected settings like healthcare and education, data leakage and cross-account contaminations carry tremendous legal and reputational risks. Luckily, a growing number of new entrants are developing models and AI systems that allow enterprises to handle sensitive data behind their firewalls and with compartmentalized models.
Lastly, we must emphasize the aspects of data analytics and sanitizing records. If a recording or data point doesn’t contain any personal identity data, there’s no risk of them being preserved inside the model. Yet, that’s not the case in all circumstances. Hence, the design and implementation of AI applications must pay close attention to what can and should be scrubbed and prevented from entering the training set before it’s too late to remove. Even AI experts admit it’s hard, if not impossible, to delete data from machine learning models.
Say It Loud: Allow Opt-Out
Overall, we must start taking data ownership and protection seriously in an AI-assisted world. That means clearly telling users what data is collected, who stores and processes the data, and what data can be retrieved. Furthermore, we must give end users better and more straightforward opportunities to opt out of the collection process. That’s especially true when personally identifying data, video or voice recordings are concerned.
None of us wants to scramble to change parts of our identity with all the worries, woes and work it entails simply because of poorly designed AI malfunctions. And neither should unclear data ownership or leaking models force us to do so. Machines and algorithms are there to serve us and make our lives easier, not the other way around.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?
Read the full article here