An Engadget report today, “Hangouts eavesdrops on your chats to offer ‘smart suggestions’” describes a new “spy/valet” feature being added to Google’s popular video chat tool.
It’s sad that this sort of thing still gets meekly labeled as “creepy”. The privacy implications are serious and pretty easy to see.
Google is evidently processing the text of Hangouts as they fly through their system, extracting linguistic cues, interpreting what’s being said using Artificial Intelligence, extracting new meaning and insights, and offering suggestions.
We need some clarification about whether any covert tests of this technology have been undertaken during the R&D phase. A company obviously doesn’t launch a new product like this without a lot of research, feasibility testing, prototyping and testing. Serious work on ‘smart suggestions’ would not start without first testing how it works in real life. So I wonder if any of this evaluation was done covertly on live data? Are Google researchers routinely eavesdropping on hangouts to develop the ‘smart suggestions’ technology?
Many people have said to me I’m jumping the gun, and that Google would probably test the new Hangouts feature on its own employees. Perhaps, but given that scanning gmails is situation normal for Google, and they have a “privacy” culture that joins up all their business units so that data may be re-purposed almsot without limit, I feel sure that running AI algorithms on text without telling people would be par for the course.
In development and in operation, we need to know what steps are taken to protect the privacy of hangout data. What personally identifiable data and metadata is retained for other purposes? Who inside Google is granted access to the data and especially the synthtised insights? How long does any secondary usage persist for? Are particularly sensitive matters (like health data, financial details, corporate intellectual property etc.) filtered out?
This is well beyond “creepy”. Hangouts and similar video chat are certainly wonderdful technologies. We’re using them routinely for teaching, education, video conferencing, collaboration and consultation. The tools may become entrenched in corporate meetings, telecommuting, healthcare and the professions. But if I am talking with my doctor, or discussing patents with my legal team, or having a clandestine chat with a lover, I clearly do not want any unsolicited contributions from the service provider. More fundamentally, I want assurance that no machine is ever tapping into these sorts of communications, running AI algorithms, and creating new insights. If I’m wrong about covert testing on live data, then Google could do what Apple did and publish an Open Letter clarifying their data usage practices and strategies.
Come to think of it, if Google is running natural language processing algorithms over the Hangouts stream, might they be augmenting their gmail scanning the same way? Their business model is to extract insights about users from any data they get their hands on. Until now it’s been a crude business of picking out keywords and using them to profile users’ interests and feed targeted advertising. But what if they could get deeper information about us through AI? Is there any sign from their historical business practices that Google would not do this? And what if they can extract sensitive information like mental health indications? Even with good intent and transarency, predicting healthcare from social media is highly problematic as shown by the “Samaritans Radar” experience.
Artificial Intelligence is one of the new frontiers. Hot on the heels of the successes of IBM Watson, we’re seeing Natural Language Processing and analytics rapidly penetrate business and now consumer applications. Commentators are alternately telling us that AI will end humanity, and not to worry about it. For now, I call on people to simply think clearly through the implications, such as for privacy. If AI programs are clever enough to draw deep insights about us from what we say, then the “datapreneurs” in charge of those algorithms need to remember they are just as accountable for privacy as if they have asked us reveal all by filling out a questionnaire.