“Machine Learning” is one of those buzzwords thrown around a lot these days, put in the same conversations as “Artificial Intelligence” and “Computer Vision.” Yet despite the prominence of machine learning, most Americans do not know what it actually is, let alone its implications for individual privacy in cyberspace. Data in the 21st Century is a bit like what oil was in the 20th; whichever party controls the flow data controls the world economy. The numbers behind the amount of data created every day are staggering, and it will take increasingly sophisticated AI technologies, such as machine learning, to be able to best leverage our data. Just what is machine learning, exactly and what impact does it have on how our data is used?
Before continuing, it may be helpful to briefly consider what machine learning actually is. While there is some debate about a proper definition, the bare bones definition is computer systems which are able to learn from data and not simply operate within a rules-based framework. Consider the example of video games: As long as you keep the level the same, the AI opponents are not going to be significantly more challenging with each play through. On the other hand, if the game was leveraging machine learning, the AI opponent would “learn” your playing style, counter it, and the game would become more and more difficult until you chucked your controller at the wall in frustration.
Beyond the implications for your poor Xbox, some of the current applications of machine learning surround us every day, and not all of them as benign as they appear. Virtual assistants such as the Amazon Echo, Apple HomePod and the Google Home leverage machine learning to provide you customized responses. However, there is the potential for this service to go wrong; you may remember that Amazon took some heat recently as a result of a well-publicized story of a family whose conversations were unknowingly sent to Amazon via their device.
Machine learning could also be a godsend for hackers in a scam as old as email: phishing. Often sent in the form of emails, phishing attacks appear to be sent by legitimate institutions asking you to verify sensitive personal information such as social security or bank account numbers. In the good old days of hacking, you could usually spot phishing attacks simply by the poor grammar of the email. With the advent of advanced machine learning, hackers could not only clean up their wording, but also customize their wording to the individual target. Imagine receiving an email from a friend who asks you to click on a “cool link they saw.” You think nothing of it because not only does the email seem to come from the friend, but it also is written in the same style your friend writes, complete with the same phrases, typos and even emojis. A scam of this type was recently uncovered in India, while hackers are also taking to Twitter.
While national laws will always be playing catchup to technology, governments are starting to take the threat seriously. In May, the European Union (EU) released their new General Data Protection Regulation (GDPR) which replaced its predecessor, The Data Protection Directive (DPD), first launched in 1995. Under the new GDPR, consumers who are being targeted by legitimate machine learning software have the “right to explanation,” meaning that individuals in the EU may demand explanation (or opt out of) decisions made by machine learning software that has legal effects on them, such or automated e-recruiting or credit applications. While debate continues between those who favor the GDPR and those who feel it will stifle innovation in the marketplace, it is very likely that the law will need to be continually redefined, clarified and updated as the technology advances.
Should the public be terrified of machine learning technology? Should we be tossing our HomePods into the nearest lake? Are all our personality traits going to be copied and turned into a bot somewhere? Not quite. There are two key reasons why we should take a breather: First, in addition to the speech and image recognition used by everyday apps from Instagram and Snapchat, machine learning has some pretty cool applications in medicine, from spotting fetal heart problems to helping to identify schizophrenic patients.
Second, there are limits to what machine learning is actually able to accomplish. Though machine learning can recognize words, phrases and the like, it is still largely unable to put them into context or use emphasis. Consider the phrase “I didn’t say you stole the money”; depending on which word is emphasized, the phrase takes on entirely different meanings. Though machine learning software is able to recognize the phrase itself, most platforms continue to struggle with emphasis, a critical aspect of communication. Machine learning also remains limited by human biases, as a recent application in the criminal justice system demonstrates.
In short, machine learning is like any emerging tech; we’re still testing the boundaries and working out the kinks. So unless you don’t want Alexa to be able to give you the most relevant traffic patterns before your way into work, stick with it; the game has only just begun.