Ear speakers integrated into mobile phones are becoming more and more powerful. This has the downside that the mini-vibrations it creates are more telltale.
The more powerful ear loudspeakers that are being used in many smartphones can be remotely monitored relatively easily with built-in motion sensors. A team of security researchers from five US universities developed and tested a corresponding EarSpy attack. They take advantage of the fact that manufacturers of current mobile phones donate not only stereo sound directly to the ear but also more sensitive motion sensors and gyroscopes. These can also record small vibrations and resonances of the loudspeakers.
Side channel attack on-ear speakers
Numerous works already deal with eavesdropping based on the vibrations generated by the actual telephone loudspeakers. These make it possible to have a conversation without holding the cell phone to the ear. So far, however, there has been very little research into listening to the also integrated, but traditionally smaller, ear loudspeakers. According to the scientists, however, this is the “most practical attack vector” since most people did not use the volume function to make calls in public places.
According to the study, which has now been published as a preprint, eavesdropping on mobile phones is a known threat and a major security problem for users. Call recording is the easiest way for an attacker to overhear. However, current smartphone operating systems have restricted third-party apps from recording via microphones, thus thwarting many relevant attacks.
In principle, it is still possible to extract speech information from motion sensors via a so-called side-channel attack, since the user’s permission is not required at this point, the researchers write. Despite the hurdles in collecting information from sensors in the latest Android versions, this is a “significant data protection problem” that many users are not aware of. In addition to motion sensors, keystrokes on touch screens, writing with a pen or the use of external devices and light sensors have also been used for such attacks.
Manufacturers recommended countermeasures
The team then set out to analyze data from the motion sensors and try to extract sensitive speech and speaker information from a recording played on the ear. They took advantage of the fact that the device used in the experiments, the Chinese Android model OnePlus 7T from 2019 with stereo ear speakers, triggered measurable vibrations much more easily and thus provided information that could be used by EarSpy than the OnePlus 3T, which was three years older.
The scientists then studied the reverberation effect of ear speakers on the sensor by extracting features and spectrograms in the time and frequency domain. They then evaluated this data using classic machine learning algorithms and a specially developed neural network model. The result: “We found an accuracy of up to 98.6 percent for gender recognition, up to 92.6 percent for speaker recognition, and up to 56.42 percent for speech recognition.” It is not yet possible to fully eavesdrop on conversations, but the threats from relevant attacks are clearly recognizable. As a defensive measure, the researchers recommend manufacturers to place the motion sensors in such a way that “