This report originally appeared in the Journal of Digital Media & Policy on 7 July 2020
Advances in deepfake technology have led to the emergence of a new picture of how doctored material will be used in disinformation campaigns. While safeguards ensure that manipulated ideos may not be such a problem at the highest levels of security and defence, lower levels – such as local elections – remain vulnerable to malign actors. At such levels, deepfakes can be distributed using social media channels to target unsuspecting victims. Current solutions only protect individuals who are prominent enough to be covered by the mainstream media, and not enough is being done by governments or social media companies to protect ordinary users from coordinated inauthentic activity online. However, with more images and videos of ourselves online than ever before, anyone can be a victim of a disinformation campaign. As deepfakes become easier to make, no one is safe – hyper-localized manipulation will create problems for democratic institutions that have not yet been fully understood.
The most commonly cited example of the potentially disastrous impact of a deepfake is the circulation of a manipulated video, which appears to show President Trump declaring a nuclear strike. In this scenario, the video is believed by Russia and, in a matter of moments, a real nuclear attack is triggered. Deepfakes risk ‘global nuclear war’, headlines remind us.
The reality is that as the technology has progressed, a different picture of how deepfakes will be used has emerged. They may not be such a problem at the highest levels of security and defence. Instead, the problems that deepfakes pose lie at the lowest levels.
WHAT ARE DEEPFAKES?
A deepfake, also called synthetic media, is any video that uses a programme to replace one face with another. Some are highly complex and can imitate voices too. The original video can be anything; it can as easily be a Hollywood movie, a press briefing or a movie shot on a phone.
Programmes such as the open-source DeepFaceLab take an original video and tracks the face, frame-by-frame. It then takes another face and superimposes it onto the original. This process uses a form of artificial intelligence (AI), a neural network, called an autoencoder.
One of the big advances in deepfake technology in recent years is how long it takes to make the mask, a process that has shrunk from days to mere seconds. Another advance is how well programmes can compensate for changes in lighting and angle in the original videos. In early versions, a straight-to-camera shot with consistent lighting was needed to get the best results. Now, videos can be convincing even if the subject is moving or if there are changes to light or shadows.
Commercial applications of deepfakes can be found in Instagram and Snapchat. Both apps scan a user’s face and can, in real time, project images on them, like puppy ears and a nose; modify face shapes by giving users bigger eyes or smoother skin; or project their face into a different image. Although it is a security concern to have someone’s face mapped onto the body of a head
of state, it is a viral marketing strategy to have a user’s face mapped onto the protagonist of a summer blockbuster, or onto the mascot of a franchise.
These commercial uses continue to push the boundaries of what is possible. In the 2016 film Rogue One, a deepfake-like technology was developed by Industrial Light and Magic, a leading CGI company, to put the late actor Peter Cushing back into the film as his younger self. Likewise, a music video for Snoop Dogg used a deepfake to mask the face of the late rapper Tupac Shakur onto an actor and be put into the music video.
The increasing commercial use of this technology indicates that, irrespective of speculative investment or academic research, the free market will continue to develop more convincing deepfakes – especially those that can be made faster and with less input data. Deepfakes are here to stay.
The security problem posed by deepfakes must not be overlooked – they enable bad actors to act fraudulently, and there are a variety of ways they can be exploited. The most common way for bad actors to use deepfakes is for disinformation campaigns (e.g. imitating Putin). While traditional tools of disinformation are the easiest ways to spread disinformation, through text based means or digital image manipulation, deepfakes offer a more convincing alternative. It is easy to write a fake news report or post a misleading tweet, but it is more rewarding for bad actors to spread deepfakes, especially when they can be so convincing so as to show a ‘white police officer shouting racial slurs or a Black Lives Matter activist calling for violence’. This was a common disinformation tactic in the build-up to and during the 2016 US election, as organizations, often intending to promote a social justice initiative, were infiltrated by Russian trolls.
The crucial problem is, therefore, not what would happen if someone deepfakes an announcement by Trump or Putin to launch nuclear weapons. There are plenty of safeguards against such a threat, including the difficulty with which such a message could be posted from an authoritative source. Even if someone made a highly convincing deepfake of a nuclear strike authorization, uploading such a video to any Twitter account would immediately cast doubts on its authenticity.
In order to be convincing, the context would have to be right. It would have to come from a more reliable and authentic source. Something that was seen during the 2016 US election, however, was that many accounts were set up to initially act in a trustworthy manner and build support, before being used for manipulation. Even sources, given enough time,
can be faked.
Another issue with such a high-profile deepfake is that media coverage would likely be quick to highlight the video as fake. Media outlets are already quick to make public service announcements to help people understand how to spot deepfakes. There is a back-and-forth arms race between deepfakes and the algorithms that can detect them. Partnership in AI, in collaboration with AWS, Facebook and Microsoft, recently established a deepfake detection challenge. As these programmes continue to develop ways of uncovering such technology, this poses a problem for high level fakes. Any video that would purport to show a head of state ordering a nuclear strike would quickly be debunked. It is the less obviously falsified
material, however, that remains a challenge.
Deepfakes can also present an illegal solution for both the media and contextual sources. Ainikki Riikonen suggests in his article, ‘Decide, Disrupt, Destroy’, that China was behind a series of fake profiles on LinkedIn, whose faces had been made with a generative adversarial network (GAN), the same technology used in deepfakes. This allows for whole networks to be set up, potentially of journalists, reporters, editors who do not exist, but can have rich lives online. With an infinite number of photos that can be generated showing them as socializing with each other online, they are able to leverage that and connect with others on social networks. This creates a synthetic social network of virtual professionals that can inject synthetic media without detection into high-level discourse. In 2019, the Associated Press reported how these bots were able to integrate successfully on LinkedIn with ‘senior US government officials and think tank experts’.
The weakest link in the chain is not at the very highest level. But if those who are working in the policy field, in security or at think tanks can be influenced by deepfakes by being repeatedly exposed to them by a trusted source, then deepfakes can have a significant impact. The earlier and longer that individuals are exposed, the more influence the deepfakes can have. This means that early stage professionals and master or doctoral students would be the most prime targets for this kind of manipulation.
The problem lies in the smaller issues, like local elections. National elections fall under much higher levels of scrutiny, and content put out by campaigns, as well as that which could potentially be deepfaked, will be analysed. For example, a video that fractionally slowed down Nancy Pelosi in order to make her appear as if she was slurring her words was quickly revealed to have been
manipulated. It is unlikely that in a smaller election, with significantly less resources and attention, the manipulation would have been revealed.
When deepfakes were harder to make, it made sense to target high-profile individuals for several reasons. For example, the more well known the target, the more context there is. A user has to have some understanding of the victim of the deepfake. Such targets would also have more video content of them available, making it easier to make a high-quality deepfake mask.
With the need for smaller civic leaders and local politicians to promote themselves constantly on social media to build their platforms, less high profile targets now also have far more video content of themselves. This means there is ample footage for deepfake creation, and emphasizes the fact
that local and civic elections are a potential weak point for democracies. If this is not protected, there will likely be a proliferation in deepfakes. While each individual fake may well be detected by algorithms designed to counter their spread, together they represent a granular and attritional threat.
Unlike deepfakes of heads of state and other high-profile individuals, the method of distribution for small-scale use is easier. If Putin were to declare a nuclear war, such an announcement would be expected to come through an official Kremlin channel or state media. However, if a fake video, circulated in small group chats or Facebook groups, showed a UK ward councillor saying they oppose a local bypass or cycling lanes, when they in fact support them, this would be believable for many social media users. It is much easier for trolls to have fake accounts and engage in small-scale fakery like this. In India, a deepfake was used to appeal to voters in both English and Haryanvi in legislative assembly elections.
As deepfakes become easier to make and distribute at this level, they will create an ‘illusory truth effect’. This effect describes the phenomenon that when a statement is frequently repeated, it is perceived to become true. This effect would not have had an impact on the one-off large impact deepfakes. For frequent low-level deepfakes, it means that the repeti-
tion of a believable lie will become true for those who repeatedly consume it, especially if there is no counternarrative to combat it.
The level of exposure to deepfakes will only increase as services such as 5G are rolled out, making video more common and accessible in all areas of life. As apps seek to replace photo content with video content, this also means that video footage of more individuals will be widely available.
This creates a vicious cycle; more civic leaders create more video content of themselves, and this makes it easier for deepfakes to be made, and in order to dispute the deepfake, the civic leaders put out more videos of themselves disputing it.
Unfortunately, relying on citizens to stop the spread of localized deepfakes is unlikely. As citizens are exposed to more deepfakes, they are unlikely to see sharing as immoral or unethical. Writing in Psychological Science, Effron and Raj found:
Seeing a fake-news headline one or four times reduced how unethical participants thought it was to publish and share that headline when they saw it again – even when it was clearly labeled as false and participants disbelieved it, and even after we statistically accounted for judgments of how likeable and popular it was.
This study provides compelling evidence that any attempt to coerce citizens into not sharing deepfakes, or any other kind of fake news, will run into a serious problem. If citizens still share a video, even when knowing that it is a fake, then, in order to find a solution to their spread online, governments will have to look elsewhere than grassroots solutions.
If they cannot be stopped, then these hyper-localized fakes present problems at the lowest and smallest levels of civic accountability. If such issues are not tackled at their core, however, they run the risk of growing more serious. Deepfakes could have a tangible impact on voting patterns. While they do not pose a threat as large one-off heists of public trust, they present an issue as an
ongoing problem for community trust and local understanding, with constant small-scale attacks.
Social media platforms already use microtargeting, the practice of subdividing an audience into very small groups in order to serve them specific, targeted content. This practice produces small and niche groups that have similar interests. As the groups get smaller and the interests overlap more, then the content that is served is more likely to keep them on the
platform for longer periods of time. This practice is completely legal. However, it can also be manipulated by bad actors. It allows bad actors to target small groups, and if they can create engaging content (i.e. deepfakes), then social media platforms will serve it to users voluntarily. This helps hyper-localized political deepfakes find and keep an engaged audience over long periods of time.
When considering solutions, it must be remembered that there are still plenty of viable commercial uses for video and audio manipulation software, and that any policies that stifle innovation will have a serious impact on the creative industries.
The Centre for Data Ethics and Innovation has made several recommendations to the UK Department for Culture, Media and Sport. The best policy solution to help those who will be most vulnerable to legislative changes is for ‘technology companies, particularly those running social media platforms, [to] include audio and visual manipulation within their anti-disinformation strategies’.
Other solutions, such as ensuring media outlets are more conscious of deepfakes, will only eradicate the risks from deepfakes for those who are prominent enough to be covered by the mainstream media. Even then, the standard of identification needs to be meticulously high as this kind of moderation ‘falls within the ethically grey area between censorship and freedom of expression’.
Media literacy is also not a particularly viable solution. The fakes will, with time, become indistinguishable to the human eye from real, non-edited videos. There will be two main ways to spot them.
One will be the delivery method – namely, whether users can trust the source of the video. Unfortunately, as videos are constantly reshared and reposted, it can require a lot of research to track down the origin of a video. This heightens the challenges facing moderators and individual users in identifying (and removing) falsified material.
The second way to spot a fake is context. Users will have to intuit how believable what they are seeing is. Much the same with photoshopped images now, the more farfetched the image, the harder it is for it to be credible, irrelevant of how well it has been faked. At the more local level of the government, this will be difficult, as it is likely that people simply do not know what the positions of local representatives are.
Ascott, Tom (2020), ‘Microfake: How small-scale deepfakes can undermine society’, Journal of Digital Media & Policy, 11:2, pp. 215–222, doi: https://doi.org/10.1386/jdmp_00018_1