Deepfake-as-a-Service and Whaling: The new weapon of fraudsters against companies

Anne Patzer
Apr 23
14 min read

A businessman in a modern office sits at a desk, participating in a video conference. The monitor displays three colleagues in formal attire, all looking attentively—but in reality, they are deceptively real deepfakes. Natural light streams through the window, creating a serious, professional atmosphere, as the man stares intently at the screen.

The new era of digital deception

In early 2024, a finance employee of a multinational corporation in Hong Kong transferred $25 million to fraudsters. The reason? He participated in a video conference where he believed he was speaking with the chief financial officer (CFO) and other colleagues. In reality, all other participants were deceptively real digital forgeries, so-called deepfakes, generated by artificial intelligence (AI). This incident is not an isolated incident, but an alarming symptom of a growing trend: the deliberate use of AI, especially generative AI, for sophisticated deception. Technology is advancing rapidly, and distinguishing between real and fake is becoming increasingly difficult.

Two particularly worrying developments in this area are "deepfake-as-a-service" (DFaaS) and "deepfake whaling." DFaaS describes the commercialization of deepfake creation, making this technology accessible to less sophisticated criminals. Deepfake whaling, on the other hand, is the targeted use of this technology to imitate high-ranking individuals – so-called “whales” – and thus take attacks such as CEO fraud to a new level of escalation.

This article analyzes the technology behind deepfakes, examines the mechanisms and dangers of DFaaS and deepfake whaling, assesses the risks for companies, and outlines effective defense strategies. Finally, it presents how specialized solutions like VAARHAFT can help restore trust in digital content and protect companies from AI-powered fraud.

What are deepfakes? A look under the hood of AI deception

Deepfakes are synthetic media content—be it videos, audio files, or images—that are created or manipulated using deep learning, a sub-discipline of artificial intelligence, to appear deceptively real. The word "deepfake" itself is a combination of "deep learning" and "fake." Their convincing nature often makes them extremely difficult to distinguish from authentic content, even for trained eyes and ears.

The technological foundation for modern deepfakes was laid in 2014 by Ian Goodfellow and colleagues with the introduction of Generative Adversarial Networks (GANs). Since then, the quality of deepfakes has improved exponentially—from early, often flawed experiments to the hyperrealistic fakes possible today. This development has gone hand in hand with increasing accessibility of the technology beyond specialized research labs.

At the heart of most deepfake creation systems are the aforementioned GANs. A GAN can be thought of as a competition between two neural networks:

The generator attempts to create new data (e.g., images) that are as similar to the real data as possible. It often starts with random noise and learns to form plausible content from it.
The discriminator tries to detect whether the data presented to it is real (from the original training data set) or generated by the generator (fake).

These two networks train each other in an adversarial process: The generator gets better at fooling the discriminator, while the discriminator gets better at detecting fakes. This iterative race results in the generator producing increasingly realistic fakes. This process requires large amounts of training data (images, videos, audio recordings of the target person or object) and significant computing power, often using specialized hardware such as GPUs.

In addition to GANs, other AI technologies are also used in deepfake creation, including autoencoders (especially for face swapping), convolutional neural networks (CNNs) for image analysis, recurrent neural networks (RNNs) for sequences such as speech or lip movements, and natural language processing (NLP) for analyzing and synthesizing speech for audio deepfakes.

Deepfakes can be roughly divided into different categories:

Video deepfakes: These include techniques such as face swapping, face reenactment (manipulating a person's facial expressions, head and lip movements based on the movements of another person) and lip syncing (adapting lip movements to a new audio track).
Audio deepfakes: This includes voice cloning/conversion, allowing an AI to imitate the voice of a target person, as well as text-to-speech (TTS) synthesis, where any text is spoken in the voice of a target person.
Image deepfakes: AI can not only manipulate existing images, but also generate deceptively real images of people or objects that never existed.

The rapid development and increasing availability of tools for deepfake creation have led to a kind of "democratization of deception." While the creation of GANs was originally an academic feat, some modern techniques, particularly for certain types of face swapping or voice cloning, require only a few minutes of video footage or even single images as a starting point. This means that the ability to create convincing fakes is no longer limited to state actors or highly specialized research institutions. It is increasingly accessible to criminals with less technical skills. This development significantly lowers the barrier to entry for sophisticated fraud and manipulation, leading to an increase in the volume and variety of threats facing organizations. The deepfake problem thus scales far beyond niche applications.

Deepfake-as-a-Service (DFaaS): Criminal innovation on the Darknet

The commercialization of deepfake technology has led to the emergence of "deepfake-as-a-service" (DFaaS). Here, specialized actors offer the creation of deepfakes as a paid service, often through marketplaces on the dark web or in closed forums. This model makes sophisticated fakes accessible to a wider range of criminals who lack the technical know-how or resources to create them themselves.

The process is typically simple: A client provides the source material—images, audio, or video clips of the target—and specifies the desired specifications. The DFaaS provider then uses its expertise and specialized software to generate the deepfake. In some cases, user-friendly interfaces such as Telegram bots are even used to further simplify access to these services.

The offerings on these platforms are diverse. They range from simple face swaps and the creation of fake pornographic content to high-quality, customized productions for specific fraud scenarios. These include, for example, the creation of fake crypto promotional videos with celebrity impersonations or the targeted impersonation of high-profile personalities for social engineering attacks.

Prices for DFaaS vary widely depending on the quality, complexity, and prominence of the person being impersonated. They can range from a few hundred dollars to $20,000 or more per minute of video footage. A particularly high-profile example was the offer to create a high-quality deepfake of Ethereum co-founder Vitalik Buterin, including a synthesized voice, for $20,000 per minute, with the provider promising complete production "according to the customer's imagination."

The existence of DFaaS has far-reaching implications for the threat landscape:

Lower barriers to entry: Criminals no longer need in-depth AI knowledge to use sophisticated deepfakes.
Increased scalability: DFaaS enables attackers to launch deepfake-based campaigns on a larger scale. Criminal networks can share methods and tools and potentially automate attacks.
Higher quality and sophistication: Even less experienced actors can rely on the expertise of specialists and thus use more convincing and harder to detect counterfeits.

The development of DFaaS points to the emergence of a specialized criminal ecosystem. The existence of dedicated providers, tiered pricing models, and specific tools such as bots demonstrates that this is no longer just a hobbyist code sharing activity, but a growing business area within cybercrime. iProov aptly calls this "Deepfake Crime-as-a-Service." This specialization likely accelerates innovation in the development and application of malicious deepfakes and complicates defense. It also suggests a criminal supply chain in which various actors specialize in data theft, deepfake creation, and the execution of the actual attacks.

Deepfake Whaling: When the CEO becomes a digital doppelganger

One of the most dangerous uses of deepfake technology in a corporate context is so-called "deepfake whaling." This is a highly sophisticated form of spear phishing in which deepfakes are used to impersonate high-ranking executives ("whales") such as CEOs, CFOs, or other board members. Deepfake whaling represents a significant evolution of traditional CEO fraud (also known as business email compromise, BEC) because it goes beyond mere text impersonation.

The attackers use deceptively real audio or video deepfakes, often during phone calls or video conferences, to manipulate employees. This method bypasses traditional verification steps that might be based on voice or facial recognition. The core of the attack is social engineering: The deepfake lends strong authenticity to the deception and exploits existing trust relationships to persuade employees to take actions such as making urgent money transfers or disclosing sensitive data. To create the deepfakes, attackers first collect audio and video footage of the target person, often from public sources such as company websites, news articles, conference appearances, or social media, or captured through previous cyberattacks.

The primary goal of deepfake whaling is often direct financial fraud through the initiation of unauthorized transfers. However, it can also be used for industrial espionage to gain access to confidential information or systems, or to damage the reputation of a company or executive.

Several sensational cases illustrate the danger:

The voice of the CEO of a British energy company was cloned to convince a manager to transfer $243,000.
The aforementioned case in Hong Kong, in which an employee was tricked into transferring $25 million through a deepfake video conference with the supposed CFO and colleagues.
A Ferrari executive was contacted by phone by a deepfake of the CEO, attempting to make a large transfer related to a supposed takeover. The scam failed when the executive became suspicious and asked a personal question.
The Chief Communications Officer (CCO) of the crypto exchange Binance was impersonated via deepfake in Zoom calls to deceive representatives of other crypto projects.
Even high-ranking central bankers such as the President of the European Central Bank (ECB) and the Chairman of the US Federal Reserve (Fed) have been targeted by deepfake calls in which the callers pretended to be the Ukrainian President.

The following table illustrates the difference and escalation compared to traditional CEO fraud:

Table 1: Traditional CEO Fraud vs. Deepfake Whaling

feature	Traditional CEO fraud	Deepfake Whaling
Primary channel	e-mail	Email, phone call, video conference
Deception method	Text-based imitation (fake email)	Text, voice and/or video imitation (deepfakes)
Challenge for victims	Verifying email legitimacy	Verification of audiovisual authenticity
Required attacker skills	Low to medium	Medium to high (or use of DFaaS)
Possible effects	Financial losses	Financial losses, espionage, reputational damage
Typical defense	Email security, process audits	Advanced detection, multi-channel verification, training

This comparison makes it clear why deepfake whaling poses such a serious threat. It not only attacks email systems but also compromises communication channels that were previously considered more trustworthy—telephone and video conferencing.

The effectiveness of deepfake whaling relies on exploiting a fundamental aspect of human and business communication: trust in auditory and visual cues. We're conditioned to trust a familiar voice on the phone or a familiar face on a video call. Deepfake whaling targets precisely this "trust gap" by convincingly falsifying these signals. Standard verifications, such as checking a sender's email address, are no longer sufficient when the core of the communication medium itself—the voice or image—is manipulated.

The $25 million fraud in Hong Kong is a dramatic example of how the seemingly authentic presence of "peers" can overcome even initial skepticism. This forces organizations to fundamentally rethink their verification processes. They must move away from a sole reliance on single channels or easily forged identifiers. Multi-channel verification via an independent, pre-known communication channel ("out-of-band") becomes essential, even for seemingly routine, high-risk requests.

The threat situation: Concrete risks from deepfakes for companies

The increasing prevalence and sophistication of deepfakes, driven by DFaaS and targeted attacks such as whaling, creates a complex threat landscape with diverse risks for organizations:

Direct financial losses: The most obvious risk is the theft of funds through fraudulent transfers, as the whaling examples clearly demonstrate. Added to this are the costs of incident response, system recovery, and potential litigation, such as employee liability. Other forms of fraud are also amplified by deepfakes, including cryptocurrency fraud, insurance fraud through fake claim images, and the submission of fake expense reports. There is also a risk of market manipulation through targeted disinformation.
Reputational damage: Deepfakes can be used to spread misinformation about a company or its executives, which can cause significant damage. These include fake public statements, compromising videos, or fake product advertisements featuring celebrity deepfakes. Such campaigns can permanently undermine a brand's credibility.
Erosion of trust: The ubiquity of deepfakes threatens fundamental trust in digital communications, both internally and externally. Successful impersonations of executives can weaken trust in company leadership. If customers fall victim to deepfake-based scams, customer trust suffers. In general, the technology can undermine trust in media and evidence.
Security breaches: Deepfakes can be used to bypass security systems, including biometric authentication methods such as voice or facial recognition. They serve as a powerful tool for social engineering attacks to obtain credentials or trick employees into disclosing sensitive information. Creating fake identities for job applications to gain entry into the company as an insider threat is also a realistic scenario.

Particularly vulnerable sectors are:

Financial sector: Due to the direct financial incentives for attackers and the heavy reliance on identity verification (KYC - Know Your Customer).
Insurance industry: Vulnerable to fraud through falsified claims and documents.
Media and politics: main targets for disinformation campaigns and reputational damage.
Human Resources (HR): Risk from fake applicant profiles and documents.

Statistics underline the growing threat:

A study by Signicat shows a 2,137% increase in deepfake fraud in the financial sector over three years.
The KPMG Cybersecurity Study 2024 reports a 119% increase in deepfake attacks in Austria over the previous year, with similar trends in Germany and Switzerland.
FinCEN is observing an increase in suspicious activity related to deepfakes.

The nature of deepfake risk is inherently asymmetric. While creating truly high-quality, undetectable deepfakes still requires significant resources or access to specialized DFaaS providers, the damage caused by a single, even moderately convincing deepfake in a targeted whaling attack can be enormous—as the $25 million case demonstrates. At the same time, the costs of defense—investments in technology, employee training, and process adjustments—are substantial and ongoing for all potential targets. The attacker, on the other hand, only needs to succeed once. The relative ease of generating any type of deepfake compared to the difficulty of universal and reliable detection creates an imbalance that favors attackers, especially in targeted attacks. Therefore, organizations cannot afford to ignore this threat, even if they don't see themselves as a target for the most sophisticated forgeries. The potential damage from a single successful attack requires proactive, multi-layered defenses, representing a significant and growing cost to cybersecurity.

Shield against counterfeiting: Defensive strategies for companies

Given the complexity and dynamic nature of the deepfake threat, there is no single, one-size-fits-all solution. Effective protection requires a multi-layered defense strategy that combines technology, processes, and human factors. Since no single method provides 100% protection, an integrated approach is essential.

A. Organizational measures (human firewall & processes):

Humans are often the weakest link, but also the first line of defense.

Raise awareness and train: Regular, mandatory training is crucial. All employees, especially those in finance, management, and customer service, must be educated about deepfake technologies, social engineering tactics (especially whaling), and identifying features. Realistic examples and simulations increase effectiveness.
Implement verification processes: For sensitive requests, especially those involving financial transactions or access to critical data, strict, mandatory multi-channel verification procedures must be implemented. Confirmation must be made through a known, trusted, and independent channel (e.g., a direct callback to a known phone number), not through channels specified in the suspicious request itself. The dual control principle should be consistently applied for critical actions.
Clear policies and communication channels: Companies should define clear policies that prohibit instructions for high-risk actions (e.g., large transfers) via individual channels such as email, telephone, or video conferencing alone. Clear escalation paths and procedures for handling suspicious inquiries must be established. Furthermore, the publication of internal information about employees and organizational structures should be limited to what is absolutely necessary.
Incident Response Plan: A specific emergency plan for dealing with suspected or successful deepfake attacks is necessary. This should include internal reporting channels, investigation processes, and external communication strategies. Legal and communications departments should be involved early on.
Foster a culture of skepticism: Employees must be encouraged to critically question unusual requests, even if they appear to be from superiors. Gut instincts ("listen to your gut") should be taken seriously, and reporting of suspicious cases should be encouraged.

B. Technical measures:

Technology can help detect counterfeits and make attacks more difficult.

AI-based detection tools: Use of software solutions that analyze audio, video, or image files for signs of manipulation or synthetic creation. VAARHAFT currently offers the Fraud Scanner for analyzing images and documents and is already working on additional technologies.
Liveness Detection: Particularly relevant for biometric verification processes (e.g., account opening or authentication), this technology checks whether the person is real and present or whether the person is an attempt at deception (e.g., a photo, video replay, mask, or potentially a digitally injected deepfake).
Strengthen authentication methods:
- Multi-factor authentication (MFA): Phishing-resistant MFA (e.g., FIDO2-based passkeys) in particular is crucial to prevent account takeovers, which are often part of or preparation for whaling attacks.
- Biometric authentication: Can be part of MFA, but requires robust liveness detection to defend against deepfake attacks. Voice biometrics is considered particularly vulnerable.
- Behavioral biometrics: Analysis of user behavior patterns (typing speed, mouse movements) as an additional security layer.
Digital watermarking / fingerprinting: Embedding invisible markings in media files to prove their authenticity or detect tampering.
Secure communication channels: Use of end-to-end encryption for sensitive communications. However, it should be noted that encryption can make real-time analysis by detection systems more difficult.
Blockchain technology: Offers potential for the tamper-proof anchoring of media and metadata for authenticity verification.

The goal is not perfect automated detection, but rather the use of technology to augment human capabilities and established processes to intercept even sophisticated attacks. Companies should therefore invest equally in technology and in strengthening the human factor. Overreliance on just one area is risky; a holistic, integrated approach is essential.

VAARHAFT: Creating trust through image authentication

In the complex landscape of deepfake defense, VAARHAFT positions itself as a reliable and specialized provider with focused solutions. The goal is to restore trust in digital media within critical business processes.

VAARHAFT's core technology is based on AI algorithms that analyze digital images and documents to detect manipulation or complete synthetic creation (deepfakes).

The Fraud Scanner solution offers the following features, among others:

AI-generated image & document recognition: Reliably identifies images or documents generated entirely by AI.
Detection of (AI-)edited images & documents: Detects images and documents that have been edited using AI or other software.
Marking of edited areas in the image & document: Locates and marks the manipulated areas within an image or document for full transparency.
Privacy-compliant reverse internet search: Performs an anonymized search on the internet to determine if the image already exists online.
Duplicate check (internal/external): Checks whether identical or similar images have already been used within your own company or by other companies (e.g. across insurance companies).
Metadata analysis: Performs basic analysis of file metadata to gain additional insights and clues to inconsistencies.

The analysis is quick (“in seconds”) and provides a detailed credibility assessment. The functionality can be seamlessly integrated into existing business processes via an API and is GDPR compliant.

Direct use cases: VAARHAFT directly targets fraud scenarios based on manipulated or forged images and documents. Specific use cases include:

Insurance fraud: Detection of fake or doctored photos of damages during claims settlement.
Fake online profiles: Detection of fake profile pictures on online platforms, e.g. in the dating sector.
E-commerce fraud: Identification of manipulated images, such as those used in counterfeit returns (implicitly through general image manipulation detection).
Invoice and receipt fraud: Detection of fake or AI-generated invoices and receipts, relevant for HR departments (e.g. expense reports) and government departments.

The threat posed by deepfakes is multimodal – it encompasses video, audio, images, and text. Fake videos and voices play a central role, particularly in so-called deepfake whaling or deepfake-as-a-service scenarios, supported by manipulated documents or images. This is precisely where VAARHAFT comes in: Currently specializing in the reliable detection of fake images and forged documents, our fraud scanner already effectively protects companies from image-based fraud. But it doesn't stop there – we are continuously working on expanding our technology to include video and audio authentication in order to also protect against these more complex attack scenarios such as deepfake whaling. Our goal is to provide comprehensive security for companies and, in doing so, raise awareness of the growing dangers posed by deepfake-as-a-service. Only with a multimodal approach can these modern threats be comprehensively defended. VAARHAFT sees itself as an important component within a defense strategy that reliably protects images, documents, and soon also video and audio content.

Conclusion: Vigilance in the age of synthetic reality

The threat posed by deepfake technology is real and growing. Its easy accessibility via deepfake-as-a-service (DFaaS) and its targeted use in sophisticated attacks such as deepfake whaling pose significant challenges for companies. The potential damage ranges from massive financial losses and reputational damage to the fundamental erosion of trust in digital communications.

Effective defense requires a proactive and holistic approach. It's not enough to rely on isolated technological solutions or rely solely on employee vigilance. Rather, a combination of robust technical tools, clearly defined and strictly adhered to organizational processes, and continuous employee awareness and training is essential. Given the constant arms race between attackers and defenders, these measures must also be regularly reviewed and adapted to the evolving threat landscape.

Specialized solutions like those from VAARHAFT play an important role in this defense strategy. By focusing on detecting tampering and forgery in digital images and documents, they address a critical attack vector relevant to many fraud scenarios – from claims settlement and expense reports to elements of identity fraud. They help secure automated processes and restore trust in visual information.

Companies are urged to assess their specific risks associated with deepfakes, critically review their existing defenses, and consider implementing advanced detection solutions. Only through a combination of technological innovation, procedural rigor, and human vigilance can resilience against AI-enabled fraud be strengthened and trust maintained in the digital age.

Learn more about the Fraud Scanner