Deep Fake Voice Detection And Extraction Using Deep review 1.pptx
shanmugarajakk
111 views
10 slides
Aug 03, 2024
Slide 1 of 10
1
2
3
4
5
6
7
8
9
10
About This Presentation
Good
Size: 67.22 KB
Language: en
Added: Aug 03, 2024
Slides: 10 pages
Slide Content
Deep Fake Voice Detection And Extraction Using Deep Learning Presented by 1. 2. 3. Guided by,
Abstract The technique of artificially producing lifelike-sounding audio recordings of persons saying or doing things that they never actually said or did is known as " deepfake audio," also referred to as "voice cloning." Deepfake audio can be used for a number of nefarious activities, including fraud, reputation damage, and the dissemination of false information. The process of gathering, examining, and evaluating digital evidence is known as digital inquiry.
Abstract Digital investigators can identify and look into deepfake audio using a range of instruments and methods. The use of deepfake voice technology makes it difficult to distinguish between authentic and fake audio. Using cutting-edge machine learning techniques and data generated by text-to-speech (TTS), this work presents a novel method for detecting deep fake voices. The suggested approach improves detection accuracy by 26% over baseline techniques.
Literature Survey 1. Deep fake audio and CNN architecture. An image may depict a thousand words; however, images fall short in comparison to an audio or video clip of an event. Recording audio and video allow people to become first-hand observers of an event, without the need to believe what another person witnessed as it may be subjective to what the person interpreted. Developments in technology will soon bring about the nightmare of misuse. Due to the development of "deep fakes," the visual manipulation of audio or video is highly convincing and impossible to detect.
Literature Survey 2. Developments of Deep fakes. Deep fakes would not have been as credible without GANs as GANs consist of two networks: the generator, and discriminator network. By trying to reproduce the dataset that is being fed as input, the first neural network, referred to as the generator, is programmed to generate a new, false video or audio. A second neural network, known as the discriminator, is then fed into all the original datasets and the newly generated deep fake. The discriminator's role is to determine which videos or audio are legitimate in the data set (which now contains a deep fake).
Literature Survey 3. Deep Convolutional Neural Networks. CNN's architecture was influenced by the human brain's visual cortex structure. Essentially, a deep learning algorithm that takes images or spectrograms and assigns different weights to each image for distinction, and performs any given task, such as image classification. In contrast to hard-coded basic techniques, it is possible to train them to know the necessary filters for proper preparation Throughout the years, a variety of architectural developments have been made to resolve concerns related to computer efficiency, error rate, and further changes in the domain.
Existing System The authenticity of audio and video information has become a major worry with the introduction of deepfake technology. Although deepfake production has significantly improved, deepfake detection accuracy is still a problem, with current detection methods only reaching about 82.56% accuracy. This discrepancy in performance, together with data's growing accessibility and power, has contributed to the spread of false media.
Drawbacks in Existing System Low Accuracy Difficult to find whether true or false
Explanation about Proposed System This work addresses these issues by emphasizing the interpretability of detection approaches and concentrating on deepfake voice detection. Targeting non-experts in artificial intelligence and linguistics, explainable artificial intelligence (XAI) techniques are used to improve interpretability. In order to ensure interpretability, the study uses very simple model architectures that combine long short-term memory networks (LSTMs) and convolutional neural networks (CNNs). The research makes use of audio datasets, such as LJ Speech and ASV spoof 2019 Logical Access, which stand for limited and nearly realworld datasets, respectively.
Advantages of Proposed System For classification reasons, real human voices are designated as "human voice," and deepfake voices as " deepfake ." Popular XAI techniques including layer-wise relevance propagation (LRP), integrated gradients, and Deep Taylor are used to examine the trained models in an effort to achieve interpretability. The findings demonstrate that XAI techniques can shed light on the characteristics that set human and deepfake voices apart.