PDGAN
This project first presents a coherent signal demodulation method based on generative adversarial networks (GAN), called a phase demodulation generative adversarial network (PDGAN). We applyed GAN method to the field of Doppler signal demodulation for laser voice detection.Demodulation of the Doppler signal from a coherent signal is accomplished through unsupervised learning within the PDGAN, layer by layer, with global supervised feedback learning for fine-tuning. Drawing upon the adversarial principle of GAN, we let the coherent signal, Z, serve as an input for G and let the generated demodulated Doppler signal G(Z) and the clean Doppler signal X serve as the input for D. By means of alternate training and optimization, G learns the mapping relationship between the coherent signal, Z, and the Doppler signal, X, thereby achieving the goal of demodulating the coherent signal. This project mainly includes related data sets of Doppler signal and corresponding coherent signal. The structure and detailed description of the network are planned to be published in the Journal of optical engineering. The author can also be contacted for information wangyahui@aoe.ac.cn
Install / Use
/learn @wangyahgui/PDGANREADME
PDGAN
This project first presents a coherent signal demodulation method based on generative adversarial networks (GAN), called a phase demodulation generative adversarial network (PDGAN). We applyed GAN method to the field of Doppler signal demodulation for laser voice detection.Demodulation of the Doppler signal from a coherent signal is accomplished through unsupervised learning within the PDGAN, layer by layer, with global supervised feedback learning for fine-tuning. Drawing upon the adversarial principle of GAN, we let the coherent signal, Z, serve as an input for G and let the generated demodulated Doppler signal G(Z) and the clean Doppler signal X serve as the input for D. By means of alternate training and optimization, G learns the mapping relationship between the coherent signal, Z, and the Doppler signal, X, thereby achieving the goal of demodulating the coherent signal. This project mainly includes related data sets of Doppler signal and corresponding coherent signal. The structure and detailed description of the network are planned to be published in the Journal of "Optical Engineering". The author can also be contacted for information wangyahui@aoe.ac.cn Here, we provide the program to prepare the data set, and we develop the data set according to the principle of interferometry.(these programs are written by MATLAB). The PDGAN network is written in Python language, and we will publish it after further collation.
The original data website for the experiment is http://www.openslr.org/12/. The clean data folder is the "train-clean-100.tar.gz"
Due to the large data set, we upload it in batches. The first upload time is February 4, 2021. The dataset contains three folders, which needed to train PDGAN clean folder include clean Doppler signal and corresponding coherent signal Babble folder includes Doppler signal with babble noise and corresponding coherent signal White folder includes Doppler signal with white noise and corresponding coherent signal
In our experiments, we used LibriSpeech as the voice library for the laser voice detection. White noise and babble noise from the NOISEX-92 dataset were used to provide the equivalent of outside perturbations. Since the purpose of the trained PDGAN model is to demodulate the enhanced Doppler signal from the noisy coherent signal. Therefore, we separately calculated the clean Doppler signal from clean voices , and coherent signals using the voices with noise according principle of interferometry . Since the A / D conversion frequency of LDV is 200 Khz, we resample the Doppler signal data set and interference signal data set to get the final data set. Finally, we get paired noisy coherent signals dataset and corresponding clean Doppler signal dataset for training PDGAN. The reason we chose the LibriSpeech voice library and the NOISEX-92 noise dataset is that they are fully open source.
Because the Doppler signal dataset is generated by direct conversion of speech signal, its statistical characteristics are consistent with that of LibriSpeech corpus. For LibriSpeech corpus, it is derived from audiobooks that are part of the LibriVox project, and contains 1000 hours of English speech captured by 20 male and 20 female speakers. We random chose the 10000 samples from them. The generated Doppler signals are the same as the speech signals, its distribution meet the Gaussian mixture model, the range is [-1,1], and the mean value is 0. After adding white and babble noise, we get noisy speech and then generate interference signal. Due to the modulation of carrier, the frequency of coherent signa is concentrated at 50 kHz. In summary, we prepared a coherent signal and corresponding clean Doppler signal dataset for SNRs of {10dB, 0dB, -10dB, -20dB}, and each dataset contained 10000 samples, with each sample having a random length of 5-10 s. The dataset distributed into70% for training set, 30% for test set, which is a common ratio of splitting data when the capacity of data is not too large.
