👩‍🏫➡👩‍🎓
Presented by Christabella Irwanto
Source domain | Target domain | How to simulate? |
---|---|---|
Clean speech | Noisy speech | Add noise |
Close-talk speech | Far-field speech | Apply RIR, add noise |
Adults | Children | Voice morphing |
Original speech | Compressed speech | Apply codec |
Wideband speech | Narrowband speech | Downsample/filter |
Domain adaptation of a (teacher) acoustic model that is well-trained with source-domain transcribed data to a target domain li17_large
The student network exclusively uses the soft posteriors from the teacher as the training target when the teacher is correct and uses the hard label instead when the teacher is wrong meng19_condit.
T/S learning for unsupervised domain adaptation of AED model for E2E ASR. The two orange lines signify the two-level knowledge transfer. meng19_domain
AT/S for supervised domain adaptation of AED model for E2E ASR meng19_domain.
The ASR WER (%) of far-field AEDs trained with CE and AED models adapted by various T/S learning methods to 3400 hours far-field Microsoft Cortana data for E2E ASR on HK speaker test set. meng19_domain