ESPnet is an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition and end-to-end text-to-speech. ESPnet useschainerandpytorchas a main deep learning engine, and also followsKaldistyle data processing, feature extraction/format, and recipes to provide a complete...