Moreover, the models in the system use the prompt-tuning method based on the discriminant pre-training language models. Eventually, according to the gating network, the prediction of the multiple models with the same structure but different initialization are weighted as the output. The ...