The unit tests included insoft_dtw_cuda.pyverify the results against the CPU implementation. Some limitations are: All sequences in the same batch should have the same length / number of features. Inputs cannot have lengths longer than 1024 (due to CUDA limitations on the maximum block size...