Applied Sciences, Vol. 14, Pages 1459: Improving the Robustness of DTW to Global Time Warping Conditions in Audio Synchronization

3 months ago 24

Applied Sciences, Vol. 14, Pages 1459: Improving the Robustness of DTW to Global Time Warping Conditions in Audio Synchronization

Applied Sciences doi: 10.3390/app14041459

Authors: Jittisa Kraprayoon Austin Pham Timothy J. Tsai

Dynamic time warping estimates the alignment between two sequences and is designed to handle a variable amount of time warping. In many contexts, it performs poorly when confronted with two sequences of different scale, in which the average slope of the true alignment path in the pairwise cost matrix deviates significantly from one. This paper investigates ways to improve the robustness of DTW to such global time warping conditions, using an audio–audio alignment task as a motivating scenario of interest. We modify a dataset commonly used for studying audio–audio synchronization in order to construct a benchmark in which the global time warping conditions are carefully controlled, and we evaluate the effectiveness of several strategies designed to handle global time warping. Among the strategies tested, there is a clear winner: performing sequence length normalization via downsampling before invoking DTW. This method achieves the best alignment accuracy across a wide range of global time warping conditions, and it maintains or reduces the runtime compared to standard usages of DTW. We present experiments and analyses to demonstrate its effectiveness in both controlled and realistic scenarios.

Read Entire Article