Public Library of Science
Browse

Feature fusion flow chart.

Download (783.68 kB)
figure
posted on 2025-02-19, 18:29 authored by Bing Lu, Qianxue Zhang, Yi Guo, Fuqiang Hu, Xuejun Xiong

In response to the issues of insufficient audio feature representation and insufficient model generalization ability in music structure analysis methods, a music structure analysis method based on beat feature fusion and an improved residual network was designed. Music structure analysis (MSA) contains two tasks, boundary detection and segment labeling. Boundary detection and Segment labeling are expected to accurately divide music segments and clarify the function. In this paper, a method is studied to accomplish the two tasks. First, the labels of music structure are refactored into 9 types and refined to each beat. Secondly, a beat-wise feature extraction is studied to segment music according to its beats and incorporates various acoustic features to achieve highly accurate segmentation. Then, a Resnet-34 based on a self-attentive mechanism is used to predict the category of each beat. Finally, a post-processing step is designed to filter predicted labels. The method is evaluated on the dataset SALAMI-IA, and experiments show that it is 3 percentage points higher than the current optimal method on HR3F. It is better than the most advanced methods on PWF and Sf.

History