Establishing a Research Base for Computerbased Musical Analysis of Uta
Leader: ITŌ Katsunobu
Member: AKAISHI Mina, YAMANAKA Reiko
There has been rapid development in Music Information Retrieval (MIR) using computer technology to process, analyze, and generate acoustic signals. The techniques of MIR, however, are implicitly based on Western music, which causes problems when applying them to non-Western music such as utai, the vocal music of Noh. Our aim is to develop a research base that can be used to analyze and process utai. Our focus is on melody, the most important element in music information processing. One challenge was that unlike Western music, in which the pitches of the sounds written in a score correspond to what is actually sung or played, there is a greater difference between the utaibon (score or vocal book) information and the acoustics of the execution of utai; the pitches of the notes are not absolute and are changeable even within a single phrase. To help bridge the gap between the acoustic signal and the score, we proposed a framework for the transcription of the melody of the utaibon and developed a method to infer the melody from the acoustic signals using information obtained from the utaibon in the previous year’s project. This year we will further develop our proposed method to establish a research base to enable the application of MIR through: 1) a systematic evaluation and revision of melody estimation method which can detect pitch changes within a utai phrase; and 2) an application of the melody estimation method to the analysis and quantification of utai expressions which vary among different performers, schools, modes
(yowagin and tsuyogin) and roles. By separating the elements of utai expressions we hope to establish a certain model per performer and objective criteria to differentiate schools.
We have established a semiautomatic analysis method for acoustic signal of Noh singing. With respect to the solo parts in commercial compact discs, procedures for data processing was established. In the process, it turned out that analysis precision is low in chorus parts and segments where plural singers overlap.
Investigating the cause of low precision, singing data was recorded by three Noh musicians of the Tessen-kai of the Kanze school, also to evaluate the validity of the framework established last year. In addition, main singer was interviewed to see what kind of intention for singing expression. The recorded data was 29 phrases of 4 songs those are the characteristic in the commentary. The chorus parts were recorded at the same time, but were recorded in multi-track with separate microphones.
Because of analyzing the multitrack chorus data, the timings of the change of the melody and the pitches of the melody were almost common in three singers. However, regarding vibratos, the timings were almost common, but how to change the size and the type were different depending on the singer. Using these findings and the data, establishing a method to analyze the stadium of commercially available stereo recording sound sources is a future issue.
In the interview, some valuable knowledge was gained about Tsuyogin-mode, which is difficult to interpret. This finding seems to be useful for interpreting the commentary. Also, during the interview, we received an evaluation that we heard the singing phrases generated by Vocaloid as the estimated melody by the proposed method, and hear it as a melody of the Kanze school. We will prepare transcripts to make interviews widely available.
Regarding other schools than Kanze school, we collected sound sources of Hosho, Kongo, and Kita school. With respect to the source of the Hosho school, we compared the acoustic signals where the section can be regarded as the same as Kenze school. As a result, we were able to confirm the differences with regard to Iro and vibrato.