权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Music Information Processing Using Continuous Speech Recognition Methods

使用连续语音识别方法的音乐信息处理

基本信息

批准号：
14380156
负责人：
SAGAYAMA Shigeki
金额：
$ 10.82万
依托单位：
The University of Tokyo
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (B)
财政年份：
2002
资助国家：
日本
起止时间：
2002 至 2004
项目状态：
已结题

项目摘要

We formulated music rhythm recognition for ranscribing MIDI data into music score as a Viterbi path search problem in HMM where hidden states and output probabilities represent the intended note values and actually played note lengths, respectively. We also solved rhythm recognition of polyphonic music by reducing polyphony intomonophony. Tempo modeling and tempo change detection were enabled with segmental k-means algorithm for speech recognition.Harmonization (chord finding) of given melodies was formulated as an isomorphic problem as continuous speech recognition by defining output by the given melody, hidden states by the chord behind the melody and stochastic language model by chord sequences. Automatic counterpoint was developed with a two-step maximum likelihood approach consisting of rhythm design and pitch allocation solved by dynamic programming.In polyphonic signal analysis, an algorithm named Harmonic-structured Clustering was developed based on the k-means clustering algorithm under harmonic constraint by modeling the framewise observed spectrum as overlapped harmonic structures and considering that the distributed energy in harmonic structure belongs to a single cluster. Furthermore, by introducing the probabilistic assignment to clusters, k-means was generalized into the EM-algorithm and attained higher performance of multi-pitch estimation. Utilizing an information criterion such as AIC, the number of sources and octave location were also enabled."Specmurt analysis" was proposed for polyphonic signal analysis. The inverse Fourier transform of linear spectrum with log-frequency was called "specmurt". Along log-scaled frequency, observed linear spectrum is regarded as convolution of distribution density of fundamental frequencies and harmonic structures of multiple tones which are assumed identical. This idea opened up a new signal processing capabilities.

我们制定了音乐节奏识别rancribing数据到乐谱作为一个维特比路径搜索问题，在HMM中的隐藏状态和输出概率分别表示预期的音符值和实际播放的音符长度。我们还解决了复调音乐的节奏识别，减少复调intonophone。采用分段k-means算法实现了克里思的速度建模和速度变化检测，通过定义给定旋律的输出、旋律后面的和弦的隐藏状态和和弦序列的随机语言模型，将给定旋律的和声（和弦发现）问题表示为连续语音识别的同构问题。自动对位法是一种两步最大似然法，它包括节奏设计和音高分配，用动态规划法求解。提出了一种基于k-提出了一种谐波约束下的均值聚类算法，该算法将帧间观测谱建模为重叠的谐波结构，并考虑谐波结构中的分布能量属于单簇。通过引入聚类的概率分配，将k-means算法推广到EM算法中，获得了更高的多基音周期估计性能。利用AIC等信息准则，还启用了源的数量和倍频程位置。提出了“谱分析法”用于复音信号的分析。对数频率的线性谱的逆傅里叶变换称为“谱”。沿沿着对数标度频率，观测线性谱被视为基频分布密度与假定相同的多个音调的谐波结构的卷积。这一想法开辟了新的信号处理能力。

项目成果

期刊论文数量（223）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Specmurtにおける準最適共通調波構造パターンの反復推定による多声音楽信号の可視化とMIDI変換

通过在 Specmurt 中迭代估计次优常见和声结构模式来实现和弦音乐信号的可视化和 MIDI 转换

DOI：
发表时间：
2004
期刊：
情報処理学会研究報告(MUS) 2004-MUS-56
影响因子：
0
作者：
亀岡弘和;齊藤翔一郎;西本卓也;嵯峨山茂樹
通讯作者：
嵯峨山茂樹

リズム語彙を用いたHMMによるMIDI演奏のリズムとテンポ推定

使用节奏词汇使用 HMM 进行 MIDI 演奏的节奏和节奏估计

DOI：
发表时间：
2004
期刊：
情報処理学会研究報告 2004-MUS-54
影响因子：
0
作者：
武田晴登;西本卓也;嵯峨山茂樹
通讯作者：
嵯峨山茂樹

Rhythm and Tempo Recognition of Music Performance from a Probabilistic Approach

从概率方法识别音乐表演的节奏和节奏

DOI：
发表时间：
2004
期刊：
Proc. 5th International Conference on Music Information Retrieval (JSMIR) (Barcelona, Spain)
影响因子：
0
作者：
Haruto TAKEDA;Takuya Nishimoto;Shigeki Sagayama
通讯作者：
Shigeki Sagayama

Rhythm Recognition of Multiphonic MIDI Signals Using Probabilistic Models

使用概率模型识别多音 MIDI 信号的节奏

DOI：
发表时间：
2004
期刊：
IPSJ Journal Vol.45, No.3
影响因子：
0
作者：
Haruto Takeda;Takuya Nishimoto;Shigeki Sagayama
通讯作者：
Shigeki Sagayama

Time-Space Clustering for Multi-pitch Spectral Segregation Using Kernel Audio Stream Model

使用内核音频流模型进行多音高频谱分离的时空聚类

DOI：
发表时间：
2005
期刊：
The 2005 Spring Meeting of the Acoustic Society of Japan 3-7-19
影响因子：
0
作者：
Hirokazu Kameoka;Takuya Nishimoto;Shigeki Sagayama
通讯作者：
Shigeki Sagayama

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

SAGAYAMA Shigeki其他文献

DNN-Based Full-Band Speech Synthesis Using GMM Approximation of Spectral Envelope

使用频谱包络 GMM 近似的基于 DNN 的全频带语音合成

DOI：
10.1587/transinf.2020edp7075
发表时间：
2020
期刊：
IEICE Transactions on Information and Systems
影响因子：
0.7
作者：
KOGUCHI Junya;TAKAMICHI Shinnosuke;MORISE Masanori;SARUWATARI Hiroshi;SAGAYAMA Shigeki
通讯作者：
SAGAYAMA Shigeki