权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

音の三要素に基づく生成過程を考慮した深層ベイズ自動採譜

基于声音三要素考虑生成过程的深度贝叶斯自动转录

基本信息

批准号：
22KJ2959
负责人：
田中啓太郎
金额：
$ 1.6万
依托单位：
Waseda University
依托单位国家：
日本
项目类别：
Grant-in-Aid for JSPS Fellows
财政年份：
2023
资助国家：
日本
起止时间：
2023-03-08 至 2025-03-31
项目状态：
未结题

来源：
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-22KJ2959/
关键词：
音楽情報処理自動採譜多楽器採譜音高音色分離

项目摘要

本研究では，音楽音響信号を構成する全ての楽器に対して各楽譜を推定する，多楽器自動採譜技術を扱う．現代音楽では多種多様な楽器が使用されるため，多楽器自動採譜は楽曲の保存と再現の観点から重要な技術である．従来は，採譜対象の楽器を事前に複数種類指定し，各楽器に依存した個別の採譜モデルを作成することで，各楽譜の推定が行われていた．そのため，使用する訓練データに含まれない楽器をはじめとする多くの楽器は楽譜推定の対象外となり，一般の楽曲に対しては採譜できないという問題があった．そこで本年度は，汎用性の高い採譜手法開発への第一歩として，1) 音高と音色の分離に基づく広範な楽器音の解析，2) 楽器音の分離表現学習における奏法の考慮，という観点から以下の研究を行った．(1) 認識と生成の双方向性を同時に扱う深層ベイズモデルを用いて，対象楽器に制限を設けることなく，楽器音の音高と音色を分析する手法を開発した．時変特徴と時不変特徴を備えた変分自己符号化器を用いて，楽器音から音高と音色を認識し，また生成および編集も可能にした．入力楽器音に与える音高と音色に関する摂動と各潜在表現の不変性に着目することで，ラベル情報に起因する対象楽器の制約を撤廃した．しかし，同一の楽器でも奏法（ビブラートやピッチカート等）の違いによって，異なる楽器として認識されてしまう問題も残った．(2) 楽器音認識のための分離表現学習において，従来主流であった二要素分離を発展させ，奏法の違いを陽に考慮する三要素分離手法を開発した．これにより，一つの楽器に対し複数の奏法を割り当てることを可能にした．さらに，異なる楽器音の間で各要素を置換することで，楽器や音高，奏法の変換を行う手法も開発した．

In this paper, we propose a new method for estimating the spectrum of acoustic signals. Modern music is used in a variety of ways, such as automatic music acquisition, music preservation and reproduction. In the past, a plurality of types of spectrometers were specified in advance, and each spectrometer was dependent on an individual spectrometer. For example, if you use a training tool, you will find that you have a problem estimating the object of the tool. This year, the first step in the development of a universal high-resolution spectral acquisition method is to: 1) analyze the basic range of instrument sounds in the separation of pitch and timbre; 2) consider the method of instrument sound separation in the study of instrument sound separation; and conduct research on the following points. (1)To understand the bidirectionality of the generation, the deep layer of the sound is used to set the limit of the sound generator, and the method of analyzing the pitch of the sound generator is developed. Time features, time not features, time not features The power of the sound and pitch of the sound is related to the potential performance of the sound. The same device can be used to solve the problem of violation of the law, different device can be used to solve the problem. (2)The separation of the two elements in the main stream of learning, the separation of the three elements in the violation of the law A single device can be used to create multiple images. In this paper, the different instrument sound between the various elements of the replacement, the instrument pitch, the method of changing the way to open.

项目成果

期刊论文数量（11）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Patch-based Memory Efficient Diffusion Probabilistic Models

基于补丁的内存有效扩散概率模型

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
Aida Kazuhiro;Hirao Marina;Funabashi Aiko;Sugimura Natsuhiko;Ota Eisuke;Yamaguchi Junichiro;Shinei Arakawa
通讯作者：
Shinei Arakawa

覚醒度と感情価に基づく音楽による画像スタイル変換

基于唤醒水平和情绪效价使用音乐进行图像风格转换