权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

広域3D環境に関するニューラル暗黙表現の学習とその応用に関する研究

神经隐式表示学习及其在广域3D环境中的应用研究

基本信息

批准号：
22K12166
负责人：
丸山稔
金额：
$ 2.25万
依托单位：
Shinshu University
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (C)
财政年份：
2022
资助国家：
日本
起止时间：
2022-04-01 至 2025-03-31
项目状态：
未结题

项目摘要

本研究の目的は広域3次元環境に対しても適用可能な3次元暗黙表現を学習するための手法を確立することである。このための基本方式としてはencoder-decoder型の処理を用いることを想定している。今年度の研究においては、入力として複数画像を用いた場合の全体手法の考案、visual transformerを含む種々のencoderの方式の検討、さらに、encoder出力として得られるテンソルを入力の特徴量とみなした場合の形状類似度検索への適用可能性の検討などを行った。今年度の研究においては、3次元モデル構築のために複数画像を入力とする方式として、画像間の照合などは行わず、画像枚数も上限のみを設定するものの、入力数を自由に設定できる方式を考案した。またこれらの入力に対して適用するencoderとしては、従来画像識別で高い能力が実証されているResNetと近年自然言語処理の分野で広く用いられ、画像等への適用も進んでいる(vision) transformerを用い、能力比較を行った。本方式では画像間の照合などは行わず順序関係を与えることはできない。このような並べ替えに対する不変性を保証するためにはmax-poolingやtransformerの場合はpositional encodingをあえて行わない方式を用いた。これらの手法の適用により構築される3Dモデルを従来手法のひとつであるDISN（deep implicit surface network）と比較し、同等以上の能力を有することを確認した。このようなencoder-decoder型の処理の際、encoderから得られるテンソル表現は入力の特徴量とみなすことができる。これらが形状類似度検索に適用できるかどうかの基礎検討に着手した。

The purpose of this study is to make sure that the three-dimensional environmental environmental information system can be used to demonstrate the accuracy of the three-dimensional environmental information system. In the basic way, you can use the encoder- decoder model to figure out what to do. In this year's research, the complex portraits have been tested by using the whole method, and the visual transformer contains a variety of encoder methods to improve the performance of the system. The results show that there are several ways to improve the performance of the system, such as the number of parameters, the size of the shape, the similarity of the shape, the possibility of the use of the device, the possibility of the use of the model, the possibility of the application, the possibility of the application, and the feasibility of the model. This year, we have studied the number of images, the number of three-dimensional images, the number of images in portraits, the images between portraits, the upper limit of the number of portraits, and the number of forces that can be set to set the system. In recent years, there has been a difference in the use of natural language in recent years, such as the use of pictures, portraits, and so on. In recent years, there has been a distinction between the use of encoder and the use of portraits, portraits, etc., in recent years, there has been a natural understanding of the nature of the language in recent years. In recent years, there has been a difference in the use of natural language, such as portraits, portraits, and so on. (vision) transformer is used, and the ability is better than that. In this way, the pictures are taken in conjunction with each other, in the order of the image and in the image. You can use the same information as you would like to know if you want to make sure that you have sex. You need to max-pooling your transformer to make sure that you use the same method as you would like to do. The technique is more accurate than the DISN (deep implicit surface network), which is equivalent to the above ability. In the encoder- decoder model, the encodery decoder can be used to show that the input force is very high. The size of the shape is similar to that of the hand.