Development of efficient machine discovery system based on data compression and pattern matching
基于数据压缩和模式匹配的高效机器发现系统的开发
基本信息
- 批准号:15300049
- 负责人:
- 金额:$ 9.22万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for Scientific Research (B)
- 财政年份:2003
- 资助国家:日本
- 起止时间:2003 至 2005
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
We studied the following three items to build efficient machine discovery systems.(1)Text compression and pattern matching.We focused on grammar-based compression and develop efficient compression algorithms. Using them we addressed the compressed pattern matching problem and obtained efficient algorithms.(2)Time-efficient processing of text and semi-structured text.We developed text index structures to accelerate text processing. As index structures for substring pattern matching, suffix trees and DAWGs are well-known. We focus on CDAWG which is a hybrid structure of them, and devised an online linear-time construction algorithm for CDAWGs. We then devised a construction algorithm for CDAWGs with sliding windows, which has an application to text data compression. We also proposed a new index structure for large alphabets (such as Japanese texts), and proved its efficiency experimentally. On the other hand, we analyze the properties of subsequence automata which are index structures for subsequence pattern matching to accelerate subsequence pattern discovery. We successfully gave a solution to the problem of online linear-time construction of word suffix trees, which has been open over 10 years.We developed a fast tree pattern matching algorithm based on bit-parallel technique for efficient processing of semi-structured text data.(3)Pattern discovery and information extraction.We developed efficient pattern discovery algorithms for various classes of patterns. We implemented them and estimated their performances experimentally.We integrated the techniques developed into a knowledge discovery system, applied it to linguistic data and literary data and then obtained good results in corporation with linguists and literary scholars.
我们研究了以下三个项目来构建高效的机器发现系统。(1)文本压缩和模式匹配。重点研究了基于语法的压缩,并开发了高效的压缩算法。利用它们,我们解决了压缩模式匹配问题,并获得了有效的算法。(2)文本和半结构化文本的高效处理。我们开发了文本索引结构来加速文本处理。作为用于子串模式匹配的索引结构,后缀树和DAWG是公知的。本文以CDAWG为研究对象,设计了一种CDAWG的在线线性时间构造算法。然后,我们设计了一个CDAWG与滑动窗口,它有一个应用程序的文本数据压缩的构造算法。我们还提出了一种新的索引结构,为大型字母(如日语文本),并证明了其效率的实验。另一方面,我们分析了子序列自动机的性质,子序列自动机是子序列模式匹配的索引结构,以加速子序列模式发现。本文成功地解决了词后缀树的在线线性时间构造问题,提出了一种基于位并行技术的快速树模式匹配算法,有效地处理了半结构化文本数据。(3)模式发现与信息提取:针对不同类型的模式,提出了有效的模式发现算法。我们将这些技术集成到一个知识发现系统中,并将其应用于语言数据和文学数据,与语言学家和文学学者合作,取得了良好的效果。
项目成果
期刊论文数量(112)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
DEVELOPING DYNAMIC GAITS FOR FOUR LEGGED ROBOTS
- DOI:
- 发表时间:2003
- 期刊:
- 影响因子:0
- 作者:Makoto Toyomasu;A. Shinohara
- 通讯作者:Makoto Toyomasu;A. Shinohara
On-line Linear-time Construction of Word Suffix Trees
词后缀树的在线线性时间构建
- DOI:
- 发表时间:2006
- 期刊:
- 影响因子:0
- 作者:Shunsuke Inenaga;他1名
- 通讯作者:他1名
The Size of Subsequence Automaton
子序列自动机的大小
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:Kazuhito Hagio;Shuichi Mitarai;Akira Ishino;and Masayuki Takeda.;Hideo Bannai 他3名;Hisashi Tsuji他2名;Shunsuke Inenaga 他2名;Zdenek Tronicek 他1名
- 通讯作者:Zdenek Tronicek 他1名
部分構造を統合的に活用するためのXMLデータ管理システム.
用于集成使用子结构的 XML 数据管理系统。
- DOI:
- 发表时间:2006
- 期刊:
- 影响因子:0
- 作者:Toshiyuki Kochi;et al.;Jun Inoue et al.;Jun Inoue et al.;Hayato Kobayashi et al.;Shunsuke Inenaga et al.;福田 智子;福田 智子 他;今井 明 他;Jun Inoue et al.;Jun Inoue et al.;Hayato Kobayashi et al.;黒木 香 他;杉本 典子 他
- 通讯作者:杉本 典子 他
<恵慶百首>秋部試注.
<Kekei Hyakushu>秋部预购。
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:Hisashi Tsuji;et al.;石野 明 他;黒木 香 他
- 通讯作者:黒木 香 他
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
TAKEDA Masayuki其他文献
TAKEDA Masayuki的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('TAKEDA Masayuki', 18)}}的其他基金
Mechanism of driver mutation positive lung cancer
驱动突变阳性肺癌发生机制
- 批准号:
15K21525 - 财政年份:2015
- 资助金额:
$ 9.22万 - 项目类别:
Grant-in-Aid for Young Scientists (B)
Studies on lower urinary tract dysfunction pathogenesis by complex systems network and dynamic homeostasis collapse
复杂系统网络和动态稳态崩溃对下尿路功能障碍发病机制的研究
- 批准号:
15H04972 - 财政年份:2015
- 资助金额:
$ 9.22万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Research on vesicular type transporter and refractory lower urinary tract dysfunction
囊泡型转运蛋白与难治性下尿路功能障碍的研究
- 批准号:
26670699 - 财政年份:2014
- 资助金额:
$ 9.22万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Mechanisms And Possible Novel Treatments for Nocturiarelated to Abnormal Circadian Rhythm
与昼夜节律异常相关的夜尿症的机制和可能的新疗法
- 批准号:
23659754 - 财政年份:2011
- 资助金额:
$ 9.22万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Teaching Materials and Tools for Education of Information Science for Elementary, Junior High and High School Students
中小学生信息科学教育教材与工具
- 批准号:
23650515 - 财政年份:2011
- 资助金额:
$ 9.22万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Research on the afferent transduction in the lower urinary tract
下尿路传入传导的研究
- 批准号:
23390381 - 财政年份:2011
- 资助金额:
$ 9.22万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Foundational technology for light-weight XML-DBMS based on very fast compressed data stream processing
基于极快压缩数据流处理的轻量级 XML-DBMS 基础技术
- 批准号:
22300010 - 财政年份:2010
- 资助金额:
$ 9.22万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Reseaarch on the function and development of new therapy for Ion channels in the lower urinary tract
下尿路离子通道的功能研究及新疗法的开发
- 批准号:
20390423 - 财政年份:2008
- 资助金额:
$ 9.22万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Key Technology for XML DB in Embedded Device Based on Efficient Compressed Pattern Matching
基于高效压缩模式匹配的嵌入式XML数据库关键技术
- 批准号:
19300008 - 财政年份:2007
- 资助金额:
$ 9.22万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Elucidation on mechanisms of vanilloid and cannabinoid functions in the urogenital afferent neurotransmissions
阐明香草酸和大麻素在泌尿生殖传入神经传递中的功能机制
- 批准号:
14370508 - 财政年份:2002
- 资助金额:
$ 9.22万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
相似国自然基金
Understanding structural evolution of galaxies with machine learning
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
相似海外基金
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
- 批准号:
2337776 - 财政年份:2024
- 资助金额:
$ 9.22万 - 项目类别:
Continuing Grant
RII Track-4:NSF: Physics-Informed Machine Learning with Organ-on-a-Chip Data for an In-Depth Understanding of Disease Progression and Drug Delivery Dynamics
RII Track-4:NSF:利用器官芯片数据进行物理信息机器学习,深入了解疾病进展和药物输送动力学
- 批准号:
2327473 - 财政年份:2024
- 资助金额:
$ 9.22万 - 项目类别:
Standard Grant
CC* Campus Compute: UTEP Cyberinfrastructure for Scientific and Machine Learning Applications
CC* 校园计算:用于科学和机器学习应用的 UTEP 网络基础设施
- 批准号:
2346717 - 财政年份:2024
- 资助金额:
$ 9.22万 - 项目类别:
Standard Grant
Learning to create Intelligent Solutions with Machine Learning and Computer Vision: A Pathway to AI Careers for Diverse High School Students
学习利用机器学习和计算机视觉创建智能解决方案:多元化高中生的人工智能职业之路
- 批准号:
2342574 - 财政年份:2024
- 资助金额:
$ 9.22万 - 项目类别:
Standard Grant
Collaborative Research: Conference: DESC: Type III: Eco Edge - Advancing Sustainable Machine Learning at the Edge
协作研究:会议:DESC:类型 III:生态边缘 - 推进边缘的可持续机器学习
- 批准号:
2342498 - 财政年份:2024
- 资助金额:
$ 9.22万 - 项目类别:
Standard Grant
Excellence in Research:Towards Data and Machine Learning Fairness in Smart Mobility
卓越研究:实现智能移动中的数据和机器学习公平
- 批准号:
2401655 - 财政年份:2024
- 资助金额:
$ 9.22万 - 项目类别:
Standard Grant
I-Corps: Translation potential of using machine learning to predict oxaliplatin chemotherapy benefit in early colon cancer
I-Corps:利用机器学习预测奥沙利铂化疗对早期结肠癌疗效的转化潜力
- 批准号:
2425300 - 财政年份:2024
- 资助金额:
$ 9.22万 - 项目类别:
Standard Grant
CAREER: Mitigating the Lack of Labeled Training Data in Machine Learning Based on Multi-level Optimization
职业:基于多级优化缓解机器学习中标记训练数据的缺乏
- 批准号:
2339216 - 财政年份:2024
- 资助金额:
$ 9.22万 - 项目类别:
Continuing Grant
Postdoctoral Fellowship: OPP-PRF: Leveraging Community Structure Data and Machine Learning Techniques to Improve Microbial Functional Diversity in an Arctic Ocean Ecosystem Model
博士后奖学金:OPP-PRF:利用群落结构数据和机器学习技术改善北冰洋生态系统模型中的微生物功能多样性
- 批准号:
2317681 - 财政年份:2024
- 资助金额:
$ 9.22万 - 项目类别:
Standard Grant
Accelerated discovery of ultra-fast ionic conductors with machine learning
通过机器学习加速超快离子导体的发现
- 批准号:
24K08582 - 财政年份:2024
- 资助金额:
$ 9.22万 - 项目类别:
Grant-in-Aid for Scientific Research (C)