Kolmogorov complexity and its applications

柯尔莫哥洛夫复杂度及其应用

基本信息

  • 批准号:
    RGPIN-2016-03687
  • 负责人:
  • 金额:
    $ 4.59万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2018
  • 资助国家:
    加拿大
  • 起止时间:
    2018-01-01 至 2019-12-31
  • 项目状态:
    已结题

项目摘要

I am interested in developing a compelling theory of big data. Such a theory will depend on Kolmogorov complexity and information distance. Kolmogorov complexity is defined on one object. Information distance [C. Bennett, P. Gacs, M. Li, P. Vitanyi, W. Zurek, Information distance, IEEE Tran-IT, 44:4(1998)] is defined on two objects. This concept can be generalized to many objects. Using such a theory it is possible to optimally approximate the intuitive concept of "semantic distance" or closeness of two piece of data, in general. The key to this theory is to compress the data. Many ways of compressing data will be studied, including error encoding, clustering, and especially deep neural networks. Deep neural networks can be considered as ways of compressing data, especially big data. The following short-term goals are in tune with the above main theme of this research: ***1) Deep learning in natural language processing (NLP). My group has trained a Convolutional Neural Network (CNN) to map natural language questions to a database structured query with a limited number of relations. This work will continue. My group also has trained a Recurrent Neural Network (RNN) for conversation or chatting. This work will be extended to context sensitive chatting. This work will have two implications with the long term goal: a) Neural network will be studied as one way to approximate semantic distance; and b) Only with big data from the internet, this approach is practically useful.***2) Bioinformatics. A CNN has also been trained for protein identification as well as for peak-picking in mass spectrometry protein quantitation. These studies and methodologies will be extended to protein quantitation. This work again depends on huge amount of training data I have obtained from industry. ***These deep learning approaches will not be studied in isolation. They will be studied together with my theory of approximating semantic distance by information distance, experimenting with the efficiency of using deep neural networks as compression methods to deal with big data when there are no clear rules of compressing. I will also spend 8 months full time to revise his research book with Paul Vitanyi "An introduction to Kolmogorov complexity and its applications", that will include these new results. ***Several other short-term topics in bioinformatics will be studied. One is an antibody sequencing algorithm. I plan to design a new algorithm using linear programming to solve a bioinformatics industrial problem of antibody sequencing. Another problem is to apply the ideas in bioinformatics to other fields: optimal spaced seeds were invented by my group to do homology search. This has been considered one of the most influential innovations in bioinformatics during the last 15 years. I have the idea of using the optimal spaced seed idea to develop an observation theory to detect the trends in time series. Initial experiments were performed successfully.**
我对发展一个令人信服的大数据理论感兴趣。这样的理论将依赖于柯尔莫哥洛夫复杂度和信息距离。Kolmogorov复杂度定义在一个对象上。信息距离[C.班尼特,P. Li,P. Vitanyi,W. Zurek,Information distance,IEEE Tran-IT,44:4(1998)]定义在两个对象上。这个概念可以推广到许多对象。使用这样的理论,通常可以最佳地近似两个数据的“语义距离”或接近度的直观概念。 这个理论的关键是压缩数据。将研究许多压缩数据的方法,包括错误编码,聚类,特别是深度神经网络。深度神经网络可以被认为是压缩数据,特别是大数据的方法。以下短期目标与本研究的上述主题一致:*1)自然语言处理(NLP)中的深度学习。我的团队已经训练了一个卷积神经网络(CNN),将自然语言问题映射到具有有限数量关系的数据库结构化查询。这项工作将继续下去。我的团队还训练了一个用于对话或聊天的递归神经网络(RNN)。这项工作将扩展到上下文敏感的聊天。这项工作将对长期目标产生两个影响:a)神经网络将被研究为近似语义距离的一种方法; B)只有使用来自互联网的大数据,这种方法才是实际有用的。2)生物信息学。CNN也被训练用于蛋白质鉴定以及质谱蛋白质定量中的峰拾取。这些研究和方法将扩展到蛋白质定量。这项工作再次依赖于我从工业界获得的大量训练数据。* 这些深度学习方法不会孤立地研究。它们将与我的信息距离近似语义距离的理论一起研究,实验在没有明确压缩规则的情况下,使用深度神经网络作为压缩方法来处理大数据的效率。我还将花8个月的时间来修改他的研究书与保罗Vitanyi“介绍Kolmogorov复杂性及其应用”,其中将包括这些新的结果。* 将研究生物信息学的其他几个短期专题。一个是抗体测序算法。我计划设计一个新的算法,使用线性规划来解决一个生物信息工业问题的抗体测序。 另一个问题是将生物信息学的思想应用到其他领域:最佳间隔种子是由我的团队发明的,用于同源搜索。这被认为是过去15年来生物信息学中最具影响力的创新之一。我有一个想法,使用最佳间隔种子的想法来开发一个观察理论来检测时间序列中的趋势。初步试验成功 **。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Li, Ming其他文献

Prognostic nutritional index with postoperative complications and 2-year mortality in hip fracture patients: an observational cohort study.
  • DOI:
    10.1097/js9.0000000000000614
  • 发表时间:
    2023-11-01
  • 期刊:
  • 影响因子:
    15.3
  • 作者:
    Wang, Yilin;Jiang, Yu;Luo, Yan;Lin, Xisheng;Song, Mi;Li, Jia;Zhao, Jingxin;Li, Ming;Jiang, Yuheng;Yin, Pengbin;Tang, Peifu;Lyu, Houchen;Zhang, Licheng
  • 通讯作者:
    Zhang, Licheng
Conotruncal heart defects and common variants in maternal and fetal genes in folate, homocysteine, and transsulfuration pathways.
A Narrow-Passband and Frequency-Tunable Microwave Photonic Filter Based on Phase-Modulation to Intensity-Modulation Conversion Using a Phase-Shifted Fiber Bragg Grating
Effect of Public Deliberation on Attitudes toward Return of Secondary Results in Genomic Sequencing.
  • DOI:
    10.1007/s10897-016-9987-0
  • 发表时间:
    2017-02
  • 期刊:
  • 影响因子:
    1.9
  • 作者:
    Gornick, Michele C.;Scherer, Aaron M.;Sutton, Erica J.;Ryan, Kerry A.;Exe, Nicole L.;Li, Ming;Uhlmann, Wendy R.;Kim, Scott Y. H.;Roberts, J. Scott;De Vries, Raymond G.
  • 通讯作者:
    De Vries, Raymond G.
Automatic speaker age and gender recognition using acoustic and prosodic level information fusion
  • DOI:
    10.1016/j.csl.2012.01.008
  • 发表时间:
    2013-01-01
  • 期刊:
  • 影响因子:
    4.3
  • 作者:
    Li, Ming;Han, Kyu J.;Narayanan, Shrikanth
  • 通讯作者:
    Narayanan, Shrikanth

Li, Ming的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Li, Ming', 18)}}的其他基金

Bioinformatics
生物信息学
  • 批准号:
    CRC-2015-00208
  • 财政年份:
    2022
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Canada Research Chairs
Kolmogorov complexity and algorithms for immunopeptidomics
免疫肽组学的柯尔莫哥洛夫复杂性和算法
  • 批准号:
    RGPIN-2022-02942
  • 财政年份:
    2022
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Discovery Grants Program - Individual
Bioinformatics
生物信息学
  • 批准号:
    CRC-2015-00208
  • 财政年份:
    2021
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Canada Research Chairs
Kolmogorov complexity and its applications
柯尔莫哥洛夫复杂度及其应用
  • 批准号:
    RGPIN-2016-03687
  • 财政年份:
    2021
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Discovery Grants Program - Individual
Kolmogorov complexity and its applications
柯尔莫哥洛夫复杂度及其应用
  • 批准号:
    RGPIN-2016-03687
  • 财政年份:
    2020
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Discovery Grants Program - Individual
Bioinformatics
生物信息学
  • 批准号:
    CRC-2015-00208
  • 财政年份:
    2020
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Canada Research Chairs
Bioinformatics
生物信息学
  • 批准号:
    CRC-2015-00208
  • 财政年份:
    2019
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Canada Research Chairs
Kolmogorov complexity and its applications
柯尔莫哥洛夫复杂度及其应用
  • 批准号:
    RGPIN-2016-03687
  • 财政年份:
    2019
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Discovery Grants Program - Individual
Bioinformatics
生物信息学
  • 批准号:
    CRC-2015-00208
  • 财政年份:
    2018
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Canada Research Chairs
Kolmogorov complexity and its applications
柯尔莫哥洛夫复杂度及其应用
  • 批准号:
    RGPIN-2016-03687
  • 财政年份:
    2017
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Discovery Grants Program - Individual

相似海外基金

Unravelling snowscapes complexity and its impacts on modelled snow cover evolution in northern regions
揭示雪景的复杂性及其对北部地区积雪演化模型的影响
  • 批准号:
    RGPNS-2022-04631
  • 财政年份:
    2022
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Discovery Grants Program - Northern Research Supplement
Unravelling snowscapes complexity and its impacts on modelled snow cover evolution in northern regions
揭示雪景的复杂性及其对北部地区积雪演化模型的影响
  • 批准号:
    RGPIN-2022-04631
  • 财政年份:
    2022
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Discovery Grants Program - Individual
Low-complexity research for next-generation VVC standard and its neural network extension
下一代VVC标准及其神经网络扩展的低复杂度研究
  • 批准号:
    21K17770
  • 财政年份:
    2021
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
CSR: Small: Evolution of Computer Vision for Low Power Devices, Breaking its Power Wall and Computational Complexity
CSR:小:低功耗设备计算机视觉的发展,打破其功耗墙和计算复杂性
  • 批准号:
    2146726
  • 财政年份:
    2021
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Standard Grant
Kolmogorov complexity and its applications
柯尔莫哥洛夫复杂度及其应用
  • 批准号:
    RGPIN-2016-03687
  • 财政年份:
    2021
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Discovery Grants Program - Individual
Intrinsic Complexity of Random Fields and Its Connections to Random Matrices and Stochastic Differential Equations
随机场的内在复杂性及其与随机矩阵和随机微分方程的联系
  • 批准号:
    2048877
  • 财政年份:
    2020
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Standard Grant
Kolmogorov complexity and its applications
柯尔莫哥洛夫复杂度及其应用
  • 批准号:
    RGPIN-2016-03687
  • 财政年份:
    2020
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Discovery Grants Program - Individual
Research on the Basic Properties of Chaotic Loewner Evolution and Its Application to Neuronal Morphology
混沌Loewner进化的基本性质及其在神经形态学中的应用研究
  • 批准号:
    20J20867
  • 财政年份:
    2020
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
The complexity of the constraint satisfaction problem and its variants
约束满足问题及其变体的复杂性
  • 批准号:
    RGPIN-2015-04656
  • 财政年份:
    2019
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Discovery Grants Program - Individual
Kolmogorov complexity and its applications
柯尔莫哥洛夫复杂度及其应用
  • 批准号:
    RGPIN-2016-03687
  • 财政年份:
    2019
  • 资助金额:
    $ 4.59万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了