Graph Grammars for Molecular Structure Search and Classification
用于分子结构搜索和分类的图文法
基本信息
- 批准号:416768284
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:德国
- 项目类别:Research Grants
- 财政年份:2019
- 资助国家:德国
- 起止时间:2018-12-31 至 2022-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Numerous fields of study focus on small molecules. A prominent example is the field of drug design, where small molecules are used to inhibit or activate proteins to achieve a desired biological function. In these fields, we often want to scan databases for molecules containing certain substructures. Traditionally, these substructures are modelled in chemical description languages such as Daylight’s SMARTS. These languages tend to be very complex and are very restricted in their ability to describe the topological patterns of the underlying graphs. Parsing and matching patterns against a database of molecules is NP-complete. To circumvent these problems, we propose a simple graph grammar to describe substructures. Even very simple graph rewriting systems allow a high expressive power that almost reaches that of SMARTS. To use these graph grammars for molecular structure search, we have to solve the subgraph matching problem. Although this problem remains NP-complete, it becomes polynomial if each minimal cut of the query graph has bounded size, which we empirically find to be true for most molecules contained in the standard databases. We will investigate the complexity of the problem for more known graph parameters and try to relate the maximal size of a minimal cut to other parameters and we will focus on parameters that are typically small for molecular graphs and we will make our basic algorithm more efficient in practice. Furthermore, we want to derive over-approximations of the class of graphs generated by a grammar for which the subgraph matching problem can be solved more efficiently. As a second research direction, we will develop and implement efficient algorithms for learning graph grammars from positive and negative examples. We aim to find a graph grammar that is as simple as possible and matches the positive examples but does not match the negative examples for the chemical group. A trivial grammar that interpolates the positive and negative examples is a grammar that creates positive examples that clearly overfit the positive examples. The underlying idea behind this learning task is to automatically identify aspects of the pharmacophore of these molecules. The challenge here is to simultaneously prevent overfitting and overgeneralization. We plan to develop constructive algorithms, i.e. algorithms that compute a simple graph grammar that interpolates the positive and negative examples and improvement algorithms, i.e. algorithms that try to simplify a graph grammar while preserving its interpolating property.
许多领域的研究都集中在小分子上。一个突出的例子是药物设计领域,在药物设计领域,小分子被用来抑制或激活蛋白质,以实现所需的生物功能。在这些领域中,我们经常想要扫描数据库以寻找包含某些亚结构的分子。传统上,这些子结构是用日光的SMARTS等化学描述语言建模的。这些语言往往非常复杂,并且在描述底层图形的拓扑模式的能力方面受到很大限制。在分子数据库中解析和匹配模式是NP完全的。为了避免这些问题,我们提出了一种简单的图文法来描述子结构。即使是非常简单的图形重写系统也可以实现几乎达到SMARTS的高度表现力。为了将这些图文法用于分子结构搜索,我们必须解决子图匹配问题。虽然这个问题仍然是NP-完全的,但如果查询图的每个最小割都有有限的大小,那么它就是多项式的,我们根据经验发现,对于标准数据库中包含的大多数分子来说,这是正确的。对于更多已知的图参数,我们将研究问题的复杂性,并尝试将最小割的最大尺寸与其他参数联系起来,我们将重点关注分子图中通常较小的参数,并将使我们的基本算法在实践中更有效。此外,我们想要得到由一种文法生成的图类的过近似,对于这种文法,子图匹配问题可以更有效地解决。作为第二个研究方向,我们将开发和实现从正例和反例学习图文法的高效算法。我们的目标是找到一种尽可能简单的图文法,它与正例匹配,但与化学组的反例不匹配。插入正例和反例的平凡文法是这样一种文法,它创建的正例明显超过了正例。这项学习任务背后的潜在想法是自动识别这些分子的药效团的各个方面。这里的挑战是同时防止过度适应和过度泛化。我们计划开发构造性算法,即计算插入正负示例的简单图文法的算法,以及改进算法,即试图简化图文法同时保持其插值性的算法。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Professor Dr. Ernst Althaus其他文献
Professor Dr. Ernst Althaus的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Professor Dr. Ernst Althaus', 18)}}的其他基金
Einfache und schnelle Implementierung von exakten Optimierungsalgorithmen mit SCIL
使用 SCIL 简单快速地实现精确优化算法
- 批准号:
48021572 - 财政年份:2007
- 资助金额:
-- - 项目类别:
Priority Programmes
相似海外基金
The Emergence and Refinement of Grammars: perspectives from syntax and phonology
语法的出现和完善:句法和音韵学的视角
- 批准号:
2890509 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Studentship
EAGER: Building Language Technologies by Machine Reading Grammars
EAGER:通过机器阅读语法构建语言技术
- 批准号:
2327143 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Doctoral Dissertation Research: How flexible are grammars past puberty? Evidence from heritage language returnees
博士论文研究:青春期过后语法的灵活性如何?
- 批准号:
2234698 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Algorithms and Inference of Grammars and Natural Computing Models
语法和自然计算模型的算法和推理
- 批准号:
RGPIN-2022-05092 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
MIM: Elucidating the Rules of Cooperation and Resiliency in Microbial Communities through Stochastic Graph Grammars
MIM:通过随机图语法阐明微生物群落的合作和弹性规则
- 批准号:
2125965 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Standard Grant
MIM: Elucidating the Rules of Cooperation and Resiliency in Microbial Communities through Stochastic Graph Grammars
MIM:通过随机图语法阐明微生物群落的合作和弹性规则
- 批准号:
2126387 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Standard Grant
CRII: III: Toward the Compression of Pangenomic DNA Sequence Data Using Context-Free Grammars
CRII:III:使用上下文无关语法压缩泛基因组 DNA 序列数据
- 批准号:
2105391 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Standard Grant
Vulnerable native grammars: the effects of limited input in native language attrition
脆弱的母语语法:有限输入对母语磨损的影响
- 批准号:
AH/T005157/1 - 财政年份:2020
- 资助金额:
-- - 项目类别:
Research Grant
Integrating prosodic structure into computational grammars
将韵律结构整合到计算语法中
- 批准号:
447093200 - 财政年份:2020
- 资助金额:
-- - 项目类别:
WBP Fellowship
Natural Language Acquisition for Machines - Reinforcement Learning of Minimalist Grammars
机器自然语言习得——极简语法的强化学习
- 批准号:
432615119 - 财政年份:2020
- 资助金额:
-- - 项目类别:
Research Grants














{{item.name}}会员




