Synthetically Accessible Virtual Inventory (SAVI)
可综合访问的虚拟库存 (SAVI)
基本信息
- 批准号:10926263
- 负责人:
- 金额:$ 36.94万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:2019-nCoVAddressBehaviorBusinessesChemicalsChemistryClinicalCodeCollaborationsCustomDataDatabasesDevelopmentDockingDrug DesignEquipment and supply inventoriesFutureGenerationsGermanyGoalsGrowthHIV-1InformaticsIntelligenceInternationalInternetIntuitionKnowledgeLanguageLinkLogicMalignant NeoplasmsMethyltransferaseModelingMolecularNatureNucleocapsidOutcomePaperPatternPeer ReviewPilot ProjectsPredictive Cancer ModelPricePrintingProcessProductivityProgramming LanguagesPropertyPublicationsPublishingReactionResearchResourcesRouteRunningSamplingServicesStructureSystems BiologyTechnologyTestingWorkWritingcancer initiationcomputer generatedcost effectivedesigndrug candidatedrug developmentendonucleaseexperiencefunctional groupin silicoknowledgebasemanufacturing runmembernew therapeutic targetnovelnovel therapeuticsscreeningsuccesstumoruser-friendlyvirtualweb server
项目摘要
The SAVI Project is based on: (a) a set of transforms with rich chemical context annotation including functional group reactivity data (LHASA, LLC, U.S.; and Lhasa Limited, UK) (b) a set of highly annotated building blocks (Sigma-Aldrich, Global Strategic Services) (c) the chemoinformatics toolkit CACTVS with custom development (Xemistry GmbH, Germany) The transforms are a set of more than 1,500 rules described in the CHMTRN/PATRAN language for encoding chemical transformations with chemical context and quality criteria added, based ultimately on the pioneering work of E. J. Corey. These rules, in contrast to simple SMIRKS transforms, allow/provide: - Computation of whether a reaction, depending on the overall structural features of the target, will work at all. - Scoring: If the reaction works, how robust it is, taking into account overall structural features. - Whether protection of interfering groups is required - and these can then already be integrated in the final starting materials queries to prioritize pre-protected starting materials. - Proposal of suitable context-dependent reaction conditions. - Textual warnings in specific circumstances, such as potential of multiple products, borderline conditions, etc. Ancillary information to the rules is a set of functional group reactivity data, i.e. a table describing whether any of the standard functional groups in the rule set is unstable under any of the standard conditions. The building blocks are a set of several hundred thousand compounds available in gram quantities, and with high reliability, from, or through, Sigma-Aldrich. This set has been annotated with pricing information and other business intelligence type data useful for this project. The chemoinformatics toolkit CACTVS has been expanded in various ways, e.g. with the capability to read the CHMTRN/PATRAN transforms. An important feature that needed to be implemented was the handling of the reversal of the original LHASA transform direction, without re-writing rules, for the strictly forward-synthetic SAVI project. Another important capability was the initial and final starting material (SM) query handling, i.e. the 4-steps: initial SM query extraction from the 2D patterns in the rules; forward reaction from the 2D patterns; scoring (which is the only original LHASA functionality); final SM query expansion (R-groups, protecting groups, etc.). For the goal of filtering out structures with less-than-desirable attributes in the drug development context, several additional computed properties regarded as important in current drug design have been implemented, such as the demerit scores based on 275 rules for identifying potentially reactive or promiscuous compounds, published by Bruns and Watson (J. Med. Chem. 2012, 55, 9763?9772); dx.doi.org/10.1021/jm301008n. In the current, very early alpha, stage of this project, only 11 transforms of the possible 1,500 were used; applied to approx. 230,000 building blocks; in only one-step reactions. The 610,000 resulting products have been annotated but not yet filtered with any of the computed or associated molecular properties. To limit the file size, only on the order of one percent of the theoretically possible products (of one-step reactions) was sampled. We have addressed the task of generating schematic graphical representations of the transforms. We are ultimately aiming at creating a database of one billion high-quality screening samples that should be easily and cheaply synthesizable. Our first full production run, using 14 transforms and about 377,000 building blocks, has resulted in more than 236 million products. These novel molecules are all annotated with a proposed simple and high-yield synthetic route, as well as by 50 molecular properties generally recognized as important in cutting-edge drug design that we have implemented. We are developing an approach that is intended to primarily allow the rapid determination of the targeted chunks of the current or a future, much larger, SAVI database that is optimal for finding active molecules for a given target ("a SAVI a la carte"). This technology is called SLICE (Smarts and Logic In ChEmistry). SLICE is designed to (a) be a simple, powerful and open language that allows chemists to encode chemistry knowledge with no/low code and complete SMARTS by reasoning directly on a molecule, with a UI that should permit one to enter and test a new transform quickly, (b) integrated in a newly developed no/low-code platform that allows users to graphically encode chemistry knowledge without experience with programming languages, (c) be fast in the execution of SLICE-enabled transforms on building blocks to generate products, and (d) to allow for reactant-based filtering of the possible SAVI space to find the right "a la carte" SAVI menu for the given target. An intuitive, user friendly web GUI has been launched and is currently being used by team members for writing transforms in the new SLICE language. The GUI will allow users free access to this database via searches by various criteria including substructure searches. It will also present links to pages where users can place requests for having the molecule(s) synthesized by commercial entities. Additional novel transforms for chemistry heretofore not in the knowledgebase have been written, yielding a total of over 70 productive and drafted transforms. After a change in the business model of Sigma-Aldrich, we decided to change the set of building blocks to Enamine, from whom we got about 151,000 possible structures. With 143,000 of those matching the 53 productive transforms used, we finished calculating 1.75 billion SAVI products in early 2020. We have made them publicly available for download on the CADD Group's web server at https://doi.org/10.35115/37N9-5738. A publication about the SAVI project is available as a preprint at https://doi.org/10.26434/chemrxiv.12185559.v1, with the peer-reviewed paper published in Nature Scientific Data. CR colleagues are using the SAVI database to screen for docking against the following SARS-CoV-2 targets: NSP7, NSP8, NSP9, NSP10, NSP15 (endonuclease), NSP16 (methyltransferase), Spike RBD, Nucleocapsid. SAVI syntheses have been extraordinarily successful at 97% success rates with the SAVI-predicted route. Out of about 170 synthesized molecules tested against cancer, HIV-1, and SARS-CoV-2 targets, tens have shown activity. New transforms have been written,
SAVI项目基于:(a)一组包含官能团反应性数据的丰富化学上下文注释的转换(LHASA, LLC, U.S.;(b)一套高度注释的构建模块(Sigma-Aldrich, Global Strategic Services) (c)定制开发的化学信息学工具包CACTVS (Xemistry GmbH,德国)。转换是一套超过1500条规则,用CHMTRN/PATRAN语言描述,用于对化学转换进行编码,并添加化学上下文和质量标准,最终基于E. J. Corey的开创性工作。与简单的SMIRKS变换相反,这些规则允许/提供:-根据目标的整体结构特征计算反应是否有效。-评分:如果反应有效,考虑到整体结构特征,反应有多强。-是否需要对干扰组进行保护-然后这些可以在最终的起始材料查询中集成,以优先考虑预保护的起始材料。-提出合适的环境相关反应条件。-特定情况下的文本警告,如潜在的多种产品,边界条件等。规则的辅助信息是一组功能组反应性数据,即描述规则集中的任何标准功能组在任何标准条件下是否不稳定的表。构建模块是一组数十万种化合物,以克为单位,具有高可靠性,来自或通过西格玛-奥尔德里奇。这个集合已经用对这个项目有用的定价信息和其他商业智能类型数据进行了注释。化学信息学工具包CACTVS已经以各种方式进行了扩展,例如具有读取CHMTRN/PATRAN转换的能力。需要实现的一个重要功能是处理原始拉萨转换方向的逆转,而不需要重新编写规则,以严格地向前合成SAVI项目。另一个重要的功能是初始和最终起始材料(SM)查询处理,即四个步骤:从规则中的2D模式中提取初始SM查询;二维模式的正向反应;计分(这是拉萨唯一的原始功能);最终SM查询扩展(r组、保护组等)。为了在药物开发环境中过滤掉具有不理想属性的结构,一些额外的计算属性在当前药物设计中被认为是重要的,例如基于275规则识别潜在反应性或混杂化合物的分数,由Bruns和Watson发表(J. Med. Chem. 2012, 55, 9763?9772);dx.doi.org/10.1021/jm301008n。在目前这个项目非常早期的alpha阶段,可能的1500个变换中只使用了11个;适用于约。23万个积木;只需要一步反应。610,000个最终产物已被注释,但尚未过滤任何计算或相关的分子性质。为了限制文件大小,只对(一步反应)理论上可能的产物的百分之一的数量级进行采样。我们已经处理了生成变换的图解表示的任务。我们的最终目标是创建一个包含10亿个高质量筛选样本的数据库,这些样本应该可以轻松廉价地合成。我们的第一次全面生产运行,使用了14个转换和约377,000个构建模块,产生了超过2.36亿个产品。这些新分子都标注了一个简单而高产的合成路线,以及50个在我们已经实施的尖端药物设计中被普遍认为重要的分子特性。我们正在开发一种方法,主要用于快速确定当前或未来更大的SAVI数据库的目标块,该数据库最适合为给定目标寻找活性分子(“SAVI点菜”)。这项技术被称为SLICE(化学中的智能和逻辑)。SLICE被设计成(a)是一种简单,强大和开放的语言,允许化学家用无/低代码编码化学知识,并通过直接对分子进行推理来完成SMARTS,具有允许人们快速进入和测试新转换的UI, (b)集成在新开发的无/低代码平台中,允许用户在没有编程语言经验的情况下以图形方式编码化学知识。(c)在构建块上快速执行支持slice的转换以生成产品,以及(d)允许对可能的SAVI空间进行基于反应物的过滤,以便为给定目标找到正确的“点菜”SAVI菜单。一个直观的、用户友好的web GUI已经发布,目前团队成员正在使用它用新的SLICE语言编写转换。GUI将允许用户通过各种标准(包括子结构搜索)免费访问该数据库。它还将提供一些页面链接,用户可以在这些页面上请求由商业实体合成这种分子。迄今为止,尚未在知识库中编写了化学的其他新颖变换,总共产生了70多个生产性和起草的变换。在改变了西格玛-奥尔德里奇的商业模式后,我们决定将构建模块改为Enamine,从它那里我们得到了大约15.1万个可能的结构。其中143,000个与53个生产性转换相匹配,我们在2020年初完成了17.5亿个SAVI产品的计算。我们已经在CADD集团的web服务器https://doi.org/10.35115/37N9-5738上公开提供下载。关于SAVI项目的出版物可以在https://doi.org/10.26434/chemrxiv.12185559.v1上获得预印本,同行评议的论文发表在《自然科学数据》上。CR的同事正在使用SAVI数据库筛选与以下SARS-CoV-2靶点的对接:NSP7、NSP8、NSP9、NSP10、NSP15(核酸内切酶)、NSP16(甲基转移酶)、Spike RBD、核衣壳。根据SAVI预测的路线,SAVI合成的成功率达到了97%。在大约170个针对癌症、HIV-1和SARS-CoV-2目标的合成分子中,有10个显示出活性。写了新的变换,
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Large-Scale Modeling of Multispecies Acute Toxicity End Points Using Consensus of Multitask Deep Learning Methods.
- DOI:10.1021/acs.jcim.0c01164
- 发表时间:2021-02-22
- 期刊:
- 影响因子:5.6
- 作者:Jain S;Siramshetty VB;Alves VM;Muratov EN;Kleinstreuer N;Tropsha A;Nicklaus MC;Simeonov A;Zakharov AV
- 通讯作者:Zakharov AV
ReactionCode: format for reaction searching, analysis, classification, transform, and encoding/decoding.
- DOI:10.1186/s13321-020-00476-x
- 发表时间:2020-12-03
- 期刊:
- 影响因子:8.6
- 作者:Delannée V;Nicklaus MC
- 通讯作者:Nicklaus MC
[Discovering new antiretroviral compounds in "Big Data" chemical space of the SAVI library].
[在SAVI图书馆的“大数据”化学空间中发现新的抗逆转录病毒化合物]。
- DOI:10.18097/pbmc20196502073
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Savosina,PI;Stolbov,LA;Druzhilovskiy,DS;Filimonov,DA;Nicklaus,MC;Poroikov,VV
- 通讯作者:Poroikov,VV
Special Issue on Reaction Informatics and Chemical Space.
反应信息学和化学空间特刊。
- DOI:10.1021/acs.jcim.2c00390
- 发表时间:2022
- 期刊:
- 影响因子:5.6
- 作者:Rarey,Matthias;Nicklaus,MarcC;Warr,Wendy
- 通讯作者:Warr,Wendy
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
MARC NICKLAUS其他文献
MARC NICKLAUS的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('MARC NICKLAUS', 18)}}的其他基金
HIV Integrase Modeling and Computer-Aided Inhibitor Deve
HIV整合酶建模和计算机辅助抑制剂开发
- 批准号:
7291875 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
HIV Integrase Modeling and Computer-Aided Inhibitor Development
HIV 整合酶建模和计算机辅助抑制剂开发
- 批准号:
7965392 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
HIV Integrase Modeling and Computer-Aided Inhibitor and Microbicide Development
HIV 整合酶建模以及计算机辅助抑制剂和杀菌剂开发
- 批准号:
10702372 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
HIV Integrase Modeling and Computer-Aided Inhibitor Development
HIV 整合酶建模和计算机辅助抑制剂开发
- 批准号:
7733068 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
Large Databases of Small Molecules - Drug Development Tool and Public Resource
小分子大型数据库 - 药物开发工具和公共资源
- 批准号:
10926595 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
Large Databases of Small Molecules - Drug Development Tool and Public Resource
小分子大型数据库 - 药物开发工具和公共资源
- 批准号:
10703018 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
相似海外基金
Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
- 批准号:
MR/S03398X/2 - 财政年份:2024
- 资助金额:
$ 36.94万 - 项目类别:
Fellowship
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
- 批准号:
EP/Y001486/1 - 财政年份:2024
- 资助金额:
$ 36.94万 - 项目类别:
Research Grant
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
- 批准号:
2338423 - 财政年份:2024
- 资助金额:
$ 36.94万 - 项目类别:
Continuing Grant
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
- 批准号:
MR/X03657X/1 - 财政年份:2024
- 资助金额:
$ 36.94万 - 项目类别:
Fellowship
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
- 批准号:
2348066 - 财政年份:2024
- 资助金额:
$ 36.94万 - 项目类别:
Standard Grant
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
- 批准号:
AH/Z505481/1 - 财政年份:2024
- 资助金额:
$ 36.94万 - 项目类别:
Research Grant
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:
10107647 - 财政年份:2024
- 资助金额:
$ 36.94万 - 项目类别:
EU-Funded
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
- 批准号:
2341402 - 财政年份:2024
- 资助金额:
$ 36.94万 - 项目类别:
Standard Grant
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:
10106221 - 财政年份:2024
- 资助金额:
$ 36.94万 - 项目类别:
EU-Funded
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
- 批准号:
AH/Z505341/1 - 财政年份:2024
- 资助金额:
$ 36.94万 - 项目类别:
Research Grant