权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Centralized assay datasets for modelling support of small drug discovery organizations

用于小型药物发现组织建模支持的集中化分析数据集

基本信息

批准号：
10474479
负责人：
SEAN EKINS
金额：
$ 85.47万
依托单位：
COLLABORATIONS PHARMACEUTICALS, INC.
依托单位国家：
美国
项目类别：
财政年份：
2017
资助国家：
美国
起止时间：
2017-01-01 至 2024-07-31
项目状态：
已结题

来源：
https://reporter.nih.gov/project-details/10474479
关键词：
Acetylcholinesterase Acetylcholinesterase Inhibitors Algorithms Alzheimer&apos s Disease Artificial Intelligence Back Bayesian Modeling Biological Biological Assay Biological Testing CCR5 gene CXCR4 gene Cells Chemistry Client Collaborations Collection Complement Complex Computer software Consult Data Data Discovery Data Set Data Sources Databases Descriptor Development Disease Disease Pathway Docking Drug Design Drug usage Employment Ensure Event Fee-for-Service Plans Foundations Future Generations Growth HIV HIV Envelope Protein gp120 Human In Vitro Industrialization Industry Standard Integrase Lead Legal patent Literature Machine Learning Manuals Marketing Measurable Modeling Molecular National Institute of Allergy and Infectious Disease National Institute of General Medical Sciences Organism Outcome Output Paper Pathway interactions Peptide Hydrolases Pharmaceutical Preparations Pharmacologic Substance Phase Phenotype Population Privatization Process Production Property PubChem Public Domains Publications Publishing RNA-Directed DNA Polymerase Rare Diseases Research Sales Service delivery model Structure Structure-Activity Relationship Technology Testing Toxic effect Toxicology Trademark United States National Institutes of Health Validation Virus Visualization software Work adverse outcome analog base commercialization consumer product data curation data visualization design diverse data drug discovery improved in vivo inhibitor interest machine learning algorithm machine learning model model building neglect novel outcome prediction pre-clinical prospective prospective test prototype public database screening software development technology development tool

项目摘要

Project Summary Collaborations Pharmaceuticals, Inc. was formed after identifying a need for software to assist academics and smaller companies in curating their data and discovery of new hits or lead optimisation. In the past two years the continued importance of artificial intelligence (AI) is apparent from the explosive growth in number of these companies and the increasing number of multi-million dollar deals with pharma using Machine Learning (ML) to assist in drug discovery. There is a heavy focus by these companies on the drug discovery modeling aspect but there is a continued unmet need and bottleneck in the curation of quality in vitro and in vivo data ADME/Tox data for ML as well as prospective testing to validate the technologies. In Phase I, we developed a prototype of Assay CentralÒ software and used this with a wide variety of structure activity data from sources both public and private, formatted and unformatted, with ~14 collaborators working on neglected, rare or common disease targets as well as used it for our internal drug discovery projects. In Phase I we also created error checking and correction software. We also built and validated Bayesian models with the datasets that were collected and cleaned. And, in addition, we developed new data visualization tools. The software can be used to create selections of these models for sharing with collaborators as needed and for scoring new molecules and visualizing the multiple outputs in various formats. In Phase II, we have developed Assay CentralÒ into a production tool which is easy to deploy, built on industry standard technologies, provided graphical display of models and information on model applicability. Importantly, we identified that customers wanted us to provide them with the results! We developed our fee-for-service consulting services model using Assay CentralÒ to solve their problems and this has expanded our revenues annually. In Phase II we evaluated additional ML algorithms and molecular descriptors with manually curated datasets as well as compared algorithms across over 5000 auto-curated datasets from ChEMBL. This illustrated the utility of access to multiple algorithms and how the Bayesian algorithm was generally comparable to these other ML algorithms. This also motivated us to develop new software to integrate these algorithms. We have also explored finding rare disease datasets and applying our data curation and ML approach to them. With these and additional collaborations, as well as internal projects on Alzheimer’s disease (through a NIH NIGMS supplement) we have been able to repurpose already approved drugs for several targets for this and other diseases. For multiple projects we have performed several rounds of model building and fed data back into the models to enable improved predictions. Finally, we have developed prototype tools to enable us to develop automated molecule designs, assess their synthesizability and perform retrosynthetic analysis. These combined efforts dramatically increased the number of projects we were able to work on (and ultimately publish to raise our visibility), created new spin off products as collections of models (MegaTransÒ, MegaToxÒ and MegaPredictÒ), molecule related IP, and generated employment. In Phase IIB we now propose a focus on steps to aid commercialization and further development of these technologies. We have identified that developing auto-curation software for dealing with complex biological data in unstructured databases will be a competitive advantage. We have also recognized that for many diseases we can have a complete or near complete collection of targets which may enable us to understand how a molecule may interfere with biological pathways from structure alone and this can be applied to complex diseases and “adverse outcome pathways” in toxicology. We also propose integrating state of the art multi-objective generative models for molecule design into our Assay Central computational software in order to complement our analog generation and retrosynthesis tools created in Phase II and aid in molecule optimization. We will validate this capability using some of the hit molecules identified in Phase II for different targets including human acetylcholinesterase. Assay Central would then have a full suite of integrated capabilities from data curation through to molecule design and retrosynthetic analysis and will enable us to attract larger deals with companies.

项目总结

项目成果

期刊论文数量（43）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Multiple Machine Learning Comparisons of HIV Cell-based and Reverse Transcriptase Data Sets.

DOI：
10.1021/acs.molpharmaceut.8b01297
发表时间：
2019-04-01
期刊：
Molecular pharmaceutics
影响因子：
4.9
作者：
Zorn KM;Lane TR;Russo DP;Clark AM;Makarov V;Ekins S
通讯作者：
Ekins S

Comparing Multiple Machine Learning Algorithms and Metrics for Estrogen Receptor Binding Prediction.

DOI：
10.1021/acs.molpharmaceut.8b00546
发表时间：
2018-10-01
期刊：
Molecular pharmaceutics
影响因子：
4.9
作者：
Russo DP;Zorn KM;Clark AM;Zhu H;Ekins S
通讯作者：
Ekins S

Validation of Acetylcholinesterase Inhibition Machine Learning Models for Multiple Species.

DOI：
10.1021/acs.chemrestox.2c00283
发表时间：
2023-02-20
期刊：
CHEMICAL RESEARCH IN TOXICOLOGY
影响因子：
4.1
作者：
Vignaux, Patricia A.;Lane, Thomas R.;Urbina, Fabio;Gerlach, Jacob;Puhl, Ana C.;Snyder, Scott H.;Ekins, Sean
通讯作者：
Ekins, Sean

Using Bibliometric Analysis and Machine Learning to Identify Compounds Binding to Sialidase-1.

使用文献计量分析和机器学习来识别与唾液酸酶-1结合的化合物。

DOI：
10.1021/acsomega.0c05591
发表时间：
2021-02-02
期刊：
ACS omega
影响因子：
4.1
作者：
Klein JJ;Baker NC;Foil DH;Zorn KM;Urbina F;Puhl AC;Ekins S
通讯作者：
Ekins S

Development of Machine Learning Models and the Discovery of a New Antiviral Compound against Yellow Fever Virus.

DOI：
10.1021/acs.jcim.1c00460
发表时间：
2021-08-23
期刊：
Journal of chemical information and modeling
影响因子：
5.6
作者：
Gawriljuk VO;Foil DH;Puhl AC;Zorn KM;Lane TR;Riabova O;Makarov V;Godoy AS;Oliva G;Ekins S
通讯作者：
Ekins S