IAA-NCEA/EPA-0001; “Chemoinformatics of EPA’s EDSP Universe of Chemicals”
IAA-NCEA/EPA-0001;
基本信息
- 批准号:10379865
- 负责人:
- 金额:$ 17.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-03-24 至 2022-03-23
- 项目状态:已结题
- 来源:
- 关键词:Artificial IntelligenceBiological AssayChemicalsClientCollaborationsConsumptionContractorCustomDataDatabasesDescriptorExclusionExerciseFeedbackFingerprintGrowth and Development functionLifeLiteratureManualsMeasurementMethodsNamesNational Institute of Environmental Health SciencesPhysical ChemistryPlantsProcessPythonsReview LiteratureScreening procedureSoftware ToolsStructureSupervisionTestingTimeToxic effectbasedashboarddata cleaningdata curationdata harmonizationdesignenvironmental chemicalexperimental studyknowledge basemachine learning method
项目摘要
Chemoinformatic support for two IAA/USEPA projects with DNTP were performed; first project with EPA's ECOTOX database and the second project with EPA's NCEA (Natl Ctr Environmental Assessment). (1) The ECOTOXicology Knowledgebase (ECOTOX) is a comprehensive, publicly available knowledgebase providing single chemical environmental toxicity data on aquatic life, terrestrial plants and wildlife. To streamline this labor-intensive process, Sciome evaluated the feasibility of using machine learning methods to automatically classify documents according to relevance, and to identify the exclusion rationale for those references that are excluded. There are ~45,000 references that have been manually curated by EPA in ECOTOX database over the course of the last thirty years of database growth and development. The current data curation process consists of several manual and time-consuming steps. The process could be greatly shortend and made more consistent by use of AI (artificial intelligence) software tools and a process was initiated to develop an ECOTOX lexicon and develop a strategy for automated curation. (2) For the 2nd project with NCEA, the EDSP Universe of Chemicals was analyzed towards forming a database portal. An initial clustering exercise was completed on ~700 compounds that were fully and unambiguously curated. Further clustering was extended to the union of compounds (~14K) obtained by searching the CompTox Dashboard for CAS numbers and Names provided in the file shared by EPA. Each clustering experiment involved complete data cleaning and harmonization, structure quantification by several fingerprints and descriptors followed by investigating several distance measurements and clustering methods. Some class-specific clustering experiments with methods to identify parsimonious set of clusters were pursued. After weighing pros and cons of several available platforms for developing the EDSP portal, EPA and Sciome agreed that the EDSP Portal would be developed on the Python/Django platform. The Portal would initially house some physical-chemistry data and Tier-1 bio-assay data for the compounds that have so far been unambiguously curated. The functionality of the Portal could be continually enhanced through a regular show-and-feedback cycle with the clients.
为两个使用dNTP的IAA/USEPA项目提供了化学信息支持;第一个项目使用了EPA的ECOTOX数据库,第二个项目使用了EPA的NCEA(Natl Ctr环境评估)。(1)生态毒理学知识库(ECOTOX)是一个全面的、公开可用的知识库,提供关于水生生物、陆地植物和野生动物的单一化学环境毒性数据。为了简化这一劳动密集的过程,Sciome评估了使用机器学习方法根据相关性自动对文档进行分类的可行性,并确定了排除那些被排除的参考文献的理由。在过去三十年的数据库增长和发展过程中,环境保护局在ECOTOX数据库中手动管理了大约45,000篇参考文献。目前的数据管理过程包括几个手动且耗时的步骤。通过使用人工智能(AI)软件工具,可以大大缩短这一进程,使其更加一致,并启动了一项进程,以编制ECOTOX词典和制定自动化管理战略。(2)对于与NCEA的第二个项目,对EDSP的化学品宇宙进行了分析,以形成一个数据库门户。完成了对大约700种化合物的初步分类工作,这些化合物都得到了充分和明确的筛选。进一步的聚类被扩展到通过在CompTox Dashboard中搜索EPA共享的文件中提供的CAS编号和名称而获得的化合物(~14K)的联合。每个聚类实验包括完整的数据清理和协调,通过几个指纹和描述符进行结构量化,然后调查几种距离测量和聚类方法。使用识别简约聚类集合的方法进行了一些特定类别的聚类实验。在权衡了几个可用于开发EDSP门户的平台的利弊之后,EPA和SCHOME同意在Python/Django平台上开发EDSP门户。该门户最初将存储一些物理化学数据和迄今为止已明确管理的化合物的一级生物分析数据。门户网站的功能可以通过与客户的定期展示和反馈周期不断得到加强。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
RUCHIR SHAH其他文献
RUCHIR SHAH的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('RUCHIR SHAH', 18)}}的其他基金
SCIOME - SPECIALIZED TECHNICAL AND SCIENTIFIC SERVICE
SCIOME - 专业技术和科学服务
- 批准号:
10497910 - 财政年份:2021
- 资助金额:
$ 17.5万 - 项目类别:
Support for New Bioinformatics Methods Development
支持新的生物信息学方法开发
- 批准号:
10281443 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
IGF::OT::IGF BIOINFORMATICS SUPPORT FOR THE NIEHS IN DIR & DNTP
IGF::OT::IGF 生物信息学支持 DIR 中的 NIEHS
- 批准号:
10379864 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
Bioinformatics Support for Toxicogenomics Technologies
毒物基因组学技术的生物信息学支持
- 批准号:
10596964 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
Support for New Bioinformatics Methods Development
支持新的生物信息学方法开发
- 批准号:
10596966 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
Bioinformatics Support for High Throughput Transcriptomics
高通量转录组学的生物信息学支持
- 批准号:
10379867 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
Bioinformatics Support for High Throughput Transcriptomics
高通量转录组学的生物信息学支持
- 批准号:
10596965 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
IGF::OT::IGF BIOINFORMATICS SUPPORT FOR THE NIEHS IN DIR & DNTP
IGF::OT::IGF 生物信息学支持 DIR 中的 NIEHS
- 批准号:
9915697 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
Bioinformatics Support for Toxicogenomics Technologies
毒物基因组学技术的生物信息学支持
- 批准号:
10379866 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
相似海外基金
Establishment of a new biological assay using Hydra nematocyst deployment
利用水螅刺丝囊部署建立新的生物测定方法
- 批准号:
520728-2017 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
University Undergraduate Student Research Awards
POINT-OF-CARE BIOLOGICAL ASSAY FOR DETERMINING TISSUE-SPECIFIC ABSORBED IONIZING RADIATION DOSE (BIODOSIMETER) AFTER RADIOLOGICAL AND NUCLEAR EVENTS.
用于确定放射和核事件后组织特异性吸收电离辐射剂量(生物剂量计)的护理点生物测定。
- 批准号:
10368760 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
POINT-OF-CARE BIOLOGICAL ASSAY FOR DETERMINING TISSUE-SPECIFIC ABSORBED IONIZING RADIATION DOSE (BIODOSIMETER) AFTER RADIOLOGICAL AND NUCLEAR EVENTS.
用于确定放射和核事件后组织特异性吸收电离辐射剂量(生物剂量计)的护理点生物测定。
- 批准号:
10669539 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
POINT-OF-CARE BIOLOGICAL ASSAY FOR DETERMINING TISSUE-SPECIFIC ABSORBED IONIZING RADIATION DOSE (BIODOSIMETER) AFTER RADIOLOGICAL AND NUCLEAR EVENTS.
用于确定放射和核事件后组织特异性吸收电离辐射剂量(生物剂量计)的护理点生物测定。
- 批准号:
9570142 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
POINT-OF-CARE BIOLOGICAL ASSAY FOR DETERMINING TISSUE-SPECIFIC ABSORBED IONIZING RADIATION DOSE (BIODOSIMETER) AFTER RADIOLOGICAL AND NUCLEAR EVENTS.
用于确定放射和核事件后组织特异性吸收电离辐射剂量(生物剂量计)的护理点生物测定。
- 批准号:
9915803 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
COVID-19 Supplemental work: POINT-OF-CARE BIOLOGICAL ASSAY FOR DETERMINING TISSUE-SPECIFIC ABSORBED IONIZING RADIATION DOSE (BIODOSIMETER).
COVID-19 补充工作:用于确定组织特异性吸收电离辐射剂量的护理点生物测定(生物剂量计)。
- 批准号:
10259999 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
Drug discovery based on a new biological assay system using Yeast knock-out strain collection
基于使用酵母敲除菌株收集的新生物测定系统的药物发现
- 批准号:
21580130 - 财政年份:2009
- 资助金额:
$ 17.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Machine learning for automatic gene annotation using high-throughput biological assay data
使用高通量生物测定数据进行自动基因注释的机器学习
- 批准号:
300985-2004 - 财政年份:2005
- 资助金额:
$ 17.5万 - 项目类别:
Postdoctoral Fellowships
Machine learning for automatic gene annotation using high-throughput biological assay data
使用高通量生物测定数据进行自动基因注释的机器学习
- 批准号:
300985-2004 - 财政年份:2004
- 资助金额:
$ 17.5万 - 项目类别:
Postdoctoral Fellowships