DATA SCIENCE RESEARCH
数据科学研究
基本信息
- 批准号:8935856
- 负责人:
- 金额:$ 187.46万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:AddressAdoptionAlgorithmsAstronomyBig DataBiologicalBiological PhenomenaBiologyCommunitiesConsensusDataData AnalysesData SetData SourcesDevelopmentEngineeringFosteringFoundationsFunding AgencyGenerationsGenesGenomicsGraphGrowthInformaticsKnowledgeMachine LearningPhysicsProteomicsResearchScheduleScienceSkeletonSystemTechniquesUpdateUser-Computer Interfacebuilt environmentdata integrationdata mininggenome-widehigh throughput technologymeetingsnext generation sequencingoperationtool
项目摘要
DATA SCIENCE RESEARCH
BACKGROUND AND SIGNIFICANCE
Biology in the 21st century has emerged as a "big data" science on par with physics or astronomy. Beginning with the landmark sequencing projects over a decade ago [1, 2], there have been successive waves of technological breakthroughs in probing cellular information on a genome-wide scale: microarrays [3], next generation sequencing [4], large-scale proteomics [5] and their many derivatives [6, 7]. Quick and widespread adoption of high throughput technologies has created massive amounts of data, yet there is a consensus that
the floodgates have only barely opened [8]. The explosive growth of data volume has fostered intense research in the development of informatics tools to store, manage and analyze such data [9]. However, the scale and efficiency of the analysis is lagging behind the generation of data, a fact recognized by the major national funding agencies, with the result that the true potential of the data to accelerate biological discovery is not being realized.
Analysis of biological data today is hampered by two major bottlenecks: (1) Integration: Different
biotechnological tools record different kinds of cellular activities that provide complementary views of the same underlying biological phenomena. However, it has proved extremely difficult to integrate those partial descriptions into a well-organized whole, even though the advantages of such an integrative analysis of diverse data types are well recognized [10]. (2) Scalability: The challenge of data integration is generally met with the most heavy-duty machine learning techniques of the day [10], which typically do not scale well with data size. Biology needs analysis tools that can handle the data deluge of its modern "omics" era. We propose
to develop an E-science framework that will address the issues of integrative analysis and scalability associated with big data analysis in biology. We will build this environment from the ground up, laying its algorithmic foundations, engineering the scalable systems that form its skeleton frame, and creating the human-computer interface that makes it hospitable.
数据科学研究
背景和意义
世纪的生物学已经成为一门与物理学或天文学齐名的“大数据”科学。从十多年前具有里程碑意义的测序项目开始[1,2],在全基因组范围内探测细胞信息方面出现了一系列技术突破:微阵列[3],下一代测序[4],大规模蛋白质组学[5]及其许多衍生物[6,7]。高吞吐量技术的快速和广泛采用产生了大量数据,但有一个共识是,
而这只不过是一场小小的“口水战”而已。数据量的爆炸式增长促进了对信息学工具开发的深入研究,以存储,管理和分析这些数据[9]。然而,分析的规模和效率落后于数据的生成,这一事实得到了主要国家供资机构的承认,结果是数据加速生物发现的真正潜力没有得到实现。
今天的生物数据分析受到两个主要瓶颈的阻碍:(1)集成:不同的
生物技术工具记录了不同种类的细胞活动,为相同的基本生物现象提供了互补的视角。然而,事实证明,将这些部分描述整合到一个组织良好的整体中是非常困难的,即使对不同数据类型进行这种整合分析的优势已经得到了很好的认可[10]。(2)可扩展性:数据集成的挑战通常会遇到当今最繁重的机器学习技术[10],这些技术通常不能很好地扩展数据大小。生物学需要能够处理现代“组学”时代数据泛滥的分析工具。我们提出
开发一个电子科学框架,解决与生物学大数据分析相关的综合分析和可扩展性问题。我们将从头开始构建这个环境,奠定其算法基础,设计形成其骨架框架的可扩展系统,并创建使其友好的人机界面。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Saurabh Sinha其他文献
Saurabh Sinha的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Saurabh Sinha', 18)}}的其他基金
Quantitative regulatory genomics: networks, cis-regulatory codes, and phenotypic variation
定量调控基因组学:网络、顺式调控密码和表型变异
- 批准号:
10021007 - 财政年份:2019
- 资助金额:
$ 187.46万 - 项目类别:
Quantitative regulatory genomics: networks, cis-regulatory codes, and phenotypic variation
定量调控基因组学:网络、顺式调控密码和表型变异
- 批准号:
10267176 - 财政年份:2019
- 资助金额:
$ 187.46万 - 项目类别:
Quantitative Modeling of Sequence-to-Expression Relationship
序列与表达关系的定量建模
- 批准号:
8864340 - 财政年份:2015
- 资助金额:
$ 187.46万 - 项目类别:
相似海外基金
WELL-CALF: optimising accuracy for commercial adoption
WELL-CALF:优化商业采用的准确性
- 批准号:
10093543 - 财政年份:2024
- 资助金额:
$ 187.46万 - 项目类别:
Collaborative R&D
Investigating the Adoption, Actual Usage, and Outcomes of Enterprise Collaboration Systems in Remote Work Settings.
调查远程工作环境中企业协作系统的采用、实际使用和结果。
- 批准号:
24K16436 - 财政年份:2024
- 资助金额:
$ 187.46万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Unraveling the Dynamics of International Accounting: Exploring the Impact of IFRS Adoption on Firms' Financial Reporting and Business Strategies
揭示国际会计的动态:探索采用 IFRS 对公司财务报告和业务战略的影响
- 批准号:
24K16488 - 财政年份:2024
- 资助金额:
$ 187.46万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:
10107647 - 财政年份:2024
- 资助金额:
$ 187.46万 - 项目类别:
EU-Funded
Assessing the Coordination of Electric Vehicle Adoption on Urban Energy Transition: A Geospatial Machine Learning Framework
评估电动汽车采用对城市能源转型的协调:地理空间机器学习框架
- 批准号:
24K20973 - 财政年份:2024
- 资助金额:
$ 187.46万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:
10106221 - 财政年份:2024
- 资助金额:
$ 187.46万 - 项目类别:
EU-Funded
Our focus for this project is accelerating the development and adoption of resource efficient solutions like fashion rental through technological advancement, addressing longer in use and reuse
我们该项目的重点是通过技术进步加快时装租赁等资源高效解决方案的开发和采用,解决更长的使用和重复使用问题
- 批准号:
10075502 - 财政年份:2023
- 资助金额:
$ 187.46万 - 项目类别:
Grant for R&D
Engage2innovate – Enhancing security solution design, adoption and impact through effective engagement and social innovation (E2i)
Engage2innovate — 通过有效参与和社会创新增强安全解决方案的设计、采用和影响 (E2i)
- 批准号:
10089082 - 财政年份:2023
- 资助金额:
$ 187.46万 - 项目类别:
EU-Funded
De-Adoption Beta-Blockers in patients with stable ischemic heart disease without REduced LV ejection fraction, ongoing Ischemia, or Arrhythmias: a randomized Trial with blinded Endpoints (ABbreviate)
在没有左心室射血分数降低、持续性缺血或心律失常的稳定型缺血性心脏病患者中停用β受体阻滞剂:一项盲法终点随机试验(ABbreviate)
- 批准号:
481560 - 财政年份:2023
- 资助金额:
$ 187.46万 - 项目类别:
Operating Grants
Collaborative Research: SCIPE: CyberInfrastructure Professionals InnoVating and brOadening the adoption of advanced Technologies (CI PIVOT)
合作研究:SCIPE:网络基础设施专业人员创新和扩大先进技术的采用 (CI PIVOT)
- 批准号:
2321091 - 财政年份:2023
- 资助金额:
$ 187.46万 - 项目类别:
Standard Grant