A massive study of data science to address the scientific reproducibility crisis

大规模数据科学研究以解决科学再现性危机

基本信息

  • 批准号:
    9244046
  • 负责人:
  • 金额:
    $ 36.45万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2016
  • 资助国家:
    美国
  • 起止时间:
    2016-04-01 至 2020-03-31
  • 项目状态:
    已结题

项目摘要

 DESCRIPTION (provided by applicant): There is a crisis of reproducibility and replicability of scientific results. This crisis is an increasing source of concern both in the scientific and poplar press. The crisis is so acute that the United States Congress is currently investigating reproducibility of the scientific process. At the heart of the crisis is a shortage of data analytc skill throughout the scientific enterprise. There is an emerging consensus that the best way to address the crisis is to increase data analytic training, particularly around reproducibility and replicability. In this application we (1) propose the first formal statistical model for reproduciility and replicability and then use data and experiments from the largest massive online open program in data science in the world to (2) perform randomized studies to improve our knowledge about which statistical methods and protocols lead to increased reproducibility and replicability in the hands of average users and (3) to analyze learner, course, and content characteristics that increase learner success and throughput to increase the number of trained data analysts worldwide. To accomplish goals (2) and (3) we will use the largest and highest throughput data science program in the world: the Johns Hopkins Data Science Specialization. This specialization, developed by the investigators of this project, consists of nine courses that are offered every month. Since the launch of this program in April 2014, these classes have seen more than two million enrollments and nearly all their experiences have been recorded as data. Furthermore, the MOOC platform for this series permits random assignment of quiz questions and content. We will disseminate our results through open source software, analysis protocols, our popular blog, and the Data Science Specialization to maximally improve data science training and reduce the scientific replication and reproducibility problem. The size of ths program means that by increasing quality of the program and the number of completing students by even a small percentage we can affect global data analytic behavior.
 描述(由申请人提供):Sciencefic结果存在可重复性和可复制性的危机。无论是在《科学》杂志(Sciencefic)还是在《白杨》杂志上,这场危机日益引起人们的关注。这场危机如此严重,以至于美国国会目前正在调查Sciencefic过程的可重复性。这场危机的核心是整个Sciencefic企业缺乏数据分析技能。一种正在形成的共识是,解决这场危机的最佳方式是增加数据分析培训,特别是关于可再现性和可复制性的培训。在这项应用中,我们(1)提出了可重复性和可复制性的fi第一正式统计模型,然后使用世界上最大的大规模在线开放计划的数据和实验来(2)进行随机研究,以改善我们对哪些统计方法和协议导致普通用户手中的可重复性和可复制性增加的知识,以及(3)分析学习者、课程和内容特征,以提高学习者的成功率和吞吐量,以增加全球训练有素的数据分析师的数量。为了实现目标(2)和(3),我们将使用世界上规模最大、吞吐量最高的数据科学计划:约翰·霍普金斯数据科学专业化认证。该专业由该项目的研究人员开发,包括每月提供的九门课程。自2014年4月该项目启动以来,这些课程的注册人数已超过200万人,他们几乎所有的经历都被记录为数据。此外,本系列的MOOC平台允许随机分配测验问题和内容。我们将通过开源软件、分析协议、我们广受欢迎的博客和数据科学专业化认证来传播我们的成果,以最大限度地改进数据科学培训,并减少科学fic复制和再现性问题。该项目的规模意味着,通过提高项目的质量和完成课程的学生数量,即使是很小的百分比,我们也可以影响全球数据分析行为。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jeffrey T. Leek其他文献

Tackling the widespread and critical impact of batch effects in high-throughput data
解决批效应在高通量数据中广泛且关键的影响
  • DOI:
    10.1038/nrg2825
  • 发表时间:
    2010-09-14
  • 期刊:
  • 影响因子:
    52.000
  • 作者:
    Jeffrey T. Leek;Robert B. Scharpf;Héctor Corrada Bravo;David Simcha;Benjamin Langmead;W. Evan Johnson;Donald Geman;Keith Baggerly;Rafael A. Irizarry
  • 通讯作者:
    Rafael A. Irizarry
Transparency and reproducibility in artificial intelligence
人工智能中的透明度和可重复性
  • DOI:
    10.1038/s41586-020-2766-y
  • 发表时间:
    2020-10-14
  • 期刊:
  • 影响因子:
    48.500
  • 作者:
    Benjamin Haibe-Kains;George Alexandru Adam;Ahmed Hosny;Farnoosh Khodakarami;Levi Waldron;Bo Wang;Chris McIntosh;Anna Goldenberg;Anshul Kundaje;Casey S. Greene;Tamara Broderick;Michael M. Hoffman;Jeffrey T. Leek;Keegan Korthauer;Wolfgang Huber;Alvis Brazma;Joelle Pineau;Robert Tibshirani;Trevor Hastie;John P. A. Ioannidis;John Quackenbush;Hugo J. W. L. Aerts
  • 通讯作者:
    Hugo J. W. L. Aerts
Erratum to: Practical impacts of genomic data “cleaning” on biological discovery using surrogate variable analysis
  • DOI:
    10.1186/s12859-016-1152-0
  • 发表时间:
    2016-08-10
  • 期刊:
  • 影响因子:
    3.300
  • 作者:
    Andrew E. Jaffe;Thomas Hyde;Joel Kleinman;Daniel R. Weinberger;Joshua G. Chenoweth;Ronald D. McKay;Jeffrey T. Leek;Carlo Colantuoni
  • 通讯作者:
    Carlo Colantuoni

Jeffrey T. Leek的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jeffrey T. Leek', 18)}}的其他基金

Data analysis tools for leveraging massive public data to improve hypothesis-driven research
数据分析工具,利用大量公共数据来改进假设驱动的研究
  • 批准号:
    10598130
  • 财政年份:
    2022
  • 资助金额:
    $ 36.45万
  • 项目类别:
Data analysis tools for leveraging massive public data to improve hypothesis-driven research
数据分析工具,利用大量公共数据来改进假设驱动的研究
  • 批准号:
    10330636
  • 财政年份:
    2022
  • 资助金额:
    $ 36.45万
  • 项目类别:
Data analysis tools for leveraging massive public data to improve hypothesis-driven research
数据分析工具,利用大量公共数据来改进假设驱动的研究
  • 批准号:
    10654376
  • 财政年份:
    2022
  • 资助金额:
    $ 36.45万
  • 项目类别:
A massive study of data science to address the scientific reproducibility crisis
大规模数据科学研究以解决科学再现性危机
  • 批准号:
    9100338
  • 财政年份:
    2016
  • 资助金额:
    $ 36.45万
  • 项目类别:
Statistical models for biological and technical variation in RNA sequencing
RNA 测序中生物和技术变异的统计模型
  • 批准号:
    8593469
  • 财政年份:
    2013
  • 资助金额:
    $ 36.45万
  • 项目类别:
Statistical models for biological and technical variation in RNA sequencing
RNA 测序中生物和技术变异的统计模型
  • 批准号:
    9264553
  • 财政年份:
    2013
  • 资助金额:
    $ 36.45万
  • 项目类别:
Statistical models for biological and technical variation in RNA sequencing
RNA 测序中生物和技术变异的统计模型
  • 批准号:
    8722575
  • 财政年份:
    2013
  • 资助金额:
    $ 36.45万
  • 项目类别:
Core B
核心B
  • 批准号:
    9978143
  • 财政年份:
    2011
  • 资助金额:
    $ 36.45万
  • 项目类别:
Core B
核心B
  • 批准号:
    9304366
  • 财政年份:
  • 资助金额:
    $ 36.45万
  • 项目类别:
Core B
核心B
  • 批准号:
    9759993
  • 财政年份:
  • 资助金额:
    $ 36.45万
  • 项目类别:

相似海外基金

Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
  • 批准号:
    MR/S03398X/2
  • 财政年份:
    2024
  • 资助金额:
    $ 36.45万
  • 项目类别:
    Fellowship
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
  • 批准号:
    2338423
  • 财政年份:
    2024
  • 资助金额:
    $ 36.45万
  • 项目类别:
    Continuing Grant
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
  • 批准号:
    EP/Y001486/1
  • 财政年份:
    2024
  • 资助金额:
    $ 36.45万
  • 项目类别:
    Research Grant
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
  • 批准号:
    MR/X03657X/1
  • 财政年份:
    2024
  • 资助金额:
    $ 36.45万
  • 项目类别:
    Fellowship
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
  • 批准号:
    2348066
  • 财政年份:
    2024
  • 资助金额:
    $ 36.45万
  • 项目类别:
    Standard Grant
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
  • 批准号:
    AH/Z505481/1
  • 财政年份:
    2024
  • 资助金额:
    $ 36.45万
  • 项目类别:
    Research Grant
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10107647
  • 财政年份:
    2024
  • 资助金额:
    $ 36.45万
  • 项目类别:
    EU-Funded
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
  • 批准号:
    2341402
  • 财政年份:
    2024
  • 资助金额:
    $ 36.45万
  • 项目类别:
    Standard Grant
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10106221
  • 财政年份:
    2024
  • 资助金额:
    $ 36.45万
  • 项目类别:
    EU-Funded
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
  • 批准号:
    AH/Z505341/1
  • 财政年份:
    2024
  • 资助金额:
    $ 36.45万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了