Collaborative Research: EnCORE: Institute for Emerging CORE Methods in Data Science
合作研究:EnCORE:数据科学新兴核心方法研究所
基本信息
- 批准号:2217069
- 负责人:
- 金额:$ 257.23万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-01 至 2027-08-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
The proliferation of data-driven decision making, and its increased popularity, has fueled rapid emergence of data science as a new scientific discipline. Data science is seen as a key enabler of future businesses, technologies, and healthcare that can transform all aspects of socioeconomic lives. Its fast adoption, however, often comes with ad hoc implementation of techniques with suboptimal, and sometimes unfair and potentially harmful, results. The time is ripe to develop principled approaches to lay solid foundations of data science. This is particularly challenging as real-world data is highly complex with intricate structures, unprecedented scale, rapidly evolving characteristics, noise, and implicit biases. Addressing these challenges requires a concerted effort across multiple scientific disciplines such as statistics for robust decision making under uncertainty; mathematics and electrical engineering for enabling data-driven optimization beyond worst case; theoretical computer science and machine learning for new algorithmic paradigms to deal with dynamic and sensitive data in an ethical way; and basic sciences to bring the technical developments to the forefront of health sciences and society. The proposed institute for emerging CORE methods in data science (EnCORE) brings together a diverse team of researchers spanning the afore-mentioned disciplines from the University of California San Diego, University of Texas Austin, University of Pennsylvania, and the University of California Los Angeles. It presents an ambitious vision to transform the landscape of the four CORE pillars of data science: C for complexities of data, O for optimization, R for responsible learning, and E for education and engagement. Along with its transformative research vision, the institute fosters a bold plan for outreach and broadening participation by engaging students of diverse backgrounds at all levels from K-12 to postdocs and junior faculty. The project aims to impact a wide demography of students by offering collaborative courses across its partner universities and a flexible co-mentorship plan for truly multidisciplinary research. With regular organization of workshops, summer schools, and seminars, the project aims to engage the entire scientific community to become the new nexus of research and education on foundations of data science. To bring the fruit of theoretical development to practice, EnCORE will continuously work with industry partners, domain scientists, and will forge strong connections with other National Science Foundation Harnessing Data Revolution institutes across the nation.EnCORE as an institute embodies intellectual merit that has the potential to lead ground-breaking research to shape the foundations of data science in the United States. Its research mission is organized around three themes. The first theme on data complexity addresses the complex characteristics of data such as massive size, huge feature space, rapid changes, variety of sources, implicit dependence structures, arbitrary outliers, and noise. A major overhaul of the core concepts of algorithm design is needed with a holistic view of different computational complexity measures. Faced with noise and outliers, uncertainty estimation is both necessary, and at the same time difficult, due to dynamic and changing data. Data heterogeneity poses major challenges even in basic classification tasks. The structural relationships hidden inside such data are crucial in the understanding and processing, and for downstream data analysis tasks such as in visualization and neuroscience. The second theme of EnCORE aims to transform the classical area of optimization where adaptive methods and human intervention can lead to major advances. It plans to revisit the foundations of distributed optimization to include heterogeneity, robustness, safety, and communication; and address statistical uncertainty due to distributional shift in dynamic data in control and reinforcement learning. The third and final theme of EnCORE proposes to build the foundations of responsible learning. Applications of machine learning in human-facing systems are severely hampered when the learned models are hard for users to understand and reproduce, may give biased outcomes, are easily changeable by an adversary, and reveal sensitive information. Thus, interpretability, reproducibility, fairness, privacy, and robustness must be incorporated in any data-driven decision making. The experience and dedication to mentoring and outreach, collaborative curriculum design, socially aware responsible research program, extensive institute activities, and industrial partnerships would pave the way for a substantial broader impact for EnCORE. Summer schools with year-long mentoring will take place in three states involving a large demography. Joint courses with hybrid, and fully online offerings will be developed. Utilizing prior experience of running Thinkabit lab that has impacted over 74,000 K-12 students so far, EnCORE will embark on an ambitious and thoughtful outreach program to improve the representation of under-represented groups and help create a future generation of workforce that is diverse, responsible, and has solid foundations in data science.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据驱动决策的激增及其越来越受欢迎,推动了数据科学作为一门新的科学学科的迅速崛起。数据科学被视为未来企业、技术和医疗保健的关键推动因素,可以改变社会经济生活的方方面面。然而,它的快速采用往往伴随着临时实施的技术,其结果不是最优的,有时是不公平的,甚至可能是有害的。开发原则性方法为数据科学奠定坚实基础的时机已经成熟。这尤其具有挑战性,因为现实世界的数据高度复杂,具有复杂的结构、前所未有的规模、快速演变的特征、噪声和隐含的偏差。应对这些挑战需要跨多个科学学科的协调努力,例如在不确定情况下做出稳健决策的统计学;使数据驱动的优化超越最坏情况的数学和电气工程;以伦理方式处理动态和敏感数据的新算法范式的理论计算机科学和机器学习;以及将技术发展带入健康科学和社会的前沿的基础科学。拟议的数据科学新兴核心方法研究所(ENCORE)汇集了来自加州大学圣地亚哥分校、德克萨斯大学奥斯汀分校、宾夕法尼亚大学和加州大学洛杉矶分校的跨上述学科的不同研究团队。它提出了一个雄心勃勃的愿景,以改变数据科学四大核心支柱的格局:C代表数据的复杂性,O代表优化,R代表负责任的学习,E代表教育和参与。除了其变革性的研究愿景,该研究所还通过吸引从K-12到博士后和初级教员的各种背景的学生,为拓展和扩大参与制定了一个大胆的计划。该项目旨在通过提供跨合作大学的合作课程和灵活的联合导师计划来影响广泛的学生人口统计学,以实现真正的多学科研究。通过定期组织研讨会、暑期学校和研讨会,该项目旨在使整个科学界成为数据科学基础研究和教育的新纽带。为了将理论发展的成果付诸实践,Encore将继续与行业合作伙伴、领域科学家合作,并将与全国其他利用数据革命研究所的国家科学基金会建立牢固的联系。Encore作为一个研究所体现了智力上的优势,具有领导突破性研究的潜力,以塑造美国数据科学的基础。它的研究任务围绕三个主题组织。关于数据复杂性的第一个主题涉及数据的复杂特征,如海量、巨大的特征空间、快速变化、来源的多样性、隐式依赖结构、任意离群值和噪声。有必要对算法设计的核心概念进行重大改革,从整体上看待不同的计算复杂性衡量标准。面对噪声和异常值,由于数据的动态性和变化性,不确定性估计既是必要的,也是困难的。即使在基本的分类任务中,数据异构性也会带来重大挑战。隐藏在这些数据中的结构关系在理解和处理以及在可视化和神经科学等下游数据分析任务中至关重要。安可的第二个主题旨在改变传统的优化领域,在这个领域,自适应方法和人类干预可以带来重大进展。它计划重新审视分布式优化的基础,以包括异构性、健壮性、安全性和通信;并解决由于控制和强化学习中动态数据的分布转移而导致的统计不确定性。安可的第三个也是最后一个主题建议建立负责任的学习基础。当学习到的模型对用户来说难以理解和复制,可能给出有偏见的结果,容易被对手更改,并泄露敏感信息时,机器学习在面向人的系统中的应用受到严重阻碍。因此,可解释性、重复性、公平性、私密性和稳健性必须纳入任何数据驱动的决策制定。在指导和推广、协作课程设计、具有社会责任感的研究项目、广泛的学院活动和行业合作伙伴关系方面的经验和奉献精神,将为安可产生更广泛的重大影响铺平道路。为期一年的暑期学校将在三个涉及大量人口的州举行。将开发与混合课程和完全在线课程的联合课程。利用之前运营Thinkabit实验室的经验,到目前为止,已影响到超过74,000名K-12学生,安可将启动一项雄心勃勃且深思熟虑的外展计划,以改善代表不足的群体的代表性,并帮助培养出多元化、负责任且在数据科学方面拥有坚实基础的未来一代劳动力。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Sujay Sanghavi其他文献
Stratospheric chlorine activation in the Arctic winters 1995/96–2001/02 derived from GOME OClO measurements
1995/96–2001/02 北极冬季平流层氯活化来自 GOME OClO 测量
- DOI:
10.1016/j.asr.2003.08.069 - 发表时间:
2004 - 期刊:
- 影响因子:2.6
- 作者:
S. Kühl;W. Wilms;S. Beirle;C. Frankenberg;M. Grzegorski;J. Hollwedel;F. Khokhar;Sarit Kraus;U. Platt;Sujay Sanghavi;C. V. Friedeburg;T. Wagner - 通讯作者:
T. Wagner
Geometric Median (GM) Matching for Robust Data Pruning
用于稳健数据修剪的几何中值 (GM) 匹配
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Anish Acharya;I. Dhillon;Sujay Sanghavi - 通讯作者:
Sujay Sanghavi
Serving content with unknown demand: the high-dimensional regime
- DOI:
10.1007/s11134-015-9443-0 - 发表时间:
2015-04-12 - 期刊:
- 影响因子:0.700
- 作者:
Sharayu Moharir;Javad Ghaderi;Sujay Sanghavi;Sanjay Shakkottai - 通讯作者:
Sanjay Shakkottai
Learning Graphical Models for Hypothesis Testing
学习假设检验的图形模型
- DOI:
- 发表时间:
2007 - 期刊:
- 影响因子:0
- 作者:
Sujay Sanghavi;V. Tan;A. Willsky - 通讯作者:
A. Willsky
In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
使用 Transformers 进行上下文学习:Softmax Attention 适应函数 Lipschitzness
- DOI:
10.48550/arxiv.2402.11639 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Liam Collins;Advait Parulekar;Aryan Mokhtari;Sujay Sanghavi;Sanjay Shakkottai - 通讯作者:
Sanjay Shakkottai
Sujay Sanghavi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Sujay Sanghavi', 18)}}的其他基金
HDR TRIPODS: UT Austin Institute on the Foundations of Data Science
HDR TRIPODS:UT Austin 数据科学基础研究所
- 批准号:
1934932 - 财政年份:2019
- 资助金额:
$ 257.23万 - 项目类别:
Continuing Grant
AF: Medium: Dropping Convexity: New Algorithms, Statistical Guarantees and Scalable Software for Non-convex Matrix Estimation
AF:中:降低凸性:用于非凸矩阵估计的新算法、统计保证和可扩展软件
- 批准号:
1564000 - 财政年份:2016
- 资助金额:
$ 257.23万 - 项目类别:
Continuing Grant
CIF: Medium: Collaborative Research: New Approaches to Robustness in High-Dimensions
CIF:中:协作研究:高维鲁棒性的新方法
- 批准号:
1302435 - 财政年份:2013
- 资助金额:
$ 257.23万 - 项目类别:
Continuing Grant
CAREER: Networks and Statistical Inference: New Connections and Algorithms
职业:网络和统计推断:新连接和算法
- 批准号:
0954059 - 财政年份:2010
- 资助金额:
$ 257.23万 - 项目类别:
Continuing Grant
NetSE: Small: Social Networks in the Real World: From Sensing to Structure Analysis
NetSE:小型:现实世界中的社交网络:从感知到结构分析
- 批准号:
1017525 - 财政年份:2010
- 资助金额:
$ 257.23万 - 项目类别:
Standard Grant
NeTS: Medium: Collaborative Research: Shaping, Learning and Optimizing Dynamic Networks
NeTS:媒介:协作研究:塑造、学习和优化动态网络
- 批准号:
0964391 - 财政年份:2010
- 资助金额:
$ 257.23万 - 项目类别:
Continuing Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: REU Site: Earth and Planetary Science and Astrophysics REU at the American Museum of Natural History in Collaboration with the City University of New York
合作研究:REU 地点:地球与行星科学和天体物理学 REU 与纽约市立大学合作,位于美国自然历史博物馆
- 批准号:
2348998 - 财政年份:2025
- 资助金额:
$ 257.23万 - 项目类别:
Standard Grant
Collaborative Research: REU Site: Earth and Planetary Science and Astrophysics REU at the American Museum of Natural History in Collaboration with the City University of New York
合作研究:REU 地点:地球与行星科学和天体物理学 REU 与纽约市立大学合作,位于美国自然历史博物馆
- 批准号:
2348999 - 财政年份:2025
- 资助金额:
$ 257.23万 - 项目类别:
Standard Grant
Collaborative Research: Investigating Southern Ocean Sea Surface Temperatures and Freshening during the Late Pliocene and Pleistocene along the Antarctic Margin
合作研究:调查上新世晚期和更新世沿南极边缘的南大洋海面温度和新鲜度
- 批准号:
2313120 - 财政年份:2024
- 资助金额:
$ 257.23万 - 项目类别:
Standard Grant
NSF Engines Development Award: Utilizing space research, development and manufacturing to improve the human condition (OH)
NSF 发动机发展奖:利用太空研究、开发和制造来改善人类状况(OH)
- 批准号:
2314750 - 财政年份:2024
- 资助金额:
$ 257.23万 - 项目类别:
Cooperative Agreement
Doctoral Dissertation Research: How New Legal Doctrine Shapes Human-Environment Relations
博士论文研究:新法律学说如何塑造人类与环境的关系
- 批准号:
2315219 - 财政年份:2024
- 资助金额:
$ 257.23万 - 项目类别:
Standard Grant
Collaborative Research: Non-Linearity and Feedbacks in the Atmospheric Circulation Response to Increased Carbon Dioxide (CO2)
合作研究:大气环流对二氧化碳 (CO2) 增加的响应的非线性和反馈
- 批准号:
2335762 - 财政年份:2024
- 资助金额:
$ 257.23万 - 项目类别:
Standard Grant
Collaborative Research: Using Adaptive Lessons to Enhance Motivation, Cognitive Engagement, And Achievement Through Equitable Classroom Preparation
协作研究:通过公平的课堂准备,利用适应性课程来增强动机、认知参与和成就
- 批准号:
2335802 - 财政年份:2024
- 资助金额:
$ 257.23万 - 项目类别:
Standard Grant
Collaborative Research: Using Adaptive Lessons to Enhance Motivation, Cognitive Engagement, And Achievement Through Equitable Classroom Preparation
协作研究:通过公平的课堂准备,利用适应性课程来增强动机、认知参与和成就
- 批准号:
2335801 - 财政年份:2024
- 资助金额:
$ 257.23万 - 项目类别:
Standard Grant
Collaborative Research: Holocene biogeochemical evolution of Earth's largest lake system
合作研究:地球最大湖泊系统的全新世生物地球化学演化
- 批准号:
2336132 - 财政年份:2024
- 资助金额:
$ 257.23万 - 项目类别:
Standard Grant
CyberCorps Scholarship for Service: Building Research-minded Cyber Leaders
CyberCorps 服务奖学金:培养具有研究意识的网络领导者
- 批准号:
2336409 - 财政年份:2024
- 资助金额:
$ 257.23万 - 项目类别:
Continuing Grant