Collaborative Research: EnCORE: Institute for Emerging CORE Methods in Data Science
合作研究:EnCORE:数据科学新兴核心方法研究所
基本信息
- 批准号:2217062
- 负责人:
- 金额:$ 183.99万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-01 至 2027-08-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
The proliferation of data-driven decision making, and its increased popularity, has fueled rapid emergence of data science as a new scientific discipline. Data science is seen as a key enabler of future businesses, technologies, and healthcare that can transform all aspects of socioeconomic lives. Its fast adoption, however, often comes with ad hoc implementation of techniques with suboptimal, and sometimes unfair and potentially harmful, results. The time is ripe to develop principled approaches to lay solid foundations of data science. This is particularly challenging as real-world data is highly complex with intricate structures, unprecedented scale, rapidly evolving characteristics, noise, and implicit biases. Addressing these challenges requires a concerted effort across multiple scientific disciplines such as statistics for robust decision making under uncertainty; mathematics and electrical engineering for enabling data-driven optimization beyond worst case; theoretical computer science and machine learning for new algorithmic paradigms to deal with dynamic and sensitive data in an ethical way; and basic sciences to bring the technical developments to the forefront of health sciences and society. The proposed institute for emerging CORE methods in data science (EnCORE) brings together a diverse team of researchers spanning the afore-mentioned disciplines from the University of California San Diego, University of Texas Austin, University of Pennsylvania, and the University of California Los Angeles. It presents an ambitious vision to transform the landscape of the four CORE pillars of data science: C for complexities of data, O for optimization, R for responsible learning, and E for education and engagement. Along with its transformative research vision, the institute fosters a bold plan for outreach and broadening participation by engaging students of diverse backgrounds at all levels from K-12 to postdocs and junior faculty. The project aims to impact a wide demography of students by offering collaborative courses across its partner universities and a flexible co-mentorship plan for truly multidisciplinary research. With regular organization of workshops, summer schools, and seminars, the project aims to engage the entire scientific community to become the new nexus of research and education on foundations of data science. To bring the fruit of theoretical development to practice, EnCORE will continuously work with industry partners, domain scientists, and will forge strong connections with other National Science Foundation Harnessing Data Revolution institutes across the nation.EnCORE as an institute embodies intellectual merit that has the potential to lead ground-breaking research to shape the foundations of data science in the United States. Its research mission is organized around three themes. The first theme on data complexity addresses the complex characteristics of data such as massive size, huge feature space, rapid changes, variety of sources, implicit dependence structures, arbitrary outliers, and noise. A major overhaul of the core concepts of algorithm design is needed with a holistic view of different computational complexity measures. Faced with noise and outliers, uncertainty estimation is both necessary, and at the same time difficult, due to dynamic and changing data. Data heterogeneity poses major challenges even in basic classification tasks. The structural relationships hidden inside such data are crucial in the understanding and processing, and for downstream data analysis tasks such as in visualization and neuroscience. The second theme of EnCORE aims to transform the classical area of optimization where adaptive methods and human intervention can lead to major advances. It plans to revisit the foundations of distributed optimization to include heterogeneity, robustness, safety, and communication; and address statistical uncertainty due to distributional shift in dynamic data in control and reinforcement learning. The third and final theme of EnCORE proposes to build the foundations of responsible learning. Applications of machine learning in human-facing systems are severely hampered when the learned models are hard for users to understand and reproduce, may give biased outcomes, are easily changeable by an adversary, and reveal sensitive information. Thus, interpretability, reproducibility, fairness, privacy, and robustness must be incorporated in any data-driven decision making. The experience and dedication to mentoring and outreach, collaborative curriculum design, socially aware responsible research program, extensive institute activities, and industrial partnerships would pave the way for a substantial broader impact for EnCORE. Summer schools with year-long mentoring will take place in three states involving a large demography. Joint courses with hybrid, and fully online offerings will be developed. Utilizing prior experience of running Thinkabit lab that has impacted over 74,000 K-12 students so far, EnCORE will embark on an ambitious and thoughtful outreach program to improve the representation of under-represented groups and help create a future generation of workforce that is diverse, responsible, and has solid foundations in data science.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据驱动决策的扩散及其日益普及,推动了数据科学作为一门新的科学学科的迅速崛起。数据科学被视为未来业务、技术和医疗保健的关键推动者,可以改变社会经济生活的各个方面。然而,它的快速采用往往伴随着一些技术的特殊实现,这些技术具有次优的,有时是不公平的和潜在有害的结果。开发有原则的方法为数据科学奠定坚实基础的时机已经成熟。这尤其具有挑战性,因为现实世界的数据高度复杂,具有复杂的结构、前所未有的规模、快速演变的特征、噪音和隐性偏见。应对这些挑战需要跨多个科学学科的协同努力,例如在不确定性下做出强有力的决策的统计学;数学和电气工程,使数据驱动的优化超越最坏的情况;理论计算机科学和机器学习的新算法范式,以道德的方式处理动态和敏感数据;基础科学将技术发展带到健康科学和社会的前沿。拟议中的数据科学新兴核心方法研究所(EnCORE)汇集了来自加州大学圣地亚哥分校、德克萨斯大学奥斯汀分校、宾夕法尼亚大学和加州大学洛杉矶分校的不同学科的研究人员。它提出了一个雄心勃勃的愿景,以改变数据科学的四个核心支柱:C代表数据的复杂性,O代表优化,R代表负责任的学习,E代表教育和参与。除了变革性的研究愿景外,该研究所还制定了一项大胆的计划,通过吸引从K-12到博士后和初级教师的不同背景的学生,扩大参与范围。该项目旨在通过在合作大学之间提供合作课程,并为真正的多学科研究提供灵活的共同指导计划,从而影响广泛的学生群体。通过定期组织研讨会、暑期学校和研讨会,该项目旨在让整个科学界参与进来,成为数据科学基础研究和教育的新纽带。为了将理论发展的成果付诸实践,EnCORE将继续与行业合作伙伴、领域科学家合作,并将与全国其他国家科学基金会利用数据革命研究所建立牢固的联系。作为一个机构,EnCORE体现了智力上的价值,有可能引领开创性的研究,塑造美国数据科学的基础。它的研究任务围绕着三个主题。关于数据复杂性的第一个主题解决了数据的复杂特征,如巨大的尺寸、巨大的特征空间、快速变化、各种来源、隐式依赖结构、任意异常值和噪声。需要对算法设计的核心概念进行重大检查,以全面了解不同的计算复杂性度量。面对噪声和异常值,由于数据的动态变化,不确定性估计是必要的,同时也是困难的。即使在基本的分类任务中,数据异构性也构成了重大挑战。隐藏在这些数据中的结构关系对于理解和处理以及可视化和神经科学等下游数据分析任务至关重要。EnCORE的第二个主题旨在改变优化的经典领域,其中自适应方法和人为干预可以导致重大进展。它计划重新审视分布式优化的基础,包括异构性、鲁棒性、安全性和通信;并解决由于控制和强化学习中动态数据的分布移位而导致的统计不确定性。EnCORE的第三个也是最后一个主题是建立负责任学习的基础。当学习的模型对用户来说很难理解和复制,可能会给出有偏见的结果,很容易被对手改变,并且泄露敏感信息时,机器学习在面向人类系统中的应用就会受到严重阻碍。因此,可解释性、可再现性、公平性、隐私性和健壮性必须纳入任何数据驱动的决策中。在指导和推广、合作课程设计、社会责任感研究项目、广泛的研究所活动和工业伙伴关系方面的经验和奉献精神将为EnCORE铺平道路,使其产生更广泛的影响。为期一年的暑期学校将在三个人口众多的州开设。将开发混合课程和完全在线课程的联合课程。利用之前运营Thinkabit实验室的经验,到目前为止,该实验室已经影响了74,000多名K-12学生,EnCORE将启动一项雄心勃勃的、深思熟虑的外展计划,以改善代表性不足的群体的代表性,并帮助培养多元化、负责任、在数据科学方面有坚实基础的未来一代劳动力。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Practical Adversarial Multivalid Conformal Prediction
- DOI:10.48550/arxiv.2206.01067
- 发表时间:2022-06
- 期刊:
- 影响因子:0
- 作者:O. Bastani;Varun Gupta;Christopher Jung;Georgy Noarov;Ramya Ramalingam;Aaron Roth
- 通讯作者:O. Bastani;Varun Gupta;Christopher Jung;Georgy Noarov;Ramya Ramalingam;Aaron Roth
Online Minimax Multiobjective Optimization: Multicalibeating and Other Applications
在线极小极大多目标优化:多校准和其他应用
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Lee, Daniel;Noarov, Goergy;Pai, Mallesh;Roth, Aaron
- 通讯作者:Roth, Aaron
Reconciling Individual Probability Forecasts✱
协调个人概率预测â±
- DOI:10.1145/3593013.3593980
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Roth, Aaron;Tolbert, Alexander;Weinstein, Scott
- 通讯作者:Weinstein, Scott
Multicalibration as Boosting for Regression
多重校准作为回归的增强
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Globus-Harris, Ira;Harrison, Declan;Kearns, Michael;Roth, Aaron;Sorrell, Jessica
- 通讯作者:Sorrell, Jessica
Wealth Dynamics Over Generations: Analysis and Interventions
几代人的财富动态:分析和干预
- DOI:10.1109/satml54575.2023.00013
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Acharya, Krishna;Arunachaleswaran, Eshwar Ram;Kannan, Sampath;Roth, Aaron;Ziani, Juba
- 通讯作者:Ziani, Juba
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Hamed Hassani其他文献
Length Optimization in Conformal Prediction
保形预测中的长度优化
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Shayan Kiyani;George J Pappas;Hamed Hassani - 通讯作者:
Hamed Hassani
Neural Collaborative Filtering to Predict Human Contact with Large-Scale GPS data
利用神经协同过滤来预测人类与大规模 GPS 数据的接触
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Jorge F. Barreras;Bethany Hsiao;Hamed Hassani;Duncan J Watts - 通讯作者:
Duncan J Watts
Non-asymptotic Coded Slotted ALOHA
非渐近编码时隙ALOHA
- DOI:
10.1109/isit.2019.8849696 - 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Mohammad Fereydounian;Xingran Chen;Hamed Hassani;S. S. Bidokhti - 通讯作者:
S. S. Bidokhti
Learning Q-network for Active Information Acquisition
用于主动信息获取的学习 Q 网络
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Heejin Jeong;Brent Schlotfeldt;Hamed Hassani;M. Morari;Daniel D. Lee;George Pappas - 通讯作者:
George Pappas
On a Relation Between the Rate-Distortion Function and Optimal Transport
率失真函数与最优传输关系的研究
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
E. Lei;Hamed Hassani;S. S. Bidokhti - 通讯作者:
S. S. Bidokhti
Hamed Hassani的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Hamed Hassani', 18)}}的其他基金
Travel: NSF Student Travel Grant for 2023 IEEE North American School for Information Theory
旅行:2023 年 IEEE 北美信息论学院 NSF 学生旅行补助金
- 批准号:
2320167 - 财政年份:2023
- 资助金额:
$ 183.99万 - 项目类别:
Standard Grant
CAREER: Submodular Optimization in Complex Environments: Theory, Algorithms, and Applications
职业:复杂环境中的子模优化:理论、算法和应用
- 批准号:
1943064 - 财政年份:2020
- 资助金额:
$ 183.99万 - 项目类别:
Continuing Grant
CIF: Small: Collaborative Research: Communications in Ultra-Low-Rate Regime: Fundamental Limits, Code Constructions, and Applications
CIF:小型:协作研究:超低速率制度下的通信:基本限制、代码构造和应用
- 批准号:
1910056 - 财政年份:2019
- 资助金额:
$ 183.99万 - 项目类别:
Standard Grant
CRII: CCF: Low-Complexity Coding at Optimal Length
CRII:CCF:最佳长度的低复杂度编码
- 批准号:
1755707 - 财政年份:2018
- 资助金额:
$ 183.99万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: REU Site: Earth and Planetary Science and Astrophysics REU at the American Museum of Natural History in Collaboration with the City University of New York
合作研究:REU 地点:地球与行星科学和天体物理学 REU 与纽约市立大学合作,位于美国自然历史博物馆
- 批准号:
2348998 - 财政年份:2025
- 资助金额:
$ 183.99万 - 项目类别:
Standard Grant
Collaborative Research: REU Site: Earth and Planetary Science and Astrophysics REU at the American Museum of Natural History in Collaboration with the City University of New York
合作研究:REU 地点:地球与行星科学和天体物理学 REU 与纽约市立大学合作,位于美国自然历史博物馆
- 批准号:
2348999 - 财政年份:2025
- 资助金额:
$ 183.99万 - 项目类别:
Standard Grant
"Small performances": investigating the typographic punches of John Baskerville (1707-75) through heritage science and practice-based research
“小型表演”:通过遗产科学和基于实践的研究调查约翰·巴斯克维尔(1707-75)的印刷拳头
- 批准号:
AH/X011747/1 - 财政年份:2024
- 资助金额:
$ 183.99万 - 项目类别:
Research Grant
Democratizing HIV science beyond community-based research
将艾滋病毒科学民主化,超越社区研究
- 批准号:
502555 - 财政年份:2024
- 资助金额:
$ 183.99万 - 项目类别:
Translational Design: Product Development for Research Commercialisation
转化设计:研究商业化的产品开发
- 批准号:
DE240100161 - 财政年份:2024
- 资助金额:
$ 183.99万 - 项目类别:
Discovery Early Career Researcher Award
Understanding the experiences of UK-based peer/community-based researchers navigating co-production within academically-led health research.
了解英国同行/社区研究人员在学术主导的健康研究中进行联合生产的经验。
- 批准号:
2902365 - 财政年份:2024
- 资助金额:
$ 183.99万 - 项目类别:
Studentship
XMaS: The National Material Science Beamline Research Facility at the ESRF
XMaS:ESRF 的国家材料科学光束线研究设施
- 批准号:
EP/Y031962/1 - 财政年份:2024
- 资助金额:
$ 183.99万 - 项目类别:
Research Grant
FCEO-UKRI Senior Research Fellowship - conflict
FCEO-UKRI 高级研究奖学金 - 冲突
- 批准号:
EP/Y033124/1 - 财政年份:2024
- 资助金额:
$ 183.99万 - 项目类别:
Research Grant
UKRI FCDO Senior Research Fellowships (Non-ODA): Critical minerals and supply chains
UKRI FCDO 高级研究奖学金(非官方发展援助):关键矿产和供应链
- 批准号:
EP/Y033183/1 - 财政年份:2024
- 资助金额:
$ 183.99万 - 项目类别:
Research Grant
TARGET Mineral Resources - Training And Research Group for Energy Transition Mineral Resources
TARGET 矿产资源 - 能源转型矿产资源培训与研究小组
- 批准号:
NE/Y005457/1 - 财政年份:2024
- 资助金额:
$ 183.99万 - 项目类别:
Training Grant