III: Medium: Collaborative Research: Scaling Machine Learning to Massive Datasets---A Logic Based Approach
III:媒介:协作研究:将机器学习扩展到海量数据集——基于逻辑的方法
基本信息
- 批准号:1302690
- 负责人:
- 金额:$ 33.3万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2013
- 资助国家:美国
- 起止时间:2013-09-01 至 2017-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Machine learning (ML) algorithms have become ubiquitous across applications as diverse as science, engineering, business, finance, education and healthcare. However, development of ML software that can scale to massive datasets and that are also easy-to-use remains a challenge in part due to the fact that developing an ML tool currently requires the implementation of a deep software stack, from the actual runtime (i.e., how an ML algorithm is executed) to the API exposed to the users.This project aims to develop DeML, a system to support the authoring and execution of ML tools. Specifically, DeML would allow ML algorithms to be formulated in the form of a declarative query over the training dataset. DeML optimizes the execution of the query over a computing platform (e.g., Amazon EC2 or SQL Azure), taking into account the characteristics of the algorithm, the data, and the available computational resources. Adoption of DeML would greatly reduce the effort required to develop scalable implementations of ML algorithms. The project is organized around three thrusts: (i) Development of a declarative query language, based on extensions of Datalog; (ii) Analysis of runtime of DeML queries; (iii) Optimization of dataflow of DeML queries based on the characteristics of data sources and the capabilities of the underlying execution platform. The resulting open source DeML prototype implementation will be made freely available to the community through the project web page at: http://deml.cs.ucla.edu.The availability of the DeML could greatly lower the effort needed to author scalable implementations of ML algorithms for analysis of massive datasets, which in turn would increase the availability of such tools to the broader community. Experience gained by implementing and deploying ML algorithms at scale over modern cloud-computing platforms, could help inform critical design choices in the development of future cloud computing platforms for big data analytics, and hence impact a broad range of scientific, engineering, national security, healthcare and business applications of big data analytics. The project offers enhanced opportunities for research-based advanced training of graduate and undergraduate students, including members of groups that are currently under-represented in computer science, in databases, machine learning, and cloud computing.
机器学习(ML)算法在科学、工程、商业、金融、教育和医疗保健等各种应用中无处不在。然而,开发可以扩展到大量数据集并且易于使用的ML软件仍然是一个挑战,部分原因是开发ML工具目前需要实现深度软件堆栈,从实际运行时(即ML算法如何执行)到向用户公开的API。该项目旨在开发DeML,一个支持ML工具的创作和执行的系统。具体来说,DeML将允许ML算法以对训练数据集的声明性查询的形式进行公式化。DeML在计算平台(例如Amazon EC2或SQL Azure)上优化查询的执行,同时考虑到算法、数据和可用计算资源的特征。采用DeML将大大减少开发ML算法的可伸缩实现所需的工作量。该项目有三个重点:(i)根据Datalog的扩展开发一种声明性查询语言;(ii)分析DeML查询的运行时;(iii)根据数据源的特点和底层执行平台的能力,优化DeML查询的数据流。由此产生的开源DeML原型实现将通过项目网页:http://deml.cs.ucla.edu.The免费提供给社区,DeML的可用性可以大大降低编写用于分析大量数据集的ML算法的可扩展实现所需的工作量,这反过来将增加此类工具对更广泛社区的可用性。通过在现代云计算平台上大规模实施和部署机器学习算法所获得的经验,可以帮助为未来大数据分析云计算平台开发中的关键设计选择提供信息,从而影响大数据分析的广泛科学、工程、国家安全、医疗保健和商业应用。该项目为研究生和本科生提供了更多基于研究的高级培训机会,包括目前在计算机科学、数据库、机器学习和云计算领域代表性不足的小组成员。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Neoklis Polyzotis其他文献
Neoklis Polyzotis的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Neoklis Polyzotis', 18)}}的其他基金
III: Small: Novel Paradigms for Automated Index Tuning
III:小:自动索引调整的新颖范式
- 批准号:
1018914 - 财政年份:2010
- 资助金额:
$ 33.3万 - 项目类别:
Continuing Grant
CAREER: Novel Summarization Techniques for Semi-Structured Data
职业:半结构化数据的新颖总结技术
- 批准号:
0447966 - 财政年份:2005
- 资助金额:
$ 33.3万 - 项目类别:
Continuing Grant
相似海外基金
III : Medium: Collaborative Research: From Open Data to Open Data Curation
III:媒介:协作研究:从开放数据到开放数据管理
- 批准号:
2420691 - 财政年份:2024
- 资助金额:
$ 33.3万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: Designing AI Systems with Steerable Long-Term Dynamics
合作研究:III:中:设计具有可操纵长期动态的人工智能系统
- 批准号:
2312865 - 财政年份:2023
- 资助金额:
$ 33.3万 - 项目类别:
Standard Grant
Collaborative Research: III: MEDIUM: Responsible Design and Validation of Algorithmic Rankers
合作研究:III:媒介:算法排序器的负责任设计和验证
- 批准号:
2312932 - 财政年份:2023
- 资助金额:
$ 33.3万 - 项目类别:
Standard Grant
III: Medium: Collaborative Research: Integrating Large-Scale Machine Learning and Edge Computing for Collaborative Autonomous Vehicles
III:媒介:协作研究:集成大规模机器学习和边缘计算以实现协作自动驾驶汽车
- 批准号:
2348169 - 财政年份:2023
- 资助金额:
$ 33.3万 - 项目类别:
Continuing Grant
Collaborative Research: III: Medium: Algorithms for scalable inference and phylodynamic analysis of tumor haplotypes using low-coverage single cell sequencing data
合作研究:III:中:使用低覆盖率单细胞测序数据对肿瘤单倍型进行可扩展推理和系统动力学分析的算法
- 批准号:
2415562 - 财政年份:2023
- 资助金额:
$ 33.3万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: VirtualLab: Integrating Deep Graph Learning and Causal Inference for Multi-Agent Dynamical Systems
协作研究:III:媒介:VirtualLab:集成多智能体动态系统的深度图学习和因果推理
- 批准号:
2312501 - 财政年份:2023
- 资助金额:
$ 33.3万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: Knowledge discovery from highly heterogeneous, sparse and private data in biomedical informatics
合作研究:III:中:生物医学信息学中高度异构、稀疏和私有数据的知识发现
- 批准号:
2312862 - 财政年份:2023
- 资助金额:
$ 33.3万 - 项目类别:
Standard Grant
Collaborative Research: III: MEDIUM: Responsible Design and Validation of Algorithmic Rankers
合作研究:III:媒介:算法排序器的负责任设计和验证
- 批准号:
2312930 - 财政年份:2023
- 资助金额:
$ 33.3万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: New Machine Learning Empowered Nanoinformatics System for Advancing Nanomaterial Design
合作研究:III:媒介:新的机器学习赋能纳米信息学系统,促进纳米材料设计
- 批准号:
2347592 - 财政年份:2023
- 资助金额:
$ 33.3万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: Graph Neural Networks for Heterophilous Data: Advancing the Theory, Models, and Applications
合作研究:III:媒介:异质数据的图神经网络:推进理论、模型和应用
- 批准号:
2406648 - 财政年份:2023
- 资助金额:
$ 33.3万 - 项目类别:
Standard Grant














{{item.name}}会员




