III: Small: Collection Construction Methodologies for Learning-to-Rank
III:小:学习排序的集合构建方法
基本信息
- 批准号:1017903
- 负责人:
- 金额:$ 48.87万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2010
- 资助国家:美国
- 起止时间:2010-09-01 至 2013-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Modern search engines, especially those designed for the World Wide Web, commonly analyze and combine hundreds of features extracted from the submitted query and underlying documents (e.g., web pages) in order to assess the relative relevance of a document to a given query and thus rank the underlying collection. The sheer size of this problem has led to the development of learning-to-rank algorithms that can automate the construction of such ranking functions: Given a training set of (feature vector, relevance) pairs, a machine learning procedure learns how to combine the query and document features in such a way so as to effectively assess the relevance of any document to any query and thus rank a collection in response to a user input. Much thought and research has been placed on feature extraction and the development of sophisticated learning-to-rank algorithms. However, relatively little research has been conducted on the choice of documents and queries for learning-to-rank data sets nor on the effect of these choices on the ability of a learning-to-rank algorithm to "learn", effectively and efficiently.The proposed work investigates the effect of query, document, and feature selection on the ability of learning-to-rank algorithms to efficiently and effectively learn ranking functions. In preliminary results on document selection, a pilot study has already determined that training sets whose sizes are as small as 2 to 5% of those typically used are just as effective for learning-to-rank purposes. Thus, one can train more efficiently over a much smaller (though effectively equivalent) data set, or, at an equal cost, one can train over a far "larger" and more representative data set. In addition to formally characterizing this phenomenon for document selection, the proposed work investigate this phenomenon for query and feature selection as well, with the end goals of (1) understanding the effect of document, query, and feature selection on learning-to-rank algorithms and (2) developing collection construction methodologies that are efficient and effective for learning-to-rank purposes.In addition to characterizing and developing collection construction methodologies, the project plan includes development and release of new, efficient, and effective learning-to-rank data sets for use by academia and industry. In fostering this effort, the project team has close ties with the National Institute of Standards and Technology (NIST) and Microsoft Research, two of the premier organizations that develop and release Information Retrieval data sets. All research results and data sets developed as part of this project will be made available at the project website (http://www.ccs.neu.edu/home/jaa/IIS-1017903/). The project provides an educational and training experience for students.
现代搜索引擎,尤其是那些为万维网设计的搜索引擎,通常分析和组合从提交的查询和底层文档(例如,网页)中提取的数百个特征,以便评估文档与给定查询的相对相关性,从而对底层集合进行排名。这个问题的巨大规模已经导致了能够自动构建这样的排序函数的学习排序算法的发展:给定(特征向量、相关性)对的训练集,机器学习过程学习如何以这样的方式组合查询和文档特征,以便有效地评估任何文档与任何查询的相关性,从而响应于用户输入来对集合进行排序。人们在特征提取和复杂的学习排序算法的开发上投入了大量的思考和研究。然而,对于学习排序数据集的文档和查询的选择以及这些选择对学习排序算法的能力的影响的研究相对较少,本文研究了查询、文档和特征选择对学习排序算法高效学习排序函数能力的影响。在关于文件选择的初步结果中,一项试点研究已经确定,其大小仅为通常使用的训练集的2%至5%的训练集对于学习到排名的目的同样有效。因此,一个人可以在一个小得多的(尽管实际上是等价的)数据集上更有效地训练,或者,在相同的成本下,一个人可以在一个“更大”和更具代表性的数据集上进行训练。除了形式化地描述这种现象用于文档选择之外,拟议的工作也研究这种现象用于查询和特征选择,最终目标是(1)了解文档、查询和特征选择对学习排序算法的影响,以及(2)开发用于学习排序目的的高效和有效的集合构建方法。除了表征和开发集合构建方法之外,该项目计划还包括开发和发布新的、高效和有效的学习排序数据集,供学术界和工业界使用。在推动这一努力的过程中,项目团队与国家标准与技术研究所(NIST)和微软研究院(Microsoft Research)保持着密切的联系,这两个组织是开发和发布信息检索数据集的主要组织。作为该项目的一部分开发的所有研究成果和数据集将在项目网站(http://www.ccs.neu.edu/home/jaa/IIS-1017903/).上提供该项目为学生提供了一种教育和培训体验。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Javed Aslam其他文献
Blockchain solution for supply chains & logistics challenges: An empirical investigation
供应链与物流挑战的区块链解决方案:实证研究
- DOI:
10.1016/j.tre.2025.104134 - 发表时间:
2025-06-01 - 期刊:
- 影响因子:8.800
- 作者:
Javed Aslam;Kee-hung Lai;Ahmad Al Hanbali;Nokhaiz Tariq Khan - 通讯作者:
Nokhaiz Tariq Khan
Critical successes factors for the adoption of additive manufacturing: Integrated impact for circular economy model
采用增材制造的关键成功因素:对循环经济模式的综合影响
- DOI:
10.1016/j.techfore.2025.124041 - 发表时间:
2025-04-01 - 期刊:
- 影响因子:13.300
- 作者:
Javed Aslam;Aqeela Saleem;Kee-hung Lai;Yun Bae Kim - 通讯作者:
Yun Bae Kim
Javed Aslam的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Javed Aslam', 18)}}的其他基金
III: Small: Optimal Allocation of Crowdsourced Resources for IR Evaluation
III:小:IR评估众包资源的优化配置
- 批准号:
1421399 - 财政年份:2014
- 资助金额:
$ 48.87万 - 项目类别:
Standard Grant
EAGER: A Nugget-Based Information Retrieval Evaluation Paradigm
EAGER:基于块的信息检索评估范式
- 批准号:
1256172 - 财政年份:2012
- 资助金额:
$ 48.87万 - 项目类别:
Standard Grant
Analysis and Evaluation of Measures of Information Retrieval Performance
信息检索性能测量的分析与评估
- 批准号:
0534482 - 财政年份:2006
- 资助金额:
$ 48.87万 - 项目类别:
Continuing Grant
CAREER: An Information-Theoretic Approach to Computational Learning with Applications
职业:计算学习的信息理论方法及其应用
- 批准号:
0418390 - 财政年份:2003
- 资助金额:
$ 48.87万 - 项目类别:
Continuing Grant
CAREER: An Information-Theoretic Approach to Computational Learning with Applications
职业:计算学习的信息理论方法及其应用
- 批准号:
0093131 - 财政年份:2001
- 资助金额:
$ 48.87万 - 项目类别:
Continuing Grant
相似国自然基金
昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
- 批准号:
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
- 批准号:32000033
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
- 批准号:31972324
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
- 批准号:81900988
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
- 批准号:31870821
- 批准年份:2018
- 资助金额:56.0 万元
- 项目类别:面上项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
- 批准号:31802058
- 批准年份:2018
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
- 批准号:31772128
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
- 批准号:81704176
- 批准年份:2017
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
- 批准号:91640114
- 批准年份:2016
- 资助金额:85.0 万元
- 项目类别:重大研究计划
相似海外基金
HCC: Small: Toward Computational Modeling of Autism Spectrum Disorder: Multimodal Data Collection, Fusion, and Phenotyping
HCC:小型:自闭症谱系障碍的计算模型:多模式数据收集、融合和表型分析
- 批准号:
2401748 - 财政年份:2023
- 资助金额:
$ 48.87万 - 项目类别:
Standard Grant
III: Small: Statistical Inference through Data-Collection and Expert-Knowledge Incorporation
III:小:通过数据收集和专家知识整合进行统计推断
- 批准号:
2311969 - 财政年份:2023
- 资助金额:
$ 48.87万 - 项目类别:
Standard Grant
HCC: Small: Toward Computational Modeling of Autism Spectrum Disorder: Multimodal Data Collection, Fusion, and Phenotyping
HCC:小型:自闭症谱系障碍的计算模型:多模式数据收集、融合和表型分析
- 批准号:
2114644 - 财政年份:2021
- 资助金额:
$ 48.87万 - 项目类别:
Standard Grant
Visualization and spatiotemporal analysis of the collection trends of small mammals specimens in Hokkaido.
北海道小型哺乳动物标本采集趋势的可视化与时空分析
- 批准号:
20K13255 - 财政年份:2020
- 资助金额:
$ 48.87万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
CPS: Small: Collaborative Research: RUI: Towards Efficient and Secure Agricultural Information Collection Using a Multi-Robot System
CPS:小型:协作研究:RUI:使用多机器人系统实现高效、安全的农业信息收集
- 批准号:
1932300 - 财政年份:2020
- 资助金额:
$ 48.87万 - 项目类别:
Standard Grant
CPS: Small: Collaborative Research: RUI: Towards Efficient and Secure Agricultural Information Collection Using a Multi-Robot System
CPS:小型:协作研究:RUI:使用多机器人系统实现高效、安全的农业信息收集
- 批准号:
1931767 - 财政年份:2020
- 资助金额:
$ 48.87万 - 项目类别:
Standard Grant
SHF: Small: Distributed DNA Computations Operating on a Collection of Cell Membranes
SHF:小型:在细胞膜集合上运行的分布式 DNA 计算
- 批准号:
1909848 - 财政年份:2019
- 资助金额:
$ 48.87万 - 项目类别:
Standard Grant
NeTS: Small: RUI: Automating Active Measurement Metadata Collection and Analysis
NeTS:小型:RUI:自动化主动测量元数据收集和分析
- 批准号:
1814537 - 财政年份:2018
- 资助金额:
$ 48.87万 - 项目类别:
Standard Grant
NeTS: Small: Geometric and Topological Analysis on Trajectory Sensing: Collection, Classification and Anonymization
NeTS:小型:轨迹感知的几何和拓扑分析:收集、分类和匿名化
- 批准号:
1618391 - 财政年份:2016
- 资助金额:
$ 48.87万 - 项目类别:
Standard Grant
CSBR: Natural History: Making a large impact on a small herbarium: Integration of un-accessioned and orphaned specimens to secure and promote wider use of the collection
CSBR:自然历史:对小型植物标本馆产生巨大影响:整合未加入和孤立的标本,以确保和促进藏品的更广泛使用
- 批准号:
1561280 - 财政年份:2016
- 资助金额:
$ 48.87万 - 项目类别:
Continuing Grant














{{item.name}}会员




