III: Small: Interactive Construction of Complex Query Models
III:小:复杂查询模型的交互构建
基本信息
- 批准号:1617408
- 负责人:
- 金额:$ 51.6万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-07-15 至 2020-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This research program will investigate and implement SearchIE, a search-based approach to information "extraction." SearchIE will allow rapid, personalized, situational identification of types of objects or actions in text, where those types are likely to be useful for a complex search task. Modern search engines often provide some mechanism to indicate that a query keyword matches a document only if it occurs in the name of a person or in a location. To make that possible, annotators found and marked a large number of people names (for example) in text, a machine learning algorithm was applied to learn which low-level features are indicative of the name type, and then a resulting classifier for that type is run across the collection of documents. It is then possible to write a query that means "paris used as a person's name rather than a location." Unfortunately, the existing approaches do not serve searchers interested in novel, unanticipated types - for example, names of whaling ships, officers in Queen Victoria's navy, local watering holes. Such examples cannot be handled currently because the classifiers need to be trained and run ahead of time, an expensive data labeling process that is too daunting for many search tasks. Since on-line information gathering almost always starts with search and frequently involves identifying items of interest in the found text, bringing these two together has the potential to change both substantially. The SearchIE approach makes it possible for someone to build personalized extractors contextualized by their topical interests. The result is that the technology can radically improve online searching for lay persons as well as professionals by significantly reducing the time needed to focus queries into relevant information. It does not appear that the information extraction task has ever been approached directly as a search task. SearchIE is unique in bringing an information retrieval (search) mindset to the extraction problem, providing new capabilities that are either impossible or extremely difficult in the traditional "annotate then detect" model of the problem. This project will investigate the fundamental issues raised by the SearchIE approach. What models can best integrate extraction and search in new settings where they can truly happen simultaneously? How can a searcher describe and edit a model for the types of interest? Can an interactively developed model be a springboard into a machine learned model and when is there enough information to do that? Does using topical context to limit the scope of extraction provide the expected accuracy gains using SearchIE's approach? What data structure modifications are needed to fully implement SearchIE so that it is efficient as well as effective? How well does this approach fare on additional standard test collections? Addressing the systems and algorithmic issues are fundamental problems that have the potential to greatly impact both search and extraction. For further information, see the project's web site at http://ciir.cs.umass.edu/research/searchie.
本研究计划将研究和实现SearchIE,一种基于搜索的信息提取方法。“SearchIE将允许快速,个性化,情景识别文本中的对象或动作类型,这些类型可能对复杂的搜索任务有用。现代搜索引擎通常提供某种机制来指示查询关键字仅在出现在人名或位置中时才与文档匹配。为了实现这一点,注释者在文本中发现并标记了大量的人名(例如),应用机器学习算法来学习哪些低级特征指示名称类型,然后在文档集合中运行该类型的分类器。这样就可以编写一个查询,表示“巴黎用作人名而不是位置。“不幸的是,现有的方法不适合对新奇的、意想不到的类型感兴趣的搜索者--例如,捕鲸船的名字、维多利亚女王海军的军官、当地的酒吧。这些例子目前无法处理,因为分类器需要提前训练和运行,这是一个昂贵的数据标记过程,对于许多搜索任务来说太艰巨了。由于在线信息收集几乎总是从搜索开始,并且经常涉及在找到的文本中识别感兴趣的项目,因此将这两者结合在一起有可能大大改变两者。SearchIE的方法使人们有可能根据他们的主题兴趣建立个性化的提取器。其结果是,该技术可以从根本上改善在线搜索的外行人以及专业人士,显着减少所需的时间集中查询相关信息。信息提取任务似乎从来没有被直接当作搜索任务来处理。SearchIE在将信息检索(搜索)思维引入提取问题方面是独一无二的,它提供了在传统的“注释然后检测”问题模型中不可能或非常困难的新功能。这个项目将调查由SearchIE方法提出的基本问题。什么样的模型可以在新的环境中最好地集成提取和搜索,使它们真正同时发生?开发人员如何为感兴趣的类型描述和编辑模型?一个交互式开发的模型可以成为机器学习模型的跳板吗?什么时候有足够的信息来做到这一点?使用主题上下文来限制提取范围是否提供了使用SearchIE方法的预期准确性增益?什么样的数据结构需要修改,以充分实施SearchIE,使它是高效的,以及有效的?这种方法在额外的标准测试集合上表现如何?解决系统和算法问题是有可能极大地影响搜索和提取的基本问题。欲了解更多信息,请访问该项目的网站http://ciir.cs.umass.edu/research/searchie。
项目成果
期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Sentence Retrieval for Entity List Extraction with a Seed, Context, and Topic
- DOI:10.1145/3341981.3344250
- 发表时间:2019-09
- 期刊:
- 影响因子:0
- 作者:Sheikh Muhammad Sarwar;John Foley;Liu Yang;J. Allan
- 通讯作者:Sheikh Muhammad Sarwar;John Foley;Liu Yang;J. Allan
A Reinforcement Learning Framework for Relevance Feedback
- DOI:10.1145/3397271.3401099
- 发表时间:2020-07
- 期刊:
- 影响因子:0
- 作者:Ali Montazeralghaem;Hamed Zamani;J. Allan
- 通讯作者:Ali Montazeralghaem;Hamed Zamani;J. Allan
Term Discrimination Value for Cross-Language Information Retrieval
跨语言信息检索的术语判别值
- DOI:10.1145/3341981.3344252
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Montazeralghaem, Ali;Rahimi, Razieh;Allan, James
- 通讯作者:Allan, James
SearchIE: A Retrieval Approach for Information Extraction
SearchIE:一种信息提取的检索方法
- DOI:10.1145/3341981.3344248
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Sarwar, Sheikh Muhammad;Allan, James
- 通讯作者:Allan, James
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
James Allan其他文献
A Single Nucleotide Resolution Model for Large-Scale Simulations of Double Stranded DNA
用于大规模模拟双链 DNA 的单核苷酸分辨率模型
- DOI:
10.1101/069310 - 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Y. G. Fosado;D. Michieletto;James Allan;C. Brackley;O. Henrich;D. Marenduzzo - 通讯作者:
D. Marenduzzo
Introduction to topic detection and tracking
- DOI:
10.1007/978-1-4615-0933-2_1 - 发表时间:
2002 - 期刊:
- 影响因子:0
- 作者:
James Allan - 通讯作者:
James Allan
A semantic data framework to support data-driven demand forecasting
支持数据驱动的需求预测的语义数据框架
- DOI:
10.1088/1742-6596/2600/2/022001 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
James Allan;Francesca Mangili;Marco Derboni;Luis Gisler;A. Hainoun;A. Rizzoli;Luca Ventriglia;M. Sulzer - 通讯作者:
M. Sulzer
Using CrowdLogger for in situ information retrieval system evaluation
使用CrowdLogger进行现场信息检索系统评估
- DOI:
10.1145/2513150.2513164 - 发表时间:
2013 - 期刊:
- 影响因子:0
- 作者:
H. Feild;James Allan - 通讯作者:
James Allan
Reranking search results for sparse queries
对稀疏查询的搜索结果重新排序
- DOI:
10.1145/2063576.2063606 - 发表时间:
2011 - 期刊:
- 影响因子:0
- 作者:
Elif Aktolga;James Allan - 通讯作者:
James Allan
James Allan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('James Allan', 18)}}的其他基金
CondensabLe AeRosol from non Ideal Stove Emissions (CLARISE)
非理想炉排放产生的冷凝气溶胶 (CLARISE)
- 批准号:
NE/X000923/1 - 财政年份:2023
- 资助金额:
$ 51.6万 - 项目类别:
Research Grant
III: Medium: Collaborative Research: Athena: Learning-oriented Search with Personalized Learning Flows
III:媒介:协作研究:Athena:具有个性化学习流程的面向学习的搜索
- 批准号:
2106282 - 财政年份:2021
- 资助金额:
$ 51.6万 - 项目类别:
Continuing Grant
EAGER: Dynamic Contextual Explanation of Search Results
EAGER:搜索结果的动态上下文解释
- 批准号:
2039449 - 财政年份:2020
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
CRI: CI-SUSTAIN: Collaborative Research: Sustaining Lemur Project Resources for the Long-Term
CRI:CI-SUSTAIN:合作研究:长期维持狐猴项目资源
- 批准号:
1822986 - 财政年份:2018
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
Soot Aerodynamic Size Selection for Optical properties (SASSO)
光学特性烟灰空气动力学尺寸选择 (SASSO)
- 批准号:
NE/S00212X/1 - 财政年份:2018
- 资助金额:
$ 51.6万 - 项目类别:
Research Grant
III: Small: Mirador: Explainable Computational Models for Recognizing and Understanding Controversial Topics Encountered Online
III:小:Mirador:用于识别和理解网上遇到的有争议话题的可解释计算模型
- 批准号:
1813662 - 财政年份:2018
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
I-Corps: Probabilistically Detecting Controversy
I-Corps:概率性检测争议
- 批准号:
1721069 - 财政年份:2017
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
Megacity Delhi atmospheric emission quantification, assessment and impacts (DelhiFlux) - Manchester
大城市德里大气排放量化、评估和影响 (DelhiFlux) - 曼彻斯特
- 批准号:
NE/P016472/1 - 财政年份:2016
- 资助金额:
$ 51.6万 - 项目类别:
Research Grant
Sources and Emissions of Air Pollutants in Beijing (Manchester)
北京(曼彻斯特)空气污染物来源及排放
- 批准号:
NE/N007123/1 - 财政年份:2016
- 资助金额:
$ 51.6万 - 项目类别:
Research Grant
III: Small: Topical Positioning System (TPS) for Informed Reading of Web Pages
III:小:网页知情阅读的主题定位系统(TPS)
- 批准号:
1217281 - 财政年份:2012
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
相似国自然基金
昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
- 批准号:32000033
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
- 批准号:31972324
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
- 批准号:81900988
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
- 批准号:31802058
- 批准年份:2018
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
- 批准号:31870821
- 批准年份:2018
- 资助金额:56.0 万元
- 项目类别:面上项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
- 批准号:31772128
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
- 批准号:81704176
- 批准年份:2017
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
- 批准号:91640114
- 批准年份:2016
- 资助金额:85.0 万元
- 项目类别:重大研究计划
相似海外基金
III: Small: Deep Interactive Reinforcement Learning for Self-optimizing Feature Selection
III:小:用于自优化特征选择的深度交互式强化学习
- 批准号:
2152030 - 财政年份:2022
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
Collaborative Research: RI: III: SHF: Small: Multi-Stakeholder Decision Making: Qualitative Preference Languages, Interactive Reasoning, and Explanation
协作研究:RI:III:SHF:小型:多利益相关者决策:定性偏好语言、交互式推理和解释
- 批准号:
2225824 - 财政年份:2022
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
Collaborative Research: RI: III: SHF: Small: Multi-Stakeholder Decision Making: Qualitative Preference Languages, Interactive Reasoning, and Explanation
协作研究:RI:III:SHF:小型:多利益相关者决策:定性偏好语言、交互式推理和解释
- 批准号:
2225823 - 财政年份:2022
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
III: Small: Fair Decision Making by Consensus: Interactive Bias Mitigation Technology
III:小:共识公平决策:交互式偏差缓解技术
- 批准号:
2007932 - 财政年份:2020
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
III: Small: An end-to-end pipeline for interactive visual analysis of big data
III:小型:用于大数据交互式可视化分析的端到端管道
- 批准号:
1815238 - 财政年份:2018
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
III: Small: Towards a Database Engine for Interactive and Online Sampling and Analytics
III:小型:面向交互式在线采样和分析的数据库引擎
- 批准号:
1619287 - 财政年份:2016
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
III: Small: Collaborative Research: Towards Interactive Data Visualization Management Systems
III:小型:协作研究:迈向交互式数据可视化管理系统
- 批准号:
1527779 - 财政年份:2015
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
III: Small: Collaborative Research: Towards Interactive Data Visualization Management Systems
III:小型:协作研究:迈向交互式数据可视化管理系统
- 批准号:
1527765 - 财政年份:2015
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant
III: Small: Characterizing and Evaluating Whole Session Interactive Information Retrieval
III:小:描述和评估整个会话交互式信息检索
- 批准号:
1423239 - 财政年份:2014
- 资助金额:
$ 51.6万 - 项目类别:
Continuing Grant
HCC: III: Small: MyDome - Defining the Computational and Cognitive Potential of Interactive Simulations in an Immersive Dome Environment
HCC:III:小型:MyDome - 定义沉浸式圆顶环境中交互式模拟的计算和认知潜力
- 批准号:
0916098 - 财政年份:2009
- 资助金额:
$ 51.6万 - 项目类别:
Standard Grant