ITR: Unified Graphical Models of Information Extraction and Data Mining with Application to Social Network Analysis
ITR:信息提取和数据挖掘的统一图形模型及其在社交网络分析中的应用
基本信息
- 批准号:0326249
- 负责人:
- 金额:$ 294.47万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2003
- 资助国家:美国
- 起止时间:2003-09-15 至 2011-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project aims to improve our ability to data mine information previously locked in unstructured natural language text. It focuses on developing novel statistical models for information extraction and data mining that have such tight integration that the boundaries between them disappear---resulting in a powerful unified framework for extraction and mining. Current information extraction methods populate slots in a database by identifying relevant subsequences of text, but they are usually unaware of the emerging patterns and regularities in the database. Current data mining methods begin from a populated database, and they are often unaware of where the data came from, or its inherent uncertainties. The result is that the accuracy of both suffers, and significant mining of complex text sources is beyond reach. This project uses probabilistic graphical models that make extraction and mining decisions in the same probabilistic currency, with a common inference procedure. Such models promise significant gains in accuracy and capability, as well as an opportunity for deeper understanding of the role of high-level, top-down patterns in natural language processing, and the role of low-level, bottom-up language data in symbolic processing. The project grounds this work in two real-world applications domains: scientific research and government information. The extraction and mining of large-scale databases in these domains will have broad impacts by providing useful, constantly-updated Web resources, by enabling insights into government efficiency and the flow of scientific ideas, and by making databases, analyses and source code publicly available.http://kdl.cs.umass.edu/projects/unified-graphical-models.html
该项目旨在提高我们对以前锁定在非结构化自然语言文本中的信息进行数据挖掘的能力。 它专注于开发新的统计模型,用于信息提取和数据挖掘,这些模型具有如此紧密的集成,以至于它们之间的边界消失了-从而形成了一个强大的统一的提取和挖掘框架。 目前的信息提取方法通过识别文本的相关顺序来填充数据库中的槽,但是它们通常不知道数据库中出现的模式和错误。当前的数据挖掘方法开始于填充的数据库,并且它们通常不知道数据来自何处或其固有的不确定性。 结果是两者的准确性都受到影响,并且复杂文本源的显着挖掘是遥不可及的。 该项目使用概率图模型,以相同的概率货币进行提取和挖掘决策,并具有共同的推理过程。 这样的模型有望在准确性和能力方面获得显着的收益,并有机会更深入地理解高级自上而下模式在自然语言处理中的作用,以及低级自下而上语言数据在符号处理中的作用。 该项目将这项工作建立在两个现实世界的应用领域:科学研究和政府信息。 这些领域的大型数据库的提取和挖掘将产生广泛的影响,提供有用的、不断更新的网络资源,使人们能够深入了解政府效率和科学思想的流动,并公开数据库、分析和源代码。available.http://kdl.cs.umass.edu/projects/unified-graphical-models.html
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Andrew McCallum其他文献
An Interoperable Multimedia Catalog System for Electronic Commerce.
用于电子商务的可互操作多媒体目录系统。
- DOI:
- 发表时间:
2000 - 期刊:
- 影响因子:0
- 作者:
William W. Cohen;Andrew McCallum;D. Quass - 通讯作者:
D. Quass
Scaling Within Document Coreference to Long Texts
文档共指内的缩放到长文本
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Raghuveer Thirukovalluru;Nicholas Monath;K. Shridhar;M. Zaheer;Mrinmaya Sachan;Andrew McCallum - 通讯作者:
Andrew McCallum
ezCoref : A Scalable Approach for Collecting Crowdsourced Annotations for Coreference Resolution
ezCoref:一种收集众包注释以进行共指解析的可扩展方法
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
A. Crowdsourced;David Bamman;Olivia Lewke;Rachel Bawden;Rico Sennrich;Alexandra Birch;Ari Bornstein;Arie Cattan;Ido Dagan;Hong Chen;Zhenhua Fan;Hao Lu;Alan Yuille;Eduard Hovy;Mitch Marcus;M. Palmer;Lance;Rodney Huddleston. 2002;Frédéric Landragin;T. Poibeau;Bernard Vic;Belinda Z. Li;Gabriel Stanovsky;Robert L Logan;Andrew McCallum;Sameer Singh - 通讯作者:
Sameer Singh
PaRaDe: Passage Ranking using Demonstrations with Large Language Models
PaRaDe:使用大型语言模型的演示进行段落排名
- DOI:
10.48550/arxiv.2310.14408 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Andrew Drozdov;Honglei Zhuang;Zhuyun Dai;Zhen Qin;Razieh Rahimi;Xuanhui Wang;Dana Alon;Mohit Iyyer;Andrew McCallum;Donald Metzler;Kai Hui - 通讯作者:
Kai Hui
Every Answer Matters: Evaluating Commonsense with Probabilistic Measures
每个答案都很重要:用概率度量评估常识
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Qi Cheng;Michael Boratko;Pranay Kumar Yelugam;T. O’Gorman;Nalini Singh;Andrew McCallum;X. Li - 通讯作者:
X. Li
Andrew McCallum的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Andrew McCallum', 18)}}的其他基金
Collaborative Research: SOS-DCI / HNDS-R: Advancing Semantic Network Analysis to Better Understand How Evaluative Exchanges Shape Scientific Arguments
合作研究:SOS-DCI / HNDS-R:推进语义网络分析,以更好地理解评估性交流如何塑造科学论证
- 批准号:
2244805 - 财政年份:2023
- 资助金额:
$ 294.47万 - 项目类别:
Standard Grant
RI: Medium: Probabilistic Box Embeddings
RI:中:概率框嵌入
- 批准号:
2106391 - 财政年份:2021
- 资助金额:
$ 294.47万 - 项目类别:
Standard Grant
DMREF: Collaborative Research: The Synthesis Genome: Data Mining for Synthesis of New Materials
DMREF:协作研究:合成基因组:新材料合成的数据挖掘
- 批准号:
1922090 - 财政年份:2019
- 资助金额:
$ 294.47万 - 项目类别:
Standard Grant
DMREF: Collaborative Research: The Synthesis Genome: Data Mining for Synthesis of New Materials
DMREF:协作研究:合成基因组:新材料合成的数据挖掘
- 批准号:
1534431 - 财政年份:2015
- 资助金额:
$ 294.47万 - 项目类别:
Standard Grant
III: Medium: Constructing Knowledge Bases by Extracting Entity-Relations and Meanings from Natural Language via "Universal Schema"
III:媒介:通过“通用模式”从自然语言中提取实体关系和含义来构建知识库
- 批准号:
1514053 - 财政年份:2015
- 资助金额:
$ 294.47万 - 项目类别:
Continuing Grant
The Fourth Northeast Student Colloquium on Artificial Intelligence
第四届东北学生人工智能学术研讨会
- 批准号:
1036017 - 财政年份:2010
- 资助金额:
$ 294.47万 - 项目类别:
Standard Grant
CI-ADDO-EN: Flexible Machine Learning for Natural Language in the MALLET Toolkit
CI-ADDO-EN:MALLET 工具包中自然语言的灵活机器学习
- 批准号:
0958392 - 财政年份:2010
- 资助金额:
$ 294.47万 - 项目类别:
Continuing Grant
RI-Medium: Collaborative Research: Dynamically-Structured Conditional Random Fields for Complex, Natural Domains
RI-Medium:协作研究:复杂自然域的动态结构条件随机场
- 批准号:
0803847 - 财政年份:2008
- 资助金额:
$ 294.47万 - 项目类别:
Continuing Grant
CRI: Collaborative Research: Improving Experimental Computer Science with a Searchable Web Portal for Data Sets
CRI:协作研究:通过可搜索的数据集门户网站改进实验计算机科学
- 批准号:
0551597 - 财政年份:2006
- 资助金额:
$ 294.47万 - 项目类别:
Continuing Grant
相似海外基金
A Unified Understanding of the Earth's Radiation Environment
对地球辐射环境的统一认识
- 批准号:
NE/Z000157/1 - 财政年份:2024
- 资助金额:
$ 294.47万 - 项目类别:
Research Grant
Thinking about possibilities: Towards a unified cognitive framework
思考可能性:走向统一的认知框架
- 批准号:
FT230100010 - 财政年份:2024
- 资助金额:
$ 294.47万 - 项目类别:
ARC Future Fellowships
A Cyber-Physical System for Unified Diagnosis and Treatment of Lung Disease
肺部疾病统一诊疗的网络物理系统
- 批准号:
MR/Y011694/1 - 财政年份:2024
- 资助金额:
$ 294.47万 - 项目类别:
Fellowship
CAREER: A Unified Theory of Private Control Systems
职业:私人控制系统的统一理论
- 批准号:
2422260 - 财政年份:2024
- 资助金额:
$ 294.47万 - 项目类别:
Continuing Grant
CAREER: Enhancing Organizational Learning: Leveraging Unified Diversity through Human Resource Management
职业:加强组织学习:通过人力资源管理利用统一多样性
- 批准号:
2336679 - 财政年份:2024
- 资助金额:
$ 294.47万 - 项目类别:
Continuing Grant
PILOT - a unified platform for integrated and auditable end-to-end financial planning that seamlessly integrates intelligence to support compliance with best practices and regulations
PILOT - 一个用于集成且可审计的端到端财务规划的统一平台,可无缝集成情报以支持遵守最佳实践和法规
- 批准号:
10097590 - 财政年份:2024
- 资助金额:
$ 294.47万 - 项目类别:
Collaborative R&D
CAS-Climate: CAREER: A Unified Zero-Carbon-Driven Design Framework for Accelerating Power Grid Deep Decarbonization (ZERO-ACCELERATOR)
CAS-气候:职业:加速电网深度脱碳的统一零碳驱动设计框架(零加速器)
- 批准号:
2338158 - 财政年份:2024
- 资助金额:
$ 294.47万 - 项目类别:
Continuing Grant
SensorGROW - an intuitive, cost effective and scalable precision growing platform, powered by a network of unified agri-sensor nodes
SensorGROW - 直观、经济高效且可扩展的精准种植平台,由统一农业传感器节点网络提供支持
- 批准号:
10095990 - 财政年份:2024
- 资助金额:
$ 294.47万 - 项目类别:
Collaborative R&D
Collaborative Research: Elements: ProDM: Developing A Unified Progressive Data Management Library for Exascale Computational Science
协作研究:要素:ProDM:为百亿亿次计算科学开发统一的渐进式数据管理库
- 批准号:
2311757 - 财政年份:2023
- 资助金额:
$ 294.47万 - 项目类别:
Standard Grant
Product structures theorems and unified methods of algorithm design for geometrically constructed graphs
几何构造图的乘积结构定理和算法设计统一方法
- 批准号:
23K10982 - 财政年份:2023
- 资助金额:
$ 294.47万 - 项目类别:
Grant-in-Aid for Scientific Research (C)