Exploiting Semantic Analysis of Documents

利用文档语义分析

基本信息

  • 批准号:
    RGPIN-2015-06183
  • 负责人:
  • 金额:
    $ 3.13万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2019
  • 资助国家:
    加拿大
  • 起止时间:
    2019-01-01 至 2020-12-31
  • 项目状态:
    已结题

项目摘要

Consider the tasks of organizing a collection of research papers for the purpose of writing a thesis; organizing the set of accepted papers at a conference into meaningful and coherent sessions; looking up a corpus of incident reports in customer service to locate the most relevant cases and their resolution to the new case on hand; or discovering novel treatments for diseases through implicit connections in the biomedical literature.  ***A core problem underlying such tasks is that of semantic relatedness of documents. Semantic relatedness of documents should not be limited to the sharing of words, as two documents may be about the same topic, but using different vocabulary (for example a medical document for experts versus a medical document for the layperson). Given a domain-specific corpus, topic models have been fit to documents and terms, leading to the representation of documents as instances of generative probabilistic models of mixtures of topics. Topic models require corpora and documents of sufficient size to be robust. In real life, documents may be short (e.g. titles or abstracts) and document corpora may contain a small number of documents (tens or hundreds instead of thousands), rendering topic models unreliable. The proposed research program will investigate semantic relatedness measures that are applicable to any domain and rely on readily available external knowledge sources, such as the Google n-gram corpus and Wikipedia. ****Organizing document collections into semantically coherent clusters has typically relied on bag-of-word document representations, with a focus more on mathematical sophistication than the interpretability of the document representation by the user. In the proposed research program we will seek algorithms and processes that support the human user in her sense making process, providing support to her in interactively steering the document representation and clustering process to fit her objectives. ***In collaboration with industrial partners, we will test the proposed methods in different applications of practical significance, such as interactive clustering of corporate document sets, automatic ranking of resumes against job ads, expertise mapping and matchmaking, paper referee assignment, and content-based recommendation of news to digital newspaper subscribers. A long term objective is to support document-based discovery in the majority of scientific fields that lack the sophistication of terminological and ontological resources currently available in the biomedical field. **
考虑组织一系列研究论文的任务,以撰写论文为目的;在会议上将一组接受的论文组织成有意义和连贯的会议;查找客户服务中的事件报告语料库,以找到最相关的案例及其对手头新案例的解决方案;或者通过生物医学文献中的隐含联系发现新的疾病治疗方法。* 这些任务背后的核心问题是文档的语义相关性。文档的语义相关性不应仅限于单词的共享,因为两个文档可能是关于同一主题的,但使用不同的词汇(例如,专家的医学文档与外行的医学文档)。给定一个特定领域的语料库,主题模型已经适合于文档和术语,导致将文档表示为主题混合物的生成概率模型的实例。主题模型需要足够大的语料库和文档才能保持健壮。在真实的生活中,文档可能很短(例如标题或摘要),并且文档语料库可能包含少量文档(数十或数百而不是数千),这使得主题模型不可靠。拟议的研究计划将调查适用于任何领域的语义相关性措施,并依赖于现成的外部知识来源,如谷歌n-gram语料库和维基百科。* 将文档集合组织成语义一致的集群通常依赖于词袋文档表示,更注重数学复杂性,而不是用户对文档表示的可解释性。在拟议的研究计划中,我们将寻求算法和过程,支持人类用户在她的意义决策过程中,为她提供支持,以交互式地引导文档表示和聚类过程,以满足她的目标。* 与工业合作伙伴合作,我们将在具有实际意义的不同应用中测试所提出的方法,例如公司文档集的交互式聚类,根据招聘广告对简历进行自动排名、专业知识映射和匹配、纸质推荐人分配以及向数字报纸订阅者推荐基于内容的新闻。一个长期目标是在大多数科学领域支持基于文档的发现,这些领域缺乏生物医学领域目前可用的术语和本体资源的复杂性。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Milios, Evangelos其他文献

Information retrieval by semantic similarity
Causal graph extraction from news: a comparative study of time-series causality learning techniques.
  • DOI:
    10.7717/peerj-cs.1066
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    3.8
  • 作者:
    Maisonnave, Mariano;Delbianco, Fernando;Tohme, Fernando;Milios, Evangelos;Maguitman, Ana G.
  • 通讯作者:
    Maguitman, Ana G.
Improving the performance of focused web crawlers
  • DOI:
    10.1016/j.datak.2009.04.002
  • 发表时间:
    2009-10-01
  • 期刊:
  • 影响因子:
    2.5
  • 作者:
    Batsakis, Sotiris;Petrakis, Euripides G. M.;Milios, Evangelos
  • 通讯作者:
    Milios, Evangelos
Statistical learning for OCR error correction
  • DOI:
    10.1016/j.ipm.2018.06.001
  • 发表时间:
    2018-11-01
  • 期刊:
  • 影响因子:
    8.6
  • 作者:
    Mei, Jie;Islam, Aminul;Milios, Evangelos
  • 通讯作者:
    Milios, Evangelos
Topic-based web site summarization

Milios, Evangelos的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Milios, Evangelos', 18)}}的其他基金

Semantic Representations for Interactive Text Mining
交互式文本挖掘的语义表示
  • 批准号:
    RGPIN-2020-04834
  • 财政年份:
    2022
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Discovery Grants Program - Individual
Semantic Representations for Interactive Text Mining
交互式文本挖掘的语义表示
  • 批准号:
    RGPIN-2020-04834
  • 财政年份:
    2021
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Discovery Grants Program - Individual
How is Canadians' mental health affected by COVID-19: visual analytics of social media text
COVID-19 对加拿大人的心理健康有何影响:社交媒体文本的可视化分析
  • 批准号:
    554657-2020
  • 财政年份:
    2020
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Alliance Grants
Semantic Representations for Interactive Text Mining
交互式文本挖掘的语义表示
  • 批准号:
    RGPIN-2020-04834
  • 财政年份:
    2020
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Discovery Grants Program - Individual
Semantic search using deep networks****
使用深度网络进行语义搜索****
  • 批准号:
    531051-2018
  • 财政年份:
    2018
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Engage Grants Program
Exploiting Semantic Analysis of Documents
利用文档语义分析
  • 批准号:
    RGPIN-2015-06183
  • 财政年份:
    2018
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Discovery Grants Program - Individual
Visual text analytics for total recall information retrieval in large noisy text datasets
用于大型噪声文本数据集中的总召回信息检索的视觉文本分析
  • 批准号:
    499941-2016
  • 财政年份:
    2017
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Collaborative Research and Development Grants
Exploiting Semantic Analysis of Documents
利用文档语义分析
  • 批准号:
    RGPIN-2015-06183
  • 财政年份:
    2017
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Discovery Grants Program - Individual
Trajectory-based localization using WiFi signal strength
使用 WiFi 信号强度进行基于轨迹的定位
  • 批准号:
    507295-2016
  • 财政年份:
    2016
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Engage Grants Program
Automation and Evaluation of Business Intelligence
商业智能的自动化和评估
  • 批准号:
    492547-2015
  • 财政年份:
    2016
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Engage Grants Program

相似海外基金

CASCADE: Computational Analysis of Semantic Change Across Different Environments
CASCADE:不同环境下语义变化的计算分析
  • 批准号:
    EP/Y031075/1
  • 财政年份:
    2024
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Research Grant
Mixed-methods Digital Oral History: Enfolding semantic web technologies and historical-interpretative analysis
混合方法数字口述历史:包含语义网络技术和历史解释分析
  • 批准号:
    AH/Y007557/1
  • 财政年份:
    2024
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Research Grant
Collaborative Research: SOS-DCI / HNDS-R: Advancing Semantic Network Analysis to Better Understand How Evaluative Exchanges Shape Scientific Arguments
合作研究:SOS-DCI / HNDS-R:推进语义网络分析,以更好地理解评估性交流如何塑造科学论证
  • 批准号:
    2244805
  • 财政年份:
    2023
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Standard Grant
Discovering clinical endpoints of toxicity via graph machine learning and semantic data analysis
通过图机器学习和语义数据分析发现毒性的临床终点
  • 批准号:
    10745593
  • 财政年份:
    2023
  • 资助金额:
    $ 3.13万
  • 项目类别:
Collaborative Research: SOS-DCI / HNDS-R: Advancing Semantic Network Analysis to Better Understand How Evaluative Exchanges Shape Scientific Arguments
合作研究:SOS-DCI / HNDS-R:推进语义网络分析,以更好地理解评估性交流如何塑造科学论证
  • 批准号:
    2244804
  • 财政年份:
    2023
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Standard Grant
A corpus-based computational analysis of Hungarian negative emotive elements from the viewpoint of semantic changes
基于语料库的语义变化视角下的匈牙利语负面情绪元素计算分析
  • 批准号:
    23KF0028
  • 财政年份:
    2023
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Definiteness in Gothic & Old Church Slavonic biblical translation: a semantic analysis of form and use
哥特式的确定性
  • 批准号:
    2877225
  • 财政年份:
    2023
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Studentship
Semantic Web Analysis over the Nova Scotia Open Data
新斯科舍省开放数据的语义网络分析
  • 批准号:
    RGPIN-2020-05869
  • 财政年份:
    2022
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Discovery Grants Program - Individual
Leveraging Patterns and Interesting Anomalies through Scalable Semantic Analysis
通过可扩展的语义分析利用模式和有趣的异常
  • 批准号:
    RGPIN-2018-03749
  • 财政年份:
    2022
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Discovery Grants Program - Individual
Leveraging Patterns and Interesting Anomalies through Scalable Semantic Analysis
通过可扩展的语义分析利用模式和有趣的异常
  • 批准号:
    RGPIN-2018-03749
  • 财政年份:
    2021
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了