Author Name Disambiguation in Medline
Medline 中作者姓名消歧
基本信息
- 批准号:6807897
- 负责人:
- 金额:$ 19.97万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2005
- 资助国家:美国
- 起止时间:2005-01-15 至 2007-01-14
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
DESCRIPTION (provided by applicant):
The current inability to identify which papers bearing the same author name (last name, first initial) are written by different individuals is an impediment to user retrieval of health-related information as well as research devoted to understanding the publication and collaboration behavior of biomedical scientists. Disambiguation of author names will help in scientometrics and health policy studies, as well as everyday scientific tasks of numerous kinds: for example, choosing referees and conference attendees. We have created a probabilistic model of how the attributes of Medline articles vary across authors, and hypothesize that this can serve as the basis for disambiguating author names in Medline. In this exploratory two-year study, it is proposed:
1. To create and evaluate a database of "author-individuals" that lists all of the papers in Medline and assigns the great majority of them to one or more specific author-individuals with high confidence. A probabilistic model based on Medline record fields will be refined which estimates, for any two papers bearing the same name, the probability that they were written by the same individual, including supplementary information such as author first names and affiliations for all authors. Then, clustering algorithms will be optimized and applied to form author-individual clusters for all names in Medline.
2. To update the author-individual database (weekly) and underlying probabilistic model (yearly), and to create and evaluate a free, public, multi-user query interface. The database will also be made available to academic researchers for bibliometric, scientometric and policy studies.
This research will set the stage for more in-depth studies of publication and collaboration behavior in the future that should give valuable insights into ways to increase scientific productivity in biomedical sciences.
描述(由申请人提供):
目前无法识别哪些具有相同作者姓名(姓氏,首字母)的论文是由不同的个人撰写的,这阻碍了用户检索健康相关信息以及致力于了解生物医学科学家的出版和合作行为的研究。消除作者姓名的歧义将有助于科学计量学和卫生政策研究,以及各种日常科学任务:例如,选择裁判和会议参加者。我们已经创建了一个概率模型的属性如何不同的作者之间的Medline文章,并假设这可以作为在Medline中消除作者姓名歧义的基础。在这项为期两年的探索性研究中,建议:
1.创建并评估一个“作者个人”数据库,该数据库列出Medline中的所有论文,并将其中的绝大多数分配给一个或多个具有高置信度的特定作者个人。将完善基于Medline记录字段的概率模型,该模型估计任何两篇具有相同名称的论文由同一个人撰写的概率,包括补充信息,如作者的名字和所有作者的隶属关系。然后,聚类算法将被优化并应用于形成Medline中所有名称的作者个体聚类。
2.更新作者个人数据库(每周)和潜在的概率模型(每年),并创建和评估一个免费的,公共的,多用户查询界面。该数据库还将提供给学术研究人员,用于文献计量、科学计量和政策研究。
这项研究将为未来更深入地研究出版和合作行为奠定基础,这些研究将为提高生物医学科学的科学生产力提供有价值的见解。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
NEIL R SMALHEISER其他文献
NEIL R SMALHEISER的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('NEIL R SMALHEISER', 18)}}的其他基金
Automated Indexing for Publication Types and Study Designs
出版物类型和研究设计的自动索引
- 批准号:
10715907 - 财政年份:2023
- 资助金额:
$ 19.97万 - 项目类别:
RNAi-Mediated Gene Suppression in Adult Mammalian CNS
RNAi介导的成年哺乳动物中枢神经系统基因抑制
- 批准号:
6668489 - 财政年份:2002
- 资助金额:
$ 19.97万 - 项目类别:
RNAi-Mediated Gene Suppression in Adult Mammalian CNS
RNAi介导的成年哺乳动物中枢神经系统基因抑制
- 批准号:
6531730 - 财政年份:2002
- 资助金额:
$ 19.97万 - 项目类别:
Arrowsmith Data Mining Techniques in Neuro-Informatics
神经信息学中的 Arrowsmith 数据挖掘技术
- 批准号:
6333330 - 财政年份:2001
- 资助金额:
$ 19.97万 - 项目类别:
Arrowsmith Data Mining Techniques in Neuro-Informatics
神经信息学中的 Arrowsmith 数据挖掘技术
- 批准号:
6608110 - 财政年份:2001
- 资助金额:
$ 19.97万 - 项目类别: