PubMed Query Log Analysis and Use in Access Inhancement
PubMed 查询日志分析及其在访问增强中的使用
基本信息
- 批准号:7969244
- 负责人:
- 金额:$ 77.4万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:ArtsCategoriesCharacteristicsComputational TechniqueComputersDataDatabasesDevelopmentDiseaseDrug FormulationsEducational process of instructingGene ProteinsGoalsHabitsInformation ServicesInvestigationLeadLearningLengthLinkLiteratureMethodsNamesPharmaceutical PreparationsPlayPopulationPubMedRelative (related person)ResearchResourcesRoleServicesSpecificitySuggestionTimeWorkabstractingfallsimprovedmeetingsresearch studyresponsesensorsuccesstext searchingtool
项目摘要
Biomedical literature search is the main entry point for an ever-increasing range of information. PubMed/MEDLINE is the most widely used service for this purpose. However, finding citations relevant to a users information need is not always easy in PubMed. Improving our understanding of the growing population of PubMed users, their information needs and the way in which they meet these needs opens opportunities to improve information services and information access provided by PubMed. One resource for understanding and characterizing patrons of search engines is the transaction logs. Our investigation of user interactions through one month of PubMed logs focused on analyzing user needs, different aspects of queries (e.g. length), and user search habits. Built on the query log analysis, we further developed an automatic search aid in query formulation, namely Related Queries (RQ). RQ focuses on finding popular queries that contain the initial user search term with a goal of helping users describe their information needs in a more precise manner (i.e. increase of specificity relative to the user input). This aid has been integrated into PubMed since January 2009. Automatic assessment using clickthrough data shows that each day, the new feature is used consistently between 6% and 10% of the time when it is shown, suggesting that it has quickly become a popular new feature in PubMed. Inspired by its success, we are currently experimenting with other state-of-the-art methods for further improving the quality of query suggestions and expanding beyond the query specification. In addition, we are focusing on developing computational techniques in response to queries that return zero results in PubMed. As shown in our log analysis, about 15% of PubMed searches fall into this category. In some cases there really is no document or abstract that will satisfy a particular query. However, in analyzing one month of queries submitted to PubMed, we find that more often than not, queries that retrieved no results are queries that would retrieve something relevant if they were constructed differently. We are currently identifying some of the characteristics of unsuccessful queries and teaching computers to automatically learn the changes that users most often apply in constructing new, corrected queries.
Not only can log analysis help PubMed search as a whole, it can play an important role in developing tools for improving the links between different Entrez databases. Through our analysis of PubMed logs, we learn that people search certain biomedical concepts more often than others and that there exist strong associations between different concepts. For example, a disease name often co-occurs with gene/protein and drug names. As a result, we have worked towards the development of different PubMed sensors: automatic tools for recognizing biomedical concepts and building links to related data outside of biomedical literature. This will allow PubMed users to be more readily drawn to related data that could lead to serendipitous discoveries.
生物医学文献检索是获取不断增加的信息的主要入口。PubMed/MEDLINE是用于此目的的最广泛的服务。然而,在PubMed中找到与用户信息需求相关的引文并不总是容易的。提高我们对PubMed用户不断增长的人口,他们的信息需求以及他们满足这些需求的方式的理解,为改善PubMed提供的信息服务和信息访问提供了机会。用于理解和表征搜索引擎的顾客的一个资源是事务日志。我们通过一个月的PubMed日志对用户交互进行了调查,重点分析了用户需求、查询的不同方面(例如长度)和用户搜索习惯。建立在查询日志分析,我们进一步开发了一个自动搜索辅助查询公式,即相关的搜索(RQ)。RQ专注于找到包含初始用户搜索词的流行查询,目的是帮助用户以更精确的方式描述他们的信息需求(即增加相对于用户输入的特异性)。自2009年1月以来,这一援助已被纳入PubMed。使用点击数据进行的自动评估显示,每天,新功能在显示时的使用率在6%到10%之间,这表明它已迅速成为PubMed中流行的新功能。受其成功的启发,我们目前正在试验其他最先进的方法,以进一步提高查询建议的质量,并扩展到查询规范之外。此外,我们正专注于开发计算技术,以响应在PubMed中返回零结果的查询。正如我们的日志分析所示,大约15%的PubMed搜索属于这一类。在某些情况下,确实没有文档或摘要可以满足特定的查询。 然而,在分析提交给PubMed的一个月的查询时,我们发现,通常情况下,没有检索到结果的查询如果构造不同,则会检索到相关的内容。我们目前正在识别一些不成功的查询的特征,并教计算机自动学习用户在构建新的、正确的查询时最常应用的更改。
日志分析不仅可以帮助PubMed搜索作为一个整体,它可以发挥重要作用,在开发工具,以改善不同的PubMed数据库之间的链接。通过我们对PubMed日志的分析,我们了解到人们搜索某些生物医学概念的频率高于其他概念,并且不同概念之间存在很强的关联。例如,疾病名称通常与基因/蛋白质和药物名称共同出现。因此,我们一直致力于开发不同的PubMed传感器:用于识别生物医学概念并与生物医学文献之外的相关数据建立链接的自动工具。这将使PubMed用户更容易被相关数据所吸引,这些数据可能会导致偶然的发现。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Willy Wilbur其他文献
Willy Wilbur的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Willy Wilbur', 18)}}的其他基金
General and Semi-supervised Machine Learning Applied to Bioinformatics
应用于生物信息学的通用和半监督机器学习
- 批准号:
8558105 - 财政年份:
- 资助金额:
$ 77.4万 - 项目类别:
Natural Language Processing Techniques To Enhance Information Access.
增强信息访问的自然语言处理技术。
- 批准号:
8943224 - 财政年份:
- 资助金额:
$ 77.4万 - 项目类别:
Automatic Analysis and Annotation of Document Keywords in Biomedical Literature
生物医学文献中文档关键词的自动分析与标注
- 批准号:
8344960 - 财政年份:
- 资助金额:
$ 77.4万 - 项目类别:
General and Semi-supervised Machine Learning Applied to Bioinformatics
应用于生物信息学的通用和半监督机器学习
- 批准号:
8149602 - 财政年份:
- 资助金额:
$ 77.4万 - 项目类别:
General and Semi-supervised Machine Learning Applied to Bioinformatics
应用于生物信息学的通用和半监督机器学习
- 批准号:
8344948 - 财政年份:
- 资助金额:
$ 77.4万 - 项目类别:
相似海外基金
Quantum Groups, W-algebras, and Brauer-Kauffmann Categories
量子群、W 代数和布劳尔-考夫曼范畴
- 批准号:
2401351 - 财政年份:2024
- 资助金额:
$ 77.4万 - 项目类别:
Standard Grant
The geometry of braids and triangulated categories
辫子的几何形状和三角类别
- 批准号:
DE240100447 - 财政年份:2024
- 资助金额:
$ 77.4万 - 项目类别:
Discovery Early Career Researcher Award
Constructing and Classifying Pre-Tannakian Categories
前坦纳克阶范畴的构建和分类
- 批准号:
2401515 - 财政年份:2024
- 资助金额:
$ 77.4万 - 项目类别:
Standard Grant
Representation Theory and Geometry in Monoidal Categories
幺半群范畴中的表示论和几何
- 批准号:
2401184 - 财政年份:2024
- 资助金额:
$ 77.4万 - 项目类别:
Continuing Grant
Deformation of singularities through Hodge theory and derived categories
通过霍奇理论和派生范畴进行奇点变形
- 批准号:
DP240101934 - 财政年份:2024
- 资助金额:
$ 77.4万 - 项目类别:
Discovery Projects
Migrant Youth and the Sociolegal Construction of Child and Adult Categories
流动青年与儿童和成人类别的社会法律建构
- 批准号:
2341428 - 财政年份:2024
- 资助金额:
$ 77.4万 - 项目类别:
Standard Grant
Postdoctoral Fellowship: MPS-Ascend: Understanding Fukaya categories through Homological Mirror Symmetry
博士后奖学金:MPS-Ascend:通过同调镜像对称理解深谷范畴
- 批准号:
2316538 - 财政年份:2023
- 资助金额:
$ 77.4万 - 项目类别:
Fellowship Award
Collaborative Research: Derived Categories in Birational Geometry, Enumerative Geometry, and Non-commutative Algebra
合作研究:双有理几何、枚举几何和非交换代数中的派生范畴
- 批准号:
2302262 - 财政年份:2023
- 资助金额:
$ 77.4万 - 项目类别:
Standard Grant
Eco Warrior: Expanding to new Plastic Free Categories
生态战士:扩展到新的无塑料类别
- 批准号:
10055912 - 财政年份:2023
- 资助金额:
$ 77.4万 - 项目类别:
Grant for R&D














{{item.name}}会员




