Mining Social Media Big Data for Toxicovigilance: Studying Substance Use via Natural Language Processing and Machine Learning Methods

挖掘社交媒体大数据进行毒物警戒:通过自然语言处理和机器学习方法研究药物使用

基本信息

  • 批准号:
    10588855
  • 负责人:
  • 金额:
    $ 127.24万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-09-30 至 2025-09-29
  • 项目状态:
    未结题

项目摘要

The epidemic of substance use (SU) and substance use disorder (SUD) in the United States has been evolving for decades. Both prescription and illicit drugs have been involved in overdose deaths over the years, with notable increases in synthetic opioids (eg., fentanyl & analogs) and psychostimulants (eg., methamphetamine) in recent years. The emergence of high-potency novel psychoactive substances (NPSs), such as fentanyl analogs, have drastically contributed to rising deaths, and adversely impacted treatment engagement and response. The COVID19 pandemic has further exacerbated the crisis, and recent studies have also highlighted that substantial disparities exist in SUD treatment, research, interest, and response across different subpopulations, with racial/ethnic minorities being disproportionately impacted. A key element to tackling the crisis is improved surveillance. Specifically, there is a need for establishing novel approaches to provide timely insights about the trends, distributions, and trajectories of the SUD epidemic, as traditional surveillance approaches involve considerable lags. Many recent studies have identified social media (SM) as useful resources for conducting SU/SUD surveillance. Many people use SM to discuss personal experiences, provide advice, or seek answers to questions regarding SU/SUD, resulting in the generation of an abundance of information. Such information can be characterized, aggregated and analyzed to obtain population- or subpopulation-level insights, at low cost and in near real time. However, converting SM data into timely, actionable knowledge is non-trivial since the data is big, complex, and noisy, requiring the development of advanced, automated artificial intelligence methods. Funded by the National Institute on Drug Abuse, our past work focused specifically on prescription medications (PM) and established the most sophisticated SM-based data mining pipeline available to date. In response to the evolution of the SUD epidemic, the proposed project will extend our capabilities to include illicit substances and develop novel methods to conduct surveillance. Specifically, we will (i) extend our machine learning and natural language processing (NLP) classification pipeline to automatically classify all SU-related chatter from Twitter and Reddit (rather than PMs only), (ii) collect and analyze longitudinal timelines of cohorts self-reporting SU/SUD, (iii) characterize the cohorts in terms of demographic details such as age-group, gender identity, race and geolocation, (iv) develop advanced NLP-driven methods for detecting NPSs and impacts of SU/SUD, (v) study short-term and long-term trends and trajectories of the epidemic, (vi) conduct observational studies on targeted population subsets, including studies focusing on SU and SUD treatment disparities and stigma, and (vii) disseminate developed methodologies via open source code and aggregated findings publicly via a web- based dashboard. Implementation of our data-centric methods and successful execution of the project has the potential to transform SU/SUD surveillance, and complement traditional surveillance measures by providing close to real time statistics and insights, including those for targeted subpopulations.
物质使用(SU)和物质使用障碍(SUD)在美国的流行一直在演变, 几十年多年来,处方药和非法药物都与过量死亡有关, 合成阿片类药物的增加(例如,芬太尼和类似物)和精神兴奋剂(例如,甲基苯丙胺)在最近 年高效力的新型精神活性物质(NPSs),如芬太尼类似物的出现, 严重导致死亡率上升,并对治疗参与和反应产生不利影响。的 COVID-19大流行进一步加剧了危机,最近的研究也强调, 不同亚群的SUD治疗、研究、兴趣和反应存在差异, 种族/族裔少数群体受到不成比例的影响。解决危机的一个关键因素得到改善 监视具体而言,需要建立新的方法,以及时了解 SUD流行的趋势、分布和轨迹,因为传统的监测方法涉及 相当大的滞后。最近的许多研究已经确定社交媒体(SM)作为进行 SU/SUD监视。许多人使用SM来讨论个人经历,提供建议或寻求答案, 关于南南发展特别股的问题,产生了大量信息。这样的信息可以 以低成本和低成本的方式进行特征化、汇总和分析,以获得人口或亚人口层面的见解, 几乎是真实的时间。然而,将SM数据转换为及时的、可操作的知识是不平凡的,因为数据是 大型、复杂、嘈杂,需要开发先进的自动化人工智能方法。 在国家药物滥用研究所的资助下,我们过去的工作主要集中在处方药上 (PM)并建立了迄今为止最复杂的基于SM的数据挖掘管道。响应于 随着SUD流行的演变,拟议的项目将扩大我们的能力,包括非法物质, 开发新的方法来进行监视。具体来说,我们将(i)扩展我们的机器学习和自然 语言处理(NLP)分类管道,自动分类来自Twitter的所有SU相关聊天 和Reddit(而不仅仅是PM),(ii)收集和分析队列自我报告的纵向时间表 南南发展特别股,㈢从人口统计学的详细资料,如年龄组、性别认同、种族 (四)开发先进的NLP驱动的方法,用于探测NPS和SU/SUD的影响, 研究该流行病的短期和长期趋势和轨迹,㈥进行观察性研究, 目标人群亚群,包括关注SU和SUD治疗差异和污名的研究,以及 (vii)通过开放源代码传播所开发的方法,并通过网络公开传播汇总的研究结果, 基于dashboard。我们以数据为中心的方法的实施和项目的成功执行, 改变SU/SUD监测的潜力,并通过提供 接近真实的时间统计和见解,包括针对目标亚群的统计和见解。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Abeed H Sarker其他文献

Abeed H Sarker的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Abeed H Sarker', 18)}}的其他基金

Mining Social Media Big Data for Toxicovigilance: Automating the Monitoring of Prescription Medication Abuse via Natural Language Processing and Machine Learning Methods
挖掘社交媒体大数据进行毒物警戒:通过自然语言处理和机器学习方法自动监测处方药滥用
  • 批准号:
    10001871
  • 财政年份:
    2019
  • 资助金额:
    $ 127.24万
  • 项目类别:

相似海外基金

ADVANCED DEVELOPMENT OF LQ A LIPOSOME-BASED SAPONIN-CONTAINING ADJUVANT FOR USE IN PANSARBECOVIRUS VACCINES
用于 Pansarbecovirus 疫苗的 LQ A 脂质体含皂苷佐剂的先进开发
  • 批准号:
    10935820
  • 财政年份:
    2023
  • 资助金额:
    $ 127.24万
  • 项目类别:
ADVANCED DEVELOPMENT OF BBT-059 AS A RADIATION MEDICAL COUNTERMEASURE FOR DOSING UP TO 48H POST EXPOSURE"
BBT-059 的先进开发,作为辐射医学对策,可在暴露后 48 小时内进行给药”
  • 批准号:
    10932514
  • 财政年份:
    2023
  • 资助金额:
    $ 127.24万
  • 项目类别:
Advanced Development of a Combined Shigella-ETEC Vaccine
志贺氏菌-ETEC 联合疫苗的先进开发
  • 批准号:
    10704845
  • 财政年份:
    2023
  • 资助金额:
    $ 127.24万
  • 项目类别:
Advanced development of composite gene delivery and CAR engineering systems
复合基因递送和CAR工程系统的先进开发
  • 批准号:
    10709085
  • 财政年份:
    2023
  • 资助金额:
    $ 127.24万
  • 项目类别:
Advanced Development of Gemini-DHAP
Gemini-DHAP的高级开发
  • 批准号:
    10760050
  • 财政年份:
    2023
  • 资助金额:
    $ 127.24万
  • 项目类别:
Advanced development and validation of an in vitro platform to phenotype brain metastatic tumor cells using artificial intelligence
使用人工智能对脑转移肿瘤细胞进行表型分析的体外平台的高级开发和验证
  • 批准号:
    10409385
  • 财政年份:
    2022
  • 资助金额:
    $ 127.24万
  • 项目类别:
ADVANCED DEVELOPMENT OF A VACCINE FOR PANDEMIC AND PRE-EMERGENT CORONAVIRUSES
针对大流行和突发冠状病毒的疫苗的高级开发
  • 批准号:
    10710595
  • 财政年份:
    2022
  • 资助金额:
    $ 127.24万
  • 项目类别:
Advanced development and validation of an in vitro platform to phenotype brain metastatic tumor cells using artificial intelligence
使用人工智能对脑转移肿瘤细胞进行表型分析的体外平台的高级开发和验证
  • 批准号:
    10630975
  • 财政年份:
    2022
  • 资助金额:
    $ 127.24万
  • 项目类别:
ADVANCED DEVELOPMENT OF A VACCINE CANDIDATE FOR STAPHYLOCOCCUS AUREUS INFECTION
金黄色葡萄球菌感染候选疫苗的高级开发
  • 批准号:
    10710588
  • 财政年份:
    2022
  • 资助金额:
    $ 127.24万
  • 项目类别:
ADVANCED DEVELOPMENT OF A VACCINE FOR PANDEMIC AND PRE-EMERGENT CORONAVIRUSES
针对大流行和突发冠状病毒的疫苗的高级开发
  • 批准号:
    10788051
  • 财政年份:
    2022
  • 资助金额:
    $ 127.24万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了