Learning from Social Media Texts
从社交媒体文本中学习
基本信息
- 批准号:RGPIN-2018-05181
- 负责人:
- 金额:$ 2.48万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2020
- 资助国家:加拿大
- 起止时间:2020-01-01 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Applications in the field of Natural Language Processing (NLP) and Machine Learning (ML) have become popular in recent years. This is due to the availability of data and evaluation benchmarks, to progress in the algorithms, and finally to an increased need for these applications in our daily life and in commercial product development.
I propose to apply NLP and ML techniques for user modelling in social media. There are three specific objectives: (1) Learn user characteristics from social media texts. These characteristics may include: age, gender, personality type, location, ethnicity, health issues, interests, and life events. (2) Learn population distributions in various social media (e.g., Twitter, forums, Facebook) for each of these characteristics. (3) Use the extracted information, as a proof of concept, in applications such as marketing research and health monitoring.
The scientific approach that I propose is based on machine learning, deep learning, automatic text classification, and information extraction techniques. We will focus our attention on social media texts, which are more challenging than regular texts due to non-standard spelling, lack of editing, abbreviations, jargon, and noise, The techniques need to be adapted to this kind of text. The various ways of adapting them include retraining, partial normalization of the messages, and adding features specific to each type of social media. In addition, I propose to integrate techniques that exploit the structure of the social network. Most of the previous work on related topics uses either the texts of the messages or the network structure. I believe that combining them may lead to an increase in the precision of the extracted information. We will also pay special attention to protecting the privacy of social media users.
The novelty of the proposed work consists of a comprehensive investigation of the existing techniques, in increasing their sophistication and in developing new techniques for the proposed tasks (particularly at population level, a less-studied area), as well as in the development of several proof-of-concept applications that require information about users or about populations of users in social media.
We anticipate that the outcomes of the research will contribute to better understanding of human communication in social media, which will allow advances in mining the social media for market research purposes and other decision-making applications, and will strengthen Canada's position as a major player in the field of information technology.
近年来,自然语言处理(NLP)和机器学习(ML)领域的应用变得流行。这是由于数据和评估基准的可用性,算法的进步,以及最终在我们的日常生活和商业产品开发中对这些应用的需求增加。
我建议将NLP和ML技术应用于社交媒体中的用户建模。具体目标有三个:(1)从社交媒体文本中学习用户特征。这些特征可能包括:年龄、性别、性格类型、位置、种族、健康问题、兴趣和生活事件。(2)了解各种社交媒体中的人口分布(例如,Twitter,论坛,Facebook)为这些特征中的每一个。(3)将提取的信息用作概念验证,用于营销研究和健康监测等应用。
我提出的科学方法基于机器学习、深度学习、自动文本分类和信息提取技术。我们将把注意力集中在社交媒体文本上,由于不标准的拼写,缺乏编辑,缩写,行话和噪音,社交媒体文本比普通文本更具挑战性,这些技术需要适应这种文本。调整它们的各种方法包括重新训练,消息的部分标准化,以及添加特定于每种类型的社交媒体的功能。 此外,我建议整合技术,利用社会网络的结构。大多数相关主题的先前工作使用消息的文本或网络结构。我认为,将它们结合起来可能会提高提取信息的精确度。我们还将特别注意保护社交媒体用户的隐私。
拟议工作的新奇包括对现有技术进行全面调查,提高其复杂性,为拟议任务开发新技术(特别是在人口层面,研究较少的领域),以及开发几个需要有关用户或社交媒体用户群体信息的概念验证应用程序。
我们预计,研究的结果将有助于更好地了解社交媒体中的人际沟通,这将有助于在挖掘社交媒体用于市场研究目的和其他决策应用方面取得进展,并将加强加拿大作为信息技术领域主要参与者的地位。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Inkpen, Diana其他文献
A Machine Learning Approach for Identifying Disease-Treatment Relations in Short Texts
- DOI:
10.1109/tkde.2010.152 - 发表时间:
2011-06-01 - 期刊:
- 影响因子:8.9
- 作者:
Frunza, Oana;Inkpen, Diana;Tran, Thomas - 通讯作者:
Tran, Thomas
A survey of book recommender systems
- DOI:
10.1007/s10844-017-0489-9 - 发表时间:
2018-08-01 - 期刊:
- 影响因子:3.4
- 作者:
Alharthi, Haifa;Inkpen, Diana;Szpakowicz, Stan - 通讯作者:
Szpakowicz, Stan
Prior and contextual emotion of words in sentential context
- DOI:
10.1016/j.csl.2013.04.009 - 发表时间:
2014-01-01 - 期刊:
- 影响因子:4.3
- 作者:
Ghazi, Diman;Inkpen, Diana;Szpakowicz, Stan - 通讯作者:
Szpakowicz, Stan
Location detection and disambiguation from twitter messages
- DOI:
10.1007/s10844-017-0458-3 - 发表时间:
2017-10-01 - 期刊:
- 影响因子:3.4
- 作者:
Inkpen, Diana;Liu, Ji;Ghazi, Diman - 通讯作者:
Ghazi, Diman
Multi-task learning to detect suicide ideation and mental disorders among social media users.
- DOI:
10.3389/frma.2023.1152535 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Buddhitha, Prasadith;Inkpen, Diana - 通讯作者:
Inkpen, Diana
Inkpen, Diana的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Inkpen, Diana', 18)}}的其他基金
Learning from Social Media Texts
从社交媒体文本中学习
- 批准号:
RGPIN-2018-05181 - 财政年份:2022
- 资助金额:
$ 2.48万 - 项目类别:
Discovery Grants Program - Individual
Learning from Social Media Texts
从社交媒体文本中学习
- 批准号:
RGPIN-2018-05181 - 财政年份:2021
- 资助金额:
$ 2.48万 - 项目类别:
Discovery Grants Program - Individual
Multi-modal and multi-lingual child safety application
多模式、多语言儿童安全应用
- 批准号:
538430-2018 - 财政年份:2020
- 资助金额:
$ 2.48万 - 项目类别:
Collaborative Research and Development Grants
Learning from Social Media Texts
从社交媒体文本中学习
- 批准号:
RGPIN-2018-05181 - 财政年份:2019
- 资助金额:
$ 2.48万 - 项目类别:
Discovery Grants Program - Individual
Multi-modal and multi-lingual child safety application
多模式、多语言儿童安全应用
- 批准号:
538430-2018 - 财政年份:2019
- 资助金额:
$ 2.48万 - 项目类别:
Collaborative Research and Development Grants
Identification and validation of performance indicators for SMEs
中小企业绩效指标的识别和验证
- 批准号:
530390-2018 - 财政年份:2018
- 资助金额:
$ 2.48万 - 项目类别:
Engage Grants Program
Learning from Social Media Texts
从社交媒体文本中学习
- 批准号:
RGPIN-2018-05181 - 财政年份:2018
- 资助金额:
$ 2.48万 - 项目类别:
Discovery Grants Program - Individual
Learning from Social Media Texts
从社交媒体文本中学习
- 批准号:
RGPIN-2017-04323 - 财政年份:2017
- 资助金额:
$ 2.48万 - 项目类别:
Discovery Grants Program - Individual
Social web mining and sentiment analysis for mental illness detection
用于精神疾病检测的社交网络挖掘和情感分析
- 批准号:
478857-2015 - 财政年份:2017
- 资助金额:
$ 2.48万 - 项目类别:
Strategic Projects - Group
Social media text mining for detecting behavioral and psychological conditions in children
用于检测儿童行为和心理状况的社交媒体文本挖掘
- 批准号:
499383-2016 - 财政年份:2017
- 资助金额:
$ 2.48万 - 项目类别:
Collaborative Research and Development Grants
相似国自然基金
小型类人猿合唱节奏的功能假说——宣
示社会关系(Social bond
advertising) ——验证研究
- 批准号:
- 批准年份:2025
- 资助金额:10.0 万元
- 项目类别:省市级项目
Behavioral Insights on Cooperation in Social Dilemmas
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国优秀青年学者研究基金项目
Navigating Sustainability: Understanding Environm ent,Social and Governanc e Challenges and Solution s for Chinese Enterprises
in Pakistan's CPEC Framew
ork
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金项目
多语言环境下Social Tagging的内涵机理与应用框架研究-基于比较的视角
- 批准号:71103203
- 批准年份:2011
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Cross-modal Deep Learning of Sizzle Representation for Social Media Data
社交媒体数据 Sizzle 表示的跨模态深度学习
- 批准号:
23K11340 - 财政年份:2023
- 资助金额:
$ 2.48万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Collaborative Research: AF: Small: Promoting Social Learning Amid Interference in the Age of Social Media
合作研究:AF:小:在社交媒体时代的干扰下促进社交学习
- 批准号:
2208663 - 财政年份:2022
- 资助金额:
$ 2.48万 - 项目类别:
Standard Grant
Collaborative Research: AF: Small: Promoting Social Learning Amid Interference in the Age of Social Media
合作研究:AF:小:在社交媒体时代的干扰下促进社交学习
- 批准号:
2208662 - 财政年份:2022
- 资助金额:
$ 2.48万 - 项目类别:
Standard Grant
Mining Social Media Big Data for Toxicovigilance: Studying Substance Use via Natural Language Processing and Machine Learning Methods
挖掘社交媒体大数据进行毒物警戒:通过自然语言处理和机器学习方法研究药物使用
- 批准号:
10588855 - 财政年份:2022
- 资助金额:
$ 2.48万 - 项目类别:
Unsupervised Representation Learning and Abstractive Summarization of Stances in Social Media: Towards Explainable Detection Systems for Emerging Rumours
无监督表示学习和社交媒体立场的抽象总结:针对新出现的谣言的可解释检测系统
- 批准号:
RGPIN-2022-04789 - 财政年份:2022
- 资助金额:
$ 2.48万 - 项目类别:
Discovery Grants Program - Individual
Learning from Social Media Texts
从社交媒体文本中学习
- 批准号:
RGPIN-2018-05181 - 财政年份:2022
- 资助金额:
$ 2.48万 - 项目类别:
Discovery Grants Program - Individual
Unsupervised Representation Learning and Abstractive Summarization of Stances in Social Media: Towards Explainable Detection Systems for Emerging Rumours
无监督表示学习和社交媒体立场的抽象总结:针对新出现的谣言的可解释检测系统
- 批准号:
DGECR-2022-00410 - 财政年份:2022
- 资助金额:
$ 2.48万 - 项目类别:
Discovery Launch Supplement
Collaborative Research: AF: Small: Promoting Social Learning Amid Interference in the Age of Social Media
合作研究:AF:小:在社交媒体时代的干扰下促进社交学习
- 批准号:
2208664 - 财政年份:2022
- 资助金额:
$ 2.48万 - 项目类别:
Standard Grant
Identifying drug and alcohol displays on social media using a machine learning approach, and mechanisms that impact adolescent substance use
使用机器学习方法识别社交媒体上的毒品和酒精展示以及影响青少年物质使用的机制
- 批准号:
10314603 - 财政年份:2021
- 资助金额:
$ 2.48万 - 项目类别:
Learning from Social Media Texts
从社交媒体文本中学习
- 批准号:
RGPIN-2018-05181 - 财政年份:2021
- 资助金额:
$ 2.48万 - 项目类别:
Discovery Grants Program - Individual