SGER: Multilingual Online Stylometric Authorship Identification: An Exploratory Study

SGER:多语言在线风格作者身份识别:一项探索性研究

基本信息

  • 批准号:
    0646942
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2006
  • 资助国家:
    美国
  • 起止时间:
    2006-09-01 至 2008-02-29
  • 项目状态:
    已结题

项目摘要

Online communication mediums such as email, web sites, newsgroups, online forums, and chat rooms have been ubiquitously integrated into our everyday lives. Unfortunately, online channels are also being misused for distribution of unsolicited and inappropriate information (e.g., extremist propaganda, spam, online gambling, etc.). The anonymous nature of these channels makes them an ideal source of communication for criminal groups and extremist organizations. Additionally, the evolution of the internet as a major international communication medium has spawned the advent of a multilingual dimension. Authorship analysis has been used to analyze long, precise English texts such as plays of Shakespeare (authorship identification) or student's class papers (plagiarism detection). Few past studies have addressed the multilingual issues of short online communications. The language-specific stylistic characteristics and the informal nature of online communications present unique research challenges. This exploratory project aims to develop a comprehensive framework and associated text mining techniques for multilingual online stylometric feature extraction and authorship classification, initially focusing on two languages, English and Arabic. The linguistic differences between these two languages will allow evaluation of common stylistic representations and explore other language-specific problems. The goal is to develop scalable online authorship analysis techniques that can be used to analyze 100s to 1000s of anonymous authors (a common scenario for web communications). Novel feature (subset) selection techniques will help reduce the high dimensionality of online writing features. The primary intellectual contribution of this research is expected to yield: (a) development and evaluation of new text mining techniques that may be suitable for identity tracing in cyberspace; (b) creation of new representations of people's identities using online "Writeprints" (i.e., the representation of people's key online writing style features); and (c) evaluation of the effectiveness of different multilingual stylistic features and classification techniques for improving identification scalability and robustness. The anticipated broader impact of this research include: building foundation for further cyber trust research; improving intelligence and law enforcement agencies' abilities to detect, prevent, and respond to cyber crimes and terrorist events via the Internet; and providing a large-scale research corpus and feature extraction resources for information scientists, political and social scientists, and terrorism researchers. The project web site (http://ai.arizona.edu/authorship) will be used for broad dissemination of project results.
在线通信媒介,如电子邮件,网站,新闻组,在线论坛和聊天室已经无处不在地融入我们的日常生活。不幸的是,在线渠道也被滥用于分发未经请求的和不适当的信息(例如,极端主义宣传、垃圾邮件、在线赌博等)。这些渠道的匿名性使其成为犯罪集团和极端主义组织理想的通信来源。此外,互联网作为一种主要的国际通信媒介的发展也催生了多语言层面的出现。作者身份分析已被用于分析长而精确的英语文本,例如莎士比亚的戏剧(作者身份识别)或学生的课堂论文(剽窃检测)。过去很少有研究涉及短的在线通信的多语言问题。语言特有的文体特征和在线交流的非正式性质提出了独特的研究挑战。这一探索性项目旨在为多语种在线文体特征提取和作者身份分类开发一个综合框架和相关的文本挖掘技术,最初侧重于英语和阿拉伯语两种语言。这两种语言之间的语言差异将允许共同的风格表示的评价,并探讨其他语言特有的问题。目标是开发可扩展的在线作者身份分析技术,可用于分析100到1000名匿名作者(Web通信的常见场景)。 新颖的特征(子集)选择技术将有助于降低在线写作特征的高维性。这项研究的主要智力贡献预计将产生:(a)开发和评估可能适用于网络空间身份追踪的新文本挖掘技术;(B)使用在线“笔迹”(即,(c)评估不同的多语言风格特征和分类技术对提高识别的可扩展性和鲁棒性的有效性。这项研究的预期更广泛的影响包括:为进一步的网络信任研究奠定基础;提高情报和执法机构通过互联网发现、预防和应对网络犯罪和恐怖主义事件的能力;为信息科学家、政治和社会科学家以及恐怖主义研究人员提供大规模的研究语料库和特征提取资源。项目网站(http://ai.arizona.edu/authorship)将用于广泛传播项目成果。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Hsinchun Chen其他文献

Chapter 7 Spatio-Temporal Data Analysis in Security Informatics
第7章安全信息学时空数据分析
  • DOI:
  • 发表时间:
    2007
  • 期刊:
  • 影响因子:
    0
  • 作者:
    D. Zeng;Hsinchun Chen;Wei Chang
  • 通讯作者:
    Wei Chang
AI, E-government, and Politics 2.0
  • DOI:
    10.1109/mis.2009.91
  • 发表时间:
    2009-09
  • 期刊:
  • 影响因子:
    6.4
  • 作者:
    Hsinchun Chen
  • 通讯作者:
    Hsinchun Chen
Fostering Cybersecurity Big Data Research : A Case Study of the AZSecure Data System
促进网络安全大数据研究:AZSecure 数据系统案例研究
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Resha Shenandoah;Sagar Samtani;Mark W. Patton;Hsinchun Chen
  • 通讯作者:
    Hsinchun Chen
Approach on the Vocabulary Problem in Collaboration
协作中词汇问题的解决方法
  • DOI:
  • 发表时间:
    1993
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hsinchun Chen
  • 通讯作者:
    Hsinchun Chen
Chapter 10 Social Network Analysis for Terrorism Research
第10章恐怖主义研究的社交网络分析
  • DOI:
  • 发表时间:
    2007
  • 期刊:
  • 影响因子:
    0
  • 作者:
    E. Reid;Hsinchun Chen;J. Xu
  • 通讯作者:
    J. Xu

Hsinchun Chen的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Hsinchun Chen', 18)}}的其他基金

CICI: UCSS: Enhancing the Usability of Vulnerability Assessment Results for Open-Source Software Technologies in Scientific Cyberinfrastructure: A Deep Learning Perspective
CICI:UCSS:增强科学网络基础设施中开源软件技术漏洞评估结果的可用性:深度学习视角
  • 批准号:
    2319325
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
EAGER: SaTC-EDU: Artificial Intelligence and Cybersecurity Research and Education at Scale
EAGER:SaTC-EDU:大规模人工智能和网络安全研究与教育
  • 批准号:
    2038483
  • 财政年份:
    2020
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
SaTC: CORE: Small: Cybersecurity Big Data Research for Hacker Communities: A Topic and Language Modeling Approach
SaTC:核心:小型:黑客社区的网络安全大数据研究:主题和语言建模方法
  • 批准号:
    1936370
  • 财政年份:
    2019
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
CICI: SSC: Proactive Cyber Threat Intelligence and Comprehensive Network Monitoring for Scientific Cyberinfrastructure: The AZSecure Framework
CICI:SSC:科学网络基础设施的主动网络威胁情报和综合网络监控:AZSecure 框架
  • 批准号:
    1917117
  • 财政年份:
    2019
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Cybersecurity Scholarship-for-Service Renewal at The University of Arizona:The AZSecure SFS Program
亚利桑那大学网络安全服务更新奖学金:AZSecure SFS 计划
  • 批准号:
    1921485
  • 财政年份:
    2019
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
EAGER: A Longitudinal Study of Knowledge Diffusion and Societal Impact of Nanomanufacturing Research & Development: Harnessing Data for Science and Engineering
EAGER:纳米制造研究的知识传播和社会影响的纵向研究
  • 批准号:
    1832926
  • 财政年份:
    2018
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Cybersecurity Big Data and Analytics Sharing Platform
网络安全大数据和分析共享平台
  • 批准号:
    1719477
  • 财政年份:
    2017
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
EAGER: A Systems Approach for Identification and Evaluation of Nanoscience and Nanomanufacturing Opportunities and Risks
EAGER:识别和评估纳米科学和纳米制造机会和风险的系统方法
  • 批准号:
    1442116
  • 财政年份:
    2014
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
CIF21 DIBBs: DIBBs for Intelligence and Security Informatics Research Community
CIF21 DIBB:用于情报和安全信息学研究社区的 DIBB
  • 批准号:
    1443019
  • 财政年份:
    2014
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
SBE TTP: Medium: Securing Cyber Space: Understanding the Cyber Attackers and Attacks via Social Media Analytics
SBE TTP:媒介:保护网络空间:通过社交媒体分析了解网络攻击者和攻击
  • 批准号:
    1314631
  • 财政年份:
    2013
  • 资助金额:
    --
  • 项目类别:
    Standard Grant

相似海外基金

Computational approach to security dilemma: understanding state rivalry through multilingual longitudinal analysis of foreign news
解决安全困境的计算方法:通过外国新闻的多语言纵向分析来理解国家竞争
  • 批准号:
    23K25490
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
ELOQUENCE - Multilingual and Cross-cultural interactions for context-aware, and bias-controlled dialogue systems for safety-critical applications
ELOQUENCE - 用于安全关键应用的上下文感知和偏差控制对话系统的多语言和跨文化交互
  • 批准号:
    10092660
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    EU-Funded
Preparing Science Teachers To Engage Multilingual Learners in Scientific Argumentation Through Mixed-Reality Simulations
让科学教师做好准备,通过混合现实模拟让多语言学习者参与科学论证
  • 批准号:
    2321205
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
The role of English in multilingual online transgender communities
英语在多语言在线变性人社区中的作用
  • 批准号:
    2873142
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Studentship
The polyglot writer: what multilingual texts reveal about writers' emotional attachment to the languages they speak
多语言作家:多语言文本揭示了作家对其所讲语言的情感依恋
  • 批准号:
    2887779
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Studentship
Expanding sixth-grade youth's understanding of engineering through critical multilingual journalism
通过批判性的多语言新闻扩大六年级青少年对工程的理解
  • 批准号:
    2300726
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Unifying Pre-training and Multilingual Semantic Representation Learning for Low-resource Neural Machine Translation
统一预训练和多语言语义表示学习以实现低资源神经机器翻译
  • 批准号:
    22KJ1843
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Expanding Access to Care for Marginalized Caregivers through Innovative Methods for Multicultural and Multilingual Adaptation of AI-Based Health Technologies
通过基于人工智能的医疗技术的多文化和多语言适应创新方法,扩大边缘化护理人员获得护理的机会
  • 批准号:
    10741177
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
Multilingual corpus construction and domain adaptation for low-resource machine translation
低资源机器翻译的多语言语料库构建和领域适应
  • 批准号:
    22KJ1724
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Research on multilingual data integration for digital archives of Japanese culture
日本文化数字档案多语言数据集成研究
  • 批准号:
    23K11780
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了