IntelliText - Intelligent Tools for Creating and Analysing Electronic Text Corpora for Humanities Research

IntelliText - 用于创建和分析人文研究电子文本语料库的智能工具

基本信息

  • 批准号:
    AH/H037306/1
  • 负责人:
  • 金额:
    $ 20.3万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2010
  • 资助国家:
    英国
  • 起止时间:
    2010 至 无数据
  • 项目状态:
    已结题

项目摘要

Much humanities research relies on or would benefit from analysis of electronic corpora - representative collections of texts (such as books, newspaper articles, technical manuals in computer-readable format), which may also be annotated with linguistic or domain information. The main advantage of using corpora over hand-picked examples is the ability to collect data systematically, to assess the centrality of certain features to the research material, and to establish experimentally potential trends in the data. Projects which rely on electronic corpora can be expected to have greater academic and social impact, thanks to increased consistency in data analysis.However, the major difficulty faced by corpus-based studies in humanities research is that creating and annotating a new corpus and designing an appropriate search engine for textual analysis require complex technical support, e.g., expertise in programming, web development, etc. Such a level of technical expertise is often unavailable to smaller humanities projects; but even larger corpus-based projects often miss opportunities for data analysis because of inadequate methodological or technological support for relevant computational aspects. Even when a corpus already exists, the task of building appropriate computational tools for analysing, intelligently searching and visualising the data still remain too challenging for many potential humanities projects.Humanities researchers' lack of awareness of modern computational techniques for corpus-based studies can seriously limit the scope and the impact of any planned research projects. Moreover, computer scientists who design corpus-based tools frequently do not understand the specific needs of humanities research; their tools are often difficult to adapt to a specific project, or lack an intuitive interface and documentation. As a result, the existence of several non-trivial computational techniques with the power to collect and prepare corpus material and reveal new dependencies and patterns in the data has been overlooked in the humanities. Thus important potential synergies for research have been neglected.IntelliText's novel contribution will be to tune advanced tools and methods from computer science to the needs of humanities researchers, integrating them into a single software application with a simple interface and good documentation. This will allow humanities researchers with no specialised background in computer science or corpus linguistics to take advantage of powerful methods of text collection and analysis. It will enable them to collect new project corpora from the web, have them enriched automatically with linguistic and other annotations, and then easily uncover interesting patterns of usage, starting either from their own intuitions and hypotheses, or from expressions and patterns identified as potentially noteworthy by the system.The software will be designed and tested in novel applications by researchers interested in the stylistic features of translated text, in language learning and contrastive linguistics, and in detecting and describing shifts in sentiment and opinion. These will demonstrate its generalisability for addressing the needs of a wide spectrum of humanities researchers, including historians and specialists in literature, media and corporate or government communications, all of whom are represented on the Project Board. IntelliText will be made freely available for research purposes as Open Source software, introducing these tools and methods into fresh areas and permitting further extensions by the user community after funding ends.In short, the impact of IntelliText will be to strengthen the theoretical foundations of many humanities disciplines by enabling a much larger community of researchers than hitherto to make testable predictions, and then to verify themby reference to solid corpus evidence uncovered by advanced and automated analytical techniques.
许多人文科学研究依赖于或将受益于电子语料库的分析-文本的代表性集合(如书籍,报纸文章,计算机可读格式的技术手册),也可以用语言或领域信息进行注释。使用语料库的主要优势是能够系统地收集数据,评估某些特征对研究材料的中心性,并通过实验确定数据中的潜在趋势。由于数据分析的一致性提高,依赖电子语料库的项目有望产生更大的学术和社会影响。然而,基于语料库的人文研究面临的主要困难是,创建和注释新语料库以及设计用于文本分析的适当搜索引擎需要复杂的技术支持,例如,这样的技术专业知识水平往往是不可用的较小的人文项目,但即使是较大的语料库为基础的项目往往错过了数据分析的机会,因为相关的计算方面的方法或技术支持不足。即使语料库已经存在,对于许多潜在的人文学科项目来说,构建适当的计算工具来分析、智能搜索和可视化数据的任务仍然太具有挑战性。人文学科研究人员缺乏对基于语料库研究的现代计算技术的认识,这会严重限制任何计划研究项目的范围和影响。此外,设计基于语料库的工具的计算机科学家通常不了解人文研究的特定需求;他们的工具通常难以适应特定项目,或者缺乏直观的界面和文档。因此,存在几个非平凡的计算技术,收集和准备语料库材料,并揭示新的依赖关系和模式的数据已被忽视的人文。IntelliText的新贡献将是将计算机科学的先进工具和方法调整到人文科学研究人员的需求,将它们集成到一个具有简单界面和良好文档的单一软件应用程序中。这将使没有计算机科学或语料库语言学专业背景的人文研究人员能够利用强大的文本收集和分析方法。它将使他们能够从网络上收集新的项目语料库,用语言和其他注释自动丰富它们,然后很容易地发现有趣的使用模式,从他们自己的直觉和假设开始,或来自系统识别出可能值得注意的表达和模式。该软件将由对翻译的文体特征感兴趣的研究人员在新颖的应用程序中设计和测试文本,语言学习和对比语言学,以及检测和描述情感和观点的变化。这些将展示其普遍性,以满足广泛的人文研究人员的需求,包括历史学家和文学,媒体和企业或政府通信专家,所有这些人都在项目委员会中有代表。IntelliText将作为开源软件免费提供给研究人员,将这些工具和方法引入新的领域,并允许用户社区在资金结束后进一步扩展。简而言之,IntelliText的影响将是通过使比迄今为止更大的研究人员社区能够做出可测试的预测来加强许多人文学科的理论基础,然后通过参考先进的自动化分析技术发现的坚实的语料库证据来验证它们。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Anthony Hartley其他文献

The Role of Scaffolding and Visualisation in Supporting Collaborative Translator Training: The Case of Minna no Hon'yaku for Translator Training (MNH-TT)
脚手架和可视化在支持协作翻译培训中的作用:翻译培训 Minna no Honyaku 案例 (MNH-TT)
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Kyo Kageura;Takeshi Abekawa;Martin Thomas;Atsushi Fujita;Anthony Hartley;Kikuko Tanabe;Chiho Toyoshima;Masao Utiyama
  • 通讯作者:
    Masao Utiyama
Creation and exploitation of translation revision data in MNH-TT environment
MNH-TT 环境中翻译修订数据的创建和利用
  • DOI:
  • 发表时间:
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Masao Utiyama;Anthony Hartley;Kyo Kageura and Martin Thomas
  • 通讯作者:
    Kyo Kageura and Martin Thomas
みんなの翻訳実習~みんなの翻訳第5報~
大家的翻译实践~大家的翻译第五期报告~
  • DOI:
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    0
  • 作者:
    内山将夫;影浦峡;Anthony Hartley;Martin Thomas
  • 通讯作者:
    Martin Thomas
NIRSを用いた人物の嗜好判断時の脳活動の分析
使用 NIRS 分析人们判断偏好时的大脑活动
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Anthony Hartley;Martin Thomas;Masao Utiyama and Kyo Kageura;藤田和生(編著)日本動物心理学会(監修);谷田遥香,加藤俊一
  • 通讯作者:
    谷田遥香,加藤俊一
校閲カテゴリ体系に基づく翻訳学習者の誤り傾向の分析
基于校对分类体系的翻译学习者错误倾向分析
  • DOI:
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    影浦峡;藤田篤;内山将夫;Anthony Hartley;山田優;阿辺川武;Martin Thomas;豊島知穂・藤田篤・田辺希久子・影浦峡・Anthony Hartley
  • 通讯作者:
    豊島知穂・藤田篤・田辺希久子・影浦峡・Anthony Hartley

Anthony Hartley的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

Intelligent Patent Analysis for Optimized Technology Stack Selection:Blockchain BusinessRegistry Case Demonstration
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    外国学者研究基金项目

相似海外基金

PFI-RP: Architectural design and intelligent control tools for decarbonizing space cooling and heating in buildings
PFI-RP:用于建筑物空间制冷和供暖脱碳的建筑设计和智能控制工具
  • 批准号:
    2234630
  • 财政年份:
    2023
  • 资助金额:
    $ 20.3万
  • 项目类别:
    Standard Grant
Sensor Hardware and Intelligent Tools for Assessing the Health Effects of Heat Exposure
用于评估热暴露对健康影响的传感器硬件和智能工具
  • 批准号:
    10522560
  • 财政年份:
    2022
  • 资助金额:
    $ 20.3万
  • 项目类别:
Sensor Hardware and Intelligent Tools for Assessing the Health Effects of Heat Exposure
用于评估热暴露对健康影响的传感器硬件和智能工具
  • 批准号:
    10703469
  • 财政年份:
    2022
  • 资助金额:
    $ 20.3万
  • 项目类别:
Intelligent architectures and tools for Internet of Vehicles
车联网智能架构和工具
  • 批准号:
    RGPIN-2018-04507
  • 财政年份:
    2022
  • 资助金额:
    $ 20.3万
  • 项目类别:
    Discovery Grants Program - Individual
Protecting the EuRopean terrItory from organised enVironmentAl crime through inteLLigent threat detectiON tools (PERIVALLON)
通过智能威胁检测工具 (PERIVALLON) 保护欧洲领土免受有组织环境犯罪的侵害
  • 批准号:
    10040457
  • 财政年份:
    2022
  • 资助金额:
    $ 20.3万
  • 项目类别:
    EU-Funded
Can software tools enable improved configuration, operation and exploitation of distributed and intelligent sensors for marine industrial applications
软件工具能否改进海洋工业应用的分布式智能传感器的配置、操作和开发
  • 批准号:
    2582893
  • 财政年份:
    2021
  • 资助金额:
    $ 20.3万
  • 项目类别:
    Studentship
NSERC/SEASPAN Industrial Research Chairs in intelligent and green marine vessels (IGMVs): Advanced Tools and Techniques for Multiphysics Prediction and Design Optimization
NSERC/SEASPAN 智能和绿色船舶 (IGMV) 工业研究主席:多物理场预测和设计优化的先进工具和技术
  • 批准号:
    550069-2019
  • 财政年份:
    2021
  • 资助金额:
    $ 20.3万
  • 项目类别:
    Industrial Research Chairs
Intelligent architectures and tools for Internet of Vehicles
车联网智能架构和工具
  • 批准号:
    RGPIN-2018-04507
  • 财政年份:
    2021
  • 资助金额:
    $ 20.3万
  • 项目类别:
    Discovery Grants Program - Individual
Dynamical Systems Diagnostics for Intelligent Machine Tools
智能机床动态系统诊断
  • 批准号:
    2053470
  • 财政年份:
    2021
  • 资助金额:
    $ 20.3万
  • 项目类别:
    Standard Grant
NSERC/SEASPAN Industrial Research Chairs in intelligent and green marine vessels (IGMVs): Advanced Tools and Techniques for Multiphysics Prediction and Design Optimization
NSERC/SEASPAN 智能和绿色船舶 (IGMV) 工业研究主席:多物理场预测和设计优化的先进工具和技术
  • 批准号:
    550071-2019
  • 财政年份:
    2021
  • 资助金额:
    $ 20.3万
  • 项目类别:
    Industrial Research Chairs
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了