Natural Language Processing for Financial Market Modelling and Forecasting

用于金融市场建模和预测的自然语言处理

基本信息

  • 批准号:
    2094258
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Studentship
  • 财政年份:
    2018
  • 资助国家:
    英国
  • 起止时间:
    2018 至 无数据
  • 项目状态:
    已结题

项目摘要

Augmenting topic-sentiment models for financial forecasting In the recent past, language analysis in finance has been approached from different research directions. One important dimension of language - sentiment - as analysed by Antweiler and Frank (2004), Hu and Liu (2004), Bollen (2011), Si et al. (2013), Levenberg et al. (2014), focusses on the mood and emotion conveyed in text data sources such as online stock message boards, social media posts or financial news. The other major linguistic dimension - the conveyed story or narrative - can for example be approximated by estimating probabilistic topic models such as Latent Dirichlet Allocation (LDA), introduced by Blei et al. (2003). However, focusing only on either one of these language dimensions can leave relevant linguistic information unused. Recently, more holistic modelling approaches have attempted to model the full dimensionality of language for financial forecasting by combining sentiment and topic modelling (Nguyen and Shirai, 2015). While the authors measure an increased forecast performance of topic-sentiment models on financial market related indicators, I believe natural language processing for financial forecasting can be further adjusted to better match the actual time-series characteristics of financial and economic data. For instance, Latent Dirichlet Allocation assumes both, that topics do not change over time and that topics areuncorrelated. These are assumptions that might turn out to be too strong when analysing textual time series data in finance. I would be interested to adjust such sentiment-topic models with features that allow for topic-correlation (Blei and Lafferty, 2006a) or topic evolution (Blei and Lafferty, 2006b). Another potential model limitation in Nguyen and Shirai (2015) is its assumption of an exogenously determined number of topics. Teh et al. (2005) developed a hierarchical dirichlet process, which endogenises this parameter. It would be interesting to test whether such (a combination of) specifications yield better financial time-series forecasting performances.2. Application of topic-sentiment analysis to forecast monetary policy decisions In financial and economic theory, fluctuations of markets are often explained by the occurrences of exogenous shocks to the economy or financial system. One class of such shocks - namely monetarypolicy shocks - represents central bank decisions about changing the target interest rate, which cannot be explained by contemporary and forecasted values of macroeconomic variables relevant for monetary policy decision making. I follow the methodology brought forward by Romer and Romer (2004) to estimate such monetary policy shocks. That is, I first regress monetary policy decisions on contemporary and forecasted macroeconomic data of inflation, real GDP growth, and unemployment. The residuals of such a regression represent movements in monetary policy that cannot be explained by quantitative economic data underlying conventional monetary policy. I then assess whether narrative effects carry explanatory power to predict these monetary policy shocks (the regression residuals). I utilize topic-sentiment models (as described earlier in my proposal) to identify whether changes in a) the narrative of central bank internal reports and b) newspaper articles on political, business, financial and economic events carry predictive power to explain these monetary policy shocks. Focusing on the timespan of 2000-2011, I programme machine learning procedures in python to estimate probabilistic topic models and topic's sentiment scores spanning a dataset of over 500,000 articles of leading US newspapers as well as over 200 central bank reports. The central bank internal reports are being created for each regularly held FOMC2 meeting of the US Federal Reserve board members. In each of these FOMC meetings, the members decide about the target interest rate.
近年来,金融领域的语言分析从不同的研究方向展开。Antweiler和Frank(2004)、Hu和Liu(2004)、Bollen(2011)、Si等人(2013)、Levenberg等人(2014)分析了语言的一个重要维度——情感,重点关注在线股票留言板、社交媒体帖子或财经新闻等文本数据源中传达的情绪和情感。另一个主要的语言维度——所传达的故事或叙事——可以通过估计概率主题模型来近似,例如Blei等人(2003)引入的潜在狄利克雷分配(Latent Dirichlet Allocation, LDA)。然而,只关注这些语言维度中的任何一个都可能使相关的语言信息无法使用。最近,更全面的建模方法试图通过结合情感和主题建模来模拟金融预测语言的全维度(Nguyen和Shirai, 2015)。虽然作者测量主题情绪模型对金融市场相关指标的预测性能有所提高,但我认为,用于金融预测的自然语言处理可以进一步调整,以更好地匹配金融和经济数据的实际时间序列特征。例如,潜狄利克雷分配假设两个主题,即主题不随时间变化,主题不相关。在分析金融领域的文本时间序列数据时,这些假设可能会被证明过于强大。我有兴趣调整这种情绪-主题模型,使其具有主题相关性(Blei and Lafferty, 2006a)或主题演变(Blei and Lafferty, 2006b)的特征。Nguyen和Shirai(2015)的另一个潜在的模型限制是它假设了外生确定的主题数量。teet al.(2005)开发了一种分层狄利克雷过程,该过程内化了该参数。测试这样的(组合)规范是否产生更好的财务时间序列预测性能将是很有趣的。在金融和经济理论中,市场波动通常由经济或金融体系的外生冲击的发生来解释。其中一类冲击——即货币政策冲击——代表中央银行关于改变目标利率的决定,这不能用与货币政策决策相关的宏观经济变量的当代和预测值来解释。我遵循Romer和Romer(2004)提出的方法来估计这种货币政策冲击。也就是说,我首先根据当前和预测的通货膨胀、实际GDP增长和失业率的宏观经济数据对货币政策决策进行回归。这种回归的残差代表了传统货币政策背后的定量经济数据无法解释的货币政策变动。然后,我评估叙事效应是否具有预测这些货币政策冲击的解释力(回归残差)。我利用主题情绪模型(如我之前的建议所述)来确定a)中央银行内部报告的叙述和b)关于政治,商业,金融和经济事件的报纸文章的变化是否具有预测能力来解释这些货币政策冲击。专注于2000-2011年的时间跨度,我用python编程了机器学习程序,以估计概率主题模型和主题的情绪得分,该数据集涵盖了美国主要报纸的50多万篇文章以及200多份中央银行报告。美联储内部报告是为美联储董事会成员定期举行的FOMC2会议编写的。在每次FOMC会议上,委员们决定目标利率。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

其他文献

吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
生命分子工学・海洋生命工学研究室
生物分子工程/海洋生物技术实验室
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:

的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('', 18)}}的其他基金

An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
  • 批准号:
    2901954
  • 财政年份:
    2028
  • 资助金额:
    --
  • 项目类别:
    Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
  • 批准号:
    2896097
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
  • 批准号:
    2780268
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
  • 批准号:
    2908918
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
  • 批准号:
    2908693
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
  • 批准号:
    2908917
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
  • 批准号:
    2879438
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
  • 批准号:
    2890513
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
CDT year 1 so TBC in Oct 2024
CDT 第 1 年,预计 2024 年 10 月
  • 批准号:
    2879865
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
  • 批准号:
    2876993
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship

相似海外基金

Navigating Chemical Space with Natural Language Processing and Deep Learning
利用自然语言处理和深度学习驾驭化学空间
  • 批准号:
    EP/Y004167/1
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Research Grant
REU Site: Recent Advances in Natural Language Processing
REU 网站:自然语言处理的最新进展
  • 批准号:
    2349452
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Studies of speech, image and natural language processing for multimodal spoken document retrieval
多模态语音文档检索的语音、图像和自然语言处理研究
  • 批准号:
    23K11216
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Efficient and Fair Language Modelling for Natural Language Processing, investigating lightweight language modelling approaches and aiming at fairness
自然语言处理的高效公平语言建模,研究轻量级语言建模方法并以公平为目标
  • 批准号:
    2894795
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Studentship
SBIR Phase I: Sown To Grow - Measuring Growth in Trusting Relationships between Students and Educators with Natural Language Processing and Machine Learning Technologies
SBIR 第一阶段:播种成长 - 使用自然语言处理和机器学习技术衡量学生和教育工作者之间信任关系的增长
  • 批准号:
    2322340
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Collaborative Research: EAGER: Developing and Optimizing Reflection-Informed STEM Learning and Instruction by Integrating Learning Technologies with Natural Language Processing
合作研究:EAGER:通过将学习技术与自然语言处理相结合来开发和优化基于反思的 STEM 学习和教学
  • 批准号:
    2329273
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Harmony AI: Natural Language Processing Enabling Advanced Biomanufacturing
Harmony AI:自然语言处理实现先进生物制造
  • 批准号:
    10761082
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
Collaborative Research: EAGER: Developing and Optimizing Reflection-Informed STEM Learning and Instruction by Integrating Learning Technologies with Natural Language Processing
合作研究:EAGER:通过将学习技术与自然语言处理相结合来开发和优化基于反思的 STEM 学习和教学
  • 批准号:
    2329274
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
CAREER: Data-driven design of graphene oxide for environmental applications enabled by natural language processing and machine learning techniques
职业:通过自然语言处理和机器学习技术实现氧化石墨烯环境应用的数据驱动设计
  • 批准号:
    2238415
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Applying Natural Language Processing to real-world patient data to optimise cancer care
将自然语言处理应用于现实世界的患者数据以优化癌症护理
  • 批准号:
    2897525
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Studentship
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了