NECO: Architectural Support For Fault Management

NECO:故障管理的架构支持

基本信息

  • 批准号:
    0831647
  • 负责人:
  • 金额:
    $ 22.5万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2008
  • 资助国家:
    美国
  • 起止时间:
    2008-09-01 至 2012-08-31
  • 项目状态:
    已结题

项目摘要

The Internet today is a global infrastructure used by one and all. It is also the life-line of many businesses. The Internet, however, still lacks the kind of reliability and robustness we expect from critical infrastructure. While the network has in-built fault-tolerance, it is typically restricted to hard failures alone; the network is equipped to neither detect nor react to soft failures such as high delays or losses that may be due to faulty implementations or hardware failures or congestion related. To support delay- and loss-sensitive applications, it is important to devise efficient mechanisms to quickly detect such failures and recover from them in an automated fashion. Network devices provide very little support for debugging performance problems?let alone automatically respond to them. Operators today inject active probes between pairs of provider edge routers to detect any forwarding problems such as high delays and losses, in their network. Operators typically rely on applying indirect inference algorithms, such as tomographic approaches, that can infer root causes by joining active probes with network topology snapshots. Given the problem is fundamentally under-constrained; inference is only approximate at best and can be quite slow. Automated response mechanisms require fast and accurate localization in order to be effective. The main research focus of this research is to create novel in-network fault management mechanisms to automatically detect, localize, report and respond to failures and other performance degradations.The research revolves around three basic ideas: 1) Equipping routers with specialized high-speed low-complexity measurement primitives to measure delay and loss; 2) Using this feedback to allow routing protocols to automatically respond to congestion or chronic failure conditions; and 3) Integrating these mechanisms into NetFlow to provide per-flow delay and loss. Together, these mechanisms provide a rich set of tools for network operators and administrators to monitor and manage their networks efficiently and improve the overall fault-tolerance properties of these networks.Intellectual Merit: This research will contribute to the established research area of network fault-tolerance by allowing the network to automatically detect and respond to soft failures; it will carve new areas of research on scalable router primitives for high-fidelity measurements; and; it will significantly enhance the capabilities of routers by incorporating new definitions of flows commensurate with latest innovations and advancements in the Internet. This research directions combine three disparate areas of research?scalable router primitives, fault-tolerant routing protocols, and measurement?to significantly improve the robustness of critical infrastructure, the Internet. The work has the potential to provide a powerful fault management platform the Internet requires to sustain the next generation of delay-sensitive and interactive applications.Broad Impact: The project will contribute to the education of the next generation of networking engineers and designers. The researchers will disseminate the results through the traditional academic channels, and will actively participate in industry forums leveraging their collaborations with AT&T and Cisco. They will also design and revise courses in Networking and Router Architectures in the graduate and undergraduate curriculum, and supervise Ph.D, Master, and undergraduate projects through the Honors program.
今天的互联网是一个全球性的基础设施使用的一个和所有。这也是许多企业的生命线。然而,互联网仍然缺乏我们期望从关键基础设施中获得的那种可靠性和鲁棒性。虽然网络具有内置的容错能力,但它通常仅限于硬故障;网络既不能检测也不能对软故障做出反应,例如可能由于错误实现或硬件故障或拥塞相关的高延迟或丢失。为了支持对延迟和丢失敏感的应用程序,设计高效的机制来快速检测此类故障并以自动化的方式从中恢复是非常重要的。网络设备对调试性能问题提供的支持很少。更别说自动回复了如今,运营商在成对的提供商边缘路由器之间注入主动探测器,以检测其网络中的任何转发问题,如高延迟和丢失。运营商通常依赖于应用间接推理算法,例如断层扫描方法,该方法可以通过将主动探测器与网络拓扑快照结合来推断根本原因。考虑到问题基本上是欠约束的;推理充其量只是近似的,而且可能相当慢。自动化响应机制需要快速准确的定位才能有效。本论文的主要研究内容是建立新型的网络故障管理机制,以自动检测、定位、报告和响应故障和其他性能下降,研究围绕三个基本思想展开:1)为路由器配备专门的高速、低复杂度的测量原语来测量延迟和丢失; 2)使用此反馈以允许路由协议自动响应拥塞或慢性故障条件;以及3)将这些机制集成到NetFlow中以提供每个流的延迟和丢失。这些机制共同为网络运营商和管理员提供了一套丰富的工具,以有效地监控和管理他们的网络,并提高这些网络的整体容错性能。它将开辟新的研究领域,可扩展的路由器原语高保真测量;和;它将大大提高路由器的能力,通过纳入新的定义流相称的最新创新和进步的互联网。这个研究方向结合了联合收割机三个不同的研究领域?可扩展的路由器原语,容错路由协议和测量?来显著提高关键基础设施--互联网的健壮性。这项工作有可能提供一个强大的故障管理平台,互联网需要维持下一代的延迟敏感和交互式application.Broad影响:该项目将有助于下一代网络工程师和设计师的教育。研究人员将通过传统的学术渠道传播研究结果,并将积极参与行业论坛,利用他们与AT T和思科的合作。他们还将设计和修改研究生和本科课程中的网络和路由器架构课程,并通过荣誉计划监督博士,硕士和本科项目。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ramana Kompella其他文献

Ramana Kompella的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ramana Kompella', 18)}}的其他基金

CAREER: Towards a High-Fidelity Knowledge Plane for Data-Center Networks
职业生涯:迈向数据中心网络的高保真知识平面
  • 批准号:
    1054788
  • 财政年份:
    2011
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Continuing Grant
TC: Small: Collaborative Research: Predictive Blacklisting for Detecting Phishing Attacks
TC:小型:协作研究:用于检测网络钓鱼攻击的预测性黑名单
  • 批准号:
    1017915
  • 财政年份:
    2010
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Standard Grant

相似海外基金

Travel: NSF Student Travel Grant for 2023 ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
旅行:2023 年 ACM 编程语言和操作系统架构支持国际会议 (ASPLOS) 的 NSF 学生旅行补助金
  • 批准号:
    2311257
  • 财政年份:
    2023
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Standard Grant
Travel: NSF Student Travel Grant for 2024 ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
旅行:2024 年 ACM 编程语言和操作系统架构支持国际会议 (ASPLOS) 的 NSF 学生旅行补助金
  • 批准号:
    2327889
  • 财政年份:
    2023
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Standard Grant
Developing a new architectural framework for designing Digital Habit Formation Support System
开发用于设计数字习惯形成支持系统的新架构框架
  • 批准号:
    RGPIN-2021-04379
  • 财政年份:
    2022
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Discovery Grants Program - Individual
Recruitment and Training Support for Diverse Populations in Mechanical and Architectural Manufacturing Technologies (RTS-MT)
为机械和建筑制造技术领域的不同人群提供招聘和培训支持 (RTS-MT)
  • 批准号:
    2201455
  • 财政年份:
    2022
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Standard Grant
Developing a new architectural framework for designing Digital Habit Formation Support System
开发用于设计数字习惯形成支持系统的新架构框架
  • 批准号:
    RGPIN-2021-04379
  • 财政年份:
    2021
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Discovery Grants Program - Individual
CAREER: Systems and Architectural Support for Accelerator-Level Parallelism
职业:加速器级并行的系统和架构支持
  • 批准号:
    2044963
  • 财政年份:
    2021
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Continuing Grant
Mitigating Software Vulnerabilities with Architectural Support for Type-safety
通过类型安全的架构支持减少软件漏洞
  • 批准号:
    541942-2019
  • 财政年份:
    2021
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Collaborative Research and Development Grants
Mitigating Software Vulnerabilities with Architectural Support for Type-safety
通过类型安全的架构支持减少软件漏洞
  • 批准号:
    541942-2019
  • 财政年份:
    2020
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Collaborative Research and Development Grants
NSF Student Travel Grant for 2020 ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
NSF 学生旅费资助 2020 年 ACM 国际编程语言和操作系统架构支持会议 (ASPLOS)
  • 批准号:
    1936025
  • 财政年份:
    2020
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Standard Grant
Research on architectural requirements for support facilities for persons with disabilities as a community residence.
社区住宅残疾人配套设施的建筑要求研究
  • 批准号:
    19K04749
  • 财政年份:
    2019
  • 资助金额:
    $ 22.5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了