单事件扰乱缓解的分析和验证 - Semiwiki

单事件扰乱缓解的分析和验证 – Semiwiki

源节点: 3003638

The evolution of space-based applications continues to drive innovation across government and private entities. The new demands for advanced capabilities and feature sets have a direct impact on the underlying hardware, driving companies to migrate to smaller geometries to deliver the required performance, area, and power benefits.

Simultaneously, the application space is evolving, and mission parameters for these new applications are causing companies to evaluate non-traditional approaches. Commercial high-reliability processes (i.e., those developed for automotive designs) are being considered for aerospace as they meet both the survivability requirements of certain scenarios and provide reduced development timelines and cost.

不幸的是,较低几何尺寸带来的优势是有代价的,其中一个缺点是底层硬件更容易受到软错误的影响,通常称为单粒子翻转 (SEU)。对芯片内的显着(如果不是全部)功能进行冗余或三倍化的传统方法很快就会变得成本过高。

Fortunately, new flows and automation provide project teams insights into SEU mitigation and offer the ability to optimize the SEU mitigation architecture, also referred to as selective hardening.

Figure 1 Driving trends
Figure 1. Driving trends to selective radiation mitigation

首先,让我们回顾一下挑战。

Selective Hardening Challenges

Feedback from the aerospace industry suggests that the traditional approach to SEU mitigation has many pitfalls and leaves two important questions unanswered.

  1. 对于已知的关键任务设计元素,实施的缓解措施效果如何?
  2. 如何识别由于未受保护的设计元素中的故障而导致的潜在故障?

The traditional approach to SEU mitigation is best summarized in a three-step workflow.

  • Step 1: Identify failure points through expert driven analysis
  • Step 2: Design engineers insert the mitigation (HW and/or SW)
  • Step 3: Verify the effectiveness of the mitigation
    • 利用功能回归和强制命令注入 SEU 的模拟
    • 重离子暴露下的硅后功能测试
Figure 2 old workflow
Figure 2: The traditional approach to SEU mitigation

不幸的是,传统方法有多个缺点,包括:

  • 没有共同的衡量标准(指标)来确定 SEU 缓解的有效性。
  • 随着复杂性的增加,专家驱动的分析不可重复或可扩展。
  • 在功能仿真中手动强制故障需要大量的工程工作。
  • 无法使用功能模拟和力陈述来分析完整的故障状态空间。
  • 在梁环境中测试时进行后期故障识别,同时在发生故障时调试可见性有限。
Automation and Workflows Supporting Selective Hardening

选择性强化的总体目标是保护对任务功能至关重要的设计功能,并通过不保护非关键功能来节省成本(功耗和面积)。归结起来,该方法有三个目标:

  1. 在设计周期的早期就让人们相信缓解措施是最佳的。
  2. 提供经验证据证明未受保护的事物不会导致异常行为。
  3. 提供定量评估,详细说明所实施缓解措施的有效性。

西门子开发了一种方法和集成工作流程,以提供一种系统方法来衡量现有缓解措施的有效性以及确定未受保护逻辑的重要性。工作流程分为四个阶段。

Figure 3 mitigation flow
Figure 3. The Siemens SEU mitigation workflow

结构分区: 流程的第一步利用结构分析引擎结合已实现的保护功能的硬件缓解来评估设计功能。结构分区的输出是一份报告,表明现有硬件缓解措施的有效性以及对存在差距的洞察。

故障注入分析​​: 无法在结构上验证的缓解措施是故障注入的候选者。在此阶段,SEU 被注入、传播并评估影响。故障注入分析​​的输出是故障分类报告,列出了硬件或软件缓解检测到的故障以及未检测到的故障。

传播分析: 在预期工作负载刺激下,对未受保护的 SEU 站点进行结构性评估,以确定每个站点的关键性及其导致功能故障的概率。传播分析的输出是当前未受保护的故障的列表,这些故障已被识别为影响功能行为。

指标计算: 来自结构、注入和传播分析的数据为指标计算引擎和可视化驾驶舱提供数据。驾驶舱提供了对故障率、缓解措施的有效性以及存在的任何差距的直观洞察。

每个半导体开发项目都有独特的特点。上述方法非常灵活且高度可配置,允许项目团队根据需要进行调整。

结论

即使是最资深的项目团队,减轻单一事件干扰仍然是一个挑战,而且随着设计复杂性的上升和技术节点的缩小,这一挑战会加剧。新的方法可以提供定量结果,详细说明 SEU 缓解措施的有效性。

如需更详细地了解西门子 SEU 方法及其将帮助您克服的挑战,请参阅白皮书, 集成电路的选择性辐射缓解,也可以访问 验证学院:选择性辐射缓解.

Jacob Wiltgen is the Functional Safety Solutions Manager for Siemens EDA. Jacob is responsible for defining and aligning functional safety technologies across the portfolio of IC Verification Solutions. He holds a Bachelor of Science degree in Electrical and Computer Engineering from the University of Colorado Boulder. Prior to Mentor, Jacob has held various design, verification, and leadership roles performing IC and SoC development at Xilinx, Micron, and Broadcom.

另请参阅:

Siemens Digital Industries Software Collaborates with AWS and Arm To Deliver an Automotive Digital Twin

Handling metastability during Clock Domain Crossing (CDC)

独特地理解芯片设计和验证的挑战

通过以下方式分享此帖子:

时间戳记:

更多来自 半维基