权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Robust and Scalable On-Line NDP Designs and Applications to Semiconductor Process Optimization

稳健且可扩展的在线 NDP 设计及其在半导体工艺优化中的应用

基本信息

批准号：
0002098
负责人：
Jennie Si
金额：
$ 30万
依托单位：
Arizona State University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2000
资助国家：
美国
起止时间：
2000-10-01 至 2004-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=0002098&HistoricalAwards=false
关键词：
Robust Scalable Line NDP Designs

项目摘要

0002098SiDespite all the technical advances today in the fields of micro-processors and control system design, one key component is still missing, the design of a generic learning system (which will be referred to as a 'learner' hereafter in the proposal). The learner will have a final product in the form of either software or hardware that learns to improve its performance through interactions with the environment. In addition to the lack of an explicit model for the environment, explicit performance feedbacks are delayed, i.e., they are only available at the end of a long sequence of actions and consequences. A problem of this nature is beyond the scope of classical adaptive control theory.In recent decades, new schools of thinking represented by Reinforcement Learning (RL) based on Neural Dynamic Programming (NDP) have surfaced. In this approach, the learner observes an input state (which can be the current state or a predicted future state) and then produces an 'action' or 'control' signal to apply back to the environment. Consequently, an 'evaluation' signal is created by a critic network to comment on the effectiveness of the action taken . The goal of learning is to generate optimal actions leading to a maximal reward. Layered neural networks are the key implementation blocks for the learner. Neural networks are used to provide both the action signal and the evaluation signal.Learners have demonstrated their effectiveness in a number of difficult tasks. However, learners are usually neural networks performing predictive tasks such as generating action values or action evaluation values. When little is known about the environment or the task, the learner must acquire a system level knowledge to first produce the action and then the evaluation. This requires the components inside the learner to work together. Furthermore, how can one implement a human-machine interface for different applications without 'cheating' by letting the learner truly learn on its own and 'on-the-fly'?This project will address these basic issues, using problems from semiconductor manufacturing as a testbed. It will seek reliable system designs in the form of mathematical learning algorithms. It will try to achieve more stability and quicker outcomes from the learner, namely higher success rates with fewer learning trials. Attention will be paid to the configuration, algorithm parameterization, system input-output and performance measure specification, and all other issues relevant to the learner design. The learner will develop input releases and queuing policies for an industrial scale semiconductor manufacturing facility. The purpose of this exercise is to examine the scalability, reliability, and generality of the learner design.A successful implementation of this research would represent a significant step toward a truly human-like system that learns on its own and improves its performance over time.

尽管今天在微处理器和控制系统设计领域中有所有的技术进步，但仍然缺少一个关键部件，即通用学习系统（在本建议书中以下将其称为“学习器”）的设计。学习者将拥有软件或硬件形式的最终产品，通过与环境的交互来学习提高其性能。除了缺乏明确的环境模型外，明确的性能反馈也会延迟，即，它们只有在一长串行动和后果的最后才能得到。这种性质的问题超出了经典自适应控制理论的范围。近几十年来，以基于神经动态规划（NDP）的强化学习（RL）为代表的新思想流派浮出水面。在这种方法中，学习者观察输入状态（可以是当前状态或预测的未来状态），然后产生“动作”或“控制”信号以应用于环境。因此，一个“评估”信号由一个批评者网络创建，以评论所采取行动的有效性。学习的目标是产生导致最大回报的最佳行动。分层神经网络是学习器的关键实现模块。神经网络被用来提供动作信号和评估信号。学习者已经在一些困难的任务中证明了它们的有效性。然而，学习器通常是执行预测任务的神经网络，例如生成动作值或动作评估值。当对环境或任务知之甚少时，学习者必须获得系统级知识，首先产生动作，然后进行评估。这需要学习者内部的组件协同工作。此外，如何才能实现不同应用程序的人机界面，而不“作弊”，让学习者真正学习自己和“飞行”？这个项目将解决这些基本问题，使用半导体制造的问题作为试验平台。它将以数学学习算法的形式寻求可靠的系统设计。它将试图从学习者那里获得更稳定和更快的结果，即以更少的学习尝试获得更高的成功率。注意将支付的配置，算法参数化，系统的输入输出和性能测量规范，以及所有其他相关的问题，学习者的设计。学习者将为工业规模的半导体制造设施开发输入发布和排队策略。这项工作的目的是检查的可扩展性，可靠性和通用性的学习器设计。这项研究的成功实施将代表一个真正的人类一样的系统，学习自己，并提高其性能随着时间的推移，迈出了重要的一步。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Jennie Si其他文献

Evidence of a mechanism of neural adaptation in the closed loop control of directions

方向闭环控制中神经适应机制的证据

DOI：
发表时间：
2010
期刊：
International Journal of Intelligent Computing and Cybernetics
影响因子：
4.3
作者：
B. Olson;Jennie Si
通讯作者：
Jennie Si

Approximate dynamic programming based supplementary reactive power control for DFIG wind farm to enhance power system stability

基于近似动态规划的双馈风电场补充无功控制增强电力系统稳定性

DOI：
10.1016/j.neucom.2015.03.089
发表时间：
2015-12
期刊：
Neurocomputing
影响因子：
6
作者：
Guo Wentao;Feng Liu;Jennie Si;Dawei He;Ronald Harley;Shengwei Mei
通讯作者：
Shengwei Mei

Development of a Start-Stop Signal for a Directional BMI.

定向 BMI 启停信号的开发。

DOI：
发表时间：
2006
期刊：
The First IEEE/RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics, 2006. BioRob 2006.
影响因子：
0
作者：
B. Olson;Jing Hu;Jennie Si;Jiping He
通讯作者：
Jiping He

Scaling Up to the Real

扩展到真实情况

DOI：
发表时间：
2003
期刊：
影响因子：
0
作者：
Jennie Si;Andrew G. Barto;Warren Powell;Don Wunsch;New York;Chichester • Weinheim;Brisbane • Singapore;Toronto Contents;Silvia Ferrari;Robert F. Stengel
通讯作者：
Robert F. Stengel