Resilient Design and Operation of Cyber Physical Systems with Emphasis on Unmanned Autonomous Systems

2018 
Autonomy and autonomous systems are occupying central stage in the research community, as autonomous vehicles are proliferating and their utility in all aspects of the military and civilian domains are increasing exponentially from one year to the next. The development and application of resiliency and safety technologies to autonomous systems is, unfortunately, not keeping pace with their growth rate. Several factors impede the deployment and adoption of autonomous systems. Among them is the absence of an adequately high level of autonomy that can be relied upon, significant challenges in the area of human-machine interface requiring significant human intervention to operate and interpret sensor data, the need for emerging machine learning technologies and, most importantly, the resilient design and operation of complex systems to assure their safety, reliability and availability when executing missions in unstructured and cluttered environments. Recent advances in resiliency and safety of complex engineered systems have focused on methods/tools to tradeoff system performance for increased time to failure aiming at mission completion or trial and error methods to arrive at a suboptimal policy for system self-organization in the presence of a failure mode. This paper introduces a novel framework for the resilient design and operation of such complex systems via self-organization and control reconfiguration strategies that avoid empirical trial and error techniques and may be implemented and perform in real time on-platform. The main theme is summarized as: “a healthy and resilient system is a safe system”. To accomplish this objective, we introduce an integrated and rigorous approach to resilient design while safety considerations ascertain that the targeted system is contained within a safe envelope. A resilient system is robustly and flexibly monitoring its internal and external environment, it can detect and anticipate disturbances that may affect its operational integrity and take appropriate action to compensate for the disturbance. Resilience enhances safety while improving risk factors and assures that vehicles subjected to extreme disturbances remain within their safe envelope. The enabling technologies begin with graph spectral and epidemic spreading modeling tools to represent the system behaviors under normal and faulty conditions; a Markov Decision Process is the basic self-organization module. We are introducing a novel approach to fault-tolerance by considering the impacts of severe fault modes on system performance as inputs to a Reinforcement Learning (RL) strategy that trades off system performance with control activity in order to extend the Remaining Useful Life (RUL) of the unmanned system. Performance metrics are defined and assist in the algorithmic developments and their validation. We pursue an integrated and verifiable methodology to safety assurance that enables the evaluation of the effectiveness of risk management strategies. Several unmanned autonomous systems are used for demonstration purposes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    8
    Citations
    NaN
    KQI
    []