A recent experience with managing the tenth anniversary version of a technical conference (www.iccve2022.org) triggered some reflection on the last decade of artificial intelligence technology. The previous article, Reflections On a Decade Of AI (forbes.com), explored the historical background of AI technology and the current success of Machine Learning (ML) methods. The article also explored the contrasts of conventional algorithms and ML algorithms. Specifically, ML algorithms are non-deterministic, not analyzable, and have no requirement of an underlying theory. ML algorithms can amplify random correlations just as easily as discover deep causal relationships.
In applications such as search (Google, Bing) or recommender systems (Amazon, Netflix), this property of ML systems is unfortunate but not a safety issue. However, in applications with a high degree of risk (as an example: autonomous vehicles or airborne systems), the safety considerations are paramount.
So, how can safety critical systems leverage the power of AI/ML while maintaining safety considerations? This has been one of the primary challenges for safety critical systems over the last decade.
Interestingly, this was not the first-time safety critical industries such as airborne systems have faced the challenge of a fundamentally new implementation paradigm. As the figure below shows, system design has moved from a paradigm primarily focused on hardware, to one which had to comprehend the unique consequences of software, and now AI/ML.
The trust framework to verify, validate, and certify airborne systems is a series of laws, orders, and best-practice guidelines used to demonstrate conformance with airworthiness standards. At a high level, critical aspects of the current environment are:
- System Design Process: A process-oriented structured development assurance for these complex systems with safety certification as part of the integrated development process;
- Formalization: Formal definitions of system operating conditions, functionality, expected behaviors, risks, and hazards which must be mitigated.
- Lifecycle: Lifecycle management of components and development systems.
Basically, very carefully and formally define the system design, formalize the expected behavior as well as likely issues, and make sure you understand the impact over the lifetime of the product. For hardware, this thought process is encoded in a practice document (SAE Aerospace Recommended Practice (ARP) 4761: Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne Systems and Equipment). The standard introduces the very important idea of mapping Hazard Analysis and Risk Assessment (HARA), thereby providing a basis for correlative analyses and mitigation.
Software significantly disrupted the situation.
With the software paradigm, the system design enables the use of the processor abstraction (Computer Architecture) to gain access to massive amounts of functionality from the software ecosystem. The processor abstraction allows the developers of software to maintain and leverage their investment across a large number of implementation platforms. This critical property of the processor abstraction allows for deep investments for developers and the availability of massive levels of functionality.
The software abstraction has its own ecosystem of tools such as compilers, operating systems, loaders, device drivers, and board support packages. Also, a combination of commercial and open-source eco-systems could provide enormous capability and at the same time allow for the crowd sourcing of innovation from a wide variety of sources. From a product management point-of-view, software is well “soft.” This means one can update it in the field, so it has the potential to run at a different cadence as the rest of the “hard” design from a product release perspective.
How does one handle software in safety critical systems?
In aerospace, the approach was to maintain the system design paradigm, but now system components could also be software. This software component maintained the same overall structure of fault analysis, lifecycle management, and system design hazard analysis. However, the underlying details had to be extended, so an associated standard (DO-178C: Software Considerations in Airborne Systems and Equipment Certification) was developed. DO-178C updated the notion of hazard from physical failure mechanisms to functional bugs. This was necessary because the software does not suffer from physical process-driven reliability degradation. Also updated were the concepts of lifecycle management, which reflected the conventional software development process, and finally and most importantly, the HARA framework continued to be used with software components. Typically, the dynamic potential of software was muted by “snapshotting” the software stack, but with the associated tradeoff of missing ongoing innovation.
The fundamental process of verifying a particular software component is quite challenging. Over the years, a vast number of techniques ranging from code-coverage to formal inspections were developed to verify these components. The result: Over the last 30 years, a combination of a focus on system design formalization, the use of rigorous engineering processes, and advanced software component verification has built a stable methodology for software-based safety critical systems.
What about AI/ML systems ? They are the next disruptive element for a number of reasons, and the topic of discussion in our next article.
Note: An article in the Journal of Air Traffic Control (ATCA) Journal (atca.org) (Sep, 2022 issue) provides a more detailed understanding of safety critical methodologies in the airborne systems space for those interested in the gory details.
Source: https://www.forbes.com/sites/rahulrazdan/2022/03/13/reflections-on-a-decade-of-ai-part-2/