CEAC aerospace

More information

Artificial Intelligence in Your Cockpit

1. Artificial Intelligence in Your Cockpit

Artificial Intelligence is adopted widely in various areas including aviation domain. Artificial intelligence in the cockpit – cooperating with a pilot or even taking autonomous decisions– this opens new opportunities for functions which increase safety, reduce crew workload and fatigue, optimize flight trajectories for fuel efficiency in crowded airspace and open completely new business opportunities.

Besides the benefits which AI will bring to the aviation segment, there are also challenges to be addressed. The AI based functions need to be evaluated not just from the it’s benefits point of view, but also need to be evaluated from:

Safety perspective – “How we can know the AI based function decisions are always safe?”
Ethical perspective – “How we know the AI based function is free of cultural/racial/gender… biases?”
Transparency perspective – “Can we determine based on what AI function took particular decision?”
Human perception perspective – “How pilots will accept AI functionality and what will be pilot’s expectations from AI functionality?”
Certification perspective – “How we can certify complex SW functionality which is Machine learning based?”

All these questions must be asked when we are thinking about bringing AI based functionality into the cockpit. Once we know the answers, then the technology revolution in the cockpit may begin.

1.1. AFI-X Prototype Story

The beginning of the Artificial Instructor story (in 2015) was simple – “I wish to have someone experienced on board with me, when I’m flying solo.” In other words, the idea came from the community of pilots with low flight time – either new pilots with fresh license or pilots with long flight break.

The second step was also the simple one – Let’s build application, running in the cockpit, which monitors in real-time pilot performance and detects pilot mistakes which may develop into the real problem if not mitigated accordingly. Once the mistake is detected, the system will recommend corrective action. The system may observe pilot improvements over the time and change focus from major errors which may lead to unsafe situations towards smaller errors which are commented to improve pilot skills. In other words – the function will do what flight instructor is doing. And how to implement such functionality? The Artificial Intelligence definition (one of many) gives us the answer: “Artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems.” (Definition is taken from techtarget.com).It means that simulation of human instructor intelligence will be Artificial Intelligence – Artificial Instructor.

The idea was clear and desired AI functionality well defined, however the framework for development of AI based, real-time application, which assist to a pilot during flight – this framework was missing at that time – so we had to built the framework from scratch.

Note that today situation is better than our initial position in 2015. In 2020, EASA published the EASA AI Roadmap 1.0 and at the end of 2021 the First Usable Guidance for Level 1 Machine Learning Applications was published by EASA. As we asked right questions back in the 2015, our framework we developed during the work on AFI-X prototype is well aligned with the EASA guidance and EASA roadmap.

And mainly, our framework we developed is generic – we are able to tailor it development of various AI based applications and we are also able to evolve the framework from actual stage towards more autonomous AI/ML frameworks. The figure below is taken from EASA AI Roadmap 1.0 – it is clear that we are just at the beginning of the technology revolution in Aerospace.

Figure 1: EASA published roadmap for AI applications. Taken from https://www.easa.europa.eu/downloads/109668/en, page 13.

The AFI-X journey was successful – the project moved from the initial idea through the AI algorithms concept definition towards simulator testing and finally was flown in real airplane for many hours assisting to pilots. This story helped us to validate the correctness of the framework and develop good AI based Level 1 application prototype.

Test at simulator…

… and real flight test.

The developed framework and processes can be re-applied and customized for development and certification of various AI based avionic applications – despite the selected algorithms and target functionality.

Are you interested about tips and tricks for a AI development and testing? Read more in next article AI Development – Tips & Tricks

Or would you like to know more about the AI development framework? Read more in the article Framework for AI Development and Certification

AI Development - Tips & Tricks

1. AI Based Application Development – Tips & Tricks

At the beginning of new AI oriented project, besides definition of required functionality, it is critical to ask the questions listed in the article Artificial Intelligence in your Cockpit. To find the right answers is not easy task and right answers may add additional work into the program – it means that “temptation” to ignore at least some of the questions may be high, however if any question is missed or not addressed in the project plan, the later corrections of problems may be expensive or even not possible.

1.1. Safety First

Obviously, your AI function needs to work. But – as you will use it in the cockpit – it must be also safe. Now – “How we can know the AI based function decisions are always safe?”

The core is to understand hazards related with AI function. Pay attention to review of defined hazards with pilots / users of your AI application to correctly assess crew impacts.

Another “catch” are situations, which were not covered in training data. If the situation is ‘far away’ from the scenarios covered in the data, then AI may have problems to provide good answers in these situations. Map operational space and compare with your data coverage. Understand well the corner cases. Check system response at the operational boundaries and during corner case scenarios by good robust testing. Try what will happen if inputs are completely out of the data coverage – for example – if you trained the AI to recognize animal species from the photo – use orange to test the response of AI. The right AI answer in this case is: “No animal recognized.”

Prepare prototype/mockup and analyze user interaction with the AI application. Focus on unexpected interactions (not considered at the beginning) and see if new hazards may be imposed.

Keep in mind that “always safe” means – “the probability of the failure with safety impact is low enough”. Based on the selected AI algorithms use proper statistical/analytical methods to determine probability of incorrect output. Note that this differs from regular safety analysis – in this case we are not looking for failure of some component (i.e., CPU HW failure) and how it contributes to analyzed hazard – in this case we analyze complex AI algorithm performance and we are looking on probability of “wrong decision taken by AI”.

Keep order in the data – do not loss track of what data were used for what (training, testing, …) and when (initial training phase, informal validation, re-training to improve performance,…). The data management has to be established from early phase of program.

And of course – the whole regular safety assessment process of the system which hosts AI application still applies.

1.2. Human Centric Design

Your AI function supposes to assist to a pilot during flight andthus, your design has to be oriented on pilot needs. At least following areas should be considered for AI design:

The user interface design. This may not differ too much from designing other avionics functions – but if the provided functionality is novel to the industry, then there may not be well known ways how to interact with the user established yet. However, the generic rules can apply here – the UI should intuitive, easy to use, not distracting – and of course - satisfying.
Another aspect, which is more specific to AI is human perception of the AI. Typically, the first reaction to AI function is “wow, that’s cool” (if it’s working). Then, user will get used to the function and the ‘wow effect’ is over. Instead of initial enchantment, the sensitivity to errors may show up. For example – if you have new voice assistant, the initially you may be excited about the great functionality. Later, you get used to the fact, it works well most of the time and instead of it, you start to pay attention to errors of the assistant. If the number of errors is higher than ~5%, the function starts to be perceived as unreliable. Now – if the system for advertisement recommendations do more than 5% errors, then you probably will be sometimes surprised about displayed advertisement, but that’s it. However, if your cockpit assistant do 5% or more of weird recommendations, you stop to trust it and remaining 95% of good advices will be compromised. It means that deploying of AI based application too early into the field (even if there are no safety impacts) may compromise the function even if overall failure rate is relatively low and acceptable from safety point of view.
Your function may get cultural/racial/gender/… biases. There are various sources of biases. For example – Artificial Instructor responses to detected piloting problems are based on recommendations from skilled pilot instructors. Thus, the system is influenced by this fact. In other words – for design of expert systems the selection of human experts which help to set the system behavior and responses may lead to bias. Another source of bias may be your training data. For example – if your AI application processes pilot face images, then selection of images for training may cause bias. If you pick up images of while bold men with glasses into your training set, then the AI application may incorrectly provide responses when there is someone who does not fit well into your training set. General advice how to minimize these biases is high variance in inputs (experts from various countries/cultures equally represented during definition of correct responses of expert system,…).
Unexpected interactions with AI – people may react to the AI function in unexpected ways which designers did not consider. It is important to prepare representative mockup/prototype for early evaluation to discover these interactions, asses them and – if needed – adjust user interface/functionality as required.

How to indicate visually a need of quick action?

May be this way?

1.3. The More Data – The Better AI?

Many people think that the more data is used for machine learning, the better AI will be. In general – yes and no. To train good AI, we need right amount of data – in other words, we need good coverage of the space, in which AI will operate. If you have huge amount of data, where most of data cover just limited subset of the operational space, then your AI may be over-trained for some scenarios and may fail to respond correctly in other scenarios. In such case – the problem may not be solved just by adding more data – you need to add data to cover gaps in operational space coverage and maybe reduce previous data set to ensure you correctly equalize data for various scenarios in your training set. The good data management process supported with iterative data sets analysis may help to reduce risks related to improper usage/balance of data. The right data management may also help you to not ‘over-collect’ data. If your project is dependent on the flight data, the data collection may be expensive process and collecting of unnecessary data will be primarily waste of money and time rather than real contribution to the AI function quality.

Another way how the cost of data collection may be reduced (in aerospace industry), is usage of simulation. This path may help to speed up data collection and keep cost reasonable, however the data must be checked for applicability. For example – during AFI-1 program – we discovered that data from simulator can do great job for flight phase detection AI function while for some other modules (i.e. monitoring of approach correctness) the simulator data were not representative enough. Further analysis has shown that the reason was in minor differences in a way how pilots fly at simulator versus how they fly in real airplane. The possible solution was to build high fidelity simulator or collect enough data for approaches.

Another way how to optimize data for AI algorithms in number of inputs into the algorithm. Sometimes designers think that injecting many inputs into the training is good idea – “Machine learning of Neural Network will solve this out.” Well, selecting right inputs for your application may help you with computational demand reduction and also analysis of why the algorithm took some decision is easier. The goal should be to minimize inputs for AI algorithms – but do not remove something critical. Watch the video and see how small change in input set may impact results in critical phase of the flight. The video (5x faster than real flight time) demonstrates ability of two neural networks to detect pattern phase which airplane currently flies (together with manual pattern phase tag created by operator).

Both networks were trained by the same set of data and both networks have the same architecture. The only difference is that one network uses information about engine RPM while the other does not. You can see that both networks behave very similar way most of the time. The only noticeable difference is at the end of the flight. The network without RPM will change state from landing to take off (for about 1-2 seconds of real flight time) and then back to the landing (landing is correct state), while the network with RPM continues to detect landing. The 1-2 seconds error in pattern phase detection seems like negligible performance problem (as whole flight time is 371 seconds) – however the impact is significant. And why significant? The erroneous detection of take off in landing phase (even for 1-2 seconds) may generate Artificial Instructor output which will try to correct errors during take off phase – but such recommendation will be misleading in the landing phase.

1.4. Performance Evaluation – Statistical vs. Subjective

Once you are done with AI training and implementation, it is the time to verify the AI functionality. This verification is done at multiple levels and by various means. The very first step is validation of learning process results. Based on the selected AI algorithm, the proper method of learning process validation is selected. Let’s take a look on two neural networks for detection of the pattern phase (which ability to detect pattern phase was compared in previous part – you remember the video, right?).

As discussed previously, both networks have the same architecture and the same inputs are the same – except the second network utilizes also information about engine RPM. And as we have seen in the video, both networks behave almost equally – except landing phase. During landing phase, the network without information about actual engine setting may confuse landing with take off. Now let’s take a look if we can recognize that the first net is prone to such problem in early phase of the project.

The figure below shows confusion matrices for the two networks (captured after initial training phase). The Class 1 corresponds to the Take Off and the Class 8 corresponds to the landing.If you check overall performance of both networks after the initial training, you will find out that both networks deliver similar quality of outputs – the network without engine RPM input reached 96.5% of correct classifications while the network with the engine RPM input reached 97.2% of correct classifications. This difference seems minor and additional cost of make this input available for the AI application may seems as not worth investment.However, if we check the confusion matrix for the first network carefully, we can see that Class 1 and Class 8 has approximately 10% of erroneous detections. After adding the engine RPM information, the error ratio drops down to approximately 5% - and if you check the confusion matrix really carefully, you fill see that scenario when class 1 was classified as class 8 or vice versa is almost completely removed but the confusion between class 1 and 9 remains almost the same.

And bonus question – what you can say about the initial data set by looking on the confusion matrix? It can be seen that number of data points for various pattern phases significantly differs – and better balancing of the data set may improve performance as well.

Confusion matrix for the neural network without the engine RPM input.

Confusion matrix for the neural network with the engine RPM input.

Once the model seems to be performing well, the next step is implementation of the model and it’s verification. Let’s skip the software level verification which ensures correctness of SW implementation of the model and let’s take a look on the system level testing of the final system. At the system level, the statistical evaluation and data analysis may help to well understand the algorithm performance, probability of incorrect decision and it also may support explainability at algorithm level. The statistical evaluation, and analysis of delivered outputs helps to verify design objectives and safety measures – however it does not address how a pilot / end user perceives interaction with AI during real-time operational scenarios. For this reason, it is important to perform set of tests which are focused on human interactions with the AI and understand if designed algorithms and it’s user interface helps to build a thrust of the user to the AI outputs.

The system level testing of AI based functionality (both scenarios above) may be extensive to the flight hours. The question if simulation may help to reduce cost and/or time of the testing is the valid one. Then answer is – maybe yes / maybe no – it depends. When we tested AFI-1 prototype, we considered utilization of the simulator as well as real flight tests. During the tests, test pilots performed subjective evaluation of AFI-1 recommendations – during scenarios when the pilot error was intentionally inserted as well as during flight when no failure was intentionally done. After initial set of tests, statistical comparison of subjective evaluations at simulator and at real airplane was done. It was found that AFI-1 corrections presented to the test pilots during situations when no fault was intentionally inserted were subjectively evaluated differently. At simulator (no-motion simulator), pilots typically accepted corrections while they considered corrections as ‘too strict’ or ‘incorrect’ more frequently in real airplane. This difference was statistically significant. The table below shows pilots’ subjective evaluation of AFI-1 outputs after initial tests.

AI Output Evaluation	Correct	Sooner	Too Soon	Later	Too Late	Too Strict	Incorrect	Missing
Simulator Data	584	0	5	0	4	28	70	0
Simulator Data	84.52%	0.00%	0.72%	0.00%	0.58%	4.05%	10.13%	0.00%
Airplane Data	184	1	0	2	1	35	43	3
Airplane Data	68.40%	0.37%	0.00%	0.74%	0.37%	13.01%	15.99%	1.12%

Comparison of pilots’ subjective evaluation of AI outputs at simulator and in real airplane.

The major contributor to this difference in judgment was driven by the fact that at no-motion simulator pilots had ‘worse feeling’ of the flight and were more willing to accept recommendations and corrections. Slight reduction of the AFI-1 ‘strictness’ level improved perception of the AI function also in real flight tests. And why the small reduction of ‘strictness’ improves the subjective perception? Because after the ‘strictness’ reduction, test pilots start to see developing error as well – right after the AFI-1 prototype recommendation.And a note – the number of outputs classified as ‘Incorrect’ was primarily tied with the assessment of errors during the approach. The root cause of relatively high number of ‘Incorrect’ evaluations in this phase during initial round of testing was not complete airplane energy assessment.

In general, we learnt two main things:

Pilot is willing to accept AI recommendations if he understands why/how the AI come to the conclusion. And it does not mean that pilots need to be experts on AI algorithms, it means that AI outcomes for specific situation must be understandable in it’s content to human user.
The data collection, and explainability at engineering level is critical for corrections and changes in AI logic – not just because of regulatory rules or to support safety analysis but it is also critical during development phase – without the algorithm explainability, the engineering team will not be able to effectively tweak algorithms and/or training data to improve performance criteria and/or user perception of AI functionality.

Framework for AI Development and Certification

1. Framework for AI Development and Certification

Development of AI function for aerospace has some specifics and the right framework for AI development may help to accelerate deployment of certifiable AI functionality for your cockpit. In general, following main steps need to be done:

Define well your functionality and operational space.
Select the algorithm type which fit the best for target functionality.
Establish data management and start to collect data. Ensure you have enough data even for corner cases.
Test your algorithm at various levels iteratively during the development.
Do not omit human factors aspect – including subjective evaluation of AI functionality.

As the whole process depends on data availability, the generation of data at simulator, collection of data in field and auto-generation of data – all these options are interesting from cost saving point of view. The investment into the infrastructure (i.e. cloud solution for in-field data collection) may pay back soon and it may also support future growth of the application. Next, once the data collection and supporting infrastructure is well working, there are options to automate future function improvement – the functionality may improve automatically and once significantly improved performance is achieved, you may deploy new version of the product (after formal verification of the change). Described development framework which may be applied to various aerospace AI applications is depicted in the figure below.

Development framework for AI based applications.

The application of the framework and iterative improvements of the AI functionality can be demonstrated at following example. Let’s consider the training organization which manage fleet of airplanes and train pilots. The very first AI application may be on-ground instructor assistant – based on recorded flight data, on-ground assistant prepares feedback for a student. The instructor reviews the feedback and store it with correction into the organization data cloud. This function safe instructor time – helps to prepare post-flight briefing with students, it also helps to the organization to collect training records – and last but not least – data for future AI improvements are collected. Next – as side benefit – data owned by the organization may help to improve training efficiency, safe cost, etc.

Now – in the example above, there are following actors:

Instructors – they benefit from pre-prepared post-briefings.
Training organizations – they benefit from detailed training data and related data analysis and services.
Supplier of the solution – the multiple training organizations data (anonymized, secured,…) can be used for further growth of the functionality – the large number of records from various training organizations can support deployment of improved assistant which assists on-board to a pilot during his solo flights.
Pilot – benefits from more effective training (at first), then have assistant on-board after his training is done who helps him recognize piloting mistakes and correct them prior they may evolve to real problem.

The example is illustrated in the figure below.

Are you interested how your use case of AI Application can be solved? Or are you interested how our framework may help you get your AI function quickly to the market? Then contact us, we can help you to get certified AI Functionality into the cockpit!

Technology for Sharing

More information

1. Artificial Intelligence in Your Cockpit

1.1. AFI-X Prototype Story

1. AI Based Application Development – Tips & Tricks

1.1. Safety First

1.2. Human Centric Design

1.3. The More Data – The Better AI?

1.4. Performance Evaluation – Statistical vs. Subjective

1. Framework for AI Development and Certification

Contact us

Search form

More information

1. Artificial Intelligence in Your Cockpit

1.1. AFI-X Prototype Story

1. AI Based Application Development – Tips & Tricks

1.1. Safety First

1.2. Human Centric Design

1.3. The More Data – The Better AI?

1.4. Performance Evaluation – Statistical vs. Subjective

1. Framework for AI Development and Certification

Contact us