When artificial intelligence comes into play, the role of Quality Assurance changes to ensure constant improvement.
If your company has developed software before, you know that it's never as simple as writing some code and putting it into production. Regardless of who the software is intended for (customers, employees, third parties), it is essential to carry out adequate Quality Assurance (QA). Otherwise, you would be blind to the limitations of the software and could even deliver broken or completely unusable products.
Outsourcing quality assurance and quality control is now a core process of any software development project. The design, build, test, and deployment process needs to be done correctly and in that order to achieve success. As such, QA engineers work throughout the software development lifecycle using agile methodologies and testing all progress in small, iterative increments, making sure the product always meets the appropriate objectives.
One would expect Artificial Intelligence projects to implement quality control like this. However, this is rarely the case. While the standard 4-stage iterative process is maintained for the most part, AI-driven operations can simply be put into production. Why? Due to the inherent nature of AI: it is constantly learning, it is constantly evolving and therefore requires continuous management.
This means you don't do quality control for AI projects the same way you would for any other project. Here's why.
The role of quality control and testing in AI projects
By definition, AI needs to be continually tested. If you want to develop AI that actually works, you can't just throw some training data at an algorithm and call it a day. The role of QA and testing is to check the “usefulness” of the training data and whether or not it does the job we ask it to do.
How is this done? Through simple validation techniques. Basically, QA engineers working with AI need to select a portion of the training data to use in the validation stage. Then they put you in an elaborate scenario and measure how the algorithm performs, how the data behaves, and whether the AI is returning predictive results accurately and consistently.
If the QA team detects significant errors during the validation process, AI will return to development, just as you would with any other software development project. After some adjustments here and there, the AI goes back to quality control until it delivers the expected results.
But unlike standard software, this is not the end of the QA team. Using some different test data, QA engineers have to do all of this again for an arbitrary amount of time, which depends on how thorough you want to be or how much time and resources you have at your disposal. And all of this happens before the AI model is put into production.
This is what most people know as the “training phase” of AI, where the development team tests the algorithm multiple times for different things. QA, however, never focuses on the actual code or the AI algorithm itself – they need to assume that everything is implemented as it should be and focus on testing that the AI actually does what it's supposed to do.
This approach leaves two main things for QA engineers to work with: the hyperparameter configuration data and the training data. The former is tested primarily through the validation methods we discussed earlier, but may also include other methods such as cross-validation. In fact, any type of AI development project must include validation techniques to determine whether hyperparameter settings are correct. That's just a given.
After that, all that remains is to test the training data itself. But how do quality control engineers do this? They can't simply test data quality, they also need to test data integrity and ask a lot of questions to help them measure results. These are always a good starting point:
- Is the training model designed to accurately represent the reality the algorithm is trying to predict?
- Is there any chance that data-based or human biases are influencing the training data in some way?
- Are there blind spots that explain why some aspects of the algorithm work in training but don't work as expected in real-world contexts?
Testing the quality of your training data can generate many more questions like these as the project progresses. Keep in mind that to answer these accurately, your QA team will need access to representative samples of real-world data and a comprehensive understanding of what AI bias is and how it relates to AI ethics.
Artificial intelligence needs to be tested in production
In short, your QA team should know when your AI software is properly validated, when the training data is up to standard, and when the algorithm is proven to consistently deliver the expected results.
However, each AI project will always have a unique way of managing and processing data – and, as we all know, data is always growing and changing. This is why the quality control approach to AI development extends to the production stage.
Once all of the above gets the green light, Quality Assurance will begin a new cycle, testing the performance and behavior of the AI as it receives new real-world data. Regardless of the size or complexity of your AI project, you want to always closely monitor the evolution of your AI. And the best way to do this is through a proper quality control process.
Today, this is known as “Machine Learning Operations” or, more succinctly, ML Operations. This involves version control, software management, cybersecurity, iteration processes, and discovery stages where QA engineers handle everything that might happen once the AI is in production. I hope this article helped you expand your perspective on quality control and artificial intelligence. Good luck!
If you liked this, be sure to check out our other articles on AI.
- How is GPT-3 redefining AI research and chatbots?
- How low-code/no-code solutions bolster AI
- How should AI be regulated?
- How to create an AI system in 5 steps
- The Impact of AI on Software Testing: Challenges and Opportunities