How can companies be confident that the algorithms they are employing are providing the service intended but are not reinforcing policies they did not intend?
By Andrew Clark
Machine learning—essentially a computer that recognizes patterns without having to be explicitly programmed—is revolutionizing many industries. Machine learning enables us to find answers and unexpected relationships in data that were impossible to find with the “cookbook recipe” style of programming that currently powers our software.
However, there is a downfall to the use of machine learning: the “black box effect.” In traditional programming that uses the recipe approach, if a decision-maker or assurance professional wanted to know why a decision was being made, software engineers or analysts could peek inside the program and see that threshold X was reached, which triggered the effect. But, with many machine learning algorithms, it is extremely difficult to look inside an algorithm to ascertain why a certain result was returned.
For instance, borrowing an example from Carlos Guestrin, the Amazon professor of machine learning at the University of Washington, a model “trained” on a set of images can tell if a given image is of a husky or a wolf with a high degree of accuracy. Unfortunately, there is one major problem with this algorithm that is unbeknownst to its trainers: All the wolf pictures on which the model was trained had snow in the background. So, when an image of a husky with snow in the background appears, the image would be classified as a wolf.
A misclassification in this direction would not necessarily be catastrophic, but an algorithm that determines which animals could enter a children’s park, for example, could be disastrous if a wolf appears on a sunny, snowless day and is classified as a husky.
Are Algorithms Providing the Intended Service?
With algorithms increasingly dictating our credit scores, which jobs we can interview for, and even, heaven forbid, what our jail sentence is going to be, how do companies have assurance that the algorithms they are employing are providing the service intended but not implicitly reinforcing policies that they did not intend? With the majority of algorithms in use today, there is a significant “black box effect,” and a dearth of assurance around the operating intricacies of these models.
Going back to our wolf/husky example, it is extremely difficult to find out why the algorithm made the decisions it did. However, machine learning experts are addressing such scenarios.
Guestrin and his graduate students, Sameer Singh and Marco Tulio Ribeiro, recently released a paper titled, “Why Should I Trust You? Explaining the Predictions of Any Classifier.” The document introduced a framework for ascertaining which weights certain factors had on the image selection. Although far from perfect, this local interpretability is able to explain why a certain prediction was made “by learning an interpretable model locally around the prediction.”
It's a step in the right direction. However, taking into account the immense difficulty of assessing an algorithm’s features leads us to a question: How many algorithms are out in the wild that appear to be working but may have been trained on snow-blotched data?
In April 2016, the European Union’s Parliament passed a set of regulations called the General Data Protection Regulation (GDPR) that gives a “right to explain” to citizens and regulators regarding algorithmic decision making. This law empowers citizens with the ability to understand why they were rejected for a bank loan, for instance, when the decision was based on an algorithm.
For certain types of machine learning, such as logistic regression, decision trees, etc., machine learning practitioners can obtain the model weights and input variables, which are readily interpretable to machine learning experts. However, neural networks and random forests, the sort of black-box models discussed earlier, make it difficult to explain why a given decision was made.
To fully comply with the GDPR and the foreseeable comparable regulations in other regions, current algorithmic development and deployment practices will need to be modified, and a new industry of objective algorithmic auditing will need to be introduced. Internal and external auditors are in the perfect position to step into this role of providing algorithmic assurance to businesses and regulators.
Auditing the data flow to and from machine learning algorithms, reviewing the myriad assumptions and examining model weights, when available, are key areas for growth to allow the auditing profession to rise in prominence and facilitate our societal shift to the "Big Data Age."
Andrew Clark is an IT auditor and internal audit data scientist at Astec Industries, as well as an ISACA expert and presenter on machine learning for auditors.