When we don't want to know

01 October 2016

FICO feeds its neural network every scrap of data it can in order to find credit transactions that look fishy. But when it comes to coming up with credit scores, it purposefully excludes information that at least seems like it might be highly relevant. It’s right to do so. Sometimes the best knowledge is not the full knowledge.

Software-based neural networks are awesome at coming up with conclusions based on massive amounts of data representing many, many variables in situations where generalizations are too general. To make up an example out of whole cloth, and that is far too simple, a neural network might discover that it’s not suspicious when, say, a Ugandan credit card is used in France for a small purchase in a patisserie on a Sunday when the sun is out and if it’s been used to buy sneakers many times in Uganda, but a U.S. card used in the same way is highly suspicious if charges on that card have gone to online stores more than 11 percent of the time. Why might this make sense? Who knows!

Really complicated

Well, the neural network knows. It comes to this sort of decision by creating a complex network of nodes that trigger other nodes based on the weight accorded that data. And then it gets really complicated, because deep learning systems layer neural networks, feeding the outputs from one into another.

The result can be quite precise fraud notifications or the ability of a photo site to tell that a particular snap not only shows a child riding a dragon in a theme park, but that the child is not too happy about it.

The drawback is that in many instances the human brain can’t understand how a deep learning system came up with its results. We simply cannot trace back all of the flipping of nodes based on data and weights from what may be thousands or millions of nodes.

And that’s a serious problem for credit scores, for laws and regulations say that the scoring agency has to let a potential lender explain why an applicant was turned down. In fact, the explanation has to make it clear how the applicant can boost her score. And the decision cannot be based on what the scoring agency knows about “protected classes,” that is, sets of people who by law are protected from being discriminated against simply because they are a member of that class.

Simply unfair

For example, here is an acceptable explanation of why your bank turned down your application for a home equity line of credit: “Your credit score was too low because you were significantly late on your last six car payments.” This is an explanation a human can understand, no stats or computer science required. It indicates how you can improve your score: Start making those payments on time! And it has nothing to do with you as a member of a protected class.

Here are some unacceptable explanations: “Don’t ask me! The computer figured it out and it’s too complex for human beings.” Or: “It’s because Hispanics are poor credit risks.” Even if that were true, it’s not something you could change in order to improve your score.

There is, of course, a more important reason why FICO and other credit scoring companies are not permitted to consider ethnicity when computing a score. Nor are they allowed to consider anything that could be a proxy for ethnicity, such as subscribing to Spanish-language newspapers. The reason is simple: It’s unfair. Imagine having to tell a bank loan officer, “It may be that loan defaults are higher for my protected class, but don’t you see where it says on my application that I earn a lot of money, I’ve saved a lot of money, and I’ve paid every bill on time?” Inversely, one can imagine a different disappointed loan seeker expostulating, “How dare you! Yes, I am personally a deadbeat, but can’t you see that I am a white man?!”

I recently asked Andrew Jennings, the senior VP of scores and analytics at FICO, about these strictures. Although FICO uses neural networks extensively in its fraud division, for credit scores, it carefully assembles models—mathematical formula that express relationships among factors—that would make sense to a human and that exclude proxies for protected classes.

But suppose these manually created models are less predictive of credit risk than a neural network would be? Jennings says that they recently derived some prototype FICO scores using machine learning techniques and found that the difference between those scores and the ones from their manual models was insignificant.

”Best” knowledge

But one can wonder. Assume that the inscrutable neural networking systems get better and better as the science improves and as more data pours into them. Assume that at some point they predict credit risk significantly better than the human-explicable way. Lenders would undoubtedly exert pressure to lower their risks by moving companies like FICO to start using neural networks to derive credit scores. The lenders would be able to say things like, “You could lower the percentage of bad loans my institution makes by 5 percent.” Or 10 percent. Or 20 percent.

It doesn’t matter. You don’t want to be denied a loan you are qualified for because of your race, ethnicity, gender, sexual “preference,” religion, age or anything other than your assets, revenue and record. We don’t always want the “best” knowledge that money and computers can buy.

It’s one of the reasons we need governments to govern, including regulating what knowledge we’re allowed to use.