Episode 161 – AI-ML and Bias – Ismini Psychoula, OneSpan

George Peabody

January 13, 2022

POF Podcast

2022 just started and we got so much to see and learn this year. But, we are not alone in this learning journey. Machine Learning and Artificial Intelligence are here to stay, so we also should learn why and how it works for the payments industry.

Starting this season of Payments on Fire®, we will have Ismini Psychoula discussing fundamental technology, AI and ML, and its role in financial services. Ismini is a Research Scientist at OneSpan. OneSpan is a global organization that is focused on digital banking security and e-signatures, delivering trust and business productivity solutions for more than 10,000 customers in 100 countries. In the financial industry, more than half of the top 100 global banks rely on OneSpan solutions to protect their online, mobile and ATM channels.

Ismini Psychoula got a PhD in Computer Science with a thesis about Privacy-Preserving Machine Learning for Smart Healthcare. At OneSpan she’s Researching and designing privacy preserving and explainable machine learning models for financial systems.

In this episode, you’ll hear about Ethical AI, AI feeding and monitoring, Machine Learning bias and how it all affects the financial system. We’ll take a deep dive in the research to prevent fraud and increase transaction security.


George Peabody, Host, Glenbrook Partners:
Welcome to Payments on Fire® Podcast from Glenbrook Partners about the payments industry, how it works and trends in its evolution. I’m George Peabody, partner at Glenbrook and host of Payments on Fire®. Today we’re taking a deep dive into a fundamental technology AI and machine learning and its role in financial services. we’ve had multiple guests here on Payments on Fire® who employ these tools in order to detect fraudulent behavior and it turns out AI can be very effective as another tech layer joining rules-based systems in the expanding fight against fraud but there are limitations and despite the digital domain that this operates in it turns out that human biases and prejudice find their way into these systems.

A long-standing problem has been the fact that credit bureaus an essential source of decisioning data for lenders have long declined access to credit based on race through the use of historical data such as zip codes where you went to high school perhaps payment history on making regular mortgage payments but what access to credit for black and hispanic customers is so difficult to achieve how does that work there’s a reason black home ownership in the us stands at 44% compared to 74.5% for non-hispanic whites.

This use of historical data focuses on the wrong thing for these populations they just don’t have that lengthy history cash flow based analytics on the other hand that look at current obligations could make more sense automated decisioning the risk assessment software used for underwriting by credit issuers is a heavy user of AI and machine learning and based on the data it consumes that includes credit bureau data these systems still favor wealthier white applicants this isn’t fair equitable or ethical it automates the digital divide it inhibits wealth building often viewed as synonymous with home ownership in the u.s among those communities and it also leaves potential business out for lenders so to understand how bias plays a part in AI and the ethical concerns the result we’re very fortunate to have Ismini Psychoula, Research Scientist with Cybersecurity firm OneSpan join us today.

 

George Peabody, Host, Glenbrook Partners:
Welcome Ismini, glad to have you here.

Ismini Psychoula:
Thank you George, the pleasure is all mine. So happy to discuss such an important topic with you.

George Peabody, Host, Glenbrook Partners:
So you’re with OneSpan, it’s a global organization that’s focused on digital banking security and these signatures obviously important security tools including digital identity verification authentication big favorites of mine mobile app security and front analysis I’ll also do a quick shout out and I’ll put it into the show notes OneSpan did a tremendous amount of work publishing a really comprehensive look at the global financial regulations your global financial regulations report it’s a really thorough piece of work.

Ismini Psychoula:
Absolutely, my colleagues at OneSpan did a tremendous job there and we’re hoping this is going to help a lot of other people in the industry get this right.

George Peabody, Host, Glenbrook Partners:
Great, so back to the main thrust here, I really appreciate what you’re doing and looking at this in such depth. First of all, could you tell us a little bit about your own background and how you came to this work?

Ismini Psychoula:
Absolutely, so my background is in Computer Science since my early, very early age I had an interest in computers how they work how the algorithms decide them behind them board so this led me to do all my studies in it and finishing with a phd in Artificial Intelligence and particularly the privacy and the trustworthiness aspects of it and now at OneSpan I work for their innovation center in Cambridge where me and the team I work with will look at all the ethical and trustworthiness aspects of AI and how can we put this into all the services and products that we put out there.

George Peabody, Host, Glenbrook Partners:
Right, so was there a particular event or a set of circumstances that caused you to focus on the ethical concerns?

Ismini Psychoula:
I would not say that there was a particular event just by reading and educating myself more in this area, I saw how underserved it is and how we’re like in research and practical let’s say technology on this and I decided to put my focus and research there.

George Peabody, Host, Glenbrook Partners:
Great, so to set the stage, I know this is a big ask, but could you explain in a few sentences how Artificial Intelligence and machine learning actually work?

Ismini Psychoula:
So Artificial Intelligence is a very broad term and it essentially shows a machine that is able to simulate human behavior. When we talk about the machine learning it is a subset of AI algorithms that allow a machine to learn from past data without actually programming it explicitly, it learns on its own based on the data that we give it.

George Peabody, Host, Glenbrook Partners:
So the data that’s being fed in is just extremely important to its effectiveness which I think we’ll be talking about.

So ideally when these systems are applied to underwriting or fraud detection they’re really that combination of what should be a never-ending stream of data that are fed into algorithms that are specific to the particular use case, a particular question.

Ismini Psychoula:
Yes. So, we use data specific to the domain to be able to decide if a transaction was fraud or not and this is all mainly based on historic data and also features that are most of the time engineered by either data scientists or fraud analysts or a combination of both to achieve the best results there.

George Peabody, Host, Glenbrook Partners:
And it’s also the case that the algorithms are also tuned for the particular use case as well.

Ismini Psychoula:
Yes, a lot of times they are tuned and algorithms in AI often times they are able to solve only a very small part. We want a yes or no decision, Is this fraud or not? They cannot generalize and tell us about other things that easily.

George Peabody, Host, Glenbrook Partners:
Yeah, so there’s no master AI out there.

Ismini Psychoula:
Not yet.

George Peabody, Host, Glenbrook Partners:
A master tool, I can wait, I mean don’t worry for that. So we know that these tools are being employed everywhere and they’re working in many use cases but where are they falling down in financial services where it is underwriting extensive credit fraud detection…?

Ismini Psychoula:
Like you said George in your introduction, right now in credit we are seeing a lot of bias based on this automated decision making, essentially because they are using historic data and bias is already there in the data that we learn from so it’s really difficult to see when this is happening and how to mitigate it. In fraud it is another use case completely because it does not have a direct effect on people’s financial circumstances or well-being but at the same time flood patterns change all the time and we do not have the ways and technology to detect that directly so bias can creep in in there and we might miss fraud cases that might be happening so it’s a different type of use cases but in both we have bias a lot of times.

George Peabody, Host, Glenbrook Partners:
So, I think yesterday I was reading a report that a Google Organization had put together just at this point about the quality of the data that needs to be fit into these individual systems that it’s often the case where you know the use of mechanical turk kind of staffing where individuals compete for jobs but doing the most simplest of things identifying is just a cat in an image or whatever that that data generated by these underpaid stressed individuals will insert their bias into their output as well.

Ismini Psychoula:
Absolutely, collecting and labeling data is one of the most important tasks that we have in machine learning and we want this to be as accurate as possible. Other times we can have sample bias or prejudice bias and when people do not pay enough attention when they do that or when two people will label the same thing differently, this can lead in confusion to our end algorithm that we will put into production so getting this step right is is really important for any organization that plans to use it for developing algorithms.

George Peabody, Host, Glenbrook Partners:
So I’m struck by what you said about how different fraud and underwriting our course are as use cases and that the underwriting one has a real social impact. How is it that we’re able to first of all detect this bias? How’s that being exposed?

Ismini Psychoula:
So lately there has been a new research area and it focuses on the explainability of AI algorithms, what this area is looking at is we have an input we give it to a machine learning algorithm we get an output but we have no insight what happened inside the AI algorithm that led to that decision. The explainability research area looks exactly there what were the important features that led to a decision for example we applied for credit we get a no decision was it because we don’t have enough income was it because our postcodes no one in our postcode ever had gotten a loan approved before so the AI learned that no one in this area ever gets a loan, these type of explanations that we get from the algorithm can help us detect if there is any bias in there and what can we do and then we can take the necessary steps to improve the algorithm since the data change the features.

George Peabody, Host, Glenbrook Partners:
So you’re pointing out that these systems are essentially or heretofore been essentially black boxes.

Ismini Psychoula:
Exactly, so the black box issue right now we are trying to address it with explainability so we can see what is happening inside the black box and if there is any bias have the means to correct it.

George Peabody, Host, Glenbrook Partners:
I’m still troubled by the fact that a underwriting decision can be as glibly made based on no oh we’ve never done a loan to this zip code so let’s uh this postal code let’s reject the application.

Ismini Psychoula:
So that was a very simple example to highlight how these decisions can be made but yeah, a lot of times our algorithms can lend correlations that have no meaning to the algorithm for example if everyone that has applied on a wednesday has always been denied it really has nothing to do with their potential to repay the debt but if that’s all the algorithm has ever seen it will learn the correlation everyone that comes for a loan on a wednesday gets denied. A very simple example again just to highlight that sometimes we learn correlations that should not be there and that’s really easy to happen.

George Peabody, Host, Glenbrook Partners:
The correlations will be surfaced but they’re truly meaningless when it comes to making an ethical decision or even a useful decision.

Ismini Psychoula:
Yeah, and it comes back to how we select our data what we put in there and having the means to to see what features were important through explainability in that particular decision.

George Peabody, Host, Glenbrook Partners:
Yeah, so that explainability has got everything to do with trust in these systems and we are outsourcing if you will more and more what was, well it is the case that unethical decisions were being made by human beings now we’re just transporting them over into a machine but to ex but still to trust the machines we’d have to be able to actually say explain what’s going on. Where are we on that journey from opening up the black box?

Ismini Psychoula:
So this is a very new area. We’ve only just started to realize what’s going on behind these black boxes and why we need this trustworthiness and explainability so we are still at the very early stage but we’ve seen a few very positive new regulations in this area. First of all is the it’s the European Union’s trustworthy AI regulation where it categorizes several use cases based on risk credit each one of its high risk use cases and what that means is that all providers of such systems in Europe will have to be subject to a conformity assessment they will have to show what data they use what features they use and any changes again will have to be subject to additional approvals so this will enhance the levels of transparency and on the US side we are seeing nist also working on some frameworks to address bias and we hope to see a lot more of this coming up in the following year.

George Peabody, Host, Glenbrook Partners:
What have you seen with respect to Data Quality?

Ismini Psychoula:
Data Quality is really difficult to achieve. A lot of times we have imbalanced data sets for example in the use case of road we might in one day have 700.000 non-front transaction and one fraudulent one this means that our algorithm will learn on the fraud side on the non-fraudulent ones and learn to recognize those but it doesn’t have enough samples to recognize the fraudulent one so it’s it’s really difficult to get the data right and that’s the most important part of an algorithm and that’s the same for credit as well. If we have a lot of samples in one area but not enough in the other, that will make our algorithm imbalance towards where we have the most samples.

George Peabody, Host, Glenbrook Partners:
And today, I’m then taking away that individual companies are making up who are using a machine learning system are determining themselves what is sufficient what is from both a quality point of view from a quantity point of view from accuracy of the algorithm.

Ismini Psychoula:
Yeah, that’s uh internal to the company itself and they get to decide what is enough.

George Peabody, Host, Glenbrook Partners:
Do you have any recommendations for them in terms of improving Data Quality and certainly the fairness of their use of the tool?

Ismini Psychoula:
So, I would say that while we still like explicit regulations in that area on what we should do or not do, there are several things companies can still do to make sure that they fulfill the requirements of trustworthy AI and the first ones of those would be conduct risk assessments the same that you do for cyber security or privacy do the same for your AI systems how are they going to impact the end users are there going to be a lot of adverse effects if something goes wrong do that for every AI system enhance any current procedures that are already there and embed AI into those procedures and compliance policies that you already have to include AI systems in there. data governance like we said it’s one of the most important aspects selecting the data that we have in there appropriate features nothing that is excessive should not be there should be removed and be representative of the population you plan to apply the AI algorithm.

George Peabody, Host, Glenbrook Partners:
What was that last piece about the population?

Ismini Psychoula:
So if you are building an AI system based on the US population and you’ve collected data from US citizens it’s not as easily applicable to the UK on UK citizens they have different spending habits, different everything so that’s an important thing to also consider. Another very important part is to monitor and control AI this is an iterative process if you’ve gotten it once you need to do it again over and over again and also use additional resources for example there are tools out there out there like IBM’s AI fairness they are open source apply them to your algorithm see that they are explainable and fair and also one last thing is enhancing the awareness around the AI ethics this is a very new area and a lot of people up until this point think that with AI nothing can go wrong but as with any technology it can make mistakes so enhancing this awareness from the customer facing employees to the senior leadership teams is important to to be able to mitigate any potential mistakes that come up along the way.

George Peabody, Host, Glenbrook Partners:
I remember 10 years ago how it was soulless magic, right. I mean, of course there’s some ais that are essentially harmless and very effective image recognition well not always harmless but yeah, I misspoke there. Well as you say that…

Ismini Psychoula:
Of course there are also AI systems that are harmless and if I get the wrong recommendation for something I’m looking at while I’m browsing it’s okay no big deal but if I get it on my credit application then that’s a different story.

George Peabody, Host, Glenbrook Partners:
Of course you haven’t seen how much money I spend with a certain online retailer based on recommendations.

Ismini Psychoula:
Well, maybe it’s working way too well.

George Peabody, Host, Glenbrook Partners:
Exactly. Back to the data governance question before we wrap up, who should since data governance really is an internal role within a company, who should be on the data governance committee or team to get the best results?

Ismini Psychoula:
I would say that this is a collaborative area in such a team you should have people from your senior literacy team head of data, data scientists domain experts they’re really important as well to get the results right and also people that can bring insight from the market on what is needed, what works well and what doesn’t. it really is a collaborative approach there.

George Peabody, Host, Glenbrook Partners:
Yeah, that’s a big job for and a new function for a lot of people.

Ismini Psychoula:
Yeah and it will have to be introduced into current processes so that it is being taken care of because as all our services are moving towards being data oriented, how we select the data that we put in there and put our customers’ privacy and well-being first is really important.

George Peabody, Host, Glenbrook Partners:
Is there a role for external auditors to look at the outcome and or the results of AI in certain terms of underwriting?

Ismini Psychoula:
So yes, in in the EU case, if that regulation goes forward there will be external auditors that will have to check the transparency fairness, data training, data training procedure and assess the algorithm before it goes forward particularly for high-risk cases like credit recruitment and anything like that
George Peabody, Host, Glenbrook Partners:
You’ve referred multiple times course with regulation in the EU and I’m as an american I’m looking at this market going how’s that going to work.

Ismini Psychoula:
Well I think the US is also moving towards this direction. We’ve seen this put forward their um biasing AI framework we’ve seen the federal trade faith commission putting some really strong blog posts out there claiming that vendors need to be truthful not except even backed up by evidence and we’ve also recently seen a letter to the heads of the federal financial institutions and examination council that tells them they need to prioritize the principles of transparency enforceability, privacy fairness and equity and this needs to be done with explainability appropriate governance risk management and controls over AI, so we can only hope they will take that into consideration.

George Peabody, Host, Glenbrook Partners:
You know, I’ve also as you describe this I’m seeing a an entire industry of people emerge to make sure that I thought AI systems are indeed ethical compliant with regulations that there are going to be experts who are all about helping organizations manage their Data Quality concerns.

Ismini Psychoula:
Absolutely, as we’re seeing AI going to almost every industry we need to start thinking about this now so we can get it right on our first rise.

George Peabody, Host, Glenbrook Partners:
Well Ismini, thank you so much for spending some time with us on Payments on Fire®, really really appreciate it talk about this really important issue.

Ismini Psychoula:
It’s been my pleasure and if there are any questions I’m always happy to answer them.

George Peabody, Host, Glenbrook Partners:
All right, well thank you very much and all the best for the new year.

Ismini Psychoula:
Thank you very much. Have a nice year as well.

George Peabody, Host, Glenbrook Partners:
Well many thanks to Ismini Psychoula for this really interesting review of the impact of machine learning and Artificial Intelligence in important areas in the important areas of credit extension credit underwriting as well as fraud management. I particularly appreciate her insights with respect to the fact that the data that’s used by these systems and the quality of that data and the specificity of that data to the problem set or the task at hand has every much every bit as much impact as the algorithms themselves so it’s clear to me now why around the world different organizations different governments indeed are looking at the oversight regulation of these systems such that what comes out at the end is indeed ethical and also explainable. So again Ismini many thanks for your time and and many thanks to you for joining us on Payments on Fire® once again, really appreciate your time and attention.

Do let us know if there’s a particular person that you’d like to see on Payments on Fire®, we always appreciate your suggestions. Until the next time then I hope all’s well do good work and we’ll see you the next time.

Recent Payment Views

Payments Post #12: Lessons from Change

Payments Post #12: Lessons from Change

In this month’s Payments Post, we want to draw your attention to several recent fraud incidents that underscore the criticality of effective risk management to your business and the safety and soundness of the payments industry.

read more

Glenbrook Payments Boot CampTM

Register for the next Glenbrook Payments Boot CampTM

An intensive and comprehensive overview of the payments industry.

Train your Team

Customized, private Payments Boot CampsTM workshops tailored to meet your team’s unique needs.

OnDemand Modules

Recorded, one-hour videos covering a broad array of payments concepts.

GlenbrookTM Company Press

Comprehensive books that detail the systems and innovations shaping the payments industry.

Launch, improve & grow your payments business