Find out why a top-ten mortgage lender with a proprietary loan origination system (LOS) needed to convert from a legacy document platform.
(Originally published in ABA Risk and Compliance, July/August 2023)
Bill Fair and Earl Isaac created the first modern credit scoring system, the Credit Application Scoring Algorithms, in 1958, and the first FICO score was introduced in 1989. Since then, the use of credit algorithms has become ubiquitous in the lending industry. These algorithms use statistical models to predict the likelihood of a borrower defaulting on a loan, which then informs the lender's decision to approve or deny the loan application. While credit algorithms offer potential advantages, such as improving consistency of applicant treatment and increasing access to credit, they also present significant potential fair lending risks, including continuation or acceleration of bias against protected classes of borrowers and rapid propagation of risks to broad consumer credit markets. Concerns about algorithmic bias and fair lending are being renewed by the growth of artificial intelligence (“AI”) and machine learning (“ML”) models and the use of alternative data.
These concerns were highlighted by a recent Joint Statement on Enforcement Efforts against Discrimination and Bias in Automated Systems (“Joint Statement”)[i] issued by the Consumer Financial Protection Bureau (“CFPB”), the Department of Justice (“DOJ”), the Equal Employment Opportunity Commission (“EEOC”), and the Federal Trade Commission (“FTC”). Referring to automated or AI systems, the Joint Statement noted that “Private and public entities use these systems to make critical decisions that impact individuals’ rights and opportunities, including fair and equal access to a job, housing, credit opportunities, and other goods and services. . . Although many of these tools offer the promise of advancement, their use also has the potential to perpetuate unlawful bias, automate unlawful discrimination, and produce other harmful outcomes.”
The CFPB has provided guidance on complying with the Equal Credit Opportunity Act (“ECOA”) and Regulation B, its implementing regulation, when using complex algorithms in credit processes. For example, Consumer Financial Protection Circular 2022-03[ii] concludes that adverse action notices (“AAN”) for credit decisions based on complex algorithms must include statements of specific reasons a credit application was denied. As the Official Interpretations of Regulation B[iii] discuss:
“If a creditor bases the denial or other adverse action on a credit scoring system, the reasons disclosed must relate only to those factors actually scored in the system. Moreover, no factor that was a principal reason for adverse action may be excluded from disclosure. The creditor must disclose the actual reasons for denial (for example, “age of automobile”) even if the relationship of that factor to predicting creditworthiness may not be clear to the applicant.”[iv]
Regulation B also mandates requirements for a credit scoring system to be deemed an “empirically derived, demonstrably and statistically sound, credit scoring system.” These requirements include:
In a recent panel discussion at the National Community Reinvestment Coalition’s Just Economy conference, Patrice Ficklin, the Associate Director of the CFPB’s Office of Fair Lending and Equal Opportunity, reminded the audience of regulatory expectations for model validation and usage in the context of fair lending. In her comments, Ficklin noted, “Rigorous searches for less discriminatory alternatives are a critical component of fair lending compliance management. I worry that firms may sometimes shortchange this key component of testing. Testing and updating models regularly to reflect less discriminatory alternatives is critical to ensure that models are fair lending compliant.” She added, “I want to express the importance of robust fair lending testing of models, including regular testing for disparate treatment and disparate impact, including searches for less discriminatory alternatives. And of course, testing should also include an evaluation of how the models are implemented, including cutoffs and thresholds, and other overlays that may apply to the model’s output.”
The Joint Statement noted algorithmic bias risks may arise from many sources, including unrepresentative, imbalanced, erroneous or biased data; lack of model transparency, especially with “black box” algorithms; and inappropriate design and usage.[vi] Robust systems for model risk governance and compliance management can address these and other issues. As a component of both a robust fair lending compliance management system and strong model risk management, model validation and revalidation should include testing for fair lending risk. In addition to common model validation and governance techniques,[vii] such testing should include at least the following scope to address fair lending risk:
As Ficklin noted, for any model with potential disparate impact, lenders must search for less discriminatory alternatives (LDAs) both before implementation and periodically throughout the model’s life. The concept of less discriminatory alternatives arises from the Discriminatory Effects Rule[xii] promulgated by the U.S. Department of Housing and Urban Development (“HUD”) under Fair Housing Act authority, but has also been applied to fair lending more broadly.[xiii] Under the Discriminatory Effects Rule, disparate impact is assessed with a three-pronged approach:
In the case of models, the search for LDAs may be considered a form of champion-challenger testing that considers both model performance and fairness. Developing challengers using debiasing approaches may involve intervention at the pre-processing, model training, or post-production stages.
Removing bias at the pre-processing stage involves techniques to increase the representativeness of training data or reducing the correlation between protected classes and other data points. For example, sampling, augmenting, anonymizing sensitive data, or reweighting data may reduce bias if the available development data are skewed, imbalanced, or lack diversity.[xv] If the available data reflect evidence of historical discrimination or bias, other techniques, such as suppression of features highly correlated with protected class membership, relabeling similar target and control observations,[xvi] or transforming variables to reduce correlation with demographic groups may reduce bias.[xvii] Adversarial training, where an additional neural network is trained to detect and mitigate biases in the training data, may also be used.
In-processing debiasing techniques involve the introduction of fairness constraints during model training. Some of the in-processing techniques that may reduce disparate impact are dual optimization for predictive power and fairness, regularization of the model’s parameters or sensitivity to specific features, reweighting model features, dropping explanatory variables that have low importance and high contribution to disparate impact, and adversarial debiasing. Meta-algorithmic approaches that use a model suite and dynamically adapt the model to increase fairness are another in-processing debiasing method. Cost-sensitive learning, which involves assigning different misclassification costs to various groups and optimizing for lowest cost instead of greatest accuracy, is also in in-processing technique.
Reducing bias during model development is costly, however, and may require many iterations of retraining. It also requires access to the complete algorithm and training data, and, for some techniques, may require incorporation of protected attributes into the model. These techniques are unlikely to be available in the case of vended models. In addition, incorporation of protected attributes, even for the purposes of model debiasing, raises ECOA concerns.
Post-processing, debiasing may involve evaluating cut-off scores and reconsidering knock-out rules and model overlays. Model-predicted probabilities can also be recalibrated post-processing to reduce bias. In addition, Reject Option Classification, also called prediction abstention, can be used to add the ability for the model to refuse to make a prediction in observations where model confidence is low. These methods do not require rebuilding the model or understanding the details of complex algorithms and can be used on vended models. Other techniques include equalized odds post-processing and individual plus group debiasing.
The Joint Statement emphasized federal resolve to monitor AI development and implementation and to use the full authority of its signatory agencies to protect individuals’ rights.[xviii] Given the regulatory focus on both model risk and fairness, lenders are on notice that fair lending testing of models and the search for LDAs are mandatory in a robust fair lending CMS. Regardless of the bias-reduction approach used, data scientists, model developers, and fair lending specialists must consider fairness as well as predictive power when developing and deploying credit models.
ABOUT THE AUTHOR
Lynn Woosley is a Managing Director with Asurity Advisors. She has more than 30 years’ risk management experience in both financial services and regulatory environments. She is an expert in consumer protection, including fair lending, fair servicing, community reinvestment, and UDAAP.
Before joining Asurity, Lynn led the fair banking practice for an advisory firm. She has also held multiple leadership positions, including Senior Vice President and Fair and Responsible Banking Officer, within the Enterprise Risk Management division of a top 10 bank. Prior to joining the private sector, Lynn served as Senior Examiner and Fair Lending Advisory Economist at the Federal Reserve Bank of Atlanta. Reach her at lwoosley@asurity.com.
[i] https://www.ftc.gov/system/files/ftc_gov/pdf/EEOC-CRT-FTC-CFPB-AI-Joint-Statement%28final%29.pdf
[ii] https://www.consumerfinance.gov/compliance/circulars/circular-2022-03-adverse-action-notification-requirements-in-connection-with-credit-decisions-based-on-complex-algorithms/.
[iii] https://www.ecfr.gov/current/title-12/chapter-X/part-1002/appendix-Supplement%20I%20to%20Part%201002
[iv] 12 CFR Part 1002 (Supp. I), sec. 1002.9, para. 9(b)(1)-4
[v] 12 CFR Part 1002 §1002.2(p)
[vi] https://www.ftc.gov/system/files/ftc_gov/pdf/EEOC-CRT-FTC-CFPB-AI-Joint-Statement%28final%29.pdf
[vii] See, for example, Federal Reserve SR 11-7 and the Comptroller’s Handbook on Model Risk Management.
[viii] Age may be included as a variable in a model as long as applicants ages 62 or over receive more favorable treatment. For example, an underwriting algorithm may include age as long as the age of an older applicant is not a negative value or factor.
[ix] In general, the weaker the nexus with creditworthiness, the greater the fair lending risk. See Carol A. Evans, “Keeping Fintech Fair: Thinking About Fair Lending And UDAP Risks,” Consumer Compliance Outlook, 2017 (second issue).
[x] For example, in 2021 the FDIC referred a bank to the Department of Justice for potential fair lending violations related to the use of CDR in originating and refinancing private student loans. CDR is typically higher at Historically Black Colleges and Universities (HBCUs), so the FDIC determined use of CDR had a disparate impact on the basis of race, given that graduates of HBCUs are disproportionately black. See Consumer Compliance Supervisory Highlights, March 2022, page 9.
[xi] High-side overrides decline applicants that were approved by the model, while low-side overrides approve applicants that were declined by the model.
[xii] https://www.hud.gov/sites/dfiles/FHEO/documents/6251-F-02_Discriminatory_Effects_Final_Rule_3-17-23.pdf
[xiii] See, for example, the discussion of Disproportionate Adverse Impact Violations in the Appendix to the Federal Financial Institutions Examination Council’s Interagency Fair Lending Examination Procedures.
[xiv] https://www.hud.gov/sites/dfiles/FHEO/documents/DE_Final_Rule_Fact_Sheet.pdf
[xv] Kamiran, F., Calders, T. “Data preprocessing techniques for classification without discrimination.” Knowledge and Information Systems, 33(1):1–33 (2012). https://doi.org/10.1007/s10115-011-0463-8
[xvi] Kamiran and Calders.
[xvii] Feldman, M., Friedler, S., Moeller, J., Scheidegger, C., and Venkatasubramanian, S. “Certifying and Removing Disparate Impact.” Proceedings of Knowledge Discovery and Data Mining ’15, 259-268 (2015). https://dl.acm.org/doi/10.1145/2783258.2783311
[xviii] https://www.ftc.gov/system/files/ftc_gov/pdf/EEOC-CRT-FTC-CFPB-AI-Joint-Statement%28final%29.pdf
Find out why a top-ten mortgage lender with a proprietary loan origination system (LOS) needed to convert from a legacy document platform.
Learn more about the Goals Module and its key monitoring and reporting features.
Learn about the changes of state consumer protection and the responsibility of financial services institutions to pursue operational excellence and a culture of compliance.