His fellow panelists included Jennifer Jordan, thought leader and Techstars director, AI entrepreneur Kareem Saleh, and Slater Victoroff, CTO of Indico, and the discussion explored bias and explainability in the field of machine learning. Although the specifics often centered on financial use cases, the larger context is applicable to all industries deploying high stakes AI and ML. Here are the themes and key moments that emerged in the session.
AI's explosive growth over the past decade has created the current moment, when large companies are relying on it to make or support big decisions about everyday people. Those decisions literally impact life and limb, creating opportunity and roadblocks for individuals.
Should I give Anthony Habayeb a life insurance policy? Does this patient have cancer? Should we give this patient this medicine? Should I give Anthony a loan? Should we hire Anthony – yes or no? Those are really important consequential decisions.
– Anthony Habayeb
With the absence oversight of such consequential systems to present, many consumers affected by these decisions have no idea about that reality. In some ways, that is not substantially different from the decisions made about them by other proprietary models and systems.
However, those older decisioning methods have withstood deep scrutiny from internal and external reviewers that AI systems have largely sidestepped. Enterprise controls with clearly articulated principles and well governed controls were created to help organizations manage risk and increase predictability.
Writ large, the validations across financial providers have been designed over time to guarantee public good for consumers and protect the economic foundation that generates prosperity.
Leave no doubt, as Kareem notes, "The concerns about bias are real. The concerns about not being able to explain are real...And you cannot rebuild the traditional workflow." Machine learning imposes new imperatives for both risk management and cross-functional alignment around those risks.
Established companies with proven processes and workflows for understanding risks struggle with different standards. An unsustainable divergence has emerged with the advent of so-called intelligent models: one measuring stick for traditional models, and another for ML and AI decisioning.
The base necessity for responsible [AI] systems don't exist at most companies. Validating the model has to be a different person from the one building the model. Seems basic, very rare.
– Slater Victoroff
And yet the opacity and evolving nature of modern models requires it. Not to mention the limited usefulness of explainability tools across different model types (see below). Given those constraints, the goal of every company deploying ML today should be unifying how they approach compliance and controls around these systems. Managing bias and delivering explainability are two important aspects of a broader, modernized assurance framework.
In the panel's estimation, we are far off from the Inception-like world in which models are explaining the decisions of other models to any satisfactory level for business leaders and risk owners. An executive who greenlights such a project will be accountable for her action, and she will need to execute significant due diligence on the design, validation, and assurances behind the model. The company may face legal consequences, unanticipated or not, by the choices of the model makers, informed or not.
Broadly, the panelists had serious questions about the reach of explainability. The prevailing wind was one of overconfidence on a couple of fronts.
For the developer community, certainty about the ability to deliver true explainability sounded alarm bells. At least in part, this reaction derives from the broad set of explainable solutions and the lack of clarity about how well (or poorly) they map to different model types. Slater even asked pointedly, "If you've got an explainability framework that doesn't apply across all model frameworks, then you don't really have any explainability framework.'"
For traditional practitioners, there were questions about rote application of standards developed for less sophisticated modeling techniques to more advanced ML models without a proper evaluation of the efficacy for this newer use case. Kareem brought up the questionable practice of embedding ML outputs as variables in more traditional models. Gaming explanation in this way of course upends the purpose of second and third lines of defense, obfuscating the nature of the decisioning for less technical audience.
In the fray of technical debate about the "right" explainability tools for the model at hand, an important dimension fades from view. Who exactly is the audience of an "explanation"?
Anthony launched into this territory by pointing out that we have a different definition of explainability as consumers than any technologist would have. Trust in the context of decisions made about you is personal and emotional in nature. Likewise, explainability has a different ring for the professionals tasked with evaluating the risks and benefits to the business. The distinction is even more pronounced when that person works in compliance, regulatory affairs, or audit, placing a higher burden of independence and verifiability. "Trust me" will never be an acceptable response.
Adding another audience dimension, Jennifer has seen different company functions impacted by their lack of domain knowledge about how ML and AI operate. External-facing teams can become entranced by the possibilities of increased intelligence and personalization offered by processing new stores of data, never realizing they've wandered into a regulatory minefield.
To this largely technical audience, Anthony closed with a poignant example from his professional background as an "intrapreneur" working within a much larger organization. He encouraged developers to empathize with the individuals responsible for the big bets that could significantly impact the success of the entire business unit and even the larger enterprise.
The people that own the budgets and will be making decisions about investments in ML and AI are not going to be able to engage at a technical level with the team that is building those things. And that is terrifying...
Imagine you are a multi-billion dollar business line owner that is now entrusting those data scientists to build you an application to make critical decisions for your business, and you feel no personal empowerment to be able to know "Is that application doing what it's supposed to be doing? Have they put the proper controls in place? Do they have somebody objectively looking at their system? Can I take a peek now and then if I wanted to?"
– Anthony Habayeb
In the end, AI's potential for good is undeniable, but a leap of faith will never work for any of the stakeholders who own the process or the results of models' decisions.