AI governance for foundation models and generative AI

As businesses decide how and where to use generative AI, a big question and often a problem is how to properly evaluate and govern it. The purpose of this blog post is to unpack how to think about governance of foundation models and generative AI products built from them, such as ChatGPT. We’ll also discuss specific governance scenarios in the context of third-party management and model risk management.

What is a foundation model?

Foundation models are the basis of Generative AI: large machine learning or neural models trained on broad data that are generalizable to multiple use cases. Before we talk about the complexities of building foundation models, let’s compare them to models that your business might use today, like statistical and machine learning models. These models and complex modeling systems are built with a stated objective and purpose-built to achieve targeted outcomes. Modeling teams can select relevant inputs, balance the data, choose the optimal algorithm, optimize a selected loss function to train the model, evaluate it against performance metric[s], then stress test and validate the system, all before deploying it.

With that understanding, foundation model development is arguably more complex and wrought with challenges due to the complexity of making a model that is generalizable to many use cases. Foundation models, or ‘General-Purpose AI’ as the EU AI Act defines it, are voracious consumers of data, not all of which may or may not be properly acquired for the current products on the market. Assuming all of the data is appropriate, and has been evaluated for bias and other key quality considerations, developing a general model that can be used for multiple use cases and can perform at a level that is useful, then stress testing and validations, both pre-deployment and continuous in production, are extremely important to ensure the models are evaluated properly for responsible use.

The perceptions of generative AI versus outcomes that matter

The allure and perception of foundation models and in tandem generative AI applications is that it doesn’t require the aforementioned complexities of building performant, robust, and resilient modeling systems. This perception drives friction across organizations who want to adopt generative AI tools and even go as far as adjusting the models to be proprietary to their business. Purchasing and risk teams assume with some prior knowledge that they can use established practices to assess or “track” the risks of generative AI in a silo; while modeling teams understand the nuances of how and why models including foundation models at the heart of these applications need to be managed continuously and for varying levels of risk depending on the outcome.

To use generative AI foundation models responsibly, commensurate with the risk of the systems, you cannot succumb to the market perception of generative AI being an instant game-changer without knowing what part of your own game you want to change. And to what outcome and impact.

Generative AI might be the right tool for some productivity enhancements, like checking documents for grammar mistakes. It might not be the right type of modeling for other purposes, like math or facts that are important. That said, we’re embarking on a world where these models have dependent relationships, especially as businesses want proprietary innovation.

Where vision, standards, and governance can accelerate responsible generative AI use

For businesses with the ultimate vision for why AI can fundamentally change their products, market relevance, and operational sustainability, they already know that they need to have governance over the whole modeling approach to understand impacts, outcomes, and where to invest more or less. Generative AI is just one modeling type of many that will comprise these visionary modeling systems.

Whether it's the NAIC or EU AI Act, standards for how AI is governed are being written to the sum of the model-based system and their outcomes. While the explosion of generative AI motivated the need for regulation, it didn’t isolate governance onto itself. In fact, generative AI widened the lens for regulators to look at modeling systems across the board and ultimately regulate the outcomes.

Generative AI is one of many modeling types. More importantly, these models will change. To create a disruptive vision at any level of an organization, you must think about a forward-thinking system of governance around the whole modeling lifecycle. This will make sure your business can develop and deploy safely with the right models for the job.

For more information about planning and vision, check out episode 24: Rethinking LLMs and model risk strategies.

Purchasing and onboarding foundation models and generative AI products to your organization

When bringing foundation models and generative AI into your modeling system, start with the intended outcomes and impact of the system in mind. This default practice and mindset ultimately determines where risks will be and to what degree: low, medium, or high. Here are some tips for what to consider during the purchasing process, We'll also illustrate severals cases for onboarding models and outcomes that would ultimately categorize your modeling system by degree of risk and how to govern accordingly.

The purchasing process

As we’ve established above, foundation models require just as much governance, if not more so, than their more narrow counterparts given their wide ranges of inputs and potential uses. How the model is used and its risk are two considerations for determining the responsibilities of the purchasing entity. Here are general suggestions about buying a foundation model. These recommendations include how to use it and how much risk you take.

Third-party risk management for AI governance

When purchasing a foundation model for enterprise use, the following considerations are likely embedded in your existing purchasing process. We mention them because they are critical steps in the process before we can discuss specific risks of the models. In short, if your AI vendor can’t pass these steps, then model risk isn’t even a question yet:

ITGC and Security considerations.
- Is the vendor SOC 2 Type 2 compliant, does it have SSO, etc.?
Data security and data rights. Does the contract explicitly define that the purchaser’s data won’t be used for the vendor’s model training, etc.?
- An enterprise license with a single-tenet environment is ideal to protect exposure of private data or processes into the modeling system

If the first two considerations are okay, then you’re at the right place to talk about AI governance in the context of third-party risk management and model risk.

Inquire about the vendor’s AI Governance practices, validations, and development process to ensure compliance with NIST AI RMF, NAIC, or other standard framework that the purchaser is tracking towards.
We recommend obtaining and reviewing documents and validation results, unless an objective assessment is present. Such an assessment would require an independent audit firm’s opinion on NIST AI RMF compliance.
We also recommend performing internal testing against the vendor’s modeling system to become comfortable with its performance on individual use cases.

How to estimate the risk and impact of generative AI

After you buy your foundation model license or AI product, you’ll want to think about governance from different angles and the risks that these models can bring. At any risk level, creating an acceptable use policy and defining what data is acceptable are the most common first steps.

Beyond that, it depends on the purpose, adjustments, and intended outcomes of the foundation model that will categorize your risk level. Here are examples of considerations for low and high-risk scenarios and how to approach governance:

Low risk

A good example of a low-risk situation is when the foundation model is bought only to make work easier. A human is always in charge during use. The company also has no desire to adjust the code of their newly purchased foundation model.

If the company is comfortable with the AI governance practices and validations received as part of third-party due diligence with the model vendor, then inventorying the foundation model is sufficient from a model governance perspective. However, you will hear about companies who prefer to assess each use case, define each use case, and keep an inventory to this level of detail.

Medium and high-risk

Foundation models are are deemed medium or high-risk when used in specific use cases that are either:

Automating business processes without a human in the loop, or
Require fine-tuning on internal data, essentially adjusting the model from its purchased form

In medium and high risk use cases, think about everything you put in place for low-risk scenarios. But now, your company needs to fully test the modeling system and each use case for regulatory data, performance, and robustness. It’s as if you built the model yourself. Standard model development lifecycle best practices, as defined in a company’s model risk management and AI governance policy need to be applied at a modeling project level. Either approach can be used commensurate to the model risk and posture of the enterprise.

Paradigm shift: When the model you buy turns into the model you build

In our earlier example of medium and high-risk scenarios, these scenarios likely started as low-risk experiments. But then the urge for customization and proprietary outcomes shifts the paradigm. Low-risk, generalized foundation models as-is are modified to become a discrete or purpose-built modeling system. Traditional modeling best practices and controls are now required to validate the model and mitigate risk.

For lack of a better analogy, you’ve shifted from a ‘no code’ environment to a ‘code’ environment, i.e. using a programming language as an interpreter for building pieces of software with the language. It’s important to work with a technology partner who knows how to manage and govern models across many types. This will help you know when your strategy can change your modeling paradigm from low to high-risk.

From GLM to LLM, get to know a full-lifecycle approach to modeling system governance.

Conclusion

To summarize, foundation models are large AI systems that are built to be generalized across many use cases. Given their wide applicability and pervasiveness, how to govern these systems is a top of mind discussion point for many companies. When buying a foundation model, it is important to think about IT, security, data, and AI-specific model governance. This will help the company make sure that the foundational model purchases meet its needs. Additionally, building a process around inventorying productivity enhancement use cases, as well model development best practices for fine-tuned modeling systems built on top of the foundation model are critical to a well-executed AI Governance program.

Monitaur can help you get where you want to be with our define, manage, and automate journey. Through our software and expertise, we make sure your AI Governance program is compliant with NAIC AI Model Bulletin, NIST AI RMF, and other standards. Reach out today to learn more.

‍

Recommended resources

‍