4 stage process to responsible AI practices:
1. Identify potential harms that are relevant to your planned solution.
2. Measure the presence of these
... [Show More] harms in the outputs generated by your solution.
3. Mitigate the harms at multiple layers in your solution to minimize their presence and impact, and ensure transparent communication about potential risks to users.
4. Operate the solution responsibly by defining and following a deployment and operational readiness plan.
Stages of Identifyng Potential Harms
1. Identify potential harms
2. Prioritize identified harms
3. Test and verify the prioritized harms
4. Document and share the verified harms
Red Teaming
a strategy that is often used to find security vulnerabilities or other weaknesses that can compromise the integrity of a software solution.
"red team" testing is the way in which a team of testers deliberately probes the solution for weaknesses and attempts to produce harmful results.
Steps for Measuring Potential Harms
1. Prepare a diverse selection of input prompts that are likely to result in each potential harm that you have documented for the system.
2. Submit the prompts to the system and retrieve the generated output.
3. Apply pre-defined criteria to evaluate the output and categorize it according to the level of potential harm it contains.Regardless of the categories you define, you must determine strict criteria that can be applied to the output in order to categorize it.
Steps for Mitigating Potential Harms
1. Model
2. Safety System
3. Application
4. Positioning
Ways to Mitigate - the Model layer
1. Selecting a model that is appropriate for the intended solution use.
2. Fine-tuning a foundational model with your own training data so that the responses it generates are more likely to be relevant and scoped to your solution scenario.
Ways to Mitigate - the Safety System layer
1. content filters - Azure OpenAI Service includes support for content filters that apply criteria to suppress prompts and responses based on classification of content into four severity levels (safe, low, medium, and high) for four categories of potential harm (hate, sexual, violence, and self-harm).
2. abuse detection algoriths
Ways to Mitigrate - the Application layer
1. Designing the application user experience (UX) to constrain inputs to specific subjects or types, or applying input and output validation.
2. Specifying metaprompts or system inputs that define behavioral parameters for the model.
3. Applying prompt engineering techniques to add context to input prompts, maximizing the likelihood of a relevant, nonharmful output.
4. Citing sources of information in the generated output (or explicitly noting that no citation has been provided).
The Model layer
The model layer consists of the generative AI model(s) at the heart of your solution.
The Safety System layer
The safety system layer includes platform-level configurations and capabilities that help mitigate harm
The Application layer
The application layer is the software application through which users interact with the generative AI model.
The Positioning layer
The positioning layer includes any documentation or other user collateral that describes the use of the solution to its users and stakeholders
Ways to Mitigrate - the Positioning layer
1. AI solution should be appropriately transparent about capabilities and limitations of the system
2. Any potential harms that may not always be addressed by the mitigation measures you have put in place.
Common Compliance Reviews
Legal, Privacy, Security, Accessibility
Phased Delivery Plan
Enables you to release the solution initially to restricted group of users. This approach enables you to gather feedback and identify problems before releasing to a wider audience.
Incident Response Plan
Estimates of the time taken to respond to unanticipated incidents.
Rollback Plan
Defines the steps to revert the solution to a previous state in the event of an incident. [Show Less]