Custom Provisions
Provisions are available from built-in libraries. Syncing the Provision Library to the publisher (AI provider) makes these provisions available for data extraction mapping. However, if your contract's provisions do not conform to any of these pre-configured provision types, you can name, build, populate, and train the AI on your own custom provision type.
Before You Begin
Custom provisions are a powerful approach to extracting data from contracts and supporting documents. Getting the most out of this feature requires familiarizing yourself with the underlying objects that Discovery AI can extract. All provisions must have a unique name and a defined data type. Available data types are: Date, Organization, Currency, Duration, Number, Percent, Picklist, Short Text, Multi Picklist, Text, Table, and Obligation. These describe CLM data fields to which Discovery AI must assign passages extracted from contracts and supporting documents. Your ability to define custom provisions depends on these data types being defined clearly in CLM.To add a custom provision
Successive generations of Discovery AI have progressively reduced the resources required to train an AI to recognize and extract values from documents. Generative AI reduces the training requirement to zero. The administrator configures the generative AI search with two examples. Once the examples are added and the provision is saved and mapped to a worksheet, no additional user training input is expected: the provision extracts from imported documents automatically.
There is an issue with custom provisions using generative AI. When the user reviews clauses from an extracted document, the on-screen highlight may not match the information gathered correctly by the AI-generated provision. This issue affects only clauses, not fields. It is scheduled for repair.
Before adding custom provisions with generative AI, alert users to this issue.
Your new custom provision is available in the Provision Library and is active by default.
When you activate the provision, it becomes available for inclusion in worksheets. Once the new provision is introduced to an operating worksheet, users can see the provision and train the AI on it.
There is no manual training flow. If the AI does not produce the desired accuracy, add more examples to the provision as described above.
Training
When the total count of annotations reaches 30 for a provision, training automatically begins. You can also trigger training for a provision before 30 annotations are entered by force-training the AI on a limited set of annotations.
To manage training tasks
Minimum Accuracy Threshold: Theory
It is important to have a working understanding of the accuracy and confidence scores, as these are the final criteria by which you will measure the AI's performance in recognizing your custom provision. As the F1 score is a factor of accuracy, precision, and recall, it is worthwhile to have a basic understanding of these foundational concepts. Once a provision is trained enough to produce a meaningful statistical sample, you can fine-tune the results based on the AI model's confidence score.
Accuracy
Accuracy describes the ratio of true positive and negative identifications for all samples. The ratio of true positives detected to all actual positives is classified as Recall . The ratio of true predicted positive results to all (true and false) predicted positives results is called Precision. These are combined to form an F1 Score.
Precision
Precision is a measure of the AI's predictive correctness. For example, if the AI is asked to find apples in a mixed basket of fruit, a perfect precision score of 1.00 means every fruit it found was an apple. This does not indicate whether it found all the apples. Expressed mathematically:
precision = true positives / (true positives + false positives)
Recall
Recall is a measure of completeness. For example, if the AI is to find apples in a mixed basket of fruit, a perfect recall score of 1.00 means it found all the apples in the basket. This does not indicate it found only apples: it may have found some oranges too. Expressed mathematically:
recall = true positives / (true positives + false negatives)
Categorizing legal concepts has more variation than picking fruit, especially when the provisions are reviewed by different legal professionals; therefore, recall and precision may differ among annotators (one person's recall of 0.90 may be someone else's 0.80). Remember this when using built-in provisions and reviewing annotations.
F1 Score
The F1 score is the harmonic mean of precision and recall. It provides a more sensitive measure of accuracy than precision or recall alone. Expressed mathematically:
F1 = 2 * [(precision * recall) / (precision + recall)].
The F1 score gives equal weight to both precision and recall for measuring a system's overall accuracy. Unlike other accuracy metrics, the total number of observations is not important. The F1 score value can provide high-level information about an AI model's output quality, enabling sensitive tuning to optimize both precision and recall.
When setting up custom provisions, you are asked to enter a desired minimum accuracy threshold. This is an F1 score, and Conga recommends you set this value to 65, which we have found optimally weights AI precision vs. trainer time.
Prediction Threshold
Also called a confidence threshold, this is presented in our app as the Acceptable Confidence Score. This value describes a confidence level above which information is accepted and below which it is rejected. If the threshold is set to 0, all responses exceed the threshold and are accepted. If the threshold is set to 1, then no response exceeds the threshold and all are rejected.
In general:
- Increasing the prediction (confidence) threshold lowers recall and improves precision (i.e., biases towards true positives, but throws some good results away).
- Decreasing the prediction threshold improves recall and lowers precision (i.e., biases towards including more hits, but with more false positives).
If you find your results have a lot of false positives (Discovery AI identifies incorrect passages as matching results), raise the prediction threshold setting. If you find the AI is missing too many entries, i.e., not detecting passages as matching, lower this setting.
Deleting Custom Provisions
- Open the Provision Library and find the custom provision you will delete.
- Check the box adjacent to the provision or provisions you will delete.
- Click the Delete button.
- The Provision Library screen automatically refreshes itself. Review Provision Library to verify that the provision or provisions are removed.
Configuring a Custom Provision with Table Extraction
You can use AI to extract tables from documents. This is especially useful when contracts present such tabular data as bills of materials or delivery schedules. Because table extraction requires generative AI, these extractions do not require training. Users can extract tables as soon as the table extraction is configured and mapped to a worksheet.
Each table must have a columnar structure, with each line reflecting the values named in the heading. Discovery AI extracts these values in much the same way it handles fields and clauses, presenting this data in a column-defined line-item format.
