Custom Provisions
Provisions are available from built-in libraries. Syncing the Provision Library to the publisher (AI provider) makes these provisions available for data extraction mapping. However, if your contract's provisions do not conform to any of these pre-configured provision types, you can name, build, populate, and train the AI on your own custom provision type.
To add a custom provision
Deleting Custom Provisions
Training
Each provision must be trained on a body of documents to reach a desirable level of accuracy. This training is performed by users working through a conventional review process.
The Force Training feature shortens the training period to a configurable number, lessening training period overhead.
When the total count of annotations for a provision reaches 30, training automatically begins. You can also trigger training for a provision before 30 annotations are entered by force-training the AI on a limited set of annotations
Force-Training a Provision
Minimum Accuracy Threshold: Theory
It is important to have a working understanding of the accuracy and confidence scores, as these are the final criteria by which you will measure the AI's performance in recognizing your custom provision. As the F1 score is a factor of accuracy, precision, and recall, it is worthwhile to have a basic understanding of these foundational concepts. Once a provision is trained enough to produce a meaningful statistical sample, you can fine-tune the results based on the AI model's confidence score.
Accuracy
Accuracy describes the ratio of true positive and negative identifications for all samples. The ratio of true positives detected to all actual positives is classified as Recall. The ratio of true predicted positive results to all (true and false) predicted positives results is called Precision. These are combined to form an F1 Score.
Precision
Precision is a measure of the AI's predictive correctness. For example, if the AI is asked to find apples in a mixed basket of fruit, a perfect precision score of 1.00 means every fruit it found was an apple. This does not indicate whether it found all the apples. Expressed mathematically:
Recall
Recall is a measure of completeness. For example, if the AI is to find apples in a mixed basket of fruit, a perfect recall score of 1.00 means it found all the apples in the basket. This does not indicate it found only apples: it may have found some oranges too. Expressed mathematically:
Categorizing legal concepts has more variation than picking fruit, especially when the provisions are reviewed by different legal professionals; therefore, recall and precision may differ among annotators (one person's recall of 0.90 may be someone else's 0.80). Remember this when using built-in provisions and reviewing annotations.
F1 Score
The F1 score is the harmonic mean of precision and recall. It provides a more sensitive measure of accuracy than precision or recall alone. Expressed mathematically:
The F1 score gives equal weight to both precision and recall for measuring a system's overall accuracy. Unlike other accuracy metrics, the total number of observations is not important. The F1 score value can provide high-level information about an AI model's output quality, enabling sensitive tuning to optimize both precision and recall.
When setting up custom provisions, you are asked to enter a desired minimum accuracy threshold. This is an F1 score, and Conga recommends you set this value to 65, which we have found optimally weights AI precision vs. trainer time.