Data Extraction Setup
Discovery AI uses predefined mappings to extract provisions from uploaded documents. To enable this process, you must define the types of documents the system will process, like master service agreement (MSA), statement of work (SOW), or non-disclosure Agreement (NDA). For each document type, you must set up the data extraction mapping that Discovery AI uses to identify and extract provisions. Data extraction setup is process of mapping the built-in and custom provisions with the CLM schema so that extracted information like fields, clause, tables and obligations can be published on the contract record.
Setting Up a New Data Extraction Mapping
If your organization uses custom provision models and has enabled X-Author on Salesforce, a data sync containing standard provisions and seed data (ID: 10ce17c4-ee2e-4351-8d1a-27c4e9e7877a) is already present for onboarding. In such cases, make sure the custom provision models are correctly mapped and included above the data sync configuration.
Editing a Data Extraction Mapping
Creating a Worksheet
Mapping Fields
- You have created, named, and saved a data extraction worksheet.
- You have Data Extraction Setup open to a mapping with the WORKSHEETS tab selected.
Mapping Clauses
- You have created, named, and saved a data extraction worksheet.
- You have Data Extraction Setup open to a mapping with the WORKSHEETS tab selected.
Mapping Tables
- You have created, named, and saved a data extraction worksheet.
- You have Data Extraction Setup open to a mapping with the WORKSHEETS tab selected.
-
Table line items are defined in the CLM database.
- Select the Tables tab, then click NEW to raise the corresponding mapping pop-up.
- Select a table provision line item from the preconfigured choices available in Conga CLM.
- Choose the extraction method.
- Click Save.
- Continue this until all line items are mapped.
Mapping Insights
- You have created, named, and saved a data extraction worksheet.
- You have Data Extraction Setup open to a mapping with the WORKSHEETS tab selected.
-
You have defined one or more risk scales as described in Risk Definition.
Insights are AI-driven evaluations that interpret extracted contract data to identify what is important, risky, or actionable within an agreement. They transform extracted provisions into meaningful intelligence—helping users quickly understand contract implications and make informed decisions. Powered by Conga's Discovery AI, Insights analyze contract language, terms, and clause structures, even when phrasing varies across third-party or legacy contracts. By leveraging predefined standards, risk scales, and mapping logic, the AI assesses provisions, assigns risk scores, and provides guidance to mitigate potential exposure.
For example, if a company standard specifies a payment term of Net 30 days but the extracted contract includes Net 60 days, the AI identifies the deviation, flags it as high financial risk (if the defined threshold is over Net 45), and may recommend negotiating the term to Net-45 or less.
Mapping Obligations
- You have created, named, and saved a data extraction worksheet.
- You have Data Extraction Setup open to a mapping with the WORKSHEETS tab selected.
Thresholds and Best Practices
When configuring and using the Discovery AI tool, keep the following thresholds and best practices in mind.
Uploads
| Upload Property | Limit / Notes |
|---|---|
| Number of documents for OCR | 2,000 per day per org/instance (for all users) |
| Document size | 50 MB per document |
| Number of pages | 200 pages per document |
| Files in bulk upload | No more than 100 files |
| Files for extraction per day | 1,000 files |
| Supported file types | DOCX, PDF |
| Upload sources |
|
| Folder/bulk upload requirement | All files must be of the same document type (MSA, NDA, SOW etc.) |
| Documents imported from Conga CLM | 200 documents |
Limitations
| Item | Threshold / Limit |
|---|---|
| Fields or clauses per document | 50 |
| Tables per worksheet | 10 Note:
|
| Columns per table | 10 |
| Fields or clauses per mapping | 70 |
| Obligations per mapping | 10 |
| Fields per obligation | 5 |
| Criteria per insight | 5 |
| Questions per insight mapping | 10 |
| Forms layout | Not supported |
Best Practices
| Category | Best Practice |
|---|---|
| Document quality |
|
| Table scan quality | Ensure the table is clearly visible and not distorted. |
| Layout requirements | Table should have a well-defined layout with clear headers and visible borders. |
| Header guidelines |
|
