Why is Azure AI Document Intelligence Custom Classifier labeling a blank page with some numbers on the top as a "Prescription" with high confidence?

Question

Why is Azure AI Document Intelligence Custom Classifier labeling a blank page with some numbers on the top as a "Prescription" with high confidence?

Estacio, Pedro Vasconcelos 25

Outros-Prescrição

This image is classified by the model as a "Prescription". I reviewed all my training data and can't find a reason why would this image would be classified as such. I even have a category called "Others" where I have similar images to this one, with blank pages and some numbers on the top right.

Is there something I can do to prevent this classification error or this high confidence value?

Manas Mohanty 5,275 Reputation points Microsoft External Staff Moderator

2025-06-19T22:45:55.56+00:00

Hey Estacio, Pedro Vasconcelos

Please let us know if the issue was resolved by adding more blank pages (images with wrong prediction) in others category /optimizing the accuracy threshold

Thank you.
Manas Mohanty 5,275 Reputation points Microsoft External Staff Moderator

2025-06-20T17:32:26.43+00:00

Hey Estacio, Pedro Vasconcelos

We could not hear back from you. Hope you appreciated our inputs.

Please take few moments if it helped addressed your misclassification issue.

Thank you.

1 answer

Your answer

Manas Mohanty 5,275 Reputation points Microsoft External Staff Moderator

2025-06-19T22:45:55.56+00:00

Hey Estacio, Pedro Vasconcelos

Please let us know if the issue was resolved by adding more blank pages (images with wrong prediction) in others category /optimizing the accuracy threshold

Thank you.
Manas Mohanty 5,275 Reputation points Microsoft External Staff Moderator

2025-06-20T17:32:26.43+00:00

Hey Estacio, Pedro Vasconcelos

We could not hear back from you. Hope you appreciated our inputs.

Please take few moments if it helped addressed your misclassification issue.

Thank you.

Answer 1

Hey Estacio, Pedro Vasconcelos

It sounds like you're facing some challenges with Custom Classifier misclassifying a blank page as a "Prescription" with high confidence.

Here are a few things you can do to troubleshoot this problem:

Review Training Data: Double-check your training data to ensure that the model has not been exposed to similar patterns that could cause it to misclassify. Sometimes, including more explicit examples of "blank pages" or low-content documents might help the model learn to categorize them correctly as "Others."
Adjust Classification Fields: Make sure that the classification fields are clearly defined, especially for ambiguous cases. It can be beneficial to enhance the descriptions for your categories to minimize misclassification.
Set Confidence Thresholds: Consider implementing manual review for documents with confidence scores below a certain threshold. For instance, if the confidence score for this classification is particularly high, try setting a lower threshold for review to catch these types of errors.
Use Human Review: Incorporating a human in the loop for documents that show high confidence in misclassification can help ensure that any critical changes are made before the output is used. You can retrain incrementally if you are seeing more misclassification.

Reference -https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/train/custom-classifier?view=doc-intel-4.0.0

If you're still having issues after trying these steps, it would be helpful to gather some additional information. Here are a few follow-up questions you might want to provide:

What confidence score is being assigned to the classification of the blank page?
Can you share more details on how your training data is structured and the types of documents included?
Have you defined clear and specific categories for all potential classifications?

Thank you

Share via

Why is Azure AI Document Intelligence Custom Classifier labeling a blank page with some numbers on the top as a "Prescription" with high confidence?

1 answer

Your answer