LEADTOOLS 表单处理API教程:识别和处理表单

LEADTOOLS将自动检测并识别所有内容!以下是快速且准确地处理各种表单类型的主要步骤,无论数据如何格式化。

LEADTOOLS Recognition Imaging SDK是精选的LEADTOOLS SDK功能集,旨在在企业级文档自动化解决方案中构建端到端文档成像应用程序,这些解决方案需要OCR,MICR,OMR,条形码,表单识别和处理,PDF,打印捕获 ,档案,注释和图像查看功能。 这套功能强大的工具利用LEAD屡获殊荣的图像处理技术,智能识别可用于识别和提取任何类型的扫描或传真形式图像数据的文档功能。

LEADTOOLS Recognition Imaging SDK试用版

使用最先进的表单处理API可以自动解决数据输入问题 。无论您是在处理客户调查,税务文件还是开票记录,每个行业都每天使用表格开展业务。将数据从纸张移动到数字介质可能会很耗时。因此,LEADTOOLS开发了专有功能,可以从包含机器打印文本, 手写文本, MICR, MRZ和 OMR字段的任意组合的图像中提取文本 。LEADTOOLS将自动检测并识别所有内容!以下是快速且准确地处理各种表单类型的主要步骤,无论数据如何格式化。

FormsRecognitionImage

首先,我们需要初始化表单引擎。这完成了读取和识别数据的所有艰苦工作:

static void InitFormsEngines(){Console.WriteLine("Initializing Engines");codecs = new RasterCodecs();recognitionEngine = new FormRecognitionEngine();processingEngine = new FormProcessingEngine();formsOCREngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD);formsOCREngine.Startup(codecs, null, null, @"C:LEADTOOLS21BinCommonOcrLEADRuntime");OcrObjectsManager ocrObjectsManager = new OcrObjectsManager(formsOCREngine);ocrObjectsManager.Engine = formsOCREngine;recognitionEngine.ObjectsManagers.Add(ocrObjectsManager);Console.WriteLine("Engines initialized successfully");}

表格识别需要一个主表格和一个填写表格。主表单包含空白字段,并用作指定区域的模板。填充表单是一种包含字段中数据的表单。

下一步是指定主表单:

private static void CreateMasterFormAttributes(){Console.WriteLine("Processing Master Form");string[] masterFileNames = Directory.GetFiles(@"C:LEADTOOLS21ResourcesImagesFormsMasterForm SetsOCR", "*.tif", SearchOption.AllDirectories);foreach (string masterFileName in masterFileNames){string formName = Path.GetFileNameWithoutExtension(masterFileName);using (RasterImage image = codecs.Load(masterFileName, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1)){FormRecognitionAttributes masterFormAttributes = recognitionEngine.CreateMasterForm(formName, Guid.Empty, null);for (int i = 0; i < image.PageCount; i++){image.Page = i + 1;recognitionEngine.AddMasterFormPage(masterFormAttributes, image, null);}recognitionEngine.CloseMasterForm(masterFormAttributes);File.WriteAllBytes(formName + ".bin", masterFormAttributes.GetData());}}Console.WriteLine("Master Form Processing Complete");Console.WriteLine("=============================================================");}

最后,我们准备阅读填写的表格:

private static void RecognizeForm(){Console.WriteLine("Recognizing Formn");var GetProjectDirectory = Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location);string formToRecognize = @"C:LEADTOOLS21ResourcesImagesFormsForms to be RecognizedOCRW9_OCR_Filled.tif";using (RasterImage image = codecs.Load(formToRecognize, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1)){FormRecognitionAttributes filledFormAttributes = recognitionEngine.CreateForm(null);for (int i = 0; i < image.PageCount; i++){image.Page = i + 1;recognitionEngine.AddFormPage(filledFormAttributes, image, null);}recognitionEngine.CloseForm(filledFormAttributes);string resultMessage = "The form could not be recognized";string[] masterFileNames = Directory.GetFiles(GetProjectDirectory, "*.bin");foreach (string masterFileName in masterFileNames){string fieldsfName = Path.GetFileNameWithoutExtension(masterFileName) + ".xml";string fieldsfullPath = Path.Combine(@"C:LEADTOOLS21ResourcesImagesFormsMasterForm SetsOCR", fieldsfName);processingEngine.LoadFields(fieldsfullPath);FormRecognitionAttributes masterFormAttributes = new FormRecognitionAttributes();masterFormAttributes.SetData(File.ReadAllBytes(masterFileName));FormRecognitionResult recognitionResult = recognitionEngine.CompareForm(masterFormAttributes, filledFormAttributes, null);if (recognitionResult.Confidence >= 80){List<PageAlignment> alignment = new List<PageAlignment>();for (int k = 0; k < recognitionResult.PageResults.Count; k++)alignment.Add(recognitionResult.PageResults[k].Alignment);resultMessage = $"This form has been recognized as a {Path.GetFileNameWithoutExtension(masterFileName)}";ProcessForm(image, alignment);break;}}Console.WriteLine(resultMessage, "Recognition Results");Console.WriteLine("=============================================================n");}}private static void ProcessForm(RasterImage image, List<PageAlignment> alignment){processingEngine.OcrEngine = formsOCREngine;string resultsMessage = string.Empty;processingEngine.Process(image, alignment);foreach (FormPage formPage in processingEngine.Pages)foreach (FormField field in formPage)if (field != null)resultsMessage = $"{resultsMessage}{field.Name} = {(field.Result as TextFormFieldResult).Text}n";if (string.IsNullOrEmpty(resultsMessage))Console.WriteLine("No fields were processed", "FieldProcessing Results");elseConsole.WriteLine(resultsMessage, "Field ProcessingResults");}

这是从填写的表单中提取数据所需的全部。要更深入地了解,请参考有关如何识别和处理表单的教程

免费评估!
直接从我们的网站免费下载LEADTOOLS SDK。该试用版有效期为60天,并提电子邮件支持。

试用版下载>>>

LEADTOOLS 使用教程>>>


想要购买LEADTOOLS正版授权,或了解更多产品信息请点击【咨询在线客服】

LEADTOOLS 表单处理API教程:识别和处理表单

标签:

来源:慧都

声明:本站部分文章及图片转载于互联网,内容版权归原作者所有,如本站任何资料有侵权请您尽早请联系jinwei@zod.com.cn进行处理,非常感谢!

上一篇 2021年1月22日
下一篇 2021年1月22日

相关推荐

发表回复

登录后才能评论