This post will help you to create an OCR application in C#. For this, I will be using Visual Studio 2013 and Puma.NET. Before creating the project you need to install Puma.NET on your computer. You can download Puma.NET from here.
Note: I suggest using Tesseract OCR over Puma.NET. Tesseract OCR is more accurate and easier to use. If you like to use Tesseract OCR, read this article about Optical Character Recognition in C# using Tesseract.
Optical Character Recognition in C# using Tesseract - Mishel
In this post, I’ll demonstrate how to use Tesseract to build an Optical Character Recognition (OCR) application in C#.
data:image/s3,"s3://crabby-images/64c24/64c24c7dc8d5d89d7c7547dd6cbf6463bb49d489" alt=""
- Create a Windows Forms Application and add a Button and RichTextBox to the form.
- Right click on Project -> Add References.
- Select Browse and add Puma.Net.dll from the Assemblies folder under the installation path of Puma.NET.
- Click on Build to build your project.
- Open Debug folder of the project and there will be puma.interop.dll and Puma.Net.dll in the folder.
- Copy dibapi.dl (C:\ProgramFiles (x86)\Puma.NET\Assemblies) to the Debug folder of your project. This is how the Debug folder will look like.
data:image/s3,"s3://crabby-images/e4fbb/e4fbb6da5548a3e28ed0348b0aa52e2a29f879c9" alt=""
- Double click on the Browse button to create an event handler for the button and add this code.
OpenFileDialog file = new OpenFileDialog();
file.Multiselect = false;
if (file.ShowDialog() == DialogResult.OK)
{
string path = file.FileName;
if (!string.IsNullOrEmpty(path))
{
Puma.Net.PumaPage input = new Puma.Net.PumaPage(path);
input.FileFormat = Puma.Net.PumaFileFormat.TxtAscii;
input.Language = Puma.Net.PumaLanguage.English;
string result = input.RecognizeToString();
input.Dispose();
richTextBox1.Text = result;
}
}
- Run the application and click on the Button and select an image which contains text.
Download Source code.
Subscribe
Join the newsletter to get the latest updates.