How to Extract text from image in Python | Python OCR | Only 3 line source code

Hello friends how are you, Today in this post "Extract text from image in Python" I am going to teach you how you can extract text from image using 3 lines of code. For a human being its very easy to get text from an image but for a computer its a very difficult task . In python we can do some very difficult task using very simple lines of code. If you want to create something new and interesting in python then this post is for you.

Now i am going to explain everything step by step 

Step 1:Install Library(pytesseract) 

This library is know as Optical Character Recognition tool for python which is used to read and recognize the text embedded in image. It is a free library and in can read all images types for example- png,jpeg,tiff,gif etc. It supports in Python 2.7 or Python 3.6+ version. To install this library open command window and type the following command and press enter.

pip install pytesseract

Step 2:Download & Install Tesseract Application

After installation of library you need to download Tesseract Application. Just click here to download this application into your system. 

Extract text from image in Python

When you will visit the link then you will two exe of Tesseract one for 32bit OS and another for 64bit OS. If your system is 32bit OS then you will download 32bit exe from there otherwise download 64bit exe. After the download of Tesseract exe, install it into your system. its very simple to install this application into your system.
Here execute() is a predefined function which is used to execute any query of SQLite. 


Step 3:Python code

This program contains only three lines of code so i am going to explain it line by line.
Line 1: import pytesseract- This is the first line of this program and here i have imported the pytesseract library which is needed to get the text from image.

Line 2: pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'- In this line you have to pass the path  where  Tesseract Application is installed. Here I have given default path so if you don't change the directory at the time of installation of Tesseract then just type the same as above. 

Line 3: print(pytesseract.image_to_string(r'intro.png'))-Here in this line a predefined function image_to_string of module pytesseract is used to get the text from image and it is printed using python print function. intro.png is an image stored in the same folder where python file exists but if in your case image exists in other location or directory then you have to pass the complete of the image location. i will suggest you to store image in same folder. First i am going to show you the image that i have used in this program.

Extract text from image in Python

Here is the complete code of this program , you can type this code into your python file or you can copy this code for your personal use.
#import needed library
import pytesseract
#Installtion path of Tesseract Application
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
#printing content of the image
#here intro.png is an image stored in current directory 
print(pytesseract.image_to_string(r'intro.png'))

Step 5:Run Program

You just type this above code into your python file or you can copy this code for your personal use. When you will run this code you will get a screen like below 

Extract text from image in Python

Here you can match the output text with image text It is approximately 100% correct. I hope now you can extract text from image .


Request:-If you found this post helpful then let me know by your comment and share it with your friend. 

If you want to ask a question or want to suggest then type your question or suggestion in comment box so that we could do something new for you all. 

If you have not subscribed my website then please subscribe my website. Try to learn something new and teach something new to other. 

Thanks.😊

Post a Comment

0 Comments