Python Image To Text Using OCR (Simple Examples)

Welcome to a tutorial on how to convert an image to text using OCR in Python. So you are working on a project that needs to “extract” text from an image? A common solution is called Optical Character Recognition, and here are some possible ways to do it in Python. Read on!

 

 

TABLE OF CONTENTS

 

DOWNLOAD & NOTES

Here is the download link to the example code, so you don’t have to copy-paste everything.

 

EXAMPLE CODE DOWNLOAD

Source code on GitHub Gist

Just click on “download zip” or do a git clone. I have released it under the MIT license, so feel free to build on top of it or use it in your own project.

 

SORRY FOR THE ADS...

But someone has to pay the bills, and sponsors are paying for it. I insist on not turning Code Boxx into a "paid scripts" business, and I don't "block people with Adblock". Every little bit of support helps.

Buy Me A Coffee Code Boxx eBooks

 

PYTHON IMAGE TO TEXT WITH OCR

All right, let us now get into the examples of converting images to text in Python using OCR.

 

QUICK SETUP

The “usual stuff”:

  • Create a virtual environment virtualenv venv and activate it – venv\Scripts\activate (Windows) venv/bin/activate (Linux/Mac)
  • Install required libraries – pip install flask
  • For those who are new, the default Flask folders are –
    • static Public files (JS/CSS/images/videos/audio)
    • templates HTML pages

 

 

SOLUTION 1) TESSERACT

1A) DOWNLOAD & INSTALL TESSERACT

There is a popular open-source OCR library called Tesseract, but unfortunately, I can’t find a Python port-over. Don’t worry though, we can still use this library. First, install it:

 

1B) PYTHON RUN TESSERACT IN THE COMMAND LINE

1-tesseract.py
# (A) LOAD SUBPROCESS MODULE & SETTINGS - CHANGE TO YOUR OWN!
import subprocess 
tes = "C:/Program Files/Tesseract-OCR/tesseract.exe"
img = "demo.png"
lang = "eng"
 
# (B) RUN TESSERACT COMMAND
cmd = f'"{tes}" {img} - -l {lang}'
res = subprocess.run(cmd, stdout=subprocess.PIPE)
 
# (C) GET TEXT
txt = res.stdout.decode("utf-8")
# @TODO - WHATEVER YOU NEED WITH THE TEXT
print(txt)

How to “gel” Tesseract and Python together:

  • (B) Run PATH/TO/TESSERACT IMAGE.FILE - -l eng in the command line.
  • (C) Get the command line output as a string.

 

 

SOLUTION 2) TESSERACT JS

2A) HTML PAGE

2A-tesseract-js.html
<!-- (A) FILE SELECTOR -->
<input type="file" id="select" accept="image/png, image/gif, image/webp, image/jpeg">

<!-- (B) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.0.6/tesseract.min.js"></script>

<!-- (C) INIT -->
<script>
window.addEventListener("load", async () => {
  // (C1) GET HTML FILE SELECTOR
  const hSel = document.getElementById("select");

  // (C2) CREATE ENGLISH WORKER
  const worker = await Tesseract.createWorker();
  await worker.loadLanguage("eng");
  await worker.initialize("eng");

  // (C3) ON FILE SELECT
  hSel.onchange = async () => {
    // (C3-1) IMAGE TO TEXT
    const res = await worker.recognize(hSel.files[0]);

    // (C3-2) UPLOAD TO SERVER
    let data = new FormData();
    data.append("text", res.data.text);
    fetch("/save", { method:"post", body:data })
    .then(res => res.text())
    .then(txt => console.log(txt))
    .catch(err => console.error(err));
  };
});
</script>

If you cannot install anything on the server, here’s an alternative – Tesseract does not have a “Python version”, but someone did manage to create a Javascript web assembly version.

 

 

2B) FLASK HTTP SERVER

2B-tesseract-js.py
# (A) INIT
# (A1) LOAD MODULES
from flask import Flask, render_template, request, make_response, send_from_directory
 
# (A2) FLASK SETTINGS + INIT
HOST_NAME = "localhost"
HOST_PORT = 80
app = Flask(__name__)
# app.debug = True
 
# (B) VIEWS
# (B1) "LANDING PAGE"
@app.route("/")
def index():
  return render_template("2A-tesseract-js.html")
 
# (B2) SAVE CONVERTED TEXT
@app.route("/save", methods=["POST"])
def txt():
  data = dict(request.form)
  # @TODO - WHATEVER YOU NEED WITH THE TEXT
  print(data["text"])
  return "OK"
 
# (C) START
if __name__ == "__main__":
  app.run(HOST_NAME, HOST_PORT)

TesseractJS is client-side, how does it work with Python? This is unfortunately a little bit roundabout:

  • Create a simple HTTP server with Flask, and serve the above Tesseract page at http://localhost.
  • The Tesseract page will send the result to http://localhost/save.

 

SOLUTION 3) GOOGLE CLOUD VISION

If all else fails, the final alternative you have is to use an online image-to-text recognition service – Google Cloud Vision is a good option. At the time of writing, they offer 1000 free processes per month.

 

 

EXTRAS

That’s all for the tutorial, and here is a small section on some extras and links that may be useful to you.

 

LINKS & REFERENCES

 

THE END

Thank you for reading, and we have come to the end. I hope that it has helped you to better understand, and if you want to share anything with this guide, please feel free to comment below. Good luck and happy coding!