Javascript OCR Image To Text (Simple Examples)

Welcome to a tutorial on how to convert an image to text in Javascript using OCR. So you are working on a project that “scans an image to text”? Well, image-to-text has been around for quite some time, and it is called “Optical Character Recognition”. Just how do we do that in Javascript? Read on for the examples!

 

 

TLDR – QUICK SLIDES

Fullscreen Mode – Click Here

 

TABLE OF CONTENTS

 

 

JAVASCRIPT OCR – IMAGE TO TEXT

All right, let us now get into the examples of converting images to text using OCR.

 

EXAMPLE 1) INPUT IMAGE TO TEXT

1-select.html
<!-- (A) FILE SELECTOR & RESULT -->
<input type="file" id="select" accept="image/png, image/gif, image/webp, image/jpeg">
<div id="result"></div>
 
<!-- (B) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.0.6/tesseract.min.js"></script>
 
<!-- (C) INIT -->
<script>
window.addEventListener("load", async () => {
  // (C1) GET HTML ELEMENTS
  const hSel = document.getElementById("select"),
        hRes = document.getElementById("result");
 
  // (C2) CREATE ENGLISH WORKER
  const worker = await Tesseract.createWorker();
  await worker.loadLanguage("eng");
  await worker.initialize("eng");
 
  // (C3) ON FILE SELECT - IMAGE TO TEXT
  hSel.onchange = async () => {
    const res = await worker.recognize(hSel.files[0]);
    hRes.innerHTML = res.data.text;
  };
});
</script>

The OCR library that we are going to use is Tesseract.js – You can either download and host the library yourself, or just load it off a CDN. The usage otherwise, is very straightforward.

  • (B) Load the Tesseract.js library. Captain Obvious at your service.
  • (C2) Create a Tesseract worker, and set the language that you want to convert.
  • (C3) Lastly, just feed the image to the Tesseract worker and get the result.

 

 

EXAMPLE 2) FETCH IMAGE TO TEXT

2-fetch.html
<!-- (A) RESULT -->
<div id="result"></div>
 
<!-- (B) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.0.6/tesseract.min.js"></script>
 
<!-- (C) FETCH IMAGE & SHOW TEXT -->
<script>
window.addEventListener("load", () => {
  // (C1) FETCH IMAGE
  fetch("text.png")
  .then(res => res.blob())
  .then(async (blob) => {
    // (C2) CREATE ENGLISH WORKER
    const worker = await Tesseract.createWorker();
    await worker.loadLanguage("eng");
    await worker.initialize("eng");

    // (C3) RESULT
    const res = await worker.recognize(blob);
    document.getElementById("result").innerHTML = res.data.text;
  });
});
</script>

This is the same “create a Tesseract worker, feed image to worker”. But instead of getting the image file from an <input>, we get the image using fetch() instead.

P.S. You can fetch images on another domain/server, but that involves CORS (cross-domain). I will leave a link in the extras section below to another tutorial.

 

 

EXAMPLE 3) WEBCAM IMAGE TO TEXT

3A) THE HTML

3-cam.html
<!-- (A) WEBCAM & RESULT -->
<video id="camVid" autoplay></video>
<button id="camGo">Go!</button>
<div id="camRes"></div>
 
<!-- (B) LOAD TESSERACT & JS -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.0.6/tesseract.min.js"></script>
<script src="3-cam.js"></script>

We no longer live in the Stone Age, there is a modern convenience called a “webcam”. We can use it to take photos, convert the image to text. The HTML of this example should be pretty self-explanatory:

  • <video id="camVid> Live feed from the webcam.
  • <button id="camGo"> “Take photo and convert to text” button.
  • <div id="camRes"> Results.

 

3B) THE JAVASCRIPT

3-cam.js
var webkam = {
  // (A) INITIALIZE
  worker : null, // tesseract worker
  hVid : null, hGo :null, hRes : null, // html elements
  init : () => {
    // (A1) GET HTML ELEMENTS
    webkam.hVid = document.getElementById("camVid"),
    webkam.hGo = document.getElementById("camGo"),
    webkam.hRes = document.getElementById("camRes");

    // (A2) GET USER PERMISSION TO ACCESS CAMERA
    navigator.mediaDevices.getUserMedia({ video: true })
    .then(async (stream) => {
      // (A2-1) CREATE ENGLISH WORKER
      webkam.worker = await Tesseract.createWorker();
      await webkam.worker.loadLanguage("eng");
      await webkam.worker.initialize("eng");

      // (A2-2) WEBCAM LIVE STREAM
      webkam.hVid.srcObject = stream;
      webkam.hGo.onclick = webkam.snap;
    })
    .catch(err => console.error(err));
  },

  // (B) SNAP VIDEO FRAME TO TEXT
  snap : async () => {
    // (B1) CREATE NEW CANVAS
    let canvas = document.createElement("canvas"),
        ctx = canvas.getContext("2d"),
        vWidth = webkam.hVid.videoWidth,
        vHeight = webkam.hVid.videoHeight;
  
    // (B2) CAPTURE VIDEO FRAME TO CANVAS
    canvas.width = vWidth;
    canvas.height = vHeight;
    ctx.drawImage(webkam.hVid, 0, 0, vWidth, vHeight);

    // (B3) CANVAS TO IMAGE, IMAGE TO TEXT
    const res = await webkam.worker.recognize(canvas.toDataURL("image/png"));
    webkam.hRes.innerHTML = res.data.text;
  },
};
window.addEventListener("load", webkam.init);

This is seemingly complex, but keep calm and look carefully.

  1. On window load.
    • Get HTML elements.
    • Get the user’s permission to access the webcam.
    • Create a Tesseract worker.
    • Set up webcam live feed into <video>.
  2. When the user hits the “go” button.
    • Capture the current video frame as an image.
    • Send the image to the Tesseract worker.
    • Output the result.

The end. It’s just long-winded.

 

 

DOWNLOAD & NOTES

Here is the download link to the example code, so you don’t have to copy-paste everything.

 

QUICK NOTES

If you spot a bug, feel free to comment below. I try to answer short questions too, but it is one person versus the entire world… If you need answers urgently, please check out my list of websites to get help with programming.

 

EXAMPLE CODE DOWNLOAD

Click here for the source code on GitHub gist, just click on “download zip” or do a git clone. I have released it under the MIT license, so feel free to build on top of it or use it in your own project.

 

 

EXTRA BITS & LINKS

That’s all for the tutorial, and here is a small section on some extras and links that may be useful to you.

 

LINKS & REFERENCES

 

INFOGRAPHIC CHEAT SHEET

Javascript OCR Image To Text (click to enlarge)

 

 

THE END

Thank you for reading, and we have come to the end. I hope that it has helped you to better understand, and if you want to share anything with this guide, please feel free to comment below. Good luck and happy coding!

Leave a Comment

Your email address will not be published. Required fields are marked *