Javascript Image To Text With OCR (Simple Examples)

Welcome to a tutorial on how to convert an image to text in Javascript. Yes, image-to-text has already been around for some time, and it is called “Optical Character Recognition” (OCR). There is an open-source OCR library called TesseractJS that we can use:

async conv () {
  const worker = await Tesseract.createWorker();
  await worker.loadLanguage("eng");
  await worker.initialize("eng");
  const res = await worker.recognize(IMAGE);
  const txt = res.data.text;
}

That’s all for the quick basics, read on if you need detailed examples!

 

 

TABLE OF CONTENTS

 

DOWNLOAD & NOTES

Here is the download link to the example code, so you don’t have to copy-paste everything.

 

EXAMPLE CODE DOWNLOAD

Source code on GitHub Gist | Example on CodePen

Just click on “download zip” or do a git clone. I have released it under the MIT license, so feel free to build on top of it or use it in your own project.

 

SORRY FOR THE ADS...

But someone has to pay the bills, and sponsors are paying for it. I insist on not turning Code Boxx into a "paid scripts" business, and I don't "block people with Adblock". Every little bit of support helps.

Buy Me A Coffee Code Boxx eBooks

 

 

JAVASCRIPT OCR – IMAGE TO TEXT

All right, let us now get into the examples of converting images to text using OCR.

 

TUTORIAL VIDEO

 

EXAMPLE 1) INPUT IMAGE TO TEXT

1A) THE HTML

1-select.html
<!-- (A) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>
 
<!-- (B) FILE SELECTOR & RESULT -->
<input type="file" id="select" accept="image/png, image/gif, image/webp, image/jpeg">
<textarea id="result"></textarea>
  • (A) Load the Tesseract.js library. You can either download and host the library yourself, or just load it off a CDN.
  • (B) For this example, we will read the image selected in <input id="select"> and output the text in <textarea id="result">.

 

1B) THE JAVASCRIPT

1-select.js
window.addEventListener("load", async () => {
  // (A) GET HTML ELEMENTS
  const hSel = document.getElementById("select"),
        hRes = document.getElementById("result");
 
  // (B) CREATE ENGLISH TESSERACT WORKER
  const worker = await Tesseract.createWorker();
  await worker.loadLanguage("eng");
  await worker.initialize("eng");
 
  // (C) ON FILE SELECT - IMAGE TO TEXT
  hSel.onchange = async () => {
    const res = await worker.recognize(hSel.files[0]);
    hRes.value = res.data.text;
  };
});

This should be very straightforward.

  1. Get the HTML file select and text area.
  2. Create an English Tesseract worker.
  3. On selecting an image file, we pass the image to the worker. When Tesseract is done, we put the results into the text area.

 

 

EXAMPLE 2) FETCH IMAGE TO TEXT

2A) THE HTML

2-fetch.html
<!-- (A) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>

<!-- (B) RESULT -->
<textarea id="result"></textarea>

In this example, we will use an image on the server itself.

 

2B) THE JAVASCRIPT

2-fetch.js
window.addEventListener("load", () => {
  // (A) FETCH IMAGE
  fetch("text.png")
  .then(res => res.blob())
  .then(async (blob) => {
    // (B) CREATE ENGLISH WORKER
    const worker = await Tesseract.createWorker();
    await worker.loadLanguage("eng");
    await worker.initialize("eng");

    // (C) RESULT
    const res = await worker.recognize(blob);
    document.getElementById("result").value = res.data.text;
  });
});
  1. Use fetch() to get the image from the server.
  2. Same old “create an English Tesseract worker”.
  3. Feed the image to the worker, and output the results into the text area.

P.S. We can fetch images on another domain/server, but that involves CORS (cross-domain). I will leave a link in the extras section below to another tutorial.

 

 

EXAMPLE 3) WEBCAM IMAGE TO TEXT

3A) THE HTML

3-cam.html
<!-- (A) LOAD TESSERACT  -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script defer src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>
 
<!-- (B) WEBCAM & RESULT -->
<video id="vid" autoplay></video>
<button id="go">Go!</button>
<textarea id="result"></textarea>

We no longer live in the Stone Age, there is a modern convenience called a “webcam”. We can use it to take photos, convert the image to text. The HTML of this example should be pretty self-explanatory:

  • <video id="vid> Live feed from the webcam.
  • <button id="go"> “Take photo and convert to text” button.
  • <textarea id="result"> Results.

 

 

3B) THE JAVASCRIPT

3-cam.js
var webkam = {
  // (A) INITIALIZE
  worker : null, // tesseract worker
  hVid : null, hGo :null, hRes : null, // html elements
  init : () => {
    // (A1) GET HTML ELEMENTS
    webkam.hVid = document.getElementById("vid"),
    webkam.hGo = document.getElementById("go"),
    webkam.hRes = document.getElementById("result");

    // (A2) GET USER PERMISSION TO ACCESS CAMERA
    navigator.mediaDevices.getUserMedia({ video: true })
    .then(async (stream) => {
      // (A2-1) CREATE ENGLISH WORKER
      webkam.worker = await Tesseract.createWorker();
      await webkam.worker.loadLanguage("eng");
      await webkam.worker.initialize("eng");

      // (A2-2) WEBCAM LIVE STREAM
      webkam.hVid.srcObject = stream;
      webkam.hGo.onclick = webkam.snap;
    })
    .catch(err => console.error(err));
  },

  // (B) SNAP VIDEO FRAME TO TEXT
  snap : async () => {
    // (B1) CREATE NEW CANVAS
    let canvas = document.createElement("canvas"),
        ctx = canvas.getContext("2d"),
        vWidth = webkam.hVid.videoWidth,
        vHeight = webkam.hVid.videoHeight;
  
    // (B2) CAPTURE VIDEO FRAME TO CANVAS
    canvas.width = vWidth;
    canvas.height = vHeight;
    ctx.drawImage(webkam.hVid, 0, 0, vWidth, vHeight);

    // (B3) CANVAS TO IMAGE, IMAGE TO TEXT
    const res = await webkam.worker.recognize(canvas.toDataURL("image/png"));
    webkam.hRes.value = res.data.text;
  },
};
window.addEventListener("load", webkam.init);

This is seemingly complex, but keep calm and look carefully.

  1. On window load.
    • Get HTML elements.
    • Get the user’s permission to access the webcam.
    • Create a Tesseract worker.
    • Set up webcam live feed into <video>.
  2. When the user hits the “go” button.
    • Capture the current video frame as an image.
    • Send the image to the Tesseract worker.
    • Output the result.

The end. It’s just long-winded.

 

 

EXTRAS

That’s all for the tutorial, and here is a small section on some extras and links that may be useful to you.

 

LINKS & REFERENCES

 

 

THE END

Thank you for reading, and we have come to the end. I hope that it has helped you to better understand, and if you want to share anything with this guide, please feel free to comment below. Good luck and happy coding!

Leave a Comment

Your email address will not be published. Required fields are marked *