Welcome to a tutorial on how to convert an image to text in Javascript using OCR. So you are working on a project that “scans an image to text”? Well, image-to-text has been around for quite some time, and it is called “Optical Character Recognition”. Just how do we do that in Javascript? Read on for the examples!
TLDR – QUICK SLIDES
Fullscreen Mode – Click Here
TABLE OF CONTENTS
JAVASCRIPT OCR – IMAGE TO TEXT
All right, let us now get into the examples of converting images to text using OCR.
EXAMPLE 1) INPUT IMAGE TO TEXT
<!-- (A) FILE SELECTOR & RESULT -->
<input type="file" id="select" accept="image/png, image/gif, image/webp, image/jpeg">
<div id="result"></div>
<!-- (B) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.0.6/tesseract.min.js"></script>
<!-- (C) INIT -->
<script>
window.addEventListener("load", async () => {
// (C1) GET HTML ELEMENTS
const hSel = document.getElementById("select"),
hRes = document.getElementById("result");
// (C2) CREATE ENGLISH WORKER
const worker = await Tesseract.createWorker();
await worker.loadLanguage("eng");
await worker.initialize("eng");
// (C3) ON FILE SELECT - IMAGE TO TEXT
hSel.onchange = async () => {
const res = await worker.recognize(hSel.files[0]);
hRes.innerHTML = res.data.text;
};
});
</script>
The OCR library that we are going to use is Tesseract.js – You can either download and host the library yourself, or just load it off a CDN. The usage otherwise, is very straightforward.
- (B) Load the Tesseract.js library. Captain Obvious at your service.
- (C2) Create a Tesseract worker, and set the language that you want to convert.
- (C3) Lastly, just feed the image to the Tesseract worker and get the result.
EXAMPLE 2) FETCH IMAGE TO TEXT
<!-- (A) RESULT -->
<div id="result"></div>
<!-- (B) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.0.6/tesseract.min.js"></script>
<!-- (C) FETCH IMAGE & SHOW TEXT -->
<script>
window.addEventListener("load", () => {
// (C1) FETCH IMAGE
fetch("text.png")
.then(res => res.blob())
.then(async (blob) => {
// (C2) CREATE ENGLISH WORKER
const worker = await Tesseract.createWorker();
await worker.loadLanguage("eng");
await worker.initialize("eng");
// (C3) RESULT
const res = await worker.recognize(blob);
document.getElementById("result").innerHTML = res.data.text;
});
});
</script>
This is the same “create a Tesseract worker, feed image to worker”. But instead of getting the image file from an <input>
, we get the image using fetch()
instead.
P.S. You can fetch images on another domain/server, but that involves CORS (cross-domain). I will leave a link in the extras section below to another tutorial.
EXAMPLE 3) WEBCAM IMAGE TO TEXT
3A) THE HTML
<!-- (A) WEBCAM & RESULT -->
<video id="camVid" autoplay></video>
<button id="camGo">Go!</button>
<div id="camRes"></div>
<!-- (B) LOAD TESSERACT & JS -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.0.6/tesseract.min.js"></script>
<script src="3-cam.js"></script>
We no longer live in the Stone Age, there is a modern convenience called a “webcam”. We can use it to take photos, convert the image to text. The HTML of this example should be pretty self-explanatory:
<video id="camVid>
Live feed from the webcam.<button id="camGo">
“Take photo and convert to text” button.<div id="camRes">
Results.
3B) THE JAVASCRIPT
var webkam = {
// (A) INITIALIZE
worker : null, // tesseract worker
hVid : null, hGo :null, hRes : null, // html elements
init : () => {
// (A1) GET HTML ELEMENTS
webkam.hVid = document.getElementById("camVid"),
webkam.hGo = document.getElementById("camGo"),
webkam.hRes = document.getElementById("camRes");
// (A2) GET USER PERMISSION TO ACCESS CAMERA
navigator.mediaDevices.getUserMedia({ video: true })
.then(async (stream) => {
// (A2-1) CREATE ENGLISH WORKER
webkam.worker = await Tesseract.createWorker();
await webkam.worker.loadLanguage("eng");
await webkam.worker.initialize("eng");
// (A2-2) WEBCAM LIVE STREAM
webkam.hVid.srcObject = stream;
webkam.hGo.onclick = webkam.snap;
})
.catch(err => console.error(err));
},
// (B) SNAP VIDEO FRAME TO TEXT
snap : async () => {
// (B1) CREATE NEW CANVAS
let canvas = document.createElement("canvas"),
ctx = canvas.getContext("2d"),
vWidth = webkam.hVid.videoWidth,
vHeight = webkam.hVid.videoHeight;
// (B2) CAPTURE VIDEO FRAME TO CANVAS
canvas.width = vWidth;
canvas.height = vHeight;
ctx.drawImage(webkam.hVid, 0, 0, vWidth, vHeight);
// (B3) CANVAS TO IMAGE, IMAGE TO TEXT
const res = await webkam.worker.recognize(canvas.toDataURL("image/png"));
webkam.hRes.innerHTML = res.data.text;
},
};
window.addEventListener("load", webkam.init);
This is seemingly complex, but keep calm and look carefully.
- On window load.
- Get HTML elements.
- Get the user’s permission to access the webcam.
- Create a Tesseract worker.
- Set up webcam live feed into
<video>
.
- When the user hits the “go” button.
- Capture the current video frame as an image.
- Send the image to the Tesseract worker.
- Output the result.
The end. It’s just long-winded.
DOWNLOAD & NOTES
Here is the download link to the example code, so you don’t have to copy-paste everything.
QUICK NOTES
If you spot a bug, feel free to comment below. I try to answer short questions too, but it is one person versus the entire world… If you need answers urgently, please check out my list of websites to get help with programming.
EXAMPLE CODE DOWNLOAD
Click here for the source code on GitHub gist, just click on “download zip” or do a git clone. I have released it under the MIT license, so feel free to build on top of it or use it in your own project.
EXTRA BITS & LINKS
That’s all for the tutorial, and here is a small section on some extras and links that may be useful to you.
LINKS & REFERENCES
- tesseract.js – GitHub
- Capture Photos With Webcam In Javascript – Code Boxx
- Javascript Cross-Origins CORS Fetch – Code Boxx
INFOGRAPHIC CHEAT SHEET

THE END
Thank you for reading, and we have come to the end. I hope that it has helped you to better understand, and if you want to share anything with this guide, please feel free to comment below. Good luck and happy coding!