Welcome to a tutorial on how to convert an image to text in Javascript. Yes, image-to-text has already been around for some time, and it is called “Optical Character Recognition” (OCR). There is an open-source OCR library called TesseractJS that we can use:
async conv () {
const worker = await Tesseract.createWorker();
await worker.loadLanguage("eng");
await worker.initialize("eng");
const res = await worker.recognize(IMAGE);
const txt = res.data.text;
}
That’s all for the quick basics, read on if you need detailed examples!
TABLE OF CONTENTS
DOWNLOAD & NOTES
Here is the download link to the example code, so you don’t have to copy-paste everything.
EXAMPLE CODE DOWNLOAD
Source code on GitHub Gist | Example on CodePen
Just click on “download zip” or do a git clone. I have released it under the MIT license, so feel free to build on top of it or use it in your own project.
SORRY FOR THE ADS...
But someone has to pay the bills, and sponsors are paying for it. I insist on not turning Code Boxx into a "paid scripts" business, and I don't "block people with Adblock". Every little bit of support helps.
Buy Me A Coffee Code Boxx eBooks
JAVASCRIPT OCR – IMAGE TO TEXT
All right, let us now get into the examples of converting images to text using OCR.
TUTORIAL VIDEO
EXAMPLE 1) INPUT IMAGE TO TEXT
1A) THE HTML
<!-- (A) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>
<!-- (B) FILE SELECTOR & RESULT -->
<input type="file" id="select" accept="image/png, image/gif, image/webp, image/jpeg">
<textarea id="result"></textarea>
- (A) Load the Tesseract.js library. You can either download and host the library yourself, or just load it off a CDN.
- (B) For this example, we will read the image selected in
<input id="select">
and output the text in<textarea id="result">
.
1B) THE JAVASCRIPT
window.addEventListener("load", async () => {
// (A) GET HTML ELEMENTS
const hSel = document.getElementById("select"),
hRes = document.getElementById("result");
// (B) CREATE ENGLISH TESSERACT WORKER
const worker = await Tesseract.createWorker();
await worker.loadLanguage("eng");
await worker.initialize("eng");
// (C) ON FILE SELECT - IMAGE TO TEXT
hSel.onchange = async () => {
const res = await worker.recognize(hSel.files[0]);
hRes.value = res.data.text;
};
});
This should be very straightforward.
- Get the HTML file select and text area.
- Create an English Tesseract worker.
- On selecting an image file, we pass the image to the worker. When Tesseract is done, we put the results into the text area.
EXAMPLE 2) FETCH IMAGE TO TEXT
2A) THE HTML
<!-- (A) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>
<!-- (B) RESULT -->
<textarea id="result"></textarea>
In this example, we will use an image on the server itself.
2B) THE JAVASCRIPT
window.addEventListener("load", () => {
// (A) FETCH IMAGE
fetch("text.png")
.then(res => res.blob())
.then(async (blob) => {
// (B) CREATE ENGLISH WORKER
const worker = await Tesseract.createWorker();
await worker.loadLanguage("eng");
await worker.initialize("eng");
// (C) RESULT
const res = await worker.recognize(blob);
document.getElementById("result").value = res.data.text;
});
});
- Use
fetch()
to get the image from the server. - Same old “create an English Tesseract worker”.
- Feed the image to the worker, and output the results into the text area.
P.S. We can fetch images on another domain/server, but that involves CORS (cross-domain). I will leave a link in the extras section below to another tutorial.
EXAMPLE 3) WEBCAM IMAGE TO TEXT
3A) THE HTML
<!-- (A) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script defer src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>
<!-- (B) WEBCAM & RESULT -->
<video id="vid" autoplay></video>
<button id="go">Go!</button>
<textarea id="result"></textarea>
We no longer live in the Stone Age, there is a modern convenience called a “webcam”. We can use it to take photos, convert the image to text. The HTML of this example should be pretty self-explanatory:
<video id="vid>
Live feed from the webcam.<button id="go">
“Take photo and convert to text” button.<textarea id="result">
Results.
3B) THE JAVASCRIPT
var webkam = {
// (A) INITIALIZE
worker : null, // tesseract worker
hVid : null, hGo :null, hRes : null, // html elements
init : () => {
// (A1) GET HTML ELEMENTS
webkam.hVid = document.getElementById("vid"),
webkam.hGo = document.getElementById("go"),
webkam.hRes = document.getElementById("result");
// (A2) GET USER PERMISSION TO ACCESS CAMERA
navigator.mediaDevices.getUserMedia({ video: true })
.then(async (stream) => {
// (A2-1) CREATE ENGLISH WORKER
webkam.worker = await Tesseract.createWorker();
await webkam.worker.loadLanguage("eng");
await webkam.worker.initialize("eng");
// (A2-2) WEBCAM LIVE STREAM
webkam.hVid.srcObject = stream;
webkam.hGo.onclick = webkam.snap;
})
.catch(err => console.error(err));
},
// (B) SNAP VIDEO FRAME TO TEXT
snap : async () => {
// (B1) CREATE NEW CANVAS
let canvas = document.createElement("canvas"),
ctx = canvas.getContext("2d"),
vWidth = webkam.hVid.videoWidth,
vHeight = webkam.hVid.videoHeight;
// (B2) CAPTURE VIDEO FRAME TO CANVAS
canvas.width = vWidth;
canvas.height = vHeight;
ctx.drawImage(webkam.hVid, 0, 0, vWidth, vHeight);
// (B3) CANVAS TO IMAGE, IMAGE TO TEXT
const res = await webkam.worker.recognize(canvas.toDataURL("image/png"));
webkam.hRes.value = res.data.text;
},
};
window.addEventListener("load", webkam.init);
This is seemingly complex, but keep calm and look carefully.
- On window load.
- Get HTML elements.
- Get the user’s permission to access the webcam.
- Create a Tesseract worker.
- Set up webcam live feed into
<video>
.
- When the user hits the “go” button.
- Capture the current video frame as an image.
- Send the image to the Tesseract worker.
- Output the result.
The end. It’s just long-winded.
EXTRAS
That’s all for the tutorial, and here is a small section on some extras and links that may be useful to you.
LINKS & REFERENCES
- tesseract.js – GitHub
- Capture Photos With Webcam In Javascript – Code Boxx
- Javascript Cross-Origins CORS Fetch – Code Boxx
THE END
Thank you for reading, and we have come to the end. I hope that it has helped you to better understand, and if you want to share anything with this guide, please feel free to comment below. Good luck and happy coding!