DevelopmentKindle Cloud Reader Translate Evolutions

How I started from a small userscript - transformed to a browser extension. And ended up dealing with OCR and image recognition.

First Iteration: Humble beginnings

Kindle Cloud Reader is a product from Amazon that allows users to read books in the browser. For some reason, you can't copy-paste or translate with this app. While learning German I was looking for a way to translate from KCR. This is how I come across an article from Andre Klein's. The article described a way to translate text from KCR with a bookmarklet. The bookmarklet did the job but for configuration, you had to edit the code. Additionally, to start using the bookmarklet you had to click on it first.

I thought to myself this should be an extension. A user could make the configurations without editing the code. And the translations would work without the need to click on the bookmarklet first. So the bookmarklet became a UserScript and I've added a simple config window. In the original bookmarklet selecting a text would prompt a menu with a translate option. I made the translation window appear immediately after selecting the text. Just a little ergonomics.

Packed it up, published it on the chrome web store, and commented on the original article. Andre was kind enough to add a link to the extension.

The first problems were soon to arrive. Amazon uses different subdomains for different regions. People in Mexico accessing KCR from https://leer.amazon.com.mx. But the ContentScript is set to run only on https://read.amazon.com/*. This is easy to fix: check in which countries the extension is downloaded. And add respective domain names. Canada, Japan, Australia, Mexico, Brazil, Germany, and many others.

Was very happy to see the international reach - but couldn't add every country in the world. Instead, I have added a small note in the web store. Asking users to email me their domain in case the extension doesn't work.

The second problem was more elusive. At times the user script would run before KCR code was completed. This would cause the method overrides to stop working. The proper way to solve this would be to use a callback. Unfortunately, such a callback was not easy to find. I ended up using a timer and checking every few seconds if KCR code is ready.

Second Iteration: The troubles

One day it stopped working. The review tab is flooded. Uninstall graph spikes. The worst part it still works on my machine. Fortunately, some of the users contacted me and helped with initial debugging. Amazon made some serious changes. Namely, all the methods I was overriding have been moved out of the global scope. It took me a whole weekend trying to get to the same methods. Alas to no avail.

I tried a new approach. My thinking was: if the text is in the DOM - and I knew it was there - and the user selects the text. I can find selected text using browsers API. The only problem is the selection of text is blocked by KCR. As you might remember copying of text is also disabled by KCR. KCR does two things to prevent text selection: it overlays the text with transparent div, and it changes CSS property user-select.

After recognizing the problem the solution is obvious. First make the text lie on top.

$("#KindleReaderIFrame")
.contents()
.find("#kindleReader_content")
.get(0).style.zIndex = 201;
Change the CSS property of user-select.
body.style.userSelect = "text";
body.parentElement.contenteditable = "true";
Attach onMouseUp events.
document.body.onmouseup = function () {
let selectedText = document.getSelection().toString();
// ... open translation window with selected text
You can see the full change in this commit

Third Iteration: OCR

The empire strikes back. After a few quiet months, the translation is broken again. A quick check reveals a hard picture. The text is gone and Amazon displays only an image. They have gone so far just to prevent users from selecting text.

If it was like this from the start I would probably never think twice about it but now I feel invested. They broke my extension - and it must be fixed. If the text is not there the only option to extract it is with OCR. A quick search yielded tesseract.js - a library for text recognition with javascript. They even had an example browser extension.

All that is left is to find the selected text. For this, I've used tracking.js . I had to modify the detection algorithm slightly to merge lines into one block.

This solution is a bit slower but it is the best I could come up with to date. What do you think? Is there a more efficient way to achieve this?