Reboosh | ♥ Nonprofit | Bring Your Karma

Bring your karma
Join the waitlist today

HUMBLECAT.ORG

Blind and Visually Impaired Community

r/blind

Full History - 2017 - 08 - 07 - ID#6s6x65

↑

↓

Automatically caption images with chrome extension (abhinavsuri.com)

submitted by julian88888888

johnathanjones1998 2 points 6y ago

Hi there. I'm the author of this extension. If any of you have advice on how to make this more accessible to visually impaired/blind individuals, please let me know! My testing was basically limited to the screen reader that came with my macbook.

fastfinge 1 points 6y ago

With $1 on windows, pressing enter doesn't seem to remove the overlay. Also, both NVDA and the extension read out the caption. If it were me, what I would probably do is modify the alt-text of the original image that the user right clicked on, to make the alt text be equal to the returned caption. Then just play a small sound to signify that the work has been done. That way, the user can review the caption with his regular screen reader commands, without losing his place in the web page, or having to listen to a third party voice reading it.

Also, if the image contains text, your extension doesn't seem to indicate that at all. It would be useful, even if it couldn't do OCR, if it returned something like "Also contains text". Many screen readers (NVDA and JAWS at least) have a command to perform OCR on an image. However, that doesn't describe the image at all, just recognize text. So, for example, the first image on $1, when I OCR it with NVDA, returns:
> Testimonial
>
> This is a test of the testimonial section.
>
> -- John Doe, US

When I ask your extension for a caption, it returns:
> a close up of a person holding a wii remote

I have no idea where it's getting that; I'm totally blind myself. Never-the-less, the point of the image is the text. So a hint that the user should try OCR would be useful.

johnathanjones1998 1 points 6y ago

Hey fastfinge,
I'll definitely look into that 'enter' key issue. I am not sure whether I will remove the overlay entirely since I do have friends who are visually impaired but aren't so impaired as to require a screenreader yet. Regardless, do you think a "click to remove overlay" would work? Or do you think it should be a configurable preference (i.e. "show overlay" or "do not show overlay")

Also the reason why it was all funky on images with text is because the neural network this was trained on only had images of scenes and common objects but not text. I'll look into setting up an OCR neural network as well.

fastfinge 1 points 6y ago

> do you think it should be a configurable preference

This is probably the way to go. If it's a preference, those of us who are totally blind can just turn it off. That way, we won't have to worry about losing our place in the web page when we dismiss the overlay.

> setting up an OCR neural network

Neat! You could also use existing APIs, like ocr.space. Though I suspect you have the skills to just do it yourself, and that might end up being cheaper and faster.

This nonprofit website is run by volunteers.

Please contribute if you can. Thank you!

Our mission is to provide everyone with access to large-
scale community websites for the good of humanity.
Without ads, without tracking, without greed.