Reboosh | ♥ Nonprofit | Bring Your Karma

Bring your karma
Join the waitlist today

HUMBLECAT.ORG

Blind and Visually Impaired Community

r/blind

Full History - 2016 - 11 - 01 - ID#5ai8gp

↑

↓

I built an app for the blind (self.Blind)

submitted by kingoftheapes

It describes text, colors, and general information about an image.

Here is the $1 to the app.

I'd really appreciate it if you guys can try it out and give some feedback. I'll take the information and release another version with you guys in mind.

Marconius 2 points 6y ago

The object recognition isn't too accurate, and the colors aren't particularly useful to me. The text recognition however is quite good, and could potentially give KNFB reader a run for it's money, at least in terms of character recognition, and it would be great if The text was available for direct interaction with VoiceOver rather than just being a replayed audio clip.

The current voice is the unenhanced version of Samantha, it would be great if the app used the voice already loaded by voiceover or allowed us to tweak the playback and feedback options. Also, open up all of the camera controls rather than just giving us a quote take picture" button. Give us the ability to use the flash, switch between front and back cameras, etc.

There is nothing in the interface that tells me how much you are selling the additional charges for, and I decided to not go past the touch ID confirmation until I knew how much you were charging. Be aware that tap tap see is pretty much free nowadays and offers pretty much the same functionality at the same speed, although it runs on a human description method.

kingoftheapes [OP] 1 points 6y ago

Hey Marconius, thanks for the solid feedback.

When you say direct interaction, what are you thinking of? Do you prefer if there was text laid out and you can touch over the text with some playback options?

Besides the camera toggling and flash, do you want any other stuff for customizing camera controls?

When you say 'human description method', what are you referring to?

Once again, thanks for trying it out. For your information, the price I charge is 100 shots for $1.99. For clarification purposes, you can go past the touch ID stage to get the pricing without agreeing to purchase the item, but I totally understand how one may be hesitant about not knowing how much it costs before putting their touch ID info in.

Marconius 1 points 6y ago

Yes, if any text is found, it would be great if it was published in such a way that voiceover can interact with the characters in words rather than having to constantly replay a wrong bit of text.

For the camera, those would be the only controls that I would need being fully blind, but providing a grid overlay or specific color filters to help the visually impaired set up their photo would also be a great help.

Tap tap see takes a photo and provides it to the Amazon Turk interface, where someone out there quickly writes an image description and sends it straight back to you. It's very very accurate since it's a human providing the description, but the description style can be inconsistent. It would also be great if we could choose what is specifically being described through your application, so that I'd be able to turn off color and object recognition and simply keep it as a text character recognition app. That would definitely make the app much more robust and useful for everyone all across the visual impairment spectrum.

kingoftheapes [OP] 1 points 6y ago

Okay! Thanks for the clarifications.

I'll take your feedback into consideration when I'm making edits to the app. Hopefully it won't take too long to come out with some of the changes you've mentioned.

I'll keep you updated!

Marconius 1 points 6y ago

Here are some additional notes for version 1.5:

Nothing in the interface is telling me which camera I have turn on or if the flash is on or off. The controls you exposed are just buttons with no VoiceOver toggle states telling me what I have selected.

The text and overall presentation of the image scan is all one chunk of text. This should be broken up so it's easier to navigate with the line Road or selection. There currently doesn't seem to be a way to save the results, meaning I have to use another charge and take another picture if I need to go back to something I've done previously.

One definite problem with the function of the charges is that since I am blind, I will be prone to making photographic mistakes and may end up wasting a bunch trying to get the information I'm looking for. Perhaps I forgot that I don't have a light on and struggle wasting charges in the dark. If possible, exposing more of the iOS camera description features that happen while you are pointing and aiming your camera would be nice. Potentially use some form of recognition to see if the image is super blurry or way too dark before processing a charge?

kingoftheapes [OP] 1 points 6y ago

Hey Marconius,

Thanks for the review. I worked on a couple things today:

1. Better button state notification with VoiceOver. When you 'hover' with VoiceOver, it should tell you the flash mode and what camera you are using.
2. I split out the text with breaks.
3. I put a brightness detector on the app. So when it's too dark and user takes a photo, it will now prevent you from taking a photo until luminosity levels are at a good point.

In regards to camera blur, the app does the job of focusing for you. It will not take a photo until focus is at a good point for most cases. But if you are intentionally making it blurry by, for instance, shaking the phone, it will take a photo after a second or so when it gives up on focus. Unfortunately, users will have to trust that this is the case and that the app is doing what it can to control focus and blur. Of course, users will have to do their best to keep their cameras still.

The update should be out in a couple days (smiley face)

Cheers and thanks for the great feedback,

kingoftheapes [OP] 1 points 6y ago

Hey Folks /u/Marconius, /u/fastfinge,

I took some recommendations by you guys and did a refresh on the app. Some new features include:

- Camera toggling/flash controls
- Text view for the result. The app doesn't speak the results anymore, I'm going to lean more on VoiceOver to do that.

$1

Cheers,

fastfinge 1 points 6y ago

This afternoon I spent my 10 free shots evaluating this app, as compared to The KNFB Reader. I didn't compare it with TapTapSee, because I don't use that app; I'm not comfortable with pictures I take getting sent to random humans to be described. While the results may be better, I value my privacy more than I value the best results.

First off, two features that The KNFB Reader (KNFB for short) has that Eye of Providence (EoP for short) does not are tilt guidance and field of view report. In order to keep the evaluation fair, I turned off these features most of the time while comparing KNFB and EoP. This is a huge deal because, for whatever reason, I am the worlds worst photographer. I find it challenging to hold my phone straight and steady, and other than "kinda...hold the phone a little bit above the thingy..." I really have no concept of what gets in any given photo, or how focus works, or anything like that. When using KNFB, I generally just press field of view while moving the phone vaguely at random, until it tells me all four edges are visible. With EoP, I can't do that, and I really expected this to be a huge issue. But read on!

I started by taking pictures in low light conditions. Well...that's a really fancy way of saying "I was in a dark room and forgot to turn the light on before starting this review." I did warn you that I'm pretty terrible at all things camera related! KNFB was able to recognize some text anyway, though it was less accurate than if I'd had the light on. EoP, on the other hand, just said all my photos were about "lense flare". I had the flash in KNFB set to "auto", so I assume that's why. It would be nice if EoP could detect low light conditions and use flash as well, or if not, at least notify me if the photos were too dark to be useful. Especially when each shot I take costs money, though an admittedly tiny amount, it'd be nice if at least some sanity checking could be done on the device, to make sure what's about to be sent to the API actually might contain something useful. I also need to note that in order to get text that was accurate enough to be helpful with KNFB, I had to re-enable tilt guidance and field of view report. In low light conditions with these features off, I couldn't get KNFB to detect any text at all.

But it's when I turned the light on that the counter-intuitive magic began happening! I started off by taking a picture of a warranty card for some turtle beach wireless headphones. On my first shot, EoP recognized all the text, including the web address required to register the product, nearly perfectly. In KNFB, without tilt guidance and field of view reports, I was able to get some mangled text, but even after multiple tries, I couldn't get accurate enough recognition to be used if I had actually wanted to register my product. Once I re-enabled those features in KNFB, I was able to get accuracy that matched EoP. But of course, this meant my usual method of taking 2 or 3 shots to get field of view right, and being extremely delicate in keeping the phone perfectly level. This already makes EoP an order of magnitude faster for recognizing text, for me. However, to add to that, EoP is quicker at uploading the photo and getting the results back from Google, than KNFB is at doing OCR directly on my IPhone6. This might not be the case for newer phones, though.

Another object I tried was the back of one of those small, slightly rounded containers that water flavouring additives come in. The idea is you squeeze it once into your bottle of water, for flavour. This has always been a weakness of KNFB, at least for me. I have huge problems getting it to recognize small print, on rounded or curved surfaces. EoP was able to read most of the nutritional information, on the first shot. No matter what I did, tilt guidance and field of view or not, I couldn't get KNFB to recognize any text at all. It's a really tiny bottle, about half the size of my palm, so it could be that I wasn't getting it framed properly. But that didn't seem to matter to EoP.

Lastly, I compared results on a plaque for an award that I won years ago. Again, EoP got all of the text, on the first try. KNFB without tilt guidance or field of view reports, was able to get quite a bit of the text, but with several mistakes. Using those features, of course, I was able to get equivalent results. Though again, the fiddling meant it took much longer to get usable results with KNFB.

The next surprising thing was object identification. Based on Google's tech demos, I expected that this would be really impressive, and a major win for EoP, seeing as KNFB doesn't have this feature at all. However, disappointingly, it wasn't. It was completely useless. It identified a can of deodorant as a computer mouse. When I took a picture of my bluetooth keyboard, it read back the text on it ("Logitech K-10 Wireless Keyboard"), and then proceeded to identify it as a laptop. Even though it could read the nutritional information of the flavouring bottle, it still thought the important thing in that picture was my floor. Apparently most people just have random information about calorie and fat content written on the floor, I dunno. Sadly, this feature could be removed entirely, without making the app any less useful. If this would make the text recognition cheaper, I'd be all for it. As it is, might as well leave it in if we're already paying for it. But it really is useless.

Unlike what /u/Marconius said, though, I did find the colour recognition useful. I know the colour of my floor and bed, and it was right about both of those things. I've tried colour identification apps in the past that can't even manage that much!

However, all of the other issues /u/Marconius mentioned are right on the nose. In the case of the warranty card example from above, if I'd wanted to register my product, I'd have had to replay the text multiple times, until I'd typed the included web address into my computer. What I'd like is the text displayed on-screen, so I can review parts of it character by character with VoiceOver, or copy it to the clipboard. What would also be nice is a "share" button that would open up the share sheet, and let me send the recognized text to other apps. That way, I could email the recognized text to myself, and open the email on the computer to click the link.

TL;DR: EoP is faster and more accurate at text recognition if you find KNFB fussy about not tilting your phone, and you need to try multiple times to get the object you want to recognize perfectly in frame. If that isn't an issue for you, they're equally good. The other features the app offers aren't compelling enough to make it "better" than KNFB. I might be switching to EoP myself, but YMMV.

kingoftheapes [OP] 2 points 6y ago

Hey fastfinge, thanks for the detailed review. It always helps to have more opinions when making decisions about a feature.

I agree with you on having more features for the camera. I have a personal belief that less is more when it comes to making a product so I take all the considerations about a feature very carefully. This is also v1. With your guys' feedback, I'll build a v2.

I'm definitely thinking about removing the object recognition out or at least make it toggle-able, so that people don't have to hear unnecessary information all the time. However, the technology will improve over time, hence the term 'machine learning'.

I wanted to ask you about the text interaction. Like Marconius said, do you also think it would be useful to have an interactive text box with all the text in it?

So far, the two most important things are having written text that you can read with voiceover. And having more enhanced camera features.

Thanks again

[Edit: You already answered my question. I just didn't catch it. Thanks again.]

fastfinge 2 points 6y ago

With more camera features and an interactive text box displaying the text, your app really would have a shot at being considered the best on the market. As I said in my review TL;DR section though, that's pretty objective, and depends a bit on the type of user. For example, if I was reading a book, I'd still stick with KNFB. Your pricing structure means reading a few 300 page books for class could get fairly expensive. I'm not asking you to lower the price really, I know the google API costs money. I'm just pointing out that one app can't satisfy every single user all the time. :-)

fastfinge 1 points 6y ago

Looks interesting! Is the tech it's using something you built yourself, or are you using the Microsoft or Google APIs.

kingoftheapes [OP] 2 points 6y ago

Initially, I tried to bake my own algorithm. Then I got sad and used Google's Vision API.

fastfinge 2 points 6y ago

That's what I was hoping to hear! Google and Microsoft have the best APIs in the business for this, but they're so knew that we haven't yet had many apps that take advantage of them other than as a tech demo.

kingoftheapes [OP] 2 points 6y ago

Thanks. I've come to accept that I won't ever bake an algorithm as good as Google's. They've done an amazing job with the vision platform.

This nonprofit website is run by volunteers.

Please contribute if you can. Thank you!

Our mission is to provide everyone with access to large-
scale community websites for the good of humanity.
Without ads, without tracking, without greed.