Bring your karma
Join the waitlist today
HUMBLECAT.ORG

Blind and Visually Impaired Community

Full History - 2021 - 05 - 01 - ID#n2q6zq
3
Tools for searching the screen with limited vision / tunnel vision (self.Blind)
submitted by EffectiveYak0
Because I have tunnel vision, I find that trying to figure out where something lives on a screen is a real pain. I tend to toggle my screen reader on and off depending on what I am doing, so I don't always have a good starting point to cycle through. I like being able to use my limited vision for certain things, and then use the screen reader for reading long text.

​

What I really need is something like a search function that spawns an arrow from one of the corners of the screen so I can follow it and find things more easily. Does anyone know of anything like this? Maybe as a browser extension?
zersiax 3 points 2y ago
Take this with a grain of salt please, I am fully blind myself and always have been, but what is stopping you from trying to do more with a screenreader than just reading longform texts? What you are trying to do, to my limited ability to imagine it at least, sounds like it would cause headaches quite often and at that point, to me, the choice between straining your eyes to find stuff on the screen or using a screenreader which might have tools to help you up to a point is an easy one.

As for tools that do this ...I'm honestly not sure. How would you tell the arrow, as it were, where to point? Maybe some of the magnifiers out there have an option like this? ZoomText, Fusion, MaGiC, Windows Magnifier, ImmersiveReader etc to name a few :)
EffectiveYak0 [OP] 2 points 2y ago
So when I first lost my vision I was completely blind for four months. I found that using the screen reader all day was pretty exhausting. Maybe it's just a matter of building up stamina, but I really struggle listening to it all day. I work in tech, and if I can use the limited vision I have to some degree I want to try and preserve my mental capacity throughout the day.

I might have to build a tool like this myself. Finding text on the screen is probably fairly straightforward. Trying to figure out graphical elements is going to be non trivial. I might just have to stick to toggling on and off as needed.
zersiax 1 points 2y ago
Hmm ...just regular old ctrl+f in a browser doesn't cut it, I take it? Not sure if that scrolls it into view properly, it never really works well for screenreaders to my knowledge, which is why a lot of screenreaders just roll their own.

I am a developer myself, so I absolutely understand trying to conserve concentration to the important things :)
EffectiveYak0 [OP] 1 points 2y ago
Nope unfortunately not. My visual field is too distorted to distinguish the highlighted portion. I mean, I can tell there is something on the screen by the blob of color for a window, but I can't make out details at all without looking directly at something.

But if I just had a better way to distinguish what is highlighted I think it would help a lot. The way I use a mouse, for example, is that I have a toggle that turns the pointer into a giant red cross that I can wiggle to find on the screen. I need some type of big pointer that I can identify with my peripheral and then hone in on with my limited central vision. Hope that makes more sense.
zersiax 1 points 2y ago
It does :) It asks a lot of my powers of imagination but I think I get it :)
Can't really help you with the visual part of things, but what I can tell you is that some screenreaders have ways to highlight what is currently focused, and that there are hotkeys to, say, jump to the nearest heading, form field or graphic. I'm not sure if any of them might give you just the right type of highlight for your peripheral vision to pick up, but it's worth a quick search :) Thats unfortunately where my usefulness runs out though; this is far outside my area of expertise :)
Rethunker 1 points 2y ago
Can you give some examples of things you need to find most often on screen? Or is it just about anything?

OCR tools to find text will take a bit of work, but it’s doable. But for text would you be looking primarily for window and dialog titles? Or any arbitrary text anywhere?

Identifying graphics can be a more difficult task in some ways, but it you narrow the scope a bit the work becomes more feasible. (I work in image processing, by the way.)

The more narrowly you can define the functionality, the more likely the service or app will be satisfactory. That’s obvious to state, but important to keen in mind because recognition via image processing can easily lead you down a rabbit hole of trying to solve overgeneralized problems.

What minimal implementation would convince you that the software will do what you need?
EffectiveYak0 [OP] 2 points 2y ago
Most of the time I'm just looking for some specific text. I want a faster way to find things. I can't see enough of the screen to figure out what is highlighted. Dialog boxes are still a constant point of annoyance, but eventually I figure out something popped up somewhere either by scanning long enough or toggling on my screen reader and tabbing through windows.

I think most useful would be text search with an arrow pointing to the match coming from a predictable location that I could somehow toggle once I have found where the match is located.
Rethunker 1 points 2y ago
Microsoft has an OCR sample that might get you started, if you're a developer. Here's the page:

$1

To quote directly from that page, the sample code shows how to do the following:

​

>Scenario 1: Load image from a file and extract text in user specified language.
>
>
>
>1. Determine whether any language is OCR supported on device.
>
>2. Get list of all available OCR languages on device.
>
>3. Create OCR recognizer for specific language.
>
>4. Create OCR recognizer for the first OCR supported language from GlobalizationPreferences.Languages list.
>
>5. Load image from a file and extract text.
>
>6. Overlay word bounding boxes over displayed image.
>
>7. Differentiate vertical and horizontal text lines.

To feed the OCR recognizer you'd need a screenshot, for which you should also find code in Microsoft docs or on StackOverflow.

Given the text and its bounding boxes, then there are some additional problems to solve.

Setting focus on your app or service will change the focus for input, which could cause other windows to be hidden. Ways that you could launch your app would include double-clicking from the desktop, using the task switcher to activate your app, and perhaps even access the functionality as a service through some key combination. Each one of these could have an effect on what other windows appear on screen, and how those windows are ordered.

Doing an exact word search would be a good start, but then the results would need to be ordered by some logic if there are multiple matches. That could mean having to select from a list of results before your arrow points somewhere.

Initially you might only want to find text currently visible, but eventually you may want a way to find text in some window that's currently hidden or partially obscured by another window. Then you might programmatically trigger the task switcher to cycle through windows, take a screenshot of the rectangle of each app window as it appears in front, run OCR on that screenshot, etc.

Having your arrow appear in the center of the screen seems a good start, whereas if the arrow were on an edge or in a corner it could be harder to follow to the target accurately. But I can also imagine that starting at some fixed point at the edge of the screen might be useful in other cases.

If you're going to draw an overlay graphic anyway, then I would suggest that in addition to the arrow that you include some indication how far away the text is. You could even show a faint, pulsing line connecting the arrow head to a box surrounding the targeted word. All those graphics may seem like overkill, but would limit the effort and thinking needed to find the word being pointed at.

If you've follow the arrow to a word and found it wasn't the instance you wanted, then you'd need some way to point to the next instance. In that case you might repoint the arrow and/or given directions from the current instance.


In summary, I think that with a bit of work you'll be able to get OCR to do the work you need, but that you would end up spending most of your time working through various scenarios and trying to make the interaction simple.
This nonprofit website is run by volunteers.
Please contribute if you can. Thank you!
Our mission is to provide everyone with access to large-
scale community websites for the good of humanity.
Without ads, without tracking, without greed.
©2023 HumbleCat Inc   •   HumbleCat is a 501(c)3 nonprofit based in Michigan, USA.