Detecting screen readers in analytics Pros and cons


This article explores both sides of a longstanding debate: should screen readers be detectable in analytics? It’s an issue that pits accessibility against privacy, with legal complications thrown in for fun. When these aspects are taken into consideration, screen reader detection emerges as a technique which carries serious technical, ethical and privacy risks.

Analytics

Designers and developers live for analytics data. We can spend hours digging into fascinating insights on everything from shopping cart abandonment to social media conversions to browser use. Ignoring the insights residing within analytics can jeopardize a project; I learned this the hard way when I sat down to demo a client’s shiny new site on their office computers and discovered that their corporate network was still hard-wired to the dreaded Internet Explorer 8. Their old browser was not their fault, but my failure to know they were using it was very much mine.

For most of us, analytics create the blueprint for our work: we use visitor data to determine where to focus our attention, how to present information, and which audiences merit our time. Wouldn’t it logically follow, then, that developers might pay closer attention to the accessibility needs of their audiences if they had real-time statistical data about assistive technology use? After all, analytics data causes us to devote hours to fallbacks and workarounds for older browsers. Wouldn’t our time be better spent meeting the everyday accessibility needs of our real audiences? What if, for example, we could access data about screen reader usage in the same place as statistics about Chrome and Safari?

As with most things regarding accessibility, the question is not quite as simple as that. The prospect of tracking screen reader usage in analytics raises a host of technical, ethical, and privacy questions which have no easy answers. Fortunately, each aspect of the issue has been admirably covered by members of the accessibility community, and in this piece, we’ll bring them all together, along with a look at what the law says, to create an informed opinion on whether the technique is one worth considering.

Technical science-fiction

At the outset, it’s important to note that the prospect of tracking screen readers in analytics is largely theoretical. It is not technically possible to detect screen readers in the same ways we can detect browser version or screen resolution. This is because screen readers do not provide user agent strings like regular browsers. As accessibility specialist Chris Maury explained, “The user agent, which contains all the information about the OS, browser, etc, is set by the browser. Screen readers and screen magnifiers are software that live on top of the browser… Bridging this gap will take cooperation between the major screen readers and browsers, which may be difficult as there is no direct benefit to either.”

Screen readers, of course, are just one form of accessible technology. Many visually impaired web users use alternative methods on top of existing software, such as toolbars and screen magnifiers, which would not register as screen readers. Still other visually impaired users browse the web through specialized assistive hardware which, like any given make or model of laptop, cannot be detected on its own.

Even if screen readers were detectable as standard, the technique would not provide an accurate picture of how users interact with a web page. “Attempting to second guess how people actually interact with a site is a dangerous game,” warns accessibility expert Graham Armfield. “Everyone is different and if you went out to try to tailor the site, you might not be responding to how someone actually does use a website.” For example, one screen reader user debating the issue noted “none of this [tracking] will track my most common failures. When Dragon or my keyboard try to access an actionable link and fail because it has no role or tab index, that’s not going to be tracked.”

Screen reader detection would also be a security risk. Mozilla accessibility engineer Marco Zehe wrote “there have been repetitive inquiries about exposing whether a screen reader is running with the browser, and if so, also granting access to the accessibility APIs to content JavaScript to manipulate what screen readers read to the user… Granting a web site access to these APIs would open a level of access to a computer that is just unacceptable, and unnecessary, for a web site to have.”

What do we know about screen reader users?

In the absence of real-time analytics data on screen reader use, the main source of actionable data is WebAIM’s periodic survey of screen reader users. The most recent survey was carried out in July 2015 and received responses from over 2,500 screen reader users around the globe.

The survey indicated eleven popular combinations of screen reader and browser (JAWS with Internet Explorer, NVDA with Firefox, Window-Eyes with Internet Explorer 10+, and so forth) as well as hundreds of other combinations. There are also five popular combinations of mobile screen reader and browser usage. To complicate the mix further, 53% of desktop users and 13% of mobile users reported using more than one screen reader. Among screen reader users, Internet Explorer usage across all versions remains substantially higher (53.5%) than it does in the general population.

An interesting fact noted in the survey was that 18% of screen reader users had not updated their software in the past year. With some specialist screen readers costing hundreds of pounds, upgrading is not always an option, leaving users stuck with older software which has not evolved with the web.

The WebAIM data reveals several problems with the prospect of screen reader detection. The first is that detection would not add one line of data to the mix, but dozens. Collecting an ostensibly useful chunk of data, only to find you are actually dealing with hundreds of smaller pieces of information, simply defeats the purpose. Another problem is that the insights offered by detection will be difficult to identify as strictly accessibility issues; rather, they will appear as the same tired issues of browser compatibility snags which have bedeviled web developers for a decade. Finally, when the accessibility issues are caused by a user being stuck with an expensive older screen reader, there is nothing the developer can do to fix that.

Mozilla collects some basic and anonymized data about screen reader usage of the Firefox browser on its Mozilla Telemetry site, however this data can be erratic and lacks useful context. Some real-time tracking is also possible through mobile apps. For example, in Android and iOS, it is possible to pass back a value indicating that the user is running the screen reader function. These assistive functions reside in the operating system of the mobile device rather than as separate standalone software. However, utilizing these options raises obvious privacy concerns.

Screen reader detection and privacy

The most troubling implication of tracking screen readers was eloquently described by blind developer Leonie Watson, who said the debate “isn’t really about screen reader detection…What is really being discussed is disability detection, and that is a very different thing altogether.” Identifying a screen reader user through analytics is, in most cases, identifying a previously anonymous individual as a person with a disability. That part of their identity, and whether they wish it to be detectable online, is and always should be a matter of their personal choice. Marco Zehe, another blind developer, was equally passionate in saying that tracking “would take away the one place where we as blind people can be relatively undetected without our white cane or guide dog screaming at everybody around us that we’re blind or visually impaired, and therefore giving others a chance to treat us like true equals.”

Indeed, it’s a measure of the delicate privacy territory we are treading on that in a lengthy Github thread discussing the issue, only two accessibility developers were willing to publicly admit that they have engaged in basic AT tracking. Let’s be clear here: their intentions in doing so were completely ethical and professional. That of course cannot be said for everyone. Many developers are justifiably convinced that analytics data on people with disabilities could be used as a form of soft discrimination. Analytics expert Marissa Goldsmith paints a scenario where “if screen readers were easy to track, it would not take much for an organization to see that a screen reader user in a certain city looked at a job application. They could then compare it to incoming CVs and opt not to call them in for an interview.” The disability, of course, would not be cited in the job rejection.

Screen reader detection also raises the ever-present issue of third-party data sharing. It is not hard to imagine a scenario where a health insurance company used analytics data indicating the use of assistive technology to raise a user’s premiums. Ad networks advertising through social media sites would be automatically be passed data about users which would possibly include additional personally identifiable details. And in a political environment which has all but declared war on people with disabilities, one could easily imagine a user with multiple complex disabilities having their benefits cut because the browser stats on a mandatory government web site did not detect assistive technology in use.

The argument for screen reader detection

Despite those active privacy concerns, some blind and visually impaired users feel the trade-off is a price worth paying. One of them is Amanda Rush, a blind web developer who is strongly in favor of screen reader detection. “I realize that there’s this enshrined idea of people with disabilities not having to disclose information about their disability,” she says, “but I also think that, if we’re going to move web accessibility forward, and get more developers to go along with it, we need to start providing some actual data to work with, not just the broad statistics we trot out during every ‘why accessibility is important’ conference talk.”

Rush notes that screen reader users provide unique technical insights beyond everyday visitor stats. “We can require some very significant time and training, because we’re not just dealing with plain HTML, we’re dealing with PHP, javascript, and we still need to take things like performance, translation-readiness, and actual project goals into consideration. Oh, and security too.” She feels this makes them uniquely poised to provide practical, actionable data for developers.

She is not alone. Some accessibility developers feel that exempting screen reader users from detection is itself a form of discrimination. “We analyze user behavior, and make improvements to the experience based on that data,” said one programmer. “Because screen readers are a black box, we can’t address the usability in the same manner.” In an industry which lives or dies on the data at hand, exempting information about real-life disability use from consideration is, to an understandable extent, cutting off your nose to spite your face.

The argument against screen reader detection

While visually impaired users like Amanda are comfortable trading some privacy for the possibility of a more accessible web, others note problems that are worth considering.

Outwith privacy issues, the strongest argument against the technique, explains Armfield, is that “in the scheme of things, blindness is a smaller section of the number of people who can be affected by web accessibility issues.” An emphasis on screen reader detection for the visually impaired would perversely exclude the needs of web users with other disabilities. For example, a woman with motor neurone disease using a Tobii Eyegaze system to browse the web is every bit as disadvantaged by popups and overlays as a blind user, despite having full visual acuity. Tracking only one form of disability risks creating a slippery slope where poor web practices are deemed acceptable for users with some disabilities but not others.

Furthermore, Adrian Roselli describes a scenario where the low numbers which would be detected through screen reader analytics would be used not to improve accessibility for the visually impaired, but to justify eliminating it. “Just as web devs would build features that blocked, say IE6, and then turn around to point out that IE6 usage had dropped on their sites, we can expect to see the same thing happen here,” he said. Amanda Rush observed that in a corporate environment, “the question of how many disabled people are using my website, because you’re asking me to allocate resources, becomes part of the cost-benefit analysis,” yet in era of swinging budget cuts, screen reader data could indeed be misused to rationalize removing the job altogether.

Another problem with the technique is that it would generate a considerable amount of bad data. While the use of a screen reader is a likely indication of a visual impairment, it is not proof on its own. For example, like many designers, I use the NVDA screen reader to test build sites for clients, and I have better than 20/20 vision; tagging me as a screen reader user, with all the anonymized analytics data my visit would involve, would distort the true picture. Indeed, 13% of respondents to the latest WebAIM survey indicated the use of screen readers for a disability other than a visual impairment - common ones include dyslexia and autism - but these users would still be labelled as blind or visually impaired in any tracked analytics. (Imagine that bad data being passed to a third party!)

What about the legal implications of detecting screen reader use? One question is whether analytics data indicating the use of a screen reader would qualify as medical data, a category rightfully afforded protections above and beyond normal data protection and privacy rules. As screen reader use is not itself an indication of a disability, it is highly unlikely that the resulting analytics would qualify as such.

However, legal issues still remain. For example, would detecting screen reader usage bring web sites into violation of HIPAA, the US privacy law for medical data? It certainly could be construed as a violation if the web site or app contained personal health information (PHI) which was being passed back to the server along with the analytics data. To be safe, any page collecting or displaying identifiable medical information, such as an electronic medical record, should not contain any calls to outside services such as analytics or social media. By that rule, all this being a theoretical discussion, it would still be possible to track screen reader usage as a general visitor trend on a web site’s landing pages without putting personal medical data at risk of being identified or passed to a third party. Additionally, any resulting visitor statistics - for example, a screen reader user from Michigan viewed these pages about disability benefits - would need to be separated from identifiable personal information and anonymized.

While legal questions have simple solutions on medical sites, they are less clear on sites such as social networks. One indication of the way the proverbial wind is blowing lies in the GDPR’s protections for health and medical data. If any case could be made for screen reader use merely indicating a health condition, the risks under the new GDPR regime are enough to warn any prospective user off the technique.

In conclusion

So what’s the consensus? It would be truly wonderful if detecting anonymized screen reader usage in analytics provided a wealth of actionable data for conscientious web developers. By having non-identifiable screen reader statistics close at hand, developers would find it harder to ignore the everyday accessibility needs of their audiences too.

However, when all the aspects of the issue are taken into consideration, screen reader detection emerges as a technique which carries serious technical, ethical and privacy risks. Even when an individual developer’s aims are true, the risk of the collected data being misused by others, be that advertisers, social networks, or governments, is overwhelming. Detecting screen readers - which, after all, means identifying people with disabilities - also raises privacy concerns which only stand to grow as data protection obligations evolve. Finally, even if it was readily available, the collected data would not in fact provide the easy or actionable insights developers need to see at all. Added together, these factors create a gray area of risk, liability, and inconvenience which outweighs any good that the insights gained from the analytics might provide. On balance, then, the technique is sadly not worth adopting.

That conclusion leaves us exactly where we started: without analytics data, what is the best way to gain insights into the ways people with disabilities use the web? How can we, as designers and developers, better serve the accessibility needs of our audiences? The answer is as it always has been: just ask them. The answers those everyday users provide should inspire solutions through what the accessibility community rightly maintains as the way forward: in Watson’s words, “well structured, standards compliant and inclusive content delivery.” Direct engagement with screen reader users forces us to confront their needs as individuals, not as aggregated data points, and we all will be better off for the experience. “I know the argument in favor of screen reader detection is so that developers can know their audience and adapt,” concludes Goldsmith, “but if they are truly good, they make it work for all technology.”


Related Tags: