‘Testability and Validity of WCAG 2.0: The Expertise Effect’ at #assets10

ASSETS 2010 LogoRoll-up, roll-up for our paper at the ‘12th International ACM SIGACCESS Conference on Computers and Accessibility‘, this year in Orlando Florida!

We’ll – well Giorgio will – talk about expertise in the context of the testability and validity of WCAG 2.0:

Web Content Accessibility Guidelines 2.0 (WCAG 2.0) require that success criteria be tested by human inspection. Further, testability of WCAG 2.0 criteria is achieved if 80% of knowledgeable inspectors agree that the criteria has been met or not. In this paper we investigate the very core WCAG 2.0, being their ability to determine web content accessibility conformance. We conducted an empirical study to ascertain the testability of WCAG 2.0 success criteria when experts and non-experts evaluated four relatively complex web pages; and the differences between the two. Further, we discuss the validity of the evaluations generated by these inspectors and look at the differences in validity due to expertise.
In summary, our study, comprising 22 experts and 27 non-experts, shows that approximately 50% of success criteria fail to meet the 80% agreement threshold; experts produce 20% false positives and miss 32% of the true problems. We also compared the performance of experts against that of non-experts and found that agreement for the non-experts dropped by 6%, false positives reach 42% and false negatives 49%. This suggests that in many cases WCAG 2.0 conformance cannot be tested by human inspection to a level where it is believed that at least 80% of knowledgeable human evaluators would agree on the conclusion. Why experts fail to meet the 80% threshold and what can be done to help achieve this level are the subjects of further investigation.

I’m looking forward to this years conference, but it’s a full programme with a Doctoral Consortium on the Sunday and the Student Research Competition running through the entire event. I’ll be writing a trip report once I’m back but I hope to see some really transformation research with concrete contributions this year; especially after the WWW Human Factors ‘let-down’ I’m hoping!

Commercial / Community Scraping! #hhhmcr #a11y #accessibility

Screen Scraping and Trancoding

Screen Scraping and Trancoding

I was recently contacted by ‘ScraperWiki’ who have an event in Manchester called ‘Hacks and Hackers Hack Day’, they say:

We hope to attract ‘hacks’ and ‘hackers’ from all different types of backgrounds: people from big media organisations, as well as individual online publishers and freelancers… The aim is to show journalists how to use programming and design techniques to create online news stories and features; and vice versa, to show programmers how to find, develop, and polish stories and features. All sorts of data was scraped and played with at our past events: in Liverpool, projects included mashes of police, libraries and courts data. Birmingham saw lots of health-related projects, as well as scraping of political party donor and leisure centre information.

However, looking further into their aim it seems that “ScraperWiki, as a platform to scrape and store public data in a structured and usable format.” – now we’ve seen data scraping from PiggyBank, and the BBC RDF triple-store, but this seems to be an engine to scrape lots of resources and make those available for some end-purpose. In Web accessibility, scraping has been used for a long long time. Initial attempts at screen reading technology only read the screen as presented to a visual use, and was called ‘screen scraping’ as it only produced superficial information regarding the text being translated. As the visual complexity of Web pages increased these screen–readers become inadequate because of the reliance of Web documents on context, linking, and the deeper document structure to convey information in a useful way. In this case Web browsers and Web page readers for visually impaired users have been created to access this deeper document structure, by directly examining the XHTML or the Document Object Model (DOM). By examining the precise linguistic meaning of the text it was hoped that more complex meanings (associated with style, colour etc.) could be derived. However, when interacting with complex Web documents these readers, although better than screen-scrapers, still do not enable an understanding of the meaning of the underlying structure which is vital for the cognition of information.

I wonder if there is some way we could now use their scrapping technology in combination with our (community’s) own to build accessible and semantically structured data and store it centrally. Now it seems from their code:

######################################
# Basic PHP scraper
######################################

require  ’scraperwiki/simple_html_dom.php’;

$html = scraperwiki::scrape(“http://scraperwiki.com/hello_world.html”);
print $html;

# Use the PHP Simple HTML DOM Parser to extract <td> tags
$dom = new simple_html_dom();
$dom->load($html);

foreach($dom->find(‘td’) as $data)
{
# Store data in the datastore
print $data->plaintext . “\n”;
scraperwiki::save(array(‘data’), array(‘data’ => $data->plaintext));
}

That they are using a simple template based approach, not really as heterogeneous as our stuff based on css, and maybe not as rich as IBM TRLs stuff either. But non the less this is an interesting development.