web
This feed contains pages in the “web” category.
Over the last few years, video has become more and more commonplace on the Web, usually using Flash (other plugins, Quicktime et al., seem to be fading into obscurity).
I absolutely bloody hate this. There’s a number of reasons; firstly, Flash is a proprietary format (as are almost all the other formats commonly used for video; the non-proprietary ones are still exceedingly rare). That means that if I’m to view these videos I’m at the mercy of the proprietary developer. In practice, that means I can’t view these videos on my desktop computer, as I run an operating system that Adobe choose not to bother supporting (FreeBSD). I’m not willing to stop using the operating system I prefer just to watch the occasional video.
Secondly, I’m impatient. I want to get the information at my own pace. I want to speed up if necessary, or slow down. This is easy in text; less so with video. At best, it’s inconvenient, and with streaming video it may not even be possible to skip forward. It’s often also difficult to find the exact point in the video you’re looking for; in a text, there’s almost certainly a Find function. (Related: I probably can’t watch a video at university or at work, even if it’s relevant to what I’m doing, if I don’t have headphones with me. If I need the information, I’d have to wait until I got home then do work-related stuff there — no, thanks.)
Thirdly, and while this doesn’t affect me personally, it most likely affects many more people than the other two: accessibility. If you’re relying on videos to promote your cause, you’re excluding blind people (if the visual content is important) and/or deaf people (if the audio content is important). A textual version, with images if necessary (and descriptions of the images if possible) is usable by both blind and deaf people (with a screen reader if necessary).
If it’s the presentation that matters — a work of art, for example — then by all means use video if that’s what you want to use. If it’s the content that matters, however, if you want people to buy your product, or become involved in your project, or support your cause, then by using video to the exclusion of more accessible formats, you are limiting your audience; how is that at all productive when you’re trying to promote something?
(An addendum: I’m not necessarily saying you shouldn’t use video at all, though I’d not mind. But you absolutely should provide an alternative version if at all possible, for people for whom video is problematic.)
Two things that really bug me are images that have alt text when they shouldn’t (ones that are purely decorative and don’t add anything to the page) and ones that don’t when they should.
The first is annoying, because when I’m using a text-only browser (which is most of the time, except at work) this type of image just clutters up the page. A common sub-case is having a blank alt attribute (alt="") which is presumably meant to hide the image from text-only browsers or screenreaders; I’ve added a hook in ELinks to filter these out.
The second is annoying, because these are images that I might want to see, but having no alt text makes this needlessly difficult.
A particularly egregious example is Wikipedia, where thumbnails have a blank alt, and so don’t show up even though they’re relevant to the content of the page, but the surrounding anchor element has the relevant description in its title attribute. What on earth!? Why does this make sense? Obviously the developers realise this information is useful, so they put it in the page, but not in the place where it would be most useful.
It’s not a difficult concept: if an image is relevant to the content of the page, it needs a description in the alt attribute. If it’s not relevant, it should probably be defined via CSS anyway, or left out altogether (either way, I won’t see it).
The ELinks browser has the nice feature of being scriptable in a number of languages (ruby, perl, guile, and lua, plus python in the development version). I was browsing Amazon and getting annoyed at the adverts and whatnot, and also single-pixel transparent GIFs on that and some other sites (I force all images to be displayed, so that I can open them in an image viewer even if they have no ALT text), so I decided to do something about it.
The really useful bit in this case is ELinks::pre_format_html_hook, which is passed a URL and the page content and expects the page content (modified or unmodified) to be returned.
I originally tried using regular expressions, which works fine for removing images (/<img [^>]*src="[^"]*(spacer|transparent-pixel).gif"[^>]*\/?>/ is what I used, for the record). However, for more complex structures this approach is less useful, as it’s not possible with regular expressions to make sure that the nesting of several elements is correct (i.e. that the number of elements that close is the same as the number that opened).
Enter Hpricot. Hpricot is a very forgiving HTML parsing library, which allows you to use XPath or CSS selector syntax, as well as (I think) a DOM-style API. You can do things like doc.search(‘div.nonmemberEnclosure’).remove, which removes the Amazon Prime nag box, or more complicated things.
My hooks.rb, therefore, contains the following:
def ELinks::pre_format_html_hook(url, html)
require 'rubygems'
require 'hpricot'
doc = Hpricot(html)
if url.grep(/amazon\.co(m|\.uk)/)
doc.search('div').collect!{|n|n if /A9Ads/ =~ n[:id] }.compact.remove # ads
doc.search('img').collect!{|n|
n if /(transparent-pixel|navPackedSprites)/ =~ n[:src]
}.compact.remove # random pointless images
doc.search('div.nonmemberEnclosure').remove # Amazon Prime nag box
doc.search('div#more-buying-choice-content-div//table').remove # tl;dr
elsif url.grep(/smile\.co\.uk/)
doc.search('img').collect!{|n|
n if /(blackarrow|littlesmile).(gif|png)/ =~ n[:src]
}.compact.remove
end
doc.search('img').collect!{|n|
n if /(spacer.gif|doubleclick.net)/ =~ n[:src]
}.compact.remove
html = doc.to_html
return html
end
I also wanted to test out the Python scripting capabilities of ELinks 0.12, so came up with the following, using BeautifulSoup:
from BeautifulSoup import BeautifulSoup
def pre_format_html_hook(url,html):
doc = BeautifulSoup(html)
if "wikipedia.org" in url:
for e in doc.findAll('td',{'class':'mbox-image'}):
e.extract()
return doc.prettify()
Google are due to release a new browser, called Chrome, tomorrow. So, here are my initial (i.e., pre-release) thoughts on it, based solely on what they’ve said about it so far.
- Initial release will Windows-only; Mac OS X and GNU/Linux versions coming Real Soon Now. Nice to see that even Google, with their thousands and thousands of GNU/Linux machines, treat anybody not using Windows as second-class citizens. Hardly unexpected by now, but still irritating.
- I’m glad to see that they’ll be releasing it as free software; hopefully, they’ve the sense to use the GPL or something, rather than a new licence of their own design.
- I’m not convinced by the tabs-as-separate-processes thing; is there really an advantage over tabs-as-separate threads, that outweighs the additional overhead of separate processes? I admit that I don’t know enough about memory management to seriously evaluate this one, though.
- They make a big deal about webapps, but say nothing of the client side; will there be anything along the lines of Firefox’s extensions API? Gem suggests supporting Firefox extensions directly, which may be difficult without XUL support; an equivalent API is a must, though.
Beyond that: we’ll see?
Update: um, today, apparently. About to try it out under WINE.
From http://news.bbc.co.uk/1/hi/technology/7593106.stm:
"The browser landscape is highly competitive, but people will choose Internet Explorer 8 for the way it puts the services they want right at their fingertips, respects their personal choices about how they want to browse and, more than any other browsing technology, puts them in control of their personal data online," he said in a statement.
—Dean Hachamovitch, general manager of Microsoft’s Internet Explorer.
Ah, good old Microsoft: never one to let reality get in their way; of all the browsers that those statements could apply, MSIE would be the least likely.
It’s not often you see something that manages to perpetuate stupid beliefs about gender at the same time as demonstrating yet another way to invade someone’s privacy with Javascript, but this article manages it. Apparently, it looks at your browser history and guesses whether you’re male or female based on the sites you’ve visited. Now, I’m not convinced that there’s a significant gender bias for most sites, and looking at the results it looks like a sizable proportion of them were wrong ("oh noes ur site thinkz im a gurl!!!!111"). It bugs me that people even bother, though.
What’s more concerning, as Simon points out, is that apparently any site that can use javascript (i.e., any site you don’t disable it for) can find out what sites you’ve been to just by creating a link and checking whether the CSS style is :visited. I think I’m going to have to install NoScript again, despite having to use Javascript for work…
Recently I’ve been thinking about ease of use of interfaces. As you may know, I’ve a Flickr account where I post my photographs; I also have a deviantART account for the same purpose.
Generally, when I take pictures, I post any that are reasonably good to Flickr without even thinking about it. I can use the web form and upload five at a time, or I can mail them in; I have my own script for mailing them that I may post at some point.
Uploading to deviantART requires me to use the web form and upload one at a time, and go through a lot of rigmarole that’s not necessary with Flickr — for example, Flickr lets you set a default CC licence for your pictures; deviantART does allow you to specify one, but you must do it individually for each picture; it’s not possible to set a default.
I didn’t really consciously think about it; I just uploaded them to Flickr because it’s easy.
It’s also easy to do things with them; Flickr, like any good Web 2.0 site, has an API that I’ve hardly even begun to look at, but it means I can follow my friends’ activity from the comfort of my mail client, and any new photos get shown on my Facebook profile. deviantART has RSS feeds, but most of them are so well-hidden as to be completely useless.
Something to think about for my final-year project, or any other web stuff I happen to write in the future — multiple access methods for data, both incoming and outgoing.
Stuart Langridge writes about the possibility of reigniting the browser wars, and why it would be a bad thing:
When browser manufacturers are told "go ahead and innovate — we want to see progress", it’s jolly difficult for them to not think "hey, I know, why don’t we take this opportunity to provide something that we can do and other browsers can’t? Then, when people start using it, we’ve locked all their users into our browser!" There are corporate executives the world over furiously masturbating themselves into unconsciousness at the very thought of that technique being open to them again.
the browser’s navigation functions and keyboard shortcuts have been disabled for security reasons and because the internet banking service has been designed to be more accessible to all customers, including those with disabilities.
Apparently, because some people with disabilities may not be able to use the back/forward buttons, nobody can.
Bad bank! No biscuit!
(The point about security is fair enough, I suppose.)
Dan,
You mean there’s actually a significant number of people who use anything older than IE6? It’s come with pretty much every new computer sold for 5-6 years. I’d be perfectly happy with a 7-year-old computer, but as is often pointed out, I’m not a normal person, and I wouldn’t be using the OS it shipped with (or any OS that was released in 2001, for that matter; Debian Woody in 2004 was bad enough).
Personally, if I wasn’t allowed to reject anyone using IE0.1-6.9…, I’d dump them into a simpler version of the page (still similar-looking, but not necessarily identical). And it’s not like Netscape 4 would notice correct HTML if you beat jwz around the head with it.