Colin's Journal: A place for thoughts about politics, software, and daily life.

Colin's Journal in October 2003

Thursday, 30 October 2003

12:05 PM - Football news

It's very rare that I take any interest at all in the football world, but the news that Perugia is looking to sign a top female player is great news.  It's only a first step in what will be a very long process of making professional football accessible to women and men equally.

Wednesday, 29 October 2003

11:20 PM - Agonising over choice

One year and ten months ago I purchased a digital camera.  I had taken virtually no photos in my life prior to making the decision to buy a digital camera, but hoped this would change once I had bought one.  Since purchasing my A20 I've take roughly two and half thousand pictures, so I'm going to count this decision as a success.

I take photos mainly when going somewhere or during an event.  I don't carry my camera with me all the time, and I don't make trips specifically for the purpose of taking photographs.  My camera is completely automatic, does fairly well for portraits, and works poorly in low light.

In much the same way as moving from film to digital was the trigger to take more photos, I'm now contemplating a move to a better digital camera in order to turn photography into something of a hobby.  The number of good photos on the 'net has been part of the inspiration behind this, particularly the amazing photos on Sensitive Light.

A better digital camera means moving to something that allows full automatic, aperture priority, shutter priority, and full manual operation.  I also want something that works well in low light conditions, is fast (as in responsive), and can take different lenses.  The best camera that fits this description is the Canon Digital Rebel (AKA 300D).  This can be had for a rather large chunk of change and for $100 more you can get an 18-55mm lens with it.

My dilemma is whether I'm better off spending the, not inconsiderable, extra money to get a Canon 10D instead.  There are several advantages of getting the 10D:

  • There are several reports that the 300D underexposes pictures, particularly when using flash indoors.  I've not seen anyone reporting similar problems with the 10D.
  • The 10D has a higher quality case with better layout of controls (e.g. two dials versus one).
  • The 10D has a few extra combinations of features that the 300D is lacking.
  • The 10D can buffer more images before writing to the CF card.

There are also two downsides to the 10D: It's heavier and it doesn't come with a lens.  The lack of lens is not just an expense problem.  Affordable lenses start at 24 or 28mm, rather than the 300D's bundled 18mm, which on these digital cameras makes more of a difference than on a 35mm camera.  The challenge of trying to choose a good first lens for the 10D is also a problem.  I'm leaning towards the 28-105mm/3.5-4.5 USM because it's had better reviews than the 24-85mm/3.5-4.5 USM and is cheaper than the 29-135mm/3.5-5.6 USM IS.

With roughly a CAD$1000 (~£500) price difference between the two combinations (300D with 18-55mm and 10D with 28-105mm) it is proving to be an agonising choice.  I'm not adverse (or rather not too adverse) to spending extra for better equipment, but this is for a hobby that I don't yet have.

Sunday, 26 October 2003

10:56 AM - The fourth annual night of dread

Last night was the fourth annual night of dreadLast years account of the night of dread gives some idea of what it's about, and this year I managed to take some photos worth posting.

Taking photo's of this event with my digital compact camera is very difficult.  The camera really doesn't work to well in low light due to a small lens and a maximum ISO of 150.  This means that I need to use the flash almost exclusively, which is hard to do in a large open space.  Consequently I took a lot of pictures, and very few actually turned out usable.

A small selection of these can now be found in a new public album "The fourth annual night of dread."

Thursday, 23 October 2003

9:54 AM - The VOIP Market

Vonage has received a considerable amount of press recently for its VOIP telephone service.  The offering is certainly very compelling: for $35 per month you get unlimited national and Canadian calls and your phone can have one of a selection of area codes (or even multiple numbers from different area codes).  The only thing you need is a high speed Internet connection, which incidentally means you can run this US phone number from anywhere in the world.

The shake-up that this will bring to the already very competitive US phone market is significant.  With 39% of all residential Internet connections in the US being broadband there is a large and growing number of people who can take advantage of this type of service.  Residential pricing is going to continue to fall, so businesses offering this type of VOIP service are going to have to keep a very tight lid on costs to maintain profitability.  Expect to see all billing being done on-line with simple price plans used to keep software complexity low.  There will also be a significant drive by VOIP providers to target small to medium businesses where extra services can be used to differentiate against competing products.

It is hard to predict how well this model will work in Europe, an equally sized, but very fragmented market.  The biggest challenge is going to be termination costs.  In Europe calls to mobiles are not paid for by the owner of the mobile, but rather the caller ("calling party pays").  This means offering a flat rate for all calls will be difficult, because the costs incurred when calls are made to mobile phones are significantly higher than those made to a land line.

The difference in termination charges are daunting, according to Oftel the termination cost to a land line is approximately 0.5ppm (pence per min), compared to 5ppm for a mobile termination.  US calls to European mobiles are usually charged at the same rate as landlines, which means that either carriers are subsidising international mobile calls with revenue from land line calls, or mobile carriers are charging less for international termination.  I doubt that this pricing can be extended to cover calls from within a country, even if the calls are routed over the Internet to the US, so mobile termination is likely to be just as expensive for VOIP carriers as everyone else.  With mobile penetration rates of 80% in parts of Europe (e.g. UK, Italy) will an offering of unlimited land line calls be attractive?  If priced correctly it could be , especially if the offer extends to unlimited calls to most of Europe and North America.

Tuesday, 21 October 2003

9:36 PM - Writing web pages in OpenOffice

Writing web page content in OpenOffice is a lot easier than writing pages in a text editor, even though I've been using HTMLText rather than raw HTML.  The PubTal OpenOffice plugin (available here) works well enough that I could convert my remaining pages over to using PubTal.

I've been avoiding moving the last of my archived content over to PubTal because it's stuff that I don't really care about any more.  With OpenOffice I could just drag and drop existing pages out of my web browser, and then clean up a few things like the relative links.  The main benefit of having done this is that all of the pages on my site now validate, and they are all produced using the same template.

I haven't decided yet whether to convert other pages from HTMLText to OpenOffice, but it's tempting for ease of maintenance.

Saturday, 18 October 2003

5:55 PM - A change in direction

I am having to reconsider the use of AbiWord as an editor for web page content.  The reason for this is not due to a flaw in the idea itself, but rather the quality of the AbiWord software.  Even the stable version (2.0) has some significant bugs that make it untrustworthy for handling important content.

The two most serious problems I've hit are:

  • AbiWord sometimes generates invalid XML.
  • AbiWord occasionally writes files that it then can not read.

I will continue to maintain and distribute the AbiWord plugin in the hope that future versions of the software will address these fatal defects, but I am now going to look at alternative editors.

The most promising is OpenOffice.  The software is well maintained by a large team, is regarded as being of high quality, and the file format is very well documented.  My initial impression of the file format is that it will be easier to handle than the AbiWord format turned out to be.

The biggest drawback to attempting an OpenOffice PubTal plugin is the huge numbers of features that OpenOffice has.  Most of these features will not translate well into a web page, and so will have to be ignored by the plugin.

Thursday, 16 October 2003

11:45 PM - First outing of the AbiWord Plugin

I've made available the first version of my AbiWord content plugin.  This is an experimental release which has undergone light testing.  It requires PubTal 2.0: Download AbiWord-Content Plugin.

Features currently supported include:

  • Heading 1,2 and 3.
  • Text styles (bold, italic, etc).
  • Bullet and Number lists.
  • Margin-left offsets.
  • Hyper-links and anchors.
  • Endnotes and Footnotes (which become Endnotes).
  • Tables.
  • PlainText (which is wrapped in <pre><code>)

Things that I haven't been able to get working yet:

  • Images.  There's nothing particularly sensible I can do with these because the source file isn't preserved by AbiWord.
  • Different kinds of list (diamond, etc).  A bug in Abiword stops good XML being produced for these in the version I'm using.
  • Numbered Headings and Sections.  The same XML bug is stopping support for these as well.
  • Page headers and footers.  I'm not sure what should be done with these, especially as they can change from page to page and section to section.
  • Font name,colour, etc.  I could add support for these but I don't know whether it's a good idea.  In the web world changing font should really be done through a site level CSS style sheet rather than on a per-page basis.  I might make this an option in a future version.

If you download and use this plugin please email me and let me know whether it works for you.  I'm using AbiWord 1.99.5, and aside from the unsupported stuff, it seems 100% reliable so far.

Monday, 13 October 2003

10:27 PM - Writing content for web pages

Last week I wrote a short article on the importance of a template based solution for web page maintenance, and the sort of innovations that could be made to ease template design.  At the end of that article I noted two other problems with web publication tools today: markup of the content, and the handling of non-journal style pages.  This article addresses some thoughts I've had on the first of these two problems.

The most popular template based systems today are those provided by blogging software.  They allow an author to enter new web content either using a web browser (thin client) or a small application (fat client).  When the author decides to include a link, make some text bold, or apply some other markup the most common solution is to have them enter HTML codes.

Alternatives to entering the HTML manually include utilising IE specific enhanced textarea widgets, using a different markup language such as Textile, or providing buttons that automatically insert the HTML tags.

The markup in which an author's content is written, and the markup in which it is published, must be treated separately, even if they happen to be the same.  The reason for this is that an evolving web also means evolving markup for web pages, for example the transistion between HTML4 and XHTML1.  When an author of a site chooses to move their pages from HTML to XHTML the software they use needs to be able to rebuild old pages using XHTML.

For software to be able to perform transformations of markup from one language to another it needs to be able to parse the original markup perfectly.  If the original markup is HTML this poses two problems: writing and parsing correct HTML programmatically is fairly difficult, and if users enter markup by hand then there will be errors in it.  The inability to convert cleanly to a new publishing markup language is a major defect in all of the blogging tools today that store and accept content using HTML markup.  It is a hole that can be coded out of, but never in a 100% satisfactory way.

The solution to this problem requires a combination of three things:

  1. The adoption of a strict, easy to parse, format for authoring content.
  2. The rejection of any content which does not adhere completely to this strict format.
  3. The use of tools, rather than users, to generate this strict format.

The critical piece missing today of these three items is the third one: a GUI tool that allows the markup of web content in a strict, easy to parse, format.  The bare minimum that such a tool should be able to support includes: links, text decoration (bold, italic, etc), lists (bullet and numeric), and images.  There are lots of other types of markup which would be very useful (e.g. tables), but for most web content this limited list would suffice.  Today there are many weblog authors who have tools and knowledge such that they don't use the most basic of markup in their content.  A GUI application supporting these features, and whoes output is in a strict format, would be enough to bring painless, sustainable content authoring to a much wider audiance.

Writing such a tool, while not technically difficult, does take time and effort.  I hope to one day soon find an open source tool to do this.  In the meantime however I have a partial solution: AbiWord.

AbiWord is an open source word processor.  A word processor isn't really the best choice of tool for editing web content, simply beacuse it has too many features that are not needed or do not apply to the web.  For example AbiWord supports Mail Merge, multiple document sections with different headers and footers on the pages, and other such features that are needed for document creation, but not for editing web content. 

Despite these drawbacks the use of AbiWord does bring some significant advantages:

  • It is a full GUI: the user has no opportunity or need to edit markup themselves.
  • It has many useful features such as spell check as you type.
  • The output format is in XML, making it fairly easy to convert into HTML/XHTML or any other markup language.

To see whether or not this can work I've written a plugin for PubTal which takes AbiWord documents, converts it to HTML markup, and then publishes it using PubTal templates.  There is still much testing to be done, but it now handles: headings, text decoration (bold, italic, underline, strikeout, overline, superscript, subscript), pre-formated text, hyperlinks, bookmarks (anchor's), bullet lists, numeric lists, footnotes/endnotes, and tables.

The biggest missing feature is the ability to include images in the content.  The problem here is that AbiWord doesn't record the original location of the image file - it just places the binary content (encoded using base64) into the XML file.  I can probably live with that restriction for most pages, at least until I can find a better solution.

12:29 PM - Spam in blogs

Spam in blog comments was always inevitable because it brings two benefits to spammers:

  1. It gets lots of people seeing their message (in the same way as Usenet spam).
  2. It spams search engines into giving their website a higher ranking.

As is clear from the discussion on Making Light it is a loosing battle to try and block comment spammers based on their IP addresses.

I'm currently thinking that there are two likely approaches to blocking this kind of spam that might stand a chance.  The first approach is to show an image of a random letter in a hard to OCR font, and then asking the user to enter the letter (or series of letters) into the form with their comment.  This is used on several large sites today, but I don't know how effective it actually is.

The second approach would be to apply statistical filtering to comments in the same way as it is used for email.  This approach has been very successful in reducing email spam getting into in-boxes as can be seen by the technique's continued roll-out.  It seems like an easy enough extension to apply this kind of filtering to comments in weblogs.

I'm sure we'll hear a lot more about weblog comment spam as time goes on.

Thursday, 9 October 2003

11:07 PM - Goings on

While it doesn't strike me as strange that Germany bans heavy lorries from it's roads on Sundays, it does seem strange that the government is working hard to maintain this ban.  Germany is struggling economically and the government has accepted that it needs to reform labour markets.  Yet when presented with a politically easy opportunity to remove an obstacle to growth such as this, it fights to keep it.

In other (rather older) news there's tax competition at work in Denmark, where tax on alcohol has been reduced significantly.  This is in an attempt to reduce the amount of booze bought in the rest of the EU and (legally) imported.  We can hold out hope that similar pressure will eventually cap the tax we see on alcohol in the UK as well.  (In an ideologically inconsistent fashion I don't care how high tax on cigarettes gets!)

We could also do with some price competition here in Ontario, where the government run monopoly keeps prices significantly higher than the UK (e.g. £3 a pint!).

Wednesday, 8 October 2003

11:18 PM - The right to vote

Until I moved to Canada I had never really considered the question of when someone should be allowed to vote, and when they shouldn't.  When you are a citizen of a country, and almost all of the people you know are also citizens the question of eligibility does not arise.

The issue is of particular importance in Latvia because 21% of Latvian residents are not citizens, and they are currently excluded from all elections including local elections.  It seems clear to me that when nearly a quarter of the permanent residents in a country are dis-enfranchised in this way that something needs to change.

As a Brit in Canada I can't vote in any Canadian election whether National or Provincial.  If I "landed" (i.e. became a permanent resident) then I would still be excluded from voting, regardless of how long I lived here, unless I took up Canadian citizenship.  Conversely I'm eligible to vote in the UK despite being out of the country for the last 3 years.

In the EU any EU Citizen is allowed to vote (even stand for office) at the local level if they are a resident.  This logic hasn't been extended to voting in national elections, and I doubt it will be any time soon.

With the increased mobility (particularly in Europe) of people between countries I would like to see the right to vote being tied to permanent residency.  The latest EU directive on freedom of movement will bring an immediate right to permanent residency for EU citizens after 5 years in a member state.  This seems to me like an appropriate length of time before someone is able to make an informed electoral choice in a country.

Tuesday, 7 October 2003

10:54 PM - New release of PubTal and SimpleTAL

PubTal 2.0 and SimpleTAL 3.6 are now available for download!  Although the changes to SimpleTAL are minimal I need to do a simultaneous release to support a new feature in PubTal.

PubTal has had many changes made.  It now supports XHTML, has a simpler configuration syntax, more content types, and better character set support.

Thanks to Florian Schulze for all of the patches and ideas!

Monday, 6 October 2003

10:01 PM - Even static pages are dynamic

Coding web pages is difficult.  It has been difficult from the start of the web and has, in some respects, become harder as time has gone on and the technologies involved have grown.  The preferred approach to making web site design easier used to be WYSIWYG (what you see is what you get), the idea being that Desktop Publishing was easy for anyone to do, so why shouldn't web page publishing be the same way?

It is easy to denounce the WYSIWYG approach because of the poor quality HTML that it tends to generate, but this is to ignore it's biggest flaw.  The problem with using WYSIWYG design is not that the resulting code is a mess, but rather that the result of the design is a page.

The problem with building a web page is that at some point you will want to change the content of that page.  Maybe you need to change your contact details that are at the bottom of the page.  Maybe the site navigation bar down the side now needs another entry.  Or it could simply be time to abandon the dark-purple on black colour scheme that looked so good when you first decided that you had something worth putting on the web.

Regardless of the motivation for wanting to update a web page there will certainly come a time when it needs to be done.  If you have one page this isn't a problem, if you have several hundred then it is a problem.  Part of the solution is to separate content from design, to keep the HTML in one place so that changes can be made once.  This solution has been known for a long time and yet it has not been a technique that many had access to.

The rise of blogging tools has brought this powerful technique to many, at least for journal style web pages such as this.  Blogging tools have made the process of publishing on the web easy enough that almost any web reader can now become a web writer, should they choose to do so.  There are still however many further improvements that can be made to make the task of publishing on the web easier.  As Felix Salmon explains in today's post, altering the templates of such blogging tools requires a significant technical ability.  My own contribution to the ease of web publication, PubTal, certainly requires users to be able to code in HTML in order to generate their own templates.

I think the problem of web page template design can be solved by allowing users to work with components that fit together to form templates.  Components can then be designed and built by those who know, or are willing to learn, the technologies behined them.  Meanwhile users can mix-and-match components to form individual designs.  Here's an example of how this might work:

  • Have a layout component as the basis for a template design.  The layout component defines areas of the screen, for example two columns and a heading, but not the content of those areas.  Multiple different layout components can be developed, and users can choose to use any one as the basis for a new template.
  • Multiple "item" components can be produced which interact with the underlying content management system to provide certain pieces of functionality, e.g. links to archived material, or the content of the latest posts.  These components can take parameters to allow some limited customisation.  Users would then specify which of the item components should go in which parts of the layout (e.g. latest posts goes in the middle column, news snippets followed by my links in the left column, etc)
  • Both the layout and item components would produce HTML with standard class and id attributes, so that the site can be "themed" using CSS (the CSS Zen Garden shows just how far CSS can take you).

Using the scheme outlined here a GUI tool could be developed that allows for easy template design using the drag-n-drop of components.  With components being distributed over the 'net there would soon be a huge variety of template designs possible, without any of the problems of normal WYSIWYG design.  The underlying technologies required to develop a system such as this are already in place, it's just a matter of writing the tools to use them (no small task).

There are at least two other problems with the current crop of web publication tools that I've not written about yet: markup of the content, and the handling of non-journal style pages.  That'll have to wait for another day.

%nbsp;

Last Modified: Thu, 30 Oct 2003 17:09:26 GMT

Made with PubTal 3.2.0

Copyright 2008 Colin Stewart

Email: colin at owlfish.com