Colin's Journal

Colin's Journal: A place for thoughts about politics, software, and daily life.

October 26th, 2003

The fourth annual night of dread

Last night was the fourth annual night of dread. Last years account of the night of dread gives some idea of what it’s about, and this year I managed to take some photos worth posting.

Taking photo’s of this event with my digital compact camera is very difficult. The camera really doesn’t work to well in low light due to a small lens and a maximum ISO of 150. This means that I need to use the flash almost exclusively, which is hard to do in a large open space. Consequently I took a lot of pictures, and very few actually turned out usable.

A small selection of these can now be found in a new public album “The fourth annual night of dread.

October 23rd, 2003

The VOIP Market

Vonage has received a considerable amount of press recently for its VOIP telephone service. The offering is certainly very compelling: for $35 per month you get unlimited national and Canadian calls and your phone can have one of a selection of area codes (or even multiple numbers from different area codes). The only thing you need is a high speed Internet connection, which incidentally means you can run this US phone number from anywhere in the world.

The shake-up that this will bring to the already very competitive US phone market is significant. With 39% of all residential Internet connections in the US being broadband there is a large and growing number of people who can take advantage of this type of service. Residential pricing is going to continue to fall, so businesses offering this type of VOIP service are going to have to keep a very tight lid on costs to maintain profitability. Expect to see all billing being done on-line with simple price plans used to keep software complexity low. There will also be a significant drive by VOIP providers to target small to medium businesses where extra services can be used to differentiate against competing products.

It is hard to predict how well this model will work in Europe, an equally sized, but very fragmented market. The biggest challenge is going to be termination costs. In Europe calls to mobiles are not paid for by the owner of the mobile, but rather the caller (“calling party pays”). This means offering a flat rate for all calls will be difficult, because the costs incurred when calls are made to mobile phones are significantly higher than those made to a land line.

The difference in termination charges are daunting, according to Oftel the termination cost to a land line is approximately 0.5ppm (pence per min), compared to 5ppm for a mobile termination. US calls to European mobiles are usually charged at the same rate as landlines, which means that either carriers are subsidising international mobile calls with revenue from land line calls, or mobile carriers are charging less for international termination. I doubt that this pricing can be extended to cover calls from within a country, even if the calls are routed over the Internet to the US, so mobile termination is likely to be just as expensive for VOIP carriers as everyone else. With mobile penetration rates of 80% in parts of Europe (e.g. UK, Italy) will an offering of unlimited land line calls be attractive? If priced correctly it could be , especially if the offer extends to unlimited calls to most of Europe and North America.

October 21st, 2003

Writing web pages in OpenOffice

Writing web page content in OpenOffice is a lot easier than writing pages in a text editor, even though I’ve been using HTMLText rather than raw HTML. The PubTal OpenOffice plugin (available here) works well enough that I could convert my remaining pages over to using PubTal.

I’ve been avoiding moving the last of my archived content over to PubTal because it’s stuff that I don’t really care about any more. With OpenOffice I could just drag and drop existing pages out of my web browser, and then clean up a few things like the relative links. The main benefit of having done this is that all of the pages on my site now validate, and they are all produced using the same template.

I haven’t decided yet whether to convert other pages from HTMLText to OpenOffice, but it’s tempting for ease of maintenance.

October 18th, 2003

A change in direction

I am having to reconsider the use of AbiWord as an editor for web page content. The reason for this is not due to a flaw in the idea itself, but rather the quality of the AbiWord software. Even the stable version (2.0) has some significant bugs that make it untrustworthy for handling important content.

The two most serious problems I’ve hit are:

  • AbiWord sometimes generates invalid XML.
  • AbiWord occasionally writes files that it then can not read.

I will continue to maintain and distribute the AbiWord plugin in the hope that future versions of the software will address these fatal defects, but I am now going to look at alternative editors.

The most promising is OpenOffice. The software is well maintained by a large team, is regarded as being of high quality, and the file format is very well documented. My initial impression of the file format is that it will be easier to handle than the AbiWord format turned out to be.

The biggest drawback to attempting an OpenOffice PubTal plugin is the huge numbers of features that OpenOffice has. Most of these features will not translate well into a web page, and so will have to be ignored by the plugin.

October 16th, 2003

First outing of the AbiWord Plugin

I’ve made available the first version of my AbiWord content plugin. This is an experimental release which has undergone light testing. It requires PubTal 2.0: Download AbiWord-Content Plugin.

Features currently supported include:

  • Heading 1,2 and 3.
  • Text styles (bold, italic, etc).
  • Bullet and Number lists.
  • Margin-left offsets.
  • Hyper-links and anchors.
  • Endnotes and Footnotes (which become Endnotes).
  • Tables.
  • PlainText (which is wrapped in <pre><code>)

Things that I haven’t been able to get working yet:

  • Images. There’s nothing particularly sensible I can do with these because the source file isn’t preserved by AbiWord.
  • Different kinds of list (diamond, etc). A bug in Abiword stops good XML being produced for these in the version I’m using.
  • Numbered Headings and Sections. The same XML bug is stopping support for these as well.
  • Page headers and footers. I’m not sure what should be done with these, especially as they can change from page to page and section to section.
  • Font name,colour, etc. I could add support for these but I don’t know whether it’s a good idea. In the web world changing font should really be done through a site level CSS style sheet rather than on a per-page basis. I might make this an option in a future version.

If you download and use this plugin please email me and let me know whether it works for you. I’m using AbiWord 1.99.5, and aside from the unsupported stuff, it seems 100% reliable so far.

October 13th, 2003

Writing content for web pages

Last week I wrote a short article on the importance of a template based solution for web page maintenance, and the sort of innovations that could be made to ease template design. At the end of that article I noted two other problems with web publication tools today: markup of the content, and the handling of non-journal style pages. This article addresses some thoughts I’ve had on the first of these two problems.

The most popular template based systems today are those provided by blogging software. They allow an author to enter new web content either using a web browser (thin client) or a small application (fat client). When the author decides to include a link, make some text bold, or apply some other markup the most common solution is to have them enter HTML codes.

Alternatives to entering the HTML manually include utilising IE specific enhanced textarea widgets, using a different markup language such as Textile, or providing buttons that automatically insert the HTML tags.

The markup in which an author’s content is written, and the markup in which it is published, must be treated separately, even if they happen to be the same. The reason for this is that an evolving web also means evolving markup for web pages, for example the transistion between HTML4 and XHTML1. When an author of a site chooses to move their pages from HTML to XHTML the software they use needs to be able to rebuild old pages using XHTML.

For software to be able to perform transformations of markup from one language to another it needs to be able to parse the original markup perfectly. If the original markup is HTML this poses two problems: writing and parsing correct HTML programmatically is fairly difficult, and if users enter markup by hand then there will be errors in it. The inability to convert cleanly to a new publishing markup language is a major defect in all of the blogging tools today that store and accept content using HTML markup. It is a hole that can be coded out of, but never in a 100% satisfactory way.

The solution to this problem requires a combination of three things:

  1. The adoption of a strict, easy to parse, format for authoring content.
  2. The rejection of any content which does not adhere completely to this strict format.
  3. The use of tools, rather than users, to generate this strict format.

The critical piece missing today of these three items is the third one: a GUI tool that allows the markup of web content in a strict, easy to parse, format. The bare minimum that such a tool should be able to support includes: links, text decoration (bold, italic, etc), lists (bullet and numeric), and images. There are lots of other types of markup which would be very useful (e.g. tables), but for most web content this limited list would suffice. Today there are many weblog authors who have tools and knowledge such that they don’t use the most basic of markup in their content. A GUI application supporting these features, and whoes output is in a strict format, would be enough to bring painless, sustainable content authoring to a much wider audiance.

Writing such a tool, while not technically difficult, does take time and effort. I hope to one day soon find an open source tool to do this. In the meantime however I have a partial solution: AbiWord.

AbiWord is an open source word processor. A word processor isn’t really the best choice of tool for editing web content, simply beacuse it has too many features that are not needed or do not apply to the web. For example AbiWord supports Mail Merge, multiple document sections with different headers and footers on the pages, and other such features that are needed for document creation, but not for editing web content.

Despite these drawbacks the use of AbiWord does bring some significant advantages:

  • It is a full GUI: the user has no opportunity or need to edit markup themselves.
  • It has many useful features such as spell check as you type.
  • The output format is in XML, making it fairly easy to convert into HTML/XHTML or any other markup language.

To see whether or not this can work I’ve written a plugin for PubTal which takes AbiWord documents, converts it to HTML markup, and then publishes it using PubTal templates. There is still much testing to be done, but it now handles: headings, text decoration (bold, italic, underline, strikeout, overline, superscript, subscript), pre-formated text, hyperlinks, bookmarks (anchor’s), bullet lists, numeric lists, footnotes/endnotes, and tables.

The biggest missing feature is the ability to include images in the content. The problem here is that AbiWord doesn’t record the original location of the image file – it just places the binary content (encoded using base64) into the XML file. I can probably live with that restriction for most pages, at least until I can find a better solution.

October 13th, 2003

Spam in blogs

Spam in blog comments was always inevitable because it brings two benefits to spammers:

  1. It gets lots of people seeing their message (in the same way as Usenet spam).
  2. It spams search engines into giving their website a higher ranking.

As is clear from the discussion on Making Light it is a loosing battle to try and block comment spammers based on their IP addresses.

I’m currently thinking that there are two likely approaches to blocking this kind of spam that might stand a chance. The first approach is to show an image of a random letter in a hard to OCR font, and then asking the user to enter the letter (or series of letters) into the form with their comment. This is used on several large sites today, but I don’t know how effective it actually is.

The second approach would be to apply statistical filtering to comments in the same way as it is used for email. This approach has been very successful in reducing email spam getting into in-boxes as can be seen by the technique’s continued roll-out. It seems like an easy enough extension to apply this kind of filtering to comments in weblogs.

I’m sure we’ll hear a lot more about weblog comment spam as time goes on.

October 9th, 2003

Goings on

While it doesn’t strike me as strange that Germany bans heavy lorries from it’s roads on Sundays, it does seem strange that the government is working hard to maintain this ban. Germany is struggling economically and the government has accepted that it needs to reform labour markets. Yet when presented with a politically easy opportunity to remove an obstacle to growth such as this, it fights to keep it.

In other (rather older) news there’s tax competition at work in Denmark, where tax on alcohol has been reduced significantly. This is in an attempt to reduce the amount of booze bought in the rest of the EU and (legally) imported. We can hold out hope that similar pressure will eventually cap the tax we see on alcohol in the UK as well. (In an ideologically inconsistent fashion I don’t care how high tax on cigarettes gets!)

We could also do with some price competition here in Ontario, where the government run monopoly keeps prices significantly higher than the UK (e.g. £3 a pint!).

October 8th, 2003

The right to vote

Until I moved to Canada I had never really considered the question of when someone should be allowed to vote, and when they shouldn’t. When you are a citizen of a country, and almost all of the people you know are also citizens the question of eligibility does not arise.

The issue is of particular importance in Latvia because 21% of Latvian residents are not citizens, and they are currently excluded from all elections including local elections. It seems clear to me that when nearly a quarter of the permanent residents in a country are dis-enfranchised in this way that something needs to change.

As a Brit in Canada I can’t vote in any Canadian election whether National or Provincial. If I “landed” (i.e. became a permanent resident) then I would still be excluded from voting, regardless of how long I lived here, unless I took up Canadian citizenship. Conversely I’m eligible to vote in the UK despite being out of the country for the last 3 years.

In the EU any EU Citizen is allowed to vote (even stand for office) at the local level if they are a resident. This logic hasn’t been extended to voting in national elections, and I doubt it will be any time soon.

With the increased mobility (particularly in Europe) of people between countries I would like to see the right to vote being tied to permanent residency. The latest EU directive on freedom of movement will bring an immediate right to permanent residency for EU citizens after 5 years in a member state. This seems to me like an appropriate length of time before someone is able to make an informed electoral choice in a country.

October 7th, 2003

New release of PubTal and SimpleTAL

PubTal 2.0 and SimpleTAL 3.6 are now available for download! Although the changes to SimpleTAL are minimal I need to do a simultaneous release to support a new feature in PubTal.

PubTal has had many changes made. It now supports XHTML, has a simpler configuration syntax, more content types, and better character set support.

Thanks to Florian Schulze for all of the patches and ideas!

Copyright 2015 Colin Stewart

Email: colin at owlfish.com