Colin's Journal: A place for thoughts about politics, software, and daily life.

Colin's Journal in January 2003

Thursday, 30 January 2003

12:45 AM - Performance improvements in SimpleTAL

Yesterday I thought I would take a look at the performance of SimpleTAL, and look to see if there were any easy ways of improving it.  I took a small (one screen full) template consisting of lots of ordinary text, a repeat command, and a couple of content commands, and timed SimpleTAL expanding it 200 times.  The result was around 5 templates/sec.

I had an idea of pre-parsing the template and turning it into a series of events (start tag, data, and end tag).  I implemented this fairly quickly, and found that performance improved up to the 11 templates/sec mark.  I know, however, that Zope's TAL engine can go significantly faster than this, so I started looking at it again and trying to work out how I could improve things significantly.

The current SimpleTAL implementation uses OO methodology fairly heavily.  This means that for each tag in the template an object is created, and at least one handler object (often more).  The tag is then passed to each handler which does various things to it based on the evaluated expressions coming back from the simpleTALES module.  The result is that for a given run of the template, even with the HTML/XML parsing done before hand, there is a significant amount of object creation (expensive), a large number of method calls rather than variable access (expensive) and text manipulation/parsing.

The Zope way of getting around this is to parse the template into an inter-mediate byte code.  This byte code is then used by an interpreter to generate the template, with very little in the way of object creation.  I'm now re-factoring SimpleTAL in a similar way to see how much improvement I can get, and so far it's looking promising.  I'm still along way from finishing, but I have content and repeat working well enough to run my performance template, and the result is now around 90 templates/sec - a near 95% improvement!  The unfortunate side effect though is that the code is harder to understand because it's data structure driven instead of object driven, which will make maintaining the code a lot harder.

Monday, 27 January 2003

10:35 PM - Our wind turbine

We were on our way to the airport, on a long and indirect series of flights back home for Christmas, when I first saw our neighbourhood wind turbine.  The fact that up to then we had failed to notice a 94 meter tall wind turbine protruding out of the city scape indicates how far away from it we live.  Upon return to Toronto I have spent the odd moment looking out over our deck to see whether or not we can see it from our flat, and thought several times that I maybe could see it, hiding behind a tree in the distance.

When I awoke this morning and glanced out over the deck I noticed, in the near distance, the turbine happily rotating away.  It is indeed behind a tree some distance away, but quite visible when it's turning.  Apparently it cost approximately C$1.2M to build (it's not clear if any maintenance is included in that figure) and will generate 1,800 MWh of electricity a year, which at market rates (now fixed by the government at 4.3c/kWh for most consumers) could bring in $77K per year.

As such it's only a symbolic and public awareness development, but it seems that when a large site is developed the cost of wind generated electricity is pretty comparable to a new coal station and close to that or a modern gas fired station (see the excellent British wind energy site and this rather more independent FT article).  Wind energy is only part of the answer (apparently only 10% of the UK electricities needs can be met this way before reliability becomes an issue), but it is still nice to have our own neighbourhood turbine, and even nicer to see how quickly wind power generation is being built.

Sunday, 26 January 2003

10:50 PM - A new version of SimpleTAL released

I received an email this morning from Thomas Weholt which detailed an interesting problem he encountered when using SimpleTAL.  The source of the problem turned out to be that the path resolution rules being used would match an attribute before looking at the mapping an object provides (more details here).  I spent some time looking at what might be a good fix for this, and then found that Zope behaves the same way, so for now I've left the implementation as is.

The research however got me looking at another potential problem: when content is included using the "structure" keyword any TAL attributes included will be expanded.  This allows for some very cool and interesting things, but it does present a problem when you need to display user input strings using structure.  The problem is that the user's input has access to all of the attributes of all objects that are included in the context, which is a potential security problem.  I was in two minds as to whether or not I should provide a way of disabling this, so I again checked on how Zope handled this situation, and I found that it would not expand TAL included in this fashion.  So that both behaviours are available I've now added a "allowTALInStructure" parameter which will control whether any TAL found in "structure" content will be expanded.  I also found, during the creation of some unit test cases for XML templates, that SimpleTAL 1.0 could not handle content included using "structure", thankfully that turned out to be a one line fix.

The end result is that I've just uploaded version 1.1 of SimpleTAL.  I've run through all of the unit test cases I have, and compared the results of my weblog program using the new version to the old, and everything seems to still work.

Saturday, 25 January 2003

5:57 PM - Tech round up

Here are some links to a few tech articles that have caught my eye over the last few days.  First up, a problem with RSS - it seems that lots of sites out there are not creating valid XML files for their RSS feeds, and so aggregators are being modified to no longer handle just XML, but also trying to handle mal-formed XML as well.  An article by Mark explains why this is happening, but provides no ideas on how to deal with it. 

Why should anyone care whether their RSS feeds are valid XML?  Well if they are valid XML files it means that they can be used by other programs.  If they are not valid then they can only be used by certain programs, and so the cost of software rises (fewer features because people are spending their time writing parsers to handle bad XML, or more costly to cover the extra effort).  What was really surprising about the article (on xml.com) was to note that even Scripting News occasionally publishes bad XML, which is a site run by someone who is responsible for one of the most popular RSS aggregators used!  There really is no excuse for this lack of quality in RSS feeds, XML processing tools are freely available and easy to use, so why do people insist on rolling their own that don't work?

Another story, this time an interview on the art of programming, and how it might be improved  (via Slashdot).  It's a very theoretical discussion, but an interesting one that has some relevance to my previous thoughts on RSS.  The idea expressed is that programming doesn't scale to large systems well because you only need a small bug in one piece to cause a large failure, rather than a failure that is on the scale of the original defect.  The solution proposed is that systems should communicate using pattern recognition rather than via defined protocols.  This approach would endorse the idea of having XML parsers handling bad XML rather than complaining; software modules should extract whatever information they can out of what they are given rather than demanding that it matches a well defined protocol.

An alternative that I would promote instead, is that software should demand all communication be done using well defined protocols, but that it should make no assumptions as to what the information means to others, or care about any extra information that may be present.  In practise this would mean that software should demand valid XML, and then it should extract from that XML whatever it finds interesting and ignore the rest.  This means that a bug in a software module is localised to a specific set of information, the rest of the system carries on running, with only modules that rely on that piece of information affected.

Finally, as most people reading this will already have found out first hand, the Internet was struggling today thanks to the spread of an SQL Server worm.  The thing that this highlighted to me was not the number of people running un-patched versions of the software (not unexpected), but rather the number of people who have made their databases accessible from the Internet directly.  There seems little reason why anyone would do this, but the sheer volume of traffic generated by this thing shows that a very large number of people indeed have databases running open on the network.  It's also a classic example of a small defect in one module having a dis-proportionally large affect on the whole system.  It would be relatively easy for networking switches and firewalls to match patterns of network usage that could be deemed 'unusual' and so drop packets that fall into this category.  If this is what Jaron Lanier is referring to in his interview then I can see what he means, but I would think of it as just robust programming, rather than a huge change in how we think about software.

Wednesday, 22 January 2003

10:16 PM - European Convention

A fairly good article by the BBC on the recent strengthening of the French/German alliance.  The timing of these developments is interesting, and I'm not sure what to make of it.  My personal reaction is to think about the current work of the convention on the future of the EU, and to consider that any constitutional arrangement will have to ensure that a French/German alliance does not dominate policy. 

This is also likely to be the response of the leaders of the other members of the EU - and surely France and Germany know this.  So could it be that this is exactly the response that the pair (or one of the pair) is looking for?  If so why?  I suppose it might push the federal cause a little further ahead, but I'm not sure it works that much.  Another answer might be that they are trying to concentrate minds - France and Germany are moving forward on European integration, so other countries need to come forward with commitments on integration if they don't want to be left behind.

Hopefully I'll find some ideas on this out there somewhere...

7:52 PM - It's rather chilly out

It's been rather cold out recently.  It's not cold in the British sense of "it's been really cold recently, there was a frost on the ground this morning!", rather it's been cold in the "beware you don't freeze to death on your way to work".  This morning it was around -20C and, according to Environment Canada, it's currently -16C.  That's without the wind chill.  Thankfully this morning there was little in the way of wind, but tonight there is enough to put the forecast at a wind chill of -35C.

So it's cold.  Despite this coldness however I noticed, on the way home from work, that there are still a couple of shops in China-town that have their shop fronts completely open.  When I type "shop fronts" I really mean it - the whole front of the shop - open to the elements, which currently means -16C.  The increasing costs of energy in Ontario don't seem to be biting as hard as perhaps they should.

Tuesday, 21 January 2003

7:41 PM - Updates to SimpleTAL pages

I've had some great feedback on my SimpleTAL library, and a few questions.  The original pages that I put up were a little spartan, even by my standards, but I've been adding to them over the last couple of days to try and make them a little more informative.  I've added a couple of examples that show how to use the library, and a page documenting the differences between this implementation of TAL and the Zope version.

It would be nice to add pages demonstrating each of the different TAL attributes and how they work, but it's a fair amount of work, so for now I'm relying on the Zope documentation.  An aspect of the documentation that I will work on however is a description of the SimpleTAL API.  It is very easy to work out from the source, but it's much nicer and easier to have it put into a web page instead.

Monday, 20 January 2003

7:41 PM - Shoe laces

One of my shoe laces broke this morning, leaving just enough lace left to keep my shoe on my foot.  At lunch I went to purchase a replacement shoe lace, and thankfully the local chemist had them.  I was expecting that I would have to buy a pair of shoe laces, instead of the one that I needed, but I was wrong.  I had, in fact, to purchase two pairs of shoe laces instead.

Shoe laces also come in multiple lengths, with a handy (in-accurate) chart on the packaging indicating what length you may need based on the number of islets your shoes have.  Sod's law - my shoes fall at the upper end of one length recommendation.  Still I got the size indicated, and although they are a little on the shy side, they will do.  The question remains however why you have to buy two pairs, with a single pair not being an option?  How many people have two identically coloured shoes, of the same number of islets, suffer broken laces at the same time?  If shoe laces have to be sold four at a time, why can they not at least put two different sizes in the same packet, so that you can buy in the confidence that at least one of them will be correct?

Friday, 17 January 2003

12:20 AM - Release of simpleTAL

The weblog system that I have put together is based on the use of a template language called TAL.  TAL is part of Zope the large Python based CMS system, and it relies on various C modules that come as part of Zope.  To use TAL I had to write my own implementation or work out a way of making the Zope version work without Zope (others have since done this using the original, but it's not widely available).

In case this library is of any use to other people I'm putting it up on my website.  If you've never heard of TAL and do CGI programming in Python, or have other needs for a simple template language for HTML and XML, then take a look.  Start with the TAL link above, and then play with my implementation SimpleTAL, if you like it then check out the rest of Zope.

Tuesday, 14 January 2003

10:25 PM - Something I didn't know

I'm reading (or rather skimming) the UK Governments consultation document on identity cards as I try and think of how I can compose a suitable email on the subject.  If you've not done so already, and care about the subject, then take a look at the stand website.

While looking through the document I saw the table of minimum ages that you need to be before you can do certain things in the UK, and learnt that you have to be 17 not just to drive a car, but also to purchase a cross bow.

Sunday, 12 January 2003

4:37 PM - War with Iraq

I've not previously written anything about the upcoming war with Iraq, mostly because I hadn't yet developed a view other than a purely instinctive one.  That instinctive reaction was to be against going to war, primarily because of how the case for doing so has been put across.  The poor, and so far unsupported, attempts to link Iraq to Islamic terrorism put me off the idea completely because it seemed that Bush and Blair were simply looking for any possible excuse to justify a war against Iraq.

Looking beyond the cobbled together excuses that were initially attempted there are some more serious arguments as to why a war with Iraq may be justified.  The top two reasons, in my mind, to go to war with a country are:
1 - The other country poses a threat to you
2 - What is happening inside that country is repugnant to your sense of morality
These reasons then need to be compared against the cost of pursuing a war, in terms of lives lost or damaged, and in terms of political/social results.  If, as in the case of North Korea, there is good justification on both fronts for an offencive, you still may not pursue that route because of the cost of doing so.

In the case of Iraq it's the first reason that concerns me the most, although not as someone living on the American continent, but rather as a European.  With the expansion of the EU to include Turkey, Iraq would suddenly have a border with the EU, and if Iraq had the opportunity to develop nuclear weapons then it would have very serious consequences for the security of the EU as a whole.  The recent attempt by the UK to justify an attack on Iraq on the basis of the second point, that the Iraqi regime is a horrible and brutal one, has not been taken too seriously because there are so many other countries that would fall into this category.  It's only when the brutality of another country reaches a very high level indeed that we feel the need to act - the cost otherwise is seen as to large (military intervention is never a clean business).

My opinion is that the best way to prevent Iraq from threatening others is to maintain intrusive weapons inspections, based on the best intelligence that the west can gather.  If it's found that Iraq is determined, despite constant inspections, to develop weapons that the rest of the world has prescribed as being unacceptable, then force should be used.  This opinion is based partly on the costs that are likely to result from an invasion of Iraq.  If the example of Afghanistan is taken, it seems that following a take-over of Iraq we can expect a weak government that can not even provide law and order within the country.  This situation is not only dangerous in the long term, it's also morally repugnant - to take over a country and leave it a lawless mess should be unacceptable.

1:44 AM - A few enhancements

Last night I noted that I should add a spell checker to my weblog program, and now I have.  The code is fairly simple, no custom dictionaries, or other fancy features, just: replace, replace all, skip, and skip all.  The actual spell checking is done by aspell, with the python classes controlling it through a pipe.

Additionally I've put up a "favicon", one of those little icons that can sit next to bookmarks.  It's very hard to draw anything visible at that size - and my drawing skills are somewhat lacking - so I've gone for a simple OF logo instead.  I find it easier to find bookmarks that have these icons for other sites that I use, so hopefully someone, somewhere, will also find this one to be of benefit.

Friday, 10 January 2003

11:41 PM - Free music every day

It's nice to be able to get a free piece of music every day.   The quality, and bizarreness (is that a word?) of the music doesn't matter, so much as the opportunity to listen to something that you will certainly not accidental hear on the radio during your day.  The only show that I can imagine ever playing any of this stuff would be John Peel, and it's rare these days that I get to listen to his shows.

So, in case you haven't already discovered it elsewhere, take a look (and listen) to otisfodder.

Note to self: I must integrate a spell checker into my weblog software....

Tuesday, 7 January 2003

9:23 PM - Politicians versus independent experts

The Liberal Democrat home affairs spokesman Simon Hughes reckons in this BBC article that politicians should not set minimum sentences for crimes, only maximum sentences.  The article is about the recent proposals to set a minimum sentence for carrying a gun at 5 years, except in exceptional cases.  Simon's argument is that it should be up to judges and magistrates to ensure that the sentence given matches the severity of the crime, not for politicians (who can obviously only make laws based on their concept of an average case).

If parliament sets a maximum sentence for a crime it says, in effect, that judges and magistrates can not be trusted to set a tariff that fits the severity of the offence in all cases.  It also says that regardless of how seriously a particular offence was committed there is a limit to the punishment that society thinks should be attached to it (reflected through our elected representatives).  Sure then it is only reasonable that parliament can set a minimum sentence, that society can say that no matter how trivial the infringement a certain level of punishment is required?

There is a trend, commented on by others, of trying to keep politics out of large chunks of decision making processes.  This trend is aided by such examples as the independence of the central bank to set interest rates, which seems to be now universally seen as a success.  However there are very definite limits to when and how this can be applied.  In the case of monetary policy it is easy for parliament to instruct a group of people to target a particular inflation rate, and to give them the tool of interest rates with which to aim for this target.  It is not possible for parliament to give senior judges and magistrates the target of reducing crime and then given them the tool of sentencing by which to achieve this objective.  Unlike with monetary policy, there is no consensus on how crime behaves given different sentencing regimes, and so the structure of sentencing options are innately political in nature.

9:02 PM - Apple releases a web browser

I'm sure it's everywhere by now, but Apple have released a beta of a new web browser for MacOS X.  It's called Safari and I haven't seen it in action yet, because it only works on Jaguar onwards (that's 10.2).  The rendering engine is from Konqueror, so this is great news for the KDE project because it'll no doubt lead to some significant improvments in the quality of the software as bug fixes are sent back by Apple.

12:18 AM - Back home

I'm back home, feeling tiered but refresh from my holiday.  I'm not sure I've felt refreshed from a holiday before, I normally feel sad to see it go and normal life take over, but this time I actually feel like I want to get on with normal life.

Every now and again a country will be pursuing something that seems so small, in comparison to what is happening in the rest of the world, that it stands out.  Here's a classic example from Bjørn Stærk: "Yesterday, for instance, a man threatened to crash a plane into the building of the European Central Bank in Frankfurt. Meanwhile, three Norwegians are scheduled in court to dispute a a $350 fine for hurling paper planes at the American Embassy in Oslo, over a year ago."

Saturday, 4 January 2003

4:33 PM - Network computing

It's remarkable that I can use my computer remotely from around the world (as shown by this post) via a simple modem connection.  I've used vnc before over a high speed network connection, where you can use the GUI of one machine on another, and barely notice the difference between the remote and the local version, but it's another thing to do this over a modem connection.  The display is certainly slow to update, but the fact that it's usable at all is a feat of software engineering.  If the whole screen, uncompressed,  was sent with every key press then a chuck of data 768K in size would need to be sent over the modem connection.  With a connection of about 3.8K per second it would take nearly two and a half minutes just to send the one snapshot of the screen.  As it is typing this the delay is roughly one second for the text that I type to appear on the screen in front of me!

As network connections around the world improve the user experience of using a machine across ~3500 miles of ocean will get better, but given the restrictions that a modem places on us, we have already achieved an extremely good result.  (BTW If you wish to try this your self then use TightVNC for use over a modem - the ordinary VNC requires too much bandwidth for bearable modem use).

%nbsp;

Last Modified: Mon, 30 Jun 2003 23:20:59 BST

Made with PubTal 3.2.0

Copyright 2008 Colin Stewart

Email: colin at owlfish.com