{"id":146,"date":"2003-10-13T22:27:15","date_gmt":"2003-10-13T22:27:15","guid":{"rendered":"tag:owlfish.com,2004:colinweblog.20031013222715"},"modified":"2003-10-14T05:33:31","modified_gmt":"2003-10-14T05:33:31","slug":"13102003","status":"publish","type":"post","link":"https:\/\/www.owlfish.com\/weblog\/2003\/10\/13102003\/","title":{"rendered":"Writing content for web pages"},"content":{"rendered":"<p>Last week I wrote a short article on the <a href=\"https:\/\/www.owlfish.com\/weblog\/2003\/10\/06102003.html#22:01:58\">importance of a template based solution for web page<\/a> maintenance, and the sort of innovations that could be made to ease template design.  At the end of that article I noted two other problems with web publication tools today: markup of the content, and the handling of non-journal style pages.  This article addresses some thoughts I&#8217;ve had on the first of these two problems.<\/p>\n<p>The most popular template based systems today are those provided by blogging software.  They allow an author to enter new web content either using a web browser (thin client) or a small application (fat client).  When the author decides to include a link, make some text bold, or apply some other markup the most common solution is to have them enter HTML codes.<\/p>\n<p>Alternatives to entering the HTML manually include utilising IE specific enhanced textarea widgets, using a different markup language such as <a href=\"http:\/\/www.textism.com\/tools\/textile\/\">Textile<\/a>, or providing buttons that automatically insert the HTML tags.<\/p>\n<p>The markup in which an author&#8217;s content is written, and the markup in which it is published, must be treated separately, even if they happen to be the same.  The reason for this is that an evolving web also means evolving markup for web pages, for example the transistion between HTML4 and XHTML1.  When an author of a site chooses to move their pages from HTML to XHTML the software they use needs to be able to rebuild old pages using XHTML.<\/p>\n<p>For software to be able to perform transformations of markup from one language to another it needs to be able to parse the original markup perfectly.  If the original markup is HTML this poses two problems: writing and parsing correct HTML programmatically is fairly difficult, and if users enter markup by hand then there <b>will<\/b> be errors in it.  The inability to convert cleanly to a new publishing markup language is a major defect in all of the blogging tools today that store and accept content using HTML markup.  It is a hole that can be coded out of, but never in a 100% satisfactory way.<\/p>\n<p>The solution to this problem requires a combination of three things:<\/p>\n<ol>\n<li>The adoption of a strict, easy to parse, format for authoring content.<\/li>\n<li>The rejection of any content which does not adhere completely to this strict format.<\/li>\n<li>The use of tools, rather than users, to generate this strict format.<\/li>\n<\/ol>\n<p>The critical piece missing today of these three items is the third one: a GUI tool that allows the markup of web content in a strict, easy to parse, format.  The bare minimum that such a tool should be able to support includes: links, text decoration (bold, italic, etc), lists (bullet and numeric), and images.  There are lots of other types of markup which would be very useful (e.g. tables), but for most web content this limited list would suffice.  Today there are many weblog authors who have tools and knowledge such that they don&#8217;t use the most basic of markup in their content.  A GUI application supporting these features, and whoes output is in a strict format, would be enough to bring painless, sustainable content authoring to a much wider audiance.<\/p>\n<p>Writing such a tool, while not technically difficult, does take time and effort.  I hope to one day soon find an open source tool to do this.  In the meantime however I have a partial solution: <a href=\"http:\/\/www.abisource.com\/\">AbiWord<\/a>.<\/p>\n<p>AbiWord is an open source word processor.  A word processor isn&#8217;t really the best choice of tool for editing web content, simply beacuse it has too many features that are not needed or do not apply to the web.  For example AbiWord supports Mail Merge, multiple document sections with different headers and footers on the pages, and other such features that are needed for document creation, but not for editing web content.  <\/p>\n<p>Despite these drawbacks the use of AbiWord does bring some significant advantages:<\/p>\n<ul>\n<li>It is a full GUI: the user has no opportunity or need to edit markup themselves.<\/li>\n<li>It has many useful features such as spell check as you type.<\/li>\n<li>The output format is in XML, making it fairly easy to convert into HTML\/XHTML or any other markup language.<\/li>\n<\/ul>\n<p>To see whether or not this can work I&#8217;ve written a plugin for <a href=\"http:\/\/www.owlfish.com\/software\/PubTal\/\">PubTal<\/a> which takes AbiWord documents, converts it to HTML markup, and then publishes it using PubTal templates.  There is still much testing to be done, but it now handles: headings, text decoration (bold, italic, underline, strikeout, overline, superscript, subscript), pre-formated text, hyperlinks, bookmarks (anchor&#8217;s), bullet lists, numeric lists, footnotes\/endnotes, and tables.<\/p>\n<p>The biggest missing feature is the ability to include images in the content.  The problem here is that AbiWord doesn&#8217;t record the original location of the image file &#8211; it just places the binary content (encoded using base64) into the XML file.  I can probably live with that restriction for most pages, at least until I can find a better solution.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Last week I wrote a short article on the importance of a template based solution for web page maintenance, and the sort of innovations that could be made to ease template design. At the end of that article I noted two other problems with web publication tools today: markup of the content, and the handling [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/www.owlfish.com\/weblog\/wp-json\/wp\/v2\/posts\/146"}],"collection":[{"href":"https:\/\/www.owlfish.com\/weblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.owlfish.com\/weblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.owlfish.com\/weblog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.owlfish.com\/weblog\/wp-json\/wp\/v2\/comments?post=146"}],"version-history":[{"count":0,"href":"https:\/\/www.owlfish.com\/weblog\/wp-json\/wp\/v2\/posts\/146\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.owlfish.com\/weblog\/wp-json\/wp\/v2\/media?parent=146"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.owlfish.com\/weblog\/wp-json\/wp\/v2\/categories?post=146"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.owlfish.com\/weblog\/wp-json\/wp\/v2\/tags?post=146"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}