htmlpp is a simple HTML pretty printer, based on nsgmls and SGMLS.pm. The code is pretty alpha, but gives attractive results for many HTML docs. Some things, like nested tables, are rendered only passably. Other deeply-nested structures may render badly as well.

Note that this pretty-printer is oldish, and alpha, and unlikely to be developed any further. It's not a bad illustration of some of the possibilities for SGML technology in web authoring. Perhaps someone will take up the challenge, and build the "right" tool!

Since htmlpp gets its input from nsgmls, invalid documents should not be expected to work. However, a side effect of this approach is that minor errors and inconsistencies are actually fixed. Attribute values are always quoted in the pretty printed version. Characters like "<", ">" and "&" are converted into the appropriate SGML entities in attribute values and in document text. End tags are inserted automatically--which will suprise you if you thought it was legal to imbed <pre> elements inside <p> elements, for example.

Prerequisites

  • First you must install James Clark's SP package, available at http://www.jclark.com/sp/ to get the parser nsgmls.
  • Next you need David Megginson's SGMLS.pm package, available from the nearest CPAN archive.
  • Finally, you need the Text::Format Perl package, also available from CPAN.

Getting htmlpp

You can download htmlpp from this site (about 25K). You can also download it from sunsite at http://sunsite.unc.edu/pub/Linux/apps/www/converters/.

Installation

	tar xvzf htmlpp-0.1.tar.gz
	cd htmlpp-0.1
	./configure && make install
    

Using htmlpp

htmlpp is pretty simple to use. It accepts input on stdin, or from the file specified as a command-line argument. The prettified output is directed to stdout.

Contact

Send any patches, bugs, complaints, free beer, etc to me. Len Budney lbudney@pobox.com

 

Top


Len Budney
lbudney@pobox.com
Copyright © 1998 - 2004
Page generated: 20:08:06 21-Dec-2004