Mapping DTDs: making validation usable

21 May 2004

21 comments

A Blue Perspective: Mapping DTDs: making validation usable

» DTD Mapper «

One of the qualms people have with validating pages is that they never know what they're validating from. The W3C releases documents called Document Type Definitions (DTDs) that strictly specify the grammar and form that HTML/XHTML documents must take in order to validate as HTML 4.01, XHTML 1.0, or XHTML 1.1. In the not too distant future, web sites will be able to supply their own DTDs, letting them create valid XML documents with a custom syntax all their own. (IBM already uses one)

However, while DTDs are perfect for letting a computer know whether a document is valid, just try reading one with your own human eyes. I'd give you cheque for $50.00 if you could look at the XHTML 1.0 Strict DTD and tell me in five minutes what elements are allowed to be nested inside a cite tag. (Cheques will not be honoured)

As Andrei pointed out, the W3C don't help matters with their deeply hidden documentation and under-described recommendations. So, I've written a program that takes the leg work out of it for you. Using the DTD Mapper all you have to do is paste in the URL of the DTD that your pages are meant to be validating to (that URL in the weird DOCTYPE line at the top of your source code), and you'll get a nice, collapsible list of all valid elements, with their attributes, possible child elements and possible parent elements.

Because there's quite a lot of repetition in attributes, nesting, etc. the HTML files that the mapper produces are rather large (> 500 kB) so it takes a while for a read-out, during which the expanding/collapsing JavaScript won't work, so wait for the whole thing to download.

Of course, automated tools like this won't be able to tell you everything about the specifications for a document – there's a large gap that can only be filled by human description – but it should be handy for some of your queries. Any suggestions for improvements, just post them here!

Tested in IE6, Firefox 0.8 (Win). Thanks to Earl Hood for the perlSGML parsing modules.

Categories

, , ,

Comments

  1. 1/21

    Simon Jessey commented on 21 May 2004 @ 03:07

    Wow, Cameron! That is a really nifty piece of work. You are getting a reputation for clever bits of programming.

    One possible enhancement would be to allow local files to be read, much like the markup validator does here: http://validator.w3.org/

  2. 2/21

    Joe Clark commented on 21 May 2004 @ 05:39

    Try pasting http://www.literarymoose.info/=/dtd/xhtml11e.dtd in. I can't get it to work.

    It seems the second field-- the pull-down menu with XHTML DTDs-- is not needed when trying to validate a custom DTD.

  3. 3/21

    ACJ commented on 21 May 2004 @ 10:42

    Dude, genius. This is so going into the resources chapter of my static bookmarks.

  4. 4/21

    The Man in Blue commented on 21 May 2004 @ 11:28

    Joe: I think the parser I''m using has trouble with the XHTML 1.1 method of importing different modules, that's why I used the XHTML 1.1 Flat file in the drop down menu.

  5. 5/21

    Justin French commented on 21 May 2004 @ 14:53

    Mate, amzing idea, works well in Firefox, but doesn't work in Safari... I get the collapsed items, but can't expand any of the tree nodes... JS is definitely on.

  6. 6/21

    Matt Pennell commented on 21 May 2004 @ 18:26

    Incredible work - although I don't fancy accessing the page on my home dial-up!

  7. 7/21

    Giacomo commented on 22 May 2004 @ 00:24

    Definately helpful. Given the nice tree structure your script builds, how hard would it be to put up a quick search for "which tags can contain this tag"? At that point, you could ask the W3C some money for building one of the most useful tools ever ;)

  8. 8/21

    Marco "Bazzmann" Trevisan commented on 22 May 2004 @ 01:29

    That's the killer application!

    Good work on you. Since the complexity of data that mapper has to handle, and humans have to use, I think that a future improvement to the GUI of mapper could improve the usability of it better than now. This is just my opinion and my two cents about it. :)

    If you wish, I could try to think something about it.

  9. 9/21

    Henrik Lied commented on 23 May 2004 @ 01:45

    Pretty nice dude. I keep getting more and more amazed

  10. 10/21

    Unearthed Ruminator commented on 26 May 2004 @ 00:06

    I agree with Giacomo, if you could search and/or filter this list, it would be even more amazing. Great work.

  11. 11/21

    Chris commented on 27 May 2004 @ 21:37

    Regarding the search, isn't that what the Parents tab does?

    This is a very handy utility indeed :)

  12. 12/21

    Chris (again) commented on 27 May 2004 @ 22:13

    Something strange I just noticed though - the ul element is showing as having no children on XHTML Transitional 1.0.

    However if you view the li it shows a parent of ul :)

  13. 13/21

    Hallvord R. M. Steen commented on 29 May 2004 @ 22:19

    Looks very nice but the script runs into problems with nextSibling being a whitespace node in Opera. This may be the problem someone mentioned with Safari as well.

  14. 14/21

    Anonymous commented on 30 May 2004 @ 01:51

    Hmmmm ... what's a whitespace node?

  15. 15/21

    Hallvord R. M. Steen commented on 30 May 2004 @ 02:25

    Spaces, linebreaks and TABs between tags may be parsed as individual nodes in the DOM (I'm not sure what determines such parsing)

    Cameron's JavaScript seems to fail in some browsers because the script changes the classname of the node you click on and the next node, expecting both to be elements. In some browsers the next node is in fact just linebreaks and tabs and changing its classname does exactly nothing. It is a small thing to fix - just need to check if the nodeType of the node is 1 and keep looking if it is 3.

  16. 16/21

    Kyrre commented on 29 July 2004 @ 08:04

    I can't seem to get it to work... Every DTD I try, I get an error message, here's the one from XHTML 1.0 Transitional from the drop-down list:

    ERROR: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd could not be parsed as a DTD.

  17. 17/21

    The Man in Blue commented on 29 July 2004 @ 11:05

    Yeah, I think when I moved servers that one of the Perl packages went missing, I'll have to take a look.

  18. 18/21

    The Man in Blue commented on 10 August 2004 @ 02:22

    OK, it's back up now!

  19. 19/21

    The Man in Blue commented on 10 August 2004 @ 02:42

    And as an added extra bonus, the collapse/expand JavaScript now works in Opera :o]

  20. 20/21

    Tom Levine commented on 13 December 2004 @ 06:10

    What about the Pull Model, which provides non-cached read only access to the .XML file, and the XmlReader() object. Using XMLValidatingReader(), a derivative of the XmlReader() you can parse with DTD.

  21. 21/21

    Michael commented on 31 January 2005 @ 03:57

    Great stuff!

    The attribute listing is great, but I'm not getting parents or children for any element in any DTD.

    Just me?

  22. Leave your own comment

    Comments have been turned off on this entry to foil the demons from the lower pits of Spamzalot.

    If you've got some vitriol that just has to be spat, then contact me.

Follow me on Twitter

To hear smaller but more regular stuff from me, follow @themaninblue.

Monthly Archives

Popular Entries

My Book: Simply JavaScript

Simply JavaScript

Simply JavaScript is an enjoyable and easy-to-follow guide for beginners as they begin their journey into JavaScript. Separated into 9 logical chapters, it will take you all the way from the basics of the JavaScript language through to DOM manipulation and Ajax.

Step-by-step examples, rich illustrations and humourous commentary will teach you the right way to code JavaScript in both an unobtrusive and an accessible manner.

RSS feed