[html4all] Interview: HTML 5 Editor Ian Hickson discusses features, pain points, adoption rate, and more

Robert J Burns rob at robburns.com
Sun Aug 31 02:35:40 PDT 2008


Hi Leif,

On Aug 31, 2008, at 12:47 AM, Leif Halvard Silli wrote:

> Robert J Burns 2008-08-30 21.43:
>
>
>>> [...] error handeling, if defined back then, could have differed
>>> from both XML and HTML 5.
>>
>>
> First you say "Only XML style error handeling ...". Then you say
> "Ian-style error handling does not ..." Above we agreed that "it
> could differ", from both Ian and XML handling.
>
> Ian-style handling would not have been possible from the start,
> because so many errors have developed until now which we did not
> have from the start!

By Ian-style handling, I'm simply referring to silent (or at least  
pretty quiet) error-recovery. That certainly could have been there  
from the start and in fact was there from very early on. The other  
thing Ian brings to the table now, is an attempt at unambiguously  
specifying the error-handling and hopefully an adoption across UAs of  
that silent error-recovery error-handling. My sense from your iCab  
example is you are talking about error-recovery that is not so silent.  
So perhaps every browser by default would provide some non-modal  
feedback whenever a page has errors. Agree that this is not Ian-style  
nor draconian, but I couldn't tell what you were referring to until  
you brought up iCab. My original point still remains though, that Ian- 
Style (nearly silent error recovery error handling) would not lead to  
less tag soup. Even without specifying the error-recovery, but  
requiring UAs to give the user obvious feedback for errant documents  
would have led to less tag soup, but I still don't see Ian supporting  
something like that.

> [...]
>
>> So how could it cause a reduction in 'tag soup'?
>
>
> Answer:
>
>
>>> Still, if all browsers were required to handle errors and
>>> exceptions "mildly", but in the same way, then they would not have
>>> had to invent this handeling by themselves. This would have lead
>>> to fewer errors being accepted. Unlike today, when some browser
>>> does it that way, another that way, a third that way - and as a
>>> result, each browser has to support all the different ways of
>>> handeling errors, in order to be compatible.
>>
>> So are you and Ian using the phrase 'tag soup' to describe the  
>> various
>> ways browser handle errant HTML?
>
>
> To say "treat something as tag soup" has become synonymous with
> "to parse something as text/html". This, again means parse html
> with the undefined error handeling that has developed over time.
>
> This is not how I used the term, however.

I'm not using it that way either.

> What I said was that the fact that the UAs handle, bless, excuse
> or don't accept tag soup/syntax errors based on a largly common
> but undefined error handling rules, has let many more errors =
> much more tag soup to grow and develop than we would have seen if
> error handling had been defined from the start.

Defining error handling does not do it per se. It is the immediate  
feedback that errors exist across all UAs that would reduce tag soup.  
Again, Ian's contention was that defining error-handling would reduce  
tag soup. However, if you fully define error-handling, but silently  
render the page, then there's no way tag soup will be reduced. So  
specifying error alerts reduces tag soup, but not error-recovery  
(which is what I understand Ian as saying when he says well specified  
error-handling).

>> The reason XML error handling reduces
>> tag soup is that the author is made immediately aware of the errors
>> (the soupiness of their tags) even without using a validator or
>> conformance checker. Upon seeing their soup they fix it so that they
>> can publish their documents. Ian-style error handling, masks the
>> errors of the authors (and their authoring tools) from the authors.
>
> Do you use 'Ian-style' as a synonym for "the different but
> somewahat similar error handling that all HTML UAs have had to
> develop on their own because it, until now, had not been defined"?

Yes, partially. I  am talking about well-defined but silent error  
recovery. For any one given browser the error-recovery is well  
defined. The key thing is that nearly all browsers are silent about  
errors (iCab being a notable exception).

> I don think I or anyone else see the problems clearer just because
> you attach name of Ian to it.

I'm not trying to obscure, I thought from the thread we had  
differentiated these terms.

>> They do not know they are creating tag soup because they test it in
>> their browser and it works. With standardized Ian-style error  
>> handling
>> authors still can't see their errors upon testing in a browser.
>
>
> That very thing is nothing new. The only UA I have used which
> constantly warn about errors is iCab. And that's why I use it.

Again, so its the error feedback that reduces tag soup (which might be  
one part of error-handling), but not error-handling itself: and  
certainly not the type of error handling Ian advocates.

>> In
>> fact they may test it in a dozen HTML5 browsers and still not see the
>> soupiness of their tags. Perhaps you (and Ian) are saying that the
>> errors are no longer tag soup because they are handled in a  
>> consistent
>> manner?
>
>
> Do I say that the errors are no longer errors because they now are
> handled in a consistent manner? No, of course not.
>
> Again, Ian spoke historically: He predicted a there would have
> been less errors in the HTML code around the globe if error
> handling had been defined from the start.

Which is wrong if the error handling doesn't specify obvious error  
notification to users.

> Let's say we added error handling to HTML 4 *today*, then this
> would not have reduced the number of errors in existing code
> unless UAs as a result started to treat some errors in a way that
> made the designers annoyed. However, for the future, I have hope
> that it could stop the number of errors to grow.

Even if we added such error handling to HTML1 (or 2) originally, it  
would not do any good if it didn't annoy authors (and even other users).

> As for HTML 5, yes, it appears to be the case that some things
> that were considered tag soup (aka 'errors') in HTML 4, will
> become "well done" in HTMl 5. But that is a completely other issue.

True, I'm not addressing that in this discussion.

> The crux seems to be that you think that only XML style handling
> can reduce the number of errors. There we disagree. But I may have
> failed to convince you about that ...

No, I agree that iCab style would also reduce tag soup. Again, my view  
is that iCab style error handling is also not something Ian would ever  
support. So going back to my original point, how can Ian say in that  
interview that specifying error handling (in a way that Ian would ever  
support) could ever reduce 'tag soup'.

>> So then we have a new distinction (at least new to me):
>> conforming documents and errant documents where errant documents can
>> be broken down into those containing tag soup and those that don't.
>> For example <object>fallback</object data='file.mpeg' > would be a  
>> tag
>> soup errant document, but <b><i>some bold italics</b></i> would be an
>> errant document, with no tag soup. I'm just honestly trying to
>> understand how you're using the term tag soup here.
>
> Ah, yes, true, some link 'tag soup' to the use of <b> and <i>
> instead of <strong> and <em>. I've done so myself, I think. It is
> unclear to me what you mean here, though. If one take the stance
> that <b> and <i> will become forbidden i HTML 5, then using them
> even if they are forbidden, would be an error and thus tag soup,
> which needs defined error handling. The common error handling of
> old tags seems to be to accept them and respect their meaning.

I'm not talking about distinguishing bold from strong emphasis or  
italics from emphasis. The example <b><i>some bold italics</b></i> is  
ill-formed HTML. It is precisely the type of tag soup that has  
proliferated due to silent error-recovery. Specifying that silent  
error-recovery from the start would have done nothing to reduce such  
errors. On the other hand, no browser had the 'genius' idea of  
applying error-recovery by looking for attributes on closing tags (at  
least not that I'm aware of) so today such errors do not persist,  
because the error is immediately apparent to the author testing in any  
major browser.

To sum up then, error notification (as part of error handling) reduces  
tag soup. However, error handling that is silent — which is what I  
think Ian means — does nothing to reduce 'tag soup'. Instead he's  
using the phrase as a sort of name drop. It doesn't even mean anything  
in that sentence, its just meant for the readers to go, "yes...  
mmmm... tag soup, bad. HTML5 must be good. This Ian, smart" On the  
other hand, perhaps Ian thinks if done from the start, XML style  
draconian error handling or iCab style clear notification error  
handling should have been used. But then I wonder why if it was  
possible then, why we can't transition to it now (hint: because  
Microsoft want to undermine W3C recommendations so that its horrible  
and proprietary formats continue to keep them on top).

Take care,
Rob


More information about the List_HTML4all.org mailing list