Archive for October, 2004

Who cares about ampersands?

Monday, October 18th, 2004

It is about the time for the another corollary of the Godwin’s law:

As an online discussion about validation grows longer, the probability of mentioning unencoded ampersands approaches one.

No kidding! The reason is that such ampersands are easily the most common validation error. I heard “Thats does that damn validator wants from me?” more than once. Some know the answer already, but don’t really care. Either it seems so innocent and unimportant that is not worth wasting the time, or code production is so out of control that trying to fix it may bring entire company down.

So who cares about ampersands? Only two? Roger says that unencoded ampersands can be a problem. Inspired by him I wrote a little demo to show how it works.

Nothing too complex — I simply try to pass 14 parameters to my PHP script which displays their values. First link has names of the parameters separated by unencoded ampersands, the second link has properly encoded href attribute. Try clicking them. What do we see? Instead of 14 values we have got only two (more in case you are using Opera — it’s browsers dependent), and they look weird…

Now, can this behaviour break your application? I’ll leave that for you to decide.

One more point to add — valid pages can also behave like this. This is because validator barks not on ampersands in an URL — if ampersand is followed by known entity name validator will be happy. It is an unrecognized entity what produces validation error.

You can check this here. All I did — I just removed parameter dummy from the href. All remaining parameters (except for id which is not precede by ampersand) have their corresponding entities so validator will remain silent. However results produced by the script should make programmer to cry out loudly.

So what can we do? There are some options:

  • Do nothing.
  • Encode them.
  • Avoid ampersands in our href’s — especially if we pass parameters for the script to extract some content.
    Here is more on that.
  • Avoid ampersands by using different separator. We may use semicolon (;) for that purpose as encouraged by W3C.
    If you are using PHP take a look at arg_separator.input and arg_separator.output settings in your php.ini file.

Invalid Standards

Thursday, October 14th, 2004

An alternative title for this post can be “The crime and the punishment” .

So Mike did it again. Alongside his talent for design and coding he has a talent to ignite flame-wars on standards and validation.

Only this time it rolled out on much more cheerful occasion — redesign of another major site than the previous one on which I will not comment right now.

News was great, news was heard, comments rolled here and there. Then Ethan came and stirred it a bit more. Keith picked up and asked a very good question.

The problem is that Keith did not specify what he considers to be a “Web Standard”. Some (including me), also known as the “wrong gang” think that this means HTML, XHTML, CSS, DOM you name it. Others call “web standards” that I call “best practices” — clean, semantic code, separation of content and presentation, CSS layouts — do I really have to list all you know for a long time already?

In the first case, whatever Mike has to say, I have a news for you – straight from W3C

A Strictly Conforming XHTML Document is an XML document that
requires only the facilities described as mandatory in this
specification. Such a document must meet all of the following criteria:

  1. It must conform to the constraints expressed in one of the three DTDs found in DTDs and in Appendix B.

  2. (…)

How wrong am I supposing that validation errors do not exactly mean that code follows DTD rules?
So, question is answered. Don’t believe me, believe W3C.

Now, before you start kicking me and trying to dig out some invalid document on my site read this: invalid code may be ok, there are many factors that can make it difficult to achieve validity, and invalid code can be better than valid one — and often is.

Only do me a favour — do not say that invalid code conforms to standards. Whatever difficulties you experience with processes, third-party CMS, uncontrollable content, ads and stuff like that — that makes me to feel sorry and compassionate, but it does not make code any more standards compliant.

Actually problems with complex systems, third-party software and difficult to control content were mentioned quite often in the discussion alongside of “big picture” or inability to see it. Not exactly clear how does it work, but the wild guess is, that sitting on a huge pile of crappy software, wild content and messy CM process widens your horizon quite a bit.

I consider myself experienced enough to see the devil in the details of that big picture. Have you ever thought how did the this same broken CMS come to life? And why does it produces such a horrible code?

Right: because nobody cared. All were busy with more important issues – deadlines, team management, stakeholders, you name it. No one had time or will to care was that code valid or was it invalid, let alone unencoded ampersands. There are always more important things to do.

So let’s not complain — we are using software produced with exactly the same attitude towards standards like we have.
Sometimes I am not sure is it marching towards the better web or is it just spinning the vicious circle.

However Keith is absolutely right on this:

I just wish people would recognize that when a large corporate site (Which frankly has much bigger fish to fry than Web standards. Period.) like ABC News makes even the slightest move toward Web standards it should be hailed as a victory for standards and nothing less.

Still, I’d like to hear less advocacy for invalid code from big names. Yes, we all know, how the real work in the real world gets done. Let it be the little secret that sometimes it is OK to be not 100% OK. Big names should not forget that there are thousands of inexperienced web developers listening to them. Humans are lazy, coders are especially lazy so whatever can be used as excuse — will be used, and we will never break out of the vicious circle.

We should know the rules extremely well to be sure when it is ok to bend or break them.

Update: I assume it is resolved now.

Got a blog?

Wednesday, October 13th, 2004

Today I once again realised that I needed a blog — badly. So this 10-minutes installation of WordPress will serve me for a while. Well, ok, it took a bit more than 10 minutes, but that’s related to the fact that my intent is to run this site in two languages, and I had to fiddle around to get content negotiation going.

My Lithuanian version was born dead in February. Now English version is rolling out and I risk to have two corpses in my closet, but I’ll try my best to avoid this horror.

Quite a mess waiting ahead, interesting times to come. Be prepared for lots of dust and falling bricks over here while I am trying to shake this down. Some things will come and go, some will stay, other will not come at all. Text of the post may change a bit at first too, without any special mark-up, or notice, but I will try to avoid that as much as possible. This very post have changed heavily from the original one-liner.

My English is far from perfect, but I expect you to understand what I mean :). Hopes are high that this site will help me improve my English as well as my general writing skills. I’ll appreciate any good soul who helps me with my spelling and grammar, and if such soul appears and it happens to be in a body able to use keyboard, I’d ask to write me an e-mail. Yes, I mean e-mail not comments. Thank you.