Comments on Google Operating System: Yahoo Lets You Delimit Unimportant Content From a Page

Search engines are going to have to identify templ...

2007-05-03T23:16:00.000-07:00

Search engines are going to have to identify templates anyway. Not everyone is going to use the new attribute correctly or even use it at all.

I think that this new Yahoo idea just adds unnecessary complexity.

Maybe that's a “search engine's job”, but it's a h...

2007-05-03T06:30:00.000-07:00

Maybe that's a “search engine's job”, but it's a hard one. I think everyone agree that this giving webmasters possibility to do this job is much faster and require less effort.

I thought that making this an µf is a good idea, however now that kjwa mentioned using robots.txt I think he's right.
But unfortunately, current robots.txt standard don't allow using using XPath. And by the way, what about pure HTML pages, XPath shouldn't be applied on SGML but on XML. Also what about other content (not-XML/SGML).
Now I think both methods shall be used.

I agree with Sergio. Webmasters have to help the s...

2007-05-03T06:15:00.000-07:00

I agree with Sergio. Webmasters have to help the search engine in order to optimize the searches made. The webmaster benefit from it (because it receives a lot more visitors) and the search engine also benefits with it - because its search capacity is improved.

Good article.

Template detection in web pages is an "old" and ac...

2007-05-03T01:25:00.000-07:00

Template detection in web pages is an "old" and active research topic first addressed by Broader.

See for instance this recent paper.
http://portal.acm.org/citation.cfm?id=1141534

This seems to be an interesting idea from Yahoo. Regarding your question about this "being a search engine's job", its just like SiteMaps introduced by Google and recently adopted by the whole industry.

Its a "search engine's job" that could benefit from a little help from webmasters ;)

If this has to be done at all, it ought to be done...

2007-05-03T00:50:00.000-07:00

If this has to be done at all, it ought to be done in a robots.txt file using css/xpath selectors.

This is the work for microformats. They even imple...

2007-05-02T18:03:00.000-07:00

This is the work for microformats. They even implemented it in a way similar to how microformats works (by attaching meaning to a class name). Idea is good.
However, there is already a draft for that which they didn't follow.