1. What is the current listing status? 2. What happened when Google visited this site? 3. Has this site acted as an intermediary resulting in further distribution of malware? 4. Has this site hosted malware?
"Of the 274621 pages we tested on the site over the past 90 days, 4 page(s) resulted in malicious software being downloaded and installed without user consent. The last time Google visited this site was on 05/22/2008, and the last time suspicious content was found on this site was on 03/13/2008. Malicious software includes 4 scripting exploit(s), 4 trojan(s). Successful infection resulted in an average of 10 new processes on the target machine. Malicious software is hosted on 4 domain(s), including 58.65.239.0, truemaybe.com, abc-powers.com. 5 domain(s) appear to be functioning as intermediaries for distributing malware to visitors of this site, including xtraff.biz, x-traffic.ws, smartvideochannel.com."
Despite all of these findings, google.com is not listed as suspicious, probably because the domain is whitelisted or the suspicious content is not very significant. It's likely that the domains listed above are from Google's search results, so that means the anti-malware system doesn't respect robots.txt.
I've always thought that an image search engine should accept as an input images and list identical or similar images from the web. This is useful if you have an image, but you don't remember what it depicts or if you want to find a higher-quality version of an image.
TinEye tries to do that and the best part is that it mostly succeeds. The new image search engine, powered by Idée's technology and currently in private beta, has an index of 487 million images (Google's index is at least 12 times bigger) and manages to find identical versions of an image or alterations. According to the FAQ, "TinEye frequently returns image results with colour adjustments, added or removed text, crops, and slight rotations. TinEye can also detect images that are part of a collage or have been blended with another image."
And the FAQ doesn't lie: I uploaded a screenshot of Flickr's homepage that included a Flickr image in the top-left corner. TinEye returned 6 results: 5 of them were different versions of the featured image (including the original image hosted by Flickr) and another result showed Flickr's homepage with a different featured image.
Then I uploaded an image from my computer that shows fingers in a book scanned by Google and TinEye pointed to me to a TechCrunch article that included that image:
TinEye doesn't do a good job at ranking images, as it orders the images "by relevance i.e. how well the result image matches your query image". It can't figure out the most-likely original source of an image, so TinEye's algorithms could be combined with a traditional image search engine like Google's in order to determine the authority of each image. TinEye also doesn't recognizes faces or objects in an image, so it just looks for similar images.
How does it work then? "TinEye uses sophisticated pattern recognition algorithms to find your image on the web without the use of metadata or watermarks. TinEye instantly analyzes your query image to create a compact digital signature or 'fingerprint' for it. TinEye searches for your image on the web by comparing its fingerprint to the fingerprint of every single other image in the TinEye search index."
The search engine is in private beta, but you can request invite or watch this screencast:
Google Docs has a new way of displaying documents: fixed-width page view (or print layout in Microsoft Word). Documents are more readable and look closer to the way they appear when printed. The previous mode (plain view, also called web layout in Microsoft Word) is available in the new View menu.
The OpenOffice wiki explains the advantages and disadvantages of different word processing layouts: the web layout is useful for documents that are written for presentation, is suboptimal for editing because of the long lines and acts as a final preview. "The Print Layout implements WYSIWYG and tries to come as close as possible to the printed document. However, this layout is not particularly suited for numerous use cases. In many circumstances, more specific layouts fare definitely better." In addition to the web layout and the print layout, Microsoft Word 2007 includes a full-screen view for reading documents, an outline view for creating the document's structure and a draft layout that focuses on content, not on formatting.
As promised when the service was launched as part of Google Apps, now you can use Google Sites without having a domain. "A few months ago we launched Google Sites exclusively as part of Google Apps for companies and organizations that wanted to use the service on their own domains. Now we've made it easy for anyone to set up a website to share all types of information -- team projects, company intranets, community groups, classrooms, clubs, family updates, you name it -- in one place, for a few people, a group or the world," says Andrew Zaeske on the Google Blog.
The sites are available at: http://sites.google.com/site/SITENAME and there doesn't seem to be a limit for the number of sites you can create. You can create as many web pages as you like, but each site has a storage quota of only 100 MB.
By default, each site is public, but you can make it private in the settings or when you create it. The same as in Google Docs, you're able to invite people as collaborators or viewers, but a site can have more than one owner.
Google Sites offers the same basic customization options like Blogger (themes, layout editor) and a rich text editor similar to the one from Google Docs, except that you can also embed a small number of whitelisted objects (Google Docs documents, spreadsheets, gadgets, YouTube videos).
In addition to web pages, you can also create simple blogs, lists, file cabinets and iGoogle-like dashboards. Each page can be arranged in a hierarchy and has a revision history that allows you to revert to earlier versions of a page. Google automatically creates a sitemap and can notify you when someone makes changes to the site or to an individual page.
For now, Google Sites looks pretty basic and doesn't include all the powerful features from JotSpot, the service acquired by Google and transformed into Google Sites. But that shouldn't be surprising, if you take into account that all Google services started small and gradually became more powerful and useful. I think the future of Google Sites is to combine Google's collaborative services so you can share more documents in a single page or to create blogs the same way you create calendars, to-do lists and photo galleries.
At the recent "Search Factory Tour" event (slides, YouTube video), Google announced that its image search engine has received a lot of attention from users lately and it intends to dramatically improve it. People no longer use Google Image Search only to find celebrity pictures or pretty images, they started to use it to compare products, to choose vacations or visualize unfamiliar situations.
To cope with the increasing number of images from the web and to provide better answers for the new use cases, Google promised that will start to add features that use complex image analysis.
One of the new features will allow you to find similar images, given a selected image. Since it's difficult to describe pictures using words, you will be able find a a group of images that illustrate the same situation.
After adding face detection as a restriction for image search, Google prepares to expand it and actually recognize faces. This will improve the quality of results for searches that include person names. Google doesn't intend to limit image recognition to people faces: finding objects in pictures is a difficult task, but it's a reliable way to filter irrelevant pictures.
Image search engines don't use the information from EXIF tags, that could offer a lot of interesting contextual details about location, date, image quality. Google will start to add information about geolocation from digital images.
The most controversial new feature tested by Google is the addition of display ads next to image results for commercial queries. Google's previous experiments with text ads weren't very successful, so adapting the ad format to the content could be a better idea. It depends on their usefulness and their prominence: the second mock-up displayed above puts too much emphasis on the image ads. For now, Ask.com Image Search is the only important search engine for images that displays ads, but they're text-only.
If we take into account that, in addition to all these enhancements, Google developed an improved algorithm for ranking images (VisualRank), we can expect an entirely new image search engine from Google in the near future.
There's a new layer for Google Earth that shows Google News stories related to a location. At the recent "Factory Tour of Search" event, Google explained the difficulties of automatically identifying the locations of a news story. For example, just because a news article includes "Paris" doesn't mean that the article talks about France's capital. It could be about the Texas city or Paris Hilton, so the algorithm needs to disambiguate names, identify complete addresses and determine the importance of an alleged location in a text. Google News also uses its automatically-generated clusters to validate locations and their importance to a news story.
"The launch of Google News on Google Earth is a milestone in the evolution of the geobrowser. By spatially locating the Google News' constantly updating index of stories from more than 4,500 news sources, Google Earth now shows an ever-changing world of human activity as chronicled by reporters worldwide. Zoom into areas of personal interest and peruse headlines of national, regional and, when fully zoomed in, even the most local of interest," says Brandon Badger, Product Manager of Google's Geo team.
To enable the Google Earth layer, go to the Layers sidebar, expand "Gallery" and select "Google News" from the impressive list of overlays. Another news-related layer that has been recently added to Google Earth is for New York Times, but it's likely that the news are geo-coded manually.
If you want to read news related to a location in your browser, add a local section to the personalized Google News homepage. "Adding a Local News section allows you to track news stories from and about a particular city or region. While this function is currently only available in our English language editions, we hope to add more languages and regions in the near future," explains the Google News help center.
YouTomb is an interesting project by MIT Free Culture that collects YouTube videos taken down because of copyright infringements. "More specifically, YouTomb continually monitors the most popular videos on YouTube for copyright-related takedowns. Any information available in the metadata is retained, including who issued the complaint and how long the video was up before takedown. The goal of the project is to identify how YouTube recognizes potential copyright violations as well as to aggregate mistakes made by the algorithm."
Since YouTube operates under Digital Millennium Copyright Act, it's obliged to take down content if it receives a notification claiming infringement from a copyright holder. In some cases, videos are wrongly taken down because YouTube is in no position to judge the validity of a claim.
According to YouTomb's stats, the companies that have recently taken down the biggest number of popular YouTube videos are: TV TOKYO, Viacom, Warner Bros, World Wrestling and other media companies. "YouTomb is currently monitoring 157340 videos, and has identified 4389 videos taken down for alleged copyright violation and 13330 videos taken down for other reasons."
Now that FeedBurner is owned by Google, it makes sense to integrate with other Google services. One of the most requested features, a FeedFlare that shows the number of comments for each post from a Blogger blog, has been recently added. The FeedFlare links to Blogger's (ugly) comments page so you can easily add a comment if you read the post in a feed reader. Note that the comment feed has to be enabled in Blogger's settings page.
FeedBurner offers a FeedFlare API, but you need to do some server-side coding to create dynamic FeedFlares like the one that shows the comment count.
The homepages of Google China, Yahoo China and other sites turned black today to commemorate the victims of last week's devastating earthquake. "Construction workers put down their tools, drivers stopped suddenly in the street, and rescuers briefly paused in their increasingly vain search for survivors amid the rubble of China's earthquake devastation. China stopped for three minutes to mourn the estimated 50,000 people killed by the earthquake exactly one week earlier," notes The Press Association.
Google China links to a custom search engine for sites that include information about missing persons. Kai-Fu Lee, President of Google China, told Google Blogoscoped: "[Engineering lead Harry Ke] and two other engineers came up with the idea of helping people search for lost relatives or friends. Given there are... tens of thousands still trapped, and some 200,000 wounded, and many more homeless. Also, cell phone and land lines are mostly not working. Transportation is difficult because the disaster area is on mountains and valleys. So, there are many people frantically looking for lost relatives and friends."
There's also a page for offering donations that mentions "Google will donate $2 million, including $1.7 million from Google.org, to help assist in relief and rebuilding efforts". A special Google Earth layer shows recent high-resolution imagery from the areas affected by the earthquake.
After a year and a half since the first announcement, the much-anticipated Google Health has been released at Google Factory Tour of Search. "Patients need to be able to better coordinate and manage their own health information. We believe that patients should control and own their own health information, and should be able to do so easily," said Adam Bosworth in November 2006.
Here's what you can do in Google Health: * create a health profile with information about your health conditions, medications, allergies * import medical records from US hospitals that use Google's APIs to make the conversion possible. Unfortunately, the list of partners is almost empty. * read medical resources, information about diseases
* find a doctor using Google Local Search * use other health services that integrate with Google Health and can can import your data securely and use it for different purposes: calculate the heart attack risk, print your health history or share it with doctors. According to the FAQ, "Google Health is a PHR (Personal Health Record), but it is also a bit of a different model. We believe it's not enough to offer a place where you can store, manage, and share your health information. You need to act on your health information to better manage your health needs on a daily basis. This is why we provide a directory of online health services to you. You must elect to sign up with a service and decide what level of personal data you want to share in exchange for the customized services those companies offer."
Google Health wants to become the central place where you organize your health information and share it with people or services you trust. Since this information is very sensitive, Google takes a lot of precautions by using SSL connections and a separate privacy policy that clearly states: "You control who can access your personal health information. By default, you are the only user who can view and edit your information. If you choose to, you can share your information with others."
This is a big test of trust for Google and probably the most personal service ever offered by the company. While you can also enter your credit card information in Google Checkout, your location in iGoogle or Google Maps, personal information in orkut, Blogger and YouTube, Google Health is about your existence.
"Health information is very fragmented today, and we think we can help. Google believes the Internet can help users get access to their health information and help people make more empowered and informed health decisions. People already come to Google to search for health information, so we are a natural starting point," says Google. But there's a big difference between offering general health information and storing health records, so it will be interesting to see if Google manages to convince users that this shift is beneficial to them. Google could integrate more health information in the search results, the same way Microsoft shows information from HealthVault in Live Search and lets you collect it in your account. TechCrunch says that "Google promises never to advertise on Google Health".
(Small tidbits: Google Health is one of the very few Google applications created using Google Web Toolkit and its codename seems to be Weaver.)
If you use Gmail and you'd like a simplified interface that doesn't use AJAX, loads fast and works well in most browsers, try the basic HTML view. Gmail links to this version at the bottom of the page, but you can also access it if you go to http://mail.google.com/mail/h/.
Until recently, Gmail didn't have the option to set the basic HTML version as the default interface and you had to bookmark its URL or click on the link from Gmail's footer. Now there's an option to always go to the HTML version, every time you load Gmail.
Please keep in mind that the simplified version lacks many useful features: * integration with Google Docs and Google Calendar * keyboard shortcuts * integrated chat * composing options: spell checker, rich formatting, address auto-complete, custom From address * ads (useful feature?), related pages, tracking packages and addresses * contact management * web clips.
Google recommends to use the basic HTML view for slow Internet connections, although you may find it useful when you use exotic browsers or when the standard version doesn't work. If you change your mind and you want to go back to the AJAX version, click on "Standard View". Apparently, the standard view loads much faster thanks to some aggressive optimization.
Google started to display a thumbnail in the Product Search OneBox, as you can see by searching for [usb flash]. The image illustrates the top search result, but links to the list of results. As part of the Universal Search, this OneBox can be displayed at the top of the search results page or at the bottom of the page, depending on its relevance to the query.
Another change is that the OneBox groups identical tech products and shows a range of prices. For queries that include clustered listings, Google no longer shows a link that restricts the results to products that can be bought using Google Checkout.
Here's an old screenshot of the Froogle/Product Search OneBox:
Google released Doctype (HTML version), an encyclopedia of the open web. "The open web is the web built on open standards: HTML, JavaScript, CSS, and more. The open web is a beautiful soup of barely compatible clients and servers. It comprises billions of pages, millions of users, and thousands of browser-based applications."
Google Doctype is an encyclopedia that can be edited by anyone who has a Google account and wants to keep it up-to-date or add new articles. The encyclopedia contains articles about web security, DOM manipulation, CSS, HTML best practices, references for HTML, DOM, CSS, complete with browser compatibility information. There's also previously-unreleased code used internally by Google that is now documented and available for anyone to use.
You could call it Wikipedia for web developers or cross-browser MSDN, but Google Doctype is a clear sign that Google wants to foster a community of developers and encourage building web application using open standards. Mark Pilgrim, the author of "Dive into Python" and now a Googler, explains more about Google Doctype in the following interview:
Google Spreadsheets added an option in the sharing dialog that allows anyone to view or edit the spreadsheet just by knowing the URL. Until now, you had to send an invitation URL that contained a secret code and the people you invited had to login using a Google account. If you click on the Share tab and enable "Anyone can edit this document WITHOUT LOGGING IN", your spreadsheet becomes a wiki that can be edited by anyone.
To experiment the new feature, I set up a spreadsheet that compares the features of two desktop office suites (MS Office, OpenOffice) and two online office suites (Google Docs, Zoho). The spreadsheet has three sheets for: word processing, spreadsheets and presentations.
(Update: You can view a snapshot of the spreadsheet after the first day of editing. The comparison has been extended with information about voice recorders, clay tablets, smoke signals, telepathy, Jedi mind control and more.)
In other spreadsheet-related news, you can now embed forms in a web site by just copying some code, there's a new option to duplicate questions and users can add their own answer to a multiple-choice question.
A simple use for Google's iPhone interfaces is to add them as sidebars in Firefox or Opera. I mentioned last year some Google gadgets for Google Notebook, Google Talk, Google Docs, that could be displayed in a permanently-visible sidebar. Here are two interfaces optimized for iPhone that have permalinks:
iGoogle - all of the gadgets are displayed in a single column and you can switch between tabs at the bottom of the page.
Google Reader - a beautiful interface that lets you read posts inline, star them, share them and browse by tags or feeds. "This new version is designed to offer many of the same features as the desktop, while making it quick and easy to act on items," says Google Reader blog.
Normally, if you click on the two links for iGoogle and Google Reader in Firefox or Opera, you should be asked if you want to bookmark the page. To open it in the sidebar, go to the Bookmarks menu and click on the corresponding item or press F4 in Opera. If the links don't automatically create a sidebar:
* bookmark them and select "Show in panel" (for Opera)
* bookmark them, then go to the Bookmarks menu, right-click on the bookmarks, select "Properties" and enable "Load this bookmark in the sidebar" (for Firefox).
Google prepares to launch Street View in Europe, Canada and Australia, where local laws regarding privacy in public places are less permissive than in the US. As promised, Google will blur faces for all the Street View imagery not just because of local laws, but also because the purpose of Street View is to show places, not people that happen to be there when Google's Street View cars go by.
To test the face detection algorithms, Google started with Manhattan, where Google replaced the Immersive Media imagery with newer and better panoramic images. "This effort has been a year in the making - working at Street View-scale is a tough challenge that required us to advance state-of-the-art automatic face detection, and we continue working hard to improve it as we roll it out for our existing and future imagery." The results are pretty good, although not all the faces are correctly detected and people can be identified even when their faces are blurred.
As Avi Bar-Zeev suggested, a better idea would be to remove people and cars from the images. "The main reason for removing people and cars is the same reason you'd want the base Google Earth imagery to lack clouds. These things tend to block your view. You can't really look behind them in a simple (essentially 2D) panoramic image. They only represent one snapshot in time vs. a broader/more virtualized essence of the place. They make it confusing to add dynamic versions of the same things on top of what's permanently baked into the imagery."
As a Google employee recently said, "Google Maps is evolving from a driving directions and business search tool, to a comprehensive representation of all the world's information, on a map." That's why Google Maps started to integrate different layers of information when you search for an address and it added a new "More" button to enable layers for photos and Wikipedia articles. Google Maps now includes in search results personalized maps, geolocated content from the web and mapped web pages.
There's a new option to search for real estate: click on "show search options" and select "real estate" from the drop-down. The search results don't seem to be powered by come from Google Base. Google shows structured information about houses and lets you refine the results by price, number of bedrooms and bathrooms. Even if there aren't too many advanced features, it's interesting to see that Google Maps wants to index all the information that could be displayed on a map.
Many real estate sites use Google Maps API and the first Google Maps mashup was HousingMaps, a site that displays Craigslist housing listings on a map. Last month, Trulia was one of the first sites that integrated Google Street View "to add efficiency to the real estate search experience and help home buyers discover more information about particular neighborhoods".
(The post was updated to reflect that the data is obtained from Google Base.)
Any web site can be a container for OpenSocial, any web site can add social features even if it's not a social network - that seems to be the idea behind Friend Connect, a new piece from Google's social puzzle. Friend Connect will allow the users of a site to add profiles, to import their friends from other social networks, to use social applications in the context of a site.
Paul Buchheit wrote last year that "there's no such thing as a social network". The social aspect of a site is just one of its many features. "Real products need more functionality in order to somehow deliver value to their users. It is this other functionality that defines the real purpose of a product, not the social network, which exists only to enable or enhance the core purpose."
Friend Connect is an enabler for making web sites more social, since the barrier to entry is really low. "First, many website owners want to add features that enable their visitors to do things with their friends, but the technology and resource hurdles have been too high. Second, people are tiring of needing to create new logins and profiles and recreate their friends lists wherever they go on the web." Google will use OpenID or Google Accounts for authentication, OAuth or APIs like Facebook Connect, MySpace Data Availability to find your friends from other social networks and OpenSocial gadgets to interact with your friends.
"Social is in the air. It's the blossoming of a lot of work by a lot of people. We don't move in lockstep and don't need to. We converge on interoperable technology. There's more than one way to connect a site to the social Web. With Friend Connect, we're confident it's a good step forward. I'm sure there will be more ways to do that than what Friend Connect does. We wanted to start with easiest and safest starting point," said Google's David Glazer in a conference call.
A preview of Friend Connect will be available later today at http://www.google.com/friendconnect (update: the page is live) and the service should be launched in a couple of months.
If you publish a presentation at Google Docs, you'll receive a simple URL that can be used to view the presentation online. Unfortunately, if you go to that URL without being logged in to a Google Account, Google will ask you to log in:
The explanation is that Google Presentations offers some advanced features that require authentication: chatting with other people that view the presentation and joining a presentation that's already in progress. To view the presentation without logging in, click on the small link from the bottom of the page: "View published presentation in a new window".
If you want to link directly to the presentation and skip the authentication page, just add "&skipauth=true" to the URL provided by Google and replace "Presentation" with "Present". Here's an example.
URL provided by Google: http://docs.google.com/Presentation?docid=ID
Modified URL: http://docs.google.com/Present?docid=ID&skipauth=true (replace ID with the presentation's ID)
Most people will also want to download the presentation, but Google doesn't offer this feature in the view-only interface. You can link to the PPT file by using this format:
Yahoo's strategy to increase the search market share is to add features that can't be found at Google or somewhere else. The problem is that these features need to be distinctive and useful enough to attract the attention and make people switch to Yahoo or at least use it a secondary search engine.
The first innovative feature added by Yahoo was Search Assistant, an integrated pane that combined autocomplete and related searches. Search Assistant was heavily inspired by Ask.com's left sidebar, but it included a distinctive feature that made it less obtrusive: the pane is only displayed if you stop typing for a couple of seconds or when your typing slows.
Google also tests a query suggestion feature and places a list of related searches at the top of the page, but Yahoo's implementation is more interesting.
This week, Yahoo started to add SiteAdvisor's warnings next to search results. "Safety ratings from McAfee SiteAdvisor are based on automated safety tests of Web sites and are enhanced by feedback from volunteer reviewers". Yahoo only shows warnings next to sites that use browser exploits, offer malicious software or send spam. Google also shows warnings next to web pages that may install malicious software, but McAfee SiteAdvisor seems to offer a more comprehensive protection and more information about the potential threats (you can also install a plug-in for IE or Firefox that works with the most popular search engines or manually find the testing results for a site).
Probably the most impressive new feature in Yahoo Search and the only one that's not yet live is SearchMonkey (an unfortunate play on GreaseMonkey), a way for site owners to enrich the snippets with structured information. "Site owners will be able to provide all types of additional information about their site directly to Yahoo! Search. So instead of a simple title, abstract and URL, for the first time users will see rich results that incorporate the massive amount of data buried in websites -- ratings and reviews, images, deep links, and all kinds of other useful data -- directly on the Yahoo! Search results page."
Yahoo uses semantic web standards to retrieve structured information from web sites, but users are the ones who decide if they want richer search results from a site. Yahoo will support a small number of microformats (hCard, hCalendar, hReview, hAtom, XFN), "some of the vocabulary of Dublin Core, Creative Commons, FOAF, GeoRSS, and MediaRSS, as well as RDFa, eRDF, and the OpenSearch specification".
Google chose a different approach - plus boxes that show additional information automatically detected: addresses, stock symbols, products etc. Google also lets you add subscribed links to search results pages, but very few sites took advantage of this feature.
If Yahoo manages to promote these features and site owners build interesting applications for SearchMonkey, people might discover that Yahoo has a pretty good search engine and search is not synonymous with Google. Exploring different ways to present search results will lead to a better user experience and to an improvement for all search engines, since the best features are usually copied by all of them. Yahoo Search hopes to become a serious alternative to Google by having a distinctive voice, but the history of Ask.com or Opera shows that being innovative is not the only necessary ingredient for becoming popular.