Review: Googling Security: How Much Does Google Know About You?

Googling Security is an interesting examination of the privacy issues surrounding the mass use of web services. It’s not just about Google, but “covers many facets of the problem of web-based information disclosure as seen through the lens of Google’s tools and services.” The tone isn’t generally that of bashing a particular company – the author for example goes out of his way to praise Google’s “awesome suite of tools”. However, Google as the biggest supplier of online services is clearly an obvious focus for this sort of analysis.

Early sections include a high-level overview of information flows and leakage, data retention and profiling. The book then moves on to chapters on individual types of web service – search, communications, mapping and advertising. The conclusion is a section on countermeasures and a look at the future.

It should come as no big surprise to any averagely-informed web user that online email, mapping and office applications or cross-site web analytics tracking can compromise their privacy. (However, many people may not realise that Google and other web-mail providers explicitly do not guarantee to delete your emails from “offline” backup systems when you delete them via the web interface.) The privacy case against Google Maps and especially their Street View application has been particularly well covered in the media.

The more scary part of the book for many will be the section on search which reveals the extent to which people can potentially compromise their privacy by day-to-day use of search engines. The examples which the author provides from the data-set of search activity released by AOL are very effective at showing there’s a serious potential issue here. The details on finger-printing techniques and the degree to which you can be personally identified over time by your search queries alone are also eye-opening. There is an emphasis on the need to think about your online activity in aggregate rather than as a series of single transactions. Each transaction may give little away on its own but could reveal a lot when examined alongside thousands of others.

Suggested countermeasures include becoming a more informed user of web services, educating others and campaigning for regulatory changes or for companies themselves to take privacy more seriously. The technical suggestions include deleting cookies, employing proxies and encryption, avoiding registered accounts, etc. – but the downsides to all of these are also clearly stated.

In the end, the book is quite a depressing read since the online privacy situation looks like it will get worse in the immediate future and there’s no easy solution for improving things. Avoiding using web services cripples your ability to use the potential of the web effectively, as does obsessively employing privacy technologies. As the author points out: “A bulletproof, anonymous web-browsing experience doesn’t exist.”

Googling Security: How Much Does Google Know About You? by Greg Conti is published by Addison-Wesley.

Free hosted search solutions

If you want a free search engine for your website and don’t want too much technical hassle, then a remotely hosted search solution may be just the thing for you. Hosted search has a much easier set-up process than a search engine which requires installation on your server. All you generally need to do is a bit of simple configuration followed by cutting-and-pasting some supplied code into your pages.

However, there are definitely some potential downsides to consider with the free offerings from hosted search providers:

  • Free hosted search engines generally come with adverts on results pages. If you want to remove them, you will need to upgrade to the paid-for version of the service.
  • Free services may not give you any control over how often or how thoroughly your site is indexed
  • Search results pages may have little or no scope for customisation
  • Some solutions may have page limits on how many pages the free search will index
  • The supplied cut-and-paste code may include formatting you want to amend or code which won’t validate without a bit of work

If you want to explore free hosted search further, below are eight services currently offering a free solution:

Google Custom Search Engine
http://www.google.com/coop/cse/
As you’d expect from Google, CSE is a polished product with an extremely easy set-up procedure. It can also give you extra benefits if you combine its use with Google AdSense and Google Analytics. A major downside to CSE though is that it does not provide you with any control over the indexing of your site. Even with the paid-for version there is no guarantee that all your pages will ever get indexed and no way to schedule indexing. You essentially just get the results which Google serves up for your site on google.com. However, if you have a large site which is already well indexed by Google, the results produced can be better than those from other free search engines, since you benefit from Google’s ability to pull the most relevant results to the front of results sets. Also, a great benefit for not-for-profit organisations is that they don’t need to have ads on their results pages, even with the free version.

JRank

http://www.jrank.org/

Enables direct control over crawling schedule, customisation of results and the ability to divide your website search into “contexts”, groups which you can specify.

Freefind
http://www.freefind.com/
Offers almost all the features in its paid-for version in its free edition, including no fixed page limit (although the size of site supported is limited to 64MB of HTML).

SiteLevel Basic

http://sitelevel.whatuseek.com

Includes reporting features, the ability to configure how relevancy is determined and customisable results screens. You get weekly automated re-indexing with the free siteLevel Basic service. The (paid-for) Pro version has daily automated re-indexing. You can also specify categories within your site for more targeted searching.

Atomz
http://www.atomz.com
Includes customizable search results pages, control over indexing and search statistics. Indexes up to 10,000 pages and supports multiple languages.

Rollyo

http://rollyo.com

As with Google CSE, you can set this up to search across several sites. Unlike with CSE, or most other free solutions listed here, the results have to appear on Rollyo’s site rather than your own and you can’t customise the way they look. Includes neat social networking features which differentiate it from other solutions.

picosearch

http://www.picosearch.com

The free version offers some customisation of search results as against complete templating in the paid-for versions. The free version also has some control over indexing and indexes up to 250 pages – but only HTML and text, not PDF or Word.

FusionBot

http://www.fusionbot.com

FusionBot offers 5 different packages at different price points. The free package includes sitemap generation, search context control, the ability to create search regions/partitions and basic customisation of results pages. The free version does not index PDF or Word files or highlight key words.