Content audits and inventories

Auditing your website content can seem an interminable task, but it’s long been regarded as an essential part of pre-redesign planning and content inventories are increasingly recognised as vital long-term tools for the effective management of web content.

If you’re just beginning to grapple with a content audit, below are some articles, books and example spreadsheets which you should find helpful.

Articles

Doing a Content Inventory, (Or, a Mind-Numbingly Detailed Odyssey through your Web Site)
This short 2002 article by Jeffrey Veen is a good starting place for learning about content audits.

The Content Inventory: Roadmap to a Successful CMS Implementation
Article by Kassia Krozser which depicts content auditing as an essential part of a CMS implementation process. Helpfully points out that content inventories ‘almost always take longer than anticipated’.

Doing a Content Audit or Inventory
This blog post by Scott Baldwin includes some useful suggestions for applications which can speed up the auditing process by automating some of the listing process.

How to do a Content Audit
Hilary Marsh provides practical tips on content auditing, including advice to start at the highest levels of the site before working downwards and to be careful when ordering columns in Excel that you don’t just change the order of a single column.

A Map-Based Approach to a Content Inventory
Interesting article by Patrick C. Walsh, describing how he used Microsoft Access and Visio to create a maintainable site map and content inventory at the same time.

Why you shouldn’t start IA with a Content Inventory
A heretical article by Leisa Reichelt suggesting that starting redesign projects with a content inventory can be undesirable in that it immerses the designer in the existing way of doing things and constrains their ability to take a fresh approach. This provoked several responses, including an interesting rebuttal from Donna Spencer and The Rolling Content Inventory by Louis Rosenfeld, who champions content inventories as an ongoing process rather than a one-off exercise for redesign projects.

Books

Communicating Design: Developing Web Site Documentation for Design and Planning by Daniel M. Brown (Peachpit Press, 2007)
Contains a chapter on content inventories, with some helpful suggestions on formatting, linking an inventory into other website documentation and presenting the results of an inventory at meetings.

Content Strategy for the Web by Kristina Halvorson (New Riders, 2009)
Has detailed practical advice about auditing content and tying the findings into an effective content strategy for your site.

Managing Enterprise Content: A Unified Content Strategy by Ann Rockley (New Riders, 2003)
A thorough treatment of all aspects of content management. The chapter that covers Performing a Content Audit is available free online.

Sample spreadsheets for content inventories

I’ve already mentioned Jeffrey Veen’s article Doing a Content Inventory, which has includes an Excel template for an inventory. It lists Page ID, Page Name, Link, Document Type, Topics, Owner,  ROT (Redundant, Outdated or Trivial?) and Notes. It uses colour coding and indentation to reflect hierarchy.

Donna Spencer provides a simple content inventory spreadsheet on her blog. It includes fields for Navigation Title, Page Title, Files, Last Updated, Owner, Comments and whether the content needs to be deleted. Again, there’s use of indentation to indicate hierarchy and an example of freezing the Navigation Title column in Excel, so that it’s always visible as you scroll to the right – a nice technique to use for presenting large inventories.

Finally, the Seneb Consulting site has an example content inventory by Sarah A. Rice. It includes use of Excel’s Group and Outline features to allow the reader to expand and collapse groups of content, as well as instructions for using the Split Screen feature when dealing with larger inventories.

Review: Content Strategy for the Web

content_strategyTextual content is a red-headed stepchild when it comes to website design and development. It’s left to the last minute in site redesigns, viewed as a commodity by most site owners and as a simple item in a to-do list for UX designers. Website text is rarely approached correctly in web projects as a ‘complex, ever-evolving body of information which needs ongoing care and feeding’.

This is the striking viewpoint of Kristina Halvorson’s book on content strategy which lays bare the complexities of content production. She offers plenty of common-sense advice about how to build website text into a key business asset, keep control of it over the long term and set measurable objectives for success.

Key to this is developing an appreciation of the political nature of content, engaging with content providers and giving reviewers plenty of notice for their contributions. ‘Don’t leave content management to your CMS’ is the clear message. You need people for meaningful, actionable content and the key person required is someone in overall charge of content – an editor-in-chief empowered to say no to the business when necessary.

The content audit is thoroughly explored as a content management tool. There are useful practical tips here, such as using indented outline numbers in your audit documentation - 1.0, 1.1, 1.2 , etc – so you can easily link specific pieces of content to matching references in the site map and other documents later. There’s also an interesting discussion of the use of page tables for content planning and advice on how to include qualitative judgments in your audit as well as just conducting a quantitative analysis of content

There’s a whole chapter on content maintenance – a subject you rarely see people write much about. This advises developing a maintenance plan, having enforceable well-documented rules and using regularly-scheduled qualitative audits to question the ongoing purpose of each piece of content. The latter point draws on Gerry McGovern’s useful advice that all content ought to be regularly reviewed and removed if it’s not meeting a business objective or helping users achieve a task.

The book has a lively pugnacious style which makes it an easy read about a subject that could easily have come across as dull. The author makes a stack of suggestions which anybody working on websites could benefit from. However, reading it only confirmed my pre-existing assumption that content strategy can be a hard sell.

Improving the status of content creation in most organisations involves fighting against the general assumption of management that ‘anyone can write content’. Within the professional web world it’s up against the status of more exciting and saleable web disciplines in design and development and specialisms like SEO which contribute more transparently to improving the bottom line. In this context, long-term content maintenance is never going to be generally considered as important as implementing an exciting new content management system or launching a flashy new site design. Recognising the centrality of textual content to a successful web presence is therefore always going to be difficult to sell to a lot of organisations, but this book is one of the best pitches I’ve seen so far.

Content Strategy for the Web by Kristina Halvorson is published by New Riders.

Related posts

Letting Go of the Words is another recommended book on writing for the web which I reviewed last year.

Review: Advanced Web Metrics with Google Analytics

advanced web metricsAdvanced Web Metrics with Google Analytics by Brian Clifton is a good introduction and excellent long-term reference for anyone who needs to implement Google Analytics on their website.

Google Analytics has become a very popular web metrics tool – not least because it’s free (although there is  a limit of  five million page-views per month if you don’t have an AdWords account).  It has a great feature set – including site and map overlay reports, customizable dashboards, easy cross-segmentation of data and two-click integration with AdWords. It’s also quite easy to set up and use – in terms of basic functionality at least. However, when you need to go beyond the basics you’ll need some guidance and this book certainly provides plenty of help for many of the issues you are likely to face.

Part one provides an introduction to web analytics, including discussions of the pros and cons of using page tags vs log files, the use of cookies and privacy issues.  It concludes with a high level overview of Google Analytics, describing its key features and how it works.

Part two continues this overview with an introduction to using the Google Analytics interface and a discussion of ten important first-level reports to ease the reader into the more detailed coverage of implementation issues in part three. This is the most technical section and includes advice on best practices for configuring Google Analytics for your site and a whole chapter of hacks for dealing with areas not covered by the default reports.

Part four is possibly the most useful part of the book, since it looks at how to use the data you’ve gained via Google Analytics to drive real-world website improvements. This includes helpful advice on how to engage non-technical colleagues in your improvement efforts. I particularly liked the section on monetizing a non e-commerce website, which tells you how to get the most out of Google Analytic’s e-commerce features even if you don’t have an e-commerce site. There’s also a discussion of Google’s Website Optimizer – a tool for undertaking multivariate tests on your site which looks really useful.

The book ends with an appendix of recommended further reading including books, web resources and a long listing of web analytics blogs.

The author certainly gets very technical at some points – particularly when delving into the use of regular expressions and discussing complex modifications to the GATC. However most of the book  should still be comprehensible and useful to a non-techie marketing or management audience.  Indeed, if they persevere they can then use the book to beat their technical staff over the head by quoting the bits where particular implementation details are described as being easy for good webmasters to accomplish.

Admittedly, most of the information you’ll get here is also available online somewhere for free. However, it’s scattered around web analytics blogs and forum posts and many people who could benefit from it are not going to have the time and/or the perseverance to seek it all out. Even if you know all the hacks cited already, the convenience of having them collected in one reference book is still a great benefit. Over and above having a lot of neat tricks, the book presents a coherent approach to the whole business of analytics which makes it worth reading for anyone who needs to undertake web metrics on a professional basis.

Advanced Web Metrics with Google Analytics by Brian Clifton is published by Sybex.

Related posts

Review: Refactoring HTML

Refactoring HTMLThere’s nothing more dispiriting than being stuck with maintaining an old website with hundreds of pages of rubbishy “Netscape 4″-era code. There can be an overwhelming number of things which need fixing or updating. You may well be tempted to think it not worth trying to improve things incrementally and instead plan for a major redesign at some indeterminate point in the future – which maybe you’ll never get the time or the money to undertake.

If you find yourself in this situation, then Elliotte Rusty Harold’s book Refactoring HTML: Improving the design of existing web applications will be a good antidote to “wait for the redesign” paralysis. The book encourages taking a gradual approach to converting your website to a modern standards-compliant state, rather than trying to do everything at once. I think this is a great area to write a book about, since it fulfils a need a lot of website managers will have. Designing and building new sites is fun and there are an awful lot of books available about this creation process. Maintenance and incremental improvement of old sites is distinctly unsexy in comparison and gets much less shelf-space.

The book is a compendium of stuff, most of which you’ll probably already know you should be doing. It’s arranged with subjects listed in a cookbook fashion – covering why you want to make each change, potential trade-offs and the mechanics of how to carry out each improvement. The first chapter is an introduction to the subject of refactoring, which is a programming concept that may be new to a lot of web designers.

“Refactoring is the gradual improvement of a code base by making small changes that don’t modify a program’s behaviour, usually with the help of some kind of automated tool. The goal of refactoring is to remove the accumulated cruft of years of legacy code and produce cleaner code that is easier to maintain, easier to debug, and easier to add new features to.” From “Refactoring HTML”

In the second chapter, there is a thorough overview of automated tools you can use for refactoring. A lot of the information here is going to be of more use if you’re a programmer. However, the discussion of regular expressions should be of use to anyone who has to deal with outdated HTML code. It’s backed up by an appendix which provides a beginner’s guide to regular expressions. Throughout the book there are specific regular expressions supplied for fixing particular problems which will be hugely useful to non-programmers like me who find writing their own regular expressions a pain.

Chapters 3 and 4 cover all the aspects of well-formedness and validity in HTML documents. The author is sensibly not insistent upon validation for its own sake and on several occasions gives examples of times when it may be pragmatically better to go for an invalid option. He also points out where the standards don’t actually make much sense – the rule that block quotes can’t be within paragraphs is one example discussed which has always really annoyed me.

Chapter 5 covers layout, with some discussion of replacing table layouts and frames-based layouts with CSS. However this is definitely not a design-oriented book and its CSS advice is limited to providing some basic layouts and advising that CSS is ‘very much a technique for full-time professionals”.

The book continues with a nice-to-see chapter on accessibility and a section on web applications. The latter includes an interesting section on Web Forms 2.0 types as well as solid advice on older topics, like when to use POST or GET and the need to escape all user input. Finally, there’s a chapter on content which – like the section on layout – is pretty basic. Still, it’s nice to see an emphasis on the need for correct spelling in a book that seems to be aimed primarily at coders.

Refactoring HTML as a whole is certainly useful for anyone managing a badly-coded site, especially if they haven’t thought much about ways to semi-automate testing and improvements.

Refactoring HTML: Improving the design of existing web applications by Elliotte Rusty Harold is published by Addison-Wesley.