Content Audit IDs

Generations* of content strategists have developed methodologies for creating content audits. Most use Google Sheets or Excel, which allows for sorting and filtering of pages, and is particularly valuable with complex sites.

The only problem is the audit ID. But I may have a solution.

What’s a Content Audit ID?

Content audits tend to identify pages by unique IDs. The IDs follow a system to help understand where a page lives in the overall navigation of the website:

  • The number of periods (.) correspond to the level of the page
  • The number itself (1, 2, 3) corresponds to how many items are on that level

For example:

  • 0 is always the homepage
  • 1 is the first page under the homepage (e.g. Home > Products)
    • All pages beginning with 1 will be within Home > Products
  • 2 is also at the first level, but is the 2nd page listed within the top-level navigation (e.g. Home > Services)
  • 1.1 is the first page at the second level (e.g. Home > Products > Specific Product)
  • 1.2 is the second page at the second level (e.g. Home > Products > A Different Product)

What’s the Problem (and Solution)?

Here’s the problem: Excel doesn’t like to sort by ID. As far as Excel is concerned, numbers should be organized like so:

  • 1
  • 1.10
  • 1.2
  • 1.3
  • 2
  • 2.1
  • 3
  • 1.2.1
  • 2.1.1

But according to a content strategist, those same numbers should be organized:

  • 1
  • 1.2
  • 1.2.1
  • 1.3
  • 1.10
  • 2
  • 2.1
  • 2.1.1
  • 2.2
  • 2.3
  • 3

So my local neighborhood developer (and husband), threw together some hacked up Excel formulas which were never intended to see the light of day to fix the problem. If you also struggle with this, feel free to take this example and hack together something similar in Excel.

    • Insert 6 columns to the right of your ID column. Assuming your ID column is column A, this means creating blank columns in B-G.
    • In column B, insert this code:
      •  =SUBSTITUTE(CONCATENATE(A2,REPT(".0",7-(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))))),".", " ! ",4)
      • What we’re doing here is appending a series of trailing “.0″s to the ID in Column A to standardize everything to have 8 bits.  We have to figure out how many periods are in the first.  Excel sucks at this and we have to take the overall length of the string and subtract the length of the string without “.”s included.  That’s this bit: (LEN(A2)-LEN(SUBSTITUTE(A2,”.”,”")).  It’s hacky and terrible, but it’s the best option excel offers.
      • Now that we now the number of bits we’re working with CONCATENATE() along with REPT() allows us to append trailing “.0″s.
      • The Leading SUBSTITUTE() function closes at the end and is replacing the 4th “.” with a “!”.  This is just inserting a searchable character to make things easier down the road.
    • In column C, insert this code:
      • =LEFT(B2,FIND(" ! ",B2))
      • Here we’re looking for the special search character we inserted in step 1 and returning everything to the left of it so we can begin standardizing each bit to 3 digit leading 0 format.
    • In column D, insert this code:
      • =TEXT(LEFT(C2,FIND(".",C2,1)-1),"000")&"." & TEXT(MID(C2,FIND( ".",C2,1)+1,FIND(".",C2,FIND(".",C2,1)+1)-FIND(".",C2,1)-1),"000")&"."&TEXT(MID(C2,FIND(".",C2,FIND(".",C2,1)+1)+1,FIND(".",C2, FIND(".",C2,FIND(".",C2,1)+1)+1)-FIND(".",C2,FIND(".",C2,1)+1)-1), "000")&"."&TEXT(RIGHT(C2,LEN(C2)-FIND(".",C2,FIND(".",C2,FIND( ".",C2,1)+1)+1)),"000")
      • Using FIND() and either MID() or LEFT() we’re isolating each bit here and formating it as text with TEXT and providing a “000″ so we get everything to 3 digits. 1 will be 001, 20 will be 020, 100 will just be 100.  If you have more than 999 pages in a section you’ll have to go to “0000″
      • The &”.” is basically CONCATENTATE() shorthand.
      • This is the main reason we split the original octet into 2 quads with ” ! “.  The TEXT() MID() combo used to find the individual bits gets cumbersome after a certain point and this keeps our nesting to 4 and is somewhat readable.   But now we have the first half of our ID.
    • In column E, insert this code:
      • =RIGHT(B2,LEN(B2)-(FIND(" ! ",B2)+2))
      • Here again as in Column C we’re looking for the special search character we inserted in step 1 and this time returning everything to the right of it so we can begin standardizing each bit to 3 digit leading 0 format.
    • In column F, insert this code:
      • =TEXT(LEFT(E2,FIND(".",E2,1)-1),"000")&"." & TEXT(MID(E2,FIND( ".",E2,1)+1,FIND(".",E2,FIND(".",E2,1)+1)-FIND(".",E2,1)-1),"000")&"."&TEXT(MID(E2,FIND(".",E2,FIND(".",E2,1)+1)+1,FIND(".",E2, FIND(".",E2,FIND(".",E2,1)+1)+1)-FIND(".",E2,FIND(".",E2,1)+1)-1), "000")&"."&TEXT(RIGHT(E2,LEN(E2)-FIND(".",E2,FIND(".",E2,FIND( ".",E2,1)+1)+1)),"000")
      • See Column D sub bullets.
    • In column G, insert this code:
      • =TEXT(CONCATENATE(D2,".",F2),"@")
      • Here we’re finally joining together the two quads of our ID into one octet with CONCATENATE() and formatting the whole thing as generic text using TEXT() and the “@” format which is just placeholder text.
    • Known issues
      • Empty IDs will bubble to the top.  It’s as much a feature as it is a bug.
      • Trailing “.”s will break things (1.1. for example).  Easy enough to add error checking for that if you want by using an IF() to check the final character.

The resulting columns should look something like this:

Screen Shot 2017-09-18 at 11.45.25 AM

What you’ve done, is tell the Excel to treat the IDs as though they were IP addresses. It knows how to read those, and you can use it up to 8 levels and 1000 pages within a section.

Questions? Comments? Betters ways to do it? Let me know!

*Ok, maybe not generations, per se. But a lot of content strategists.

Marli Mesibov

Marli is a content strategist with a passion for the user experience. Her work spans websites, web applications, and mobile. Marli is the VP of Content Strategy at the UX design agency Mad*Pow, where she helps healthcare, finance, and educational organizations communicate with their audiences. Marli is a frequent conference speaker, and has spoken at conferences including Content Strategy Forum and LavaCon. She can also be found on Twitter, where she shares thoughts on content strategy, literature, and Muppets.

Leave a Reply

Your email address will not be published. Required fields are marked *