Friendly URL Management

Overview

FarCry manages content in a pseudo-object oriented way, through the so called "content object application programming interface" or COAPI. This approach provides tremendous advantages for content management but has the disadvantage of generating URLs that are parameterised by default. Parameterised URLs are intimidating to users and difficult to remember and share.

For example, a typical parameterised URL in FarCry would be: http://www.farcrycms.org/index.cfm?objectid=A762FA21-EA69-0EC7-F9213952134B86E8

In contrast, Friendly URLs are defined as internet addresses that are both easy for human users of the web site to read, and also meaningful for automated spiders. FarCry has a special engine for automatically creating and managing Friendly URLs for content generated through the system.

For example, a typical Friendly URL in FarCry would be: http://farcry.daemon.com.au/go/documentation/daily-use-guides

Historical Mechanism

FarCry 2.3 Friendly URL (FU) invocation technology works like so:

  • FriendlyURL servlet looks for specific mask on URL.
  • mask is set in jrun xml config and in FarCry FU config settings. By default this is /go/* but can be anything per project.
  • if the mask pattern is matched by the servlet, the URL is looked up in the FriendlyURL hash table (persisted on the drive as FriendlyURL.txt) and the paramterised URL provided to the application
  • parameterised URL passed to index.cfm for processing eg. index.cfm?objectid=UUID

The FU's are managed through a component in the FarCry code base. By default dmNavigation content items have a FU built from the ancestor title breadcrumb to home. There are limited utilities for reporting on active FUs and rebuilding FUs from the navigation tree. Extending FU's to other content types is possible but by no means obvious.

FarCry 3.0 Invocation Mechanism

The FU servlet gets modified to simply match the FU pattern (eg /go/*) and then redirects processing to a specific go.cfm template in the webroot.

A summary of invocation might include:

  • servlet matches FU pattern and redirects to go.cfm
  • go.cfm determines parameterised URL
  • go.cfm CFINCLUDES index.cfm to continue FarCry invocation

URL Passthrough

Despite the internal invocation process, the rewrite engine should be performing a "passthrough". That is, URL remains the same to the user in the browser. And the page execution from the CFINCLUDE performs similarly to any standard FarCry invocation with the exception that Application.cfm/.cfc is invoked then go.cfm and finally the central conjuror, typically index.cfm.

The default processing code within go.cfm is centralised in farcry_core through the use of a component function call. Individual projects will be able to supplement or replace this default behaviour directly within the go.cfm template of their project. For example, a different invocation template might be chosen eg. invoke.cfm instead of index.cfm.

FU management in FarCry has changed slightly:

  • rather than peristing FU information in FriendlyURL.txt URLs are persisted in the CMS db and loaded into the application.fu.* scope on app initialisation.
  • a component in FarCry would set FU's into the database and memory as required
  • individual content types (including custom content types) would set their own FU during the action case of an edit handler, or perhaps within an extended setdata() method (details for core content type behaviour beyond that already known for dmNavigation will be determined for a later milestone).

Friendly URL Subsystem

All system level friendly URL management is handled through a specific component, ie. fu.cfc. fu.cfc has been re-engineered to seamlessly present the existing v2.3 FU functionality but also provide a richer interface for FarCry 3.0 extensions.

Database persistence will be done in the "refFriendlyURL" table with the following fields:

  • objectid (pk)
  • refobjectid (objectid of referenced content item)
  • friendlyurl (everything after /go/*, not including query_string)
  • query_string (the query_string of the parameterised URL)
  • status (0: redirect 301, 1: active, 2: active permanent)

status includes an "active permanent" state to accommodate the option for FU's that have been manually created and should consequently not be automatically retired on content editing changes. As content is edited it is envisaged that old FU's will be flagged to 0 while the current active FU will have a state of 1. That is, refobjectid/status:1 should be unique.

In memory storage of all active FU's (ie. status > 0) will take one of two routes:

  • stored as a structure of structures keyed by friendlyURL eg. application.fu[friendlyURL].* including the following keys: refobjectid, query_string, or;
  • stored as a query eg. application.fu.qFU (with additional metadata assigned to application.fu for management eg. datetimelastupdated for query)

query of queries vs structkeyfinds()

We'll need to perform some analysis of the performance of query of queries vs structkeyfinds() on large data sets. Either way the internal storage mechanism should not
affect developers.

Friendly URL Structure

The proposed FU changes would mean that any content type (core or custom) could record any FU based on both properties and logic stored in the content type, activated during editing, approval or any other stage of the content item lifecycle.

This will enable content types to accommodate strategies for developing SEO specific properties. How these SEO properties are populated would be wholly dependent on the
edit handler for the content type in question. For example, such a property could be a freeform field, calculated or selectively populated from any form of metadata engine for
specific keyword taxonomies.

An example format for the friendly URL might be as follows: http://myproject.com/product/SEO-URL-Keyword/DocID

Note in this example we're using the /product/ prefix rather than /go/ seoURLKeyword - user defined property. Users select a word or phrase from a predefined list of options. Multi-word phrases should be separated by hyphens. (i.e. superfriendly-urls)

DocID: DocID might be an internal unique identifier. This could provide better options for backward compatibility for legacy content systems and so on.

For dynamic content that effectively lives in numerous places in the site, URLs could be set according to the methods described above rather than their proximity in the site tree.

/product/rag-doll/DG00123
/product/mp3-player/DG00234

Redirects

What happens when Friendly URLs change? FarCry holds onto old URLs, persisting them in the database and flagging them as defunct. That way we can do some better 404 handling in the go.cfm template. By default missed FU look ups perform a second look up for deprecated URLs from the database and then redirecting the user if possible. As a last resort we pass the user through to a 404.cfm handler.

Clients migrating content from old systems can write specific checks in go.cfm, post FU lookup, to map old content to new locations and firing relevant 301 redirects. With the case of historical URLs that don't match the FU pattern (probably most), we'd be proposing that the error handler or Application.cfm/.cfc look for the redirect before abandoning the user to a 404 or missing objectid error.

Redirect in ColdFusion

It is critical to consider the nature of the redirect being performed, namely a 301, in order to preserve search engine page rank and indexation during the move. Ideally this should be done using a specific HTTP header.

<cfheader statuscode="301" statustext="Moved permanently">
<cfheader name="Location" value="http://www.new-url.com">

Extending refObjects

Its possible projects may wish to utilise a cross content-type document id that is different to a CF UUID. For example, the docid might carry through from a legacy system, or simply provide a simplified content key. At this time I can see no issue with extending the refObjects table to incoporate an additional column.

Sub-Navigation

FU's can really be any parameterised URL, including variable-value pairs beyond objectid=UUID. So it is feasible to incoporate sub-navigational context into an FU. However, the real issue is management of the FU and ensuring that accessories such as buildLink.cfm custom tag properly understand the extended concepts.