« Thoughts about JSON | Main | Thoughts about Tagging »

Temporal Perturbation

I had a really interesting chat with one of my colleagues last night about keyword detection, keyword relevance algorithms and in general self learning computer systems for the web. These are growing more and more important in the web today. Great examples of this are Google search results, Amazon recommends and all tagging technologies.

The problem that is becoming evident at a low level is described in natural search circles as the rich get richer. Since links exist for ever older links may be less relevantly describing a site's content than newer links and yet these links are still used heavily in determining keyword relevance for a page in Google.

In recommends services if I have browsed both good to great and built to last recently amazon will happily recommend the two together. Unsurprisingly with that combo thousands of others will do that over time and the affinity between those books will grow strong in their database. Here is where I enter fiction since I don't know how Amazon really works but some recommendation services certainly work in this way: What if Chris Anderson's upcoming book on the long tail has a stronger affinity with Good to Great (not implausible) and more people start visiting those two books in the same session... many will still visit built to last and good to great and due to the massive weight that affinity already has the more recent change could take months or years to appear.

Advanced direct response paid search strategies will optimize the keywords they buy based on historic data on ROI and brand campaigns on historic brand uplift or tweaks in user behaviour. At what point do those historic models harm you and do you need to downweight history in order to upweight changing trends in user behaviour.

A suggestion I have heard is to assign all data sets a relevance score and downweight that relevance logarithmically with time... I love that suggestion... there are certainly other possibilities out there but don't be surprised if the internet back end needs another tweak within the next 2 yrs as the big sites begin to realize their clever technology isn't quite as clever as it used to be and is getting a little bit too tied to past results and a little slow to pick up on present trends... interestingly close to a human being as they age really!

Comments

Post a comment

If you have a TypeKey or TypePad account, please Sign In

My Photo
Blog powered by TypePad

my flickr tag cloud