Thursday 30 November 2006

From Spices to Quirks...

Folks,

Corporate Spices has seen quite a few visitors here at Blogger. But then we always want more...

Keeping in mind my varied interests and my insanely infrequent posting habits I have decided to rid Blogger of my troublesome anti-blog-social behavior and move to my own domain. I decided to merge all my online identities under my real one, and Voila! we now have: 42 Quirks All the current posts, comments have been imported and collected at the new address. I will stop answering comments to this blog from tonight 0000 hrs IST (0630 GMT). May be I will delete the blog... Maybe I won't...

Thanks for all your support and the great conversations we've had until now. I hope we will continue them on 42 Quirks in the future...

You are cordially invited to visit my new home at 42 Quirks. Update your bookmarks, (in case you did have one [:D] )

See you there!!

Tuesday 5 September 2006

Search Engine Redundancy: The Final Countdown

Until now, the process of consuming content was of a very primitive type - Search and consume. We searched for information using certain keywords and then converted it manually to knowledge. If we wanted to access the information at a later date, we simply printed it out. If we wanted to re-search it (pun unintended) we searched it again! There was no way of storing or retrieving this data for later usage.

Enter del.icio.us, one of the first social applications.

All of a sudden, you could bookmark pages you liked AND store them too! Searching for that page on Shark bites you saw two months ago simply transformed into searching through your list of bookmarks. Your bookmarks could now travel with you wherever you went!! The sharing feature meant that now your friend could easily send you that link to the direct downloads, bypassing all the popups and ads along the way. ;-)

The process of consuming information now became three-tiered: Search, Store and Retrieve.

Somewhere between then and now, we instinctively developed a habit of consuming content, gaining knowledge, and stashing it away for further usage. A lot of Web purists call this approach as the River of News approach.

Drink hard, drink deep...

We live in a dynamic world that survives on a River of News.

The River of News concept, as described by Dave Winer, goes something like this:

Instead of having to hunt for new stories by clicking on the titles of feeds, you just view the page of new stuff and scroll through it. It's like sitting on the bank of a river, watching the boats go by. If you miss one, no big deal.

... which is exactly how we parse our daily newspapers for news! If a story is interesting enough, it will be back again the next day. If it ain't, down it goes...

The River of News concept assumes a relaxed outlook towards the consumption of content by any user. It relies on the fact that if an older item is to be revived, then it will be revived, no matter how or why*.

The only hitch to this concept is the duration of focus in an avergae human. Somehow, the concept of a limited attention span has seeped on to the Web. Conversations (a.k.a posts, articles, etc.) have a specific life span depending upon a variety of factors, ranging from authority to popularity. The previous post touched upon four of these factors that I personally belive to be important.

As the river of news concept washes the Webosphere, the content generated by users (erm, I mean, the knowledge shared by the netizens) becomes outdated as soon as the attention-span of the article ends. For some posts, the span is as short as 30 seconds, for others it might last for weeks.** The keep-alive time of the post is enriched by a variety of parameters, with the element of chance also playing a significant role, sometimes.

The Bottom-line: Find, not search

Traditional Search Engines search for content based upon classifications of keywords and various natively built algorithms. Earlier, when the internet was an array of 'webmaster-maintained' static displays, search engines had to be relevant. In the days to come, I foresee the River of News flooding the Blogosphere: Freshness of results will definitely be paramount, then.

The trade-off between freshness and relevancy is one of the factors that will see a good sound debate in the days to come. This, unless the Blogging trend tapers off suddenly instead of continuing to rise.***

One question I have purposefully (and successfully) avoided so far is this: Will we be able to match people to keywords?

A search engine will match content to your keywords. But there are three Shrikant Joshis and many Shrikants and many more Joshis who are regular bloggers. How do you differentiate them? Again, what happens when you are looking for a solution to a problem? Would search engines in the (near?) future also throw up results like:

"5 user(s) can solve your problem! Do you want to hire them?"

More importantly, if they did, would you believe them?

Footnotes:

*One of my reasons for posting this post so late (inspite of my previous assurances) was to check if there was any interest I could generate, and how it varied with time. However, I miscalculated one of the most important aspects. Subscriptions. Since I never had any audience to begin with, there was no way I could anticipate anything. That's called counting your results before you have keywords. :)

** Wondering what category my posts fall into? Well, somewhere close to the 30 second limit.

***For more details read David Sifry of Technorati

Technorati Tags: , , ,

powered by performancing firefox

Monday 21 August 2006

Part II - Why Search Engines will be redundant soon...

Part II - I Seek You, and your meta-data, too...


The story until now: Part I was a quick review into understanding Traditional Search Engines and their methods and relating them to human conversation - since the Web 2.0 is all about 'conversations in the marketplace'. On to the second part.

What does making sense out of data mean?

In Search Engine terms, it would refer to contextualizing the huge chunk of uncontextual data that is the World Wide Web into information and eventually knowledge. To me, as a human, it simply means tagging certain keywords to any given chunk of data (e.g. a lecture, a passage, a book, a chapter, a conversation) in order to be able to recall it at any time - especially, when one of these keywords is mentioned. For instance, the conversation in the previous post was about a traveller, (an out-of-towner) looking for directions to a tobbacconist. As I keep reminding myself, Web 2.0 is not a product, it is a process. The process has a lot of conversational threads that keep getting picked and dropped as newer and more interesting threads or new participants appear in their place. So what would a contemporary Search engine have to consider in Web 2.0?

'Weight'ing for Information.

From being a static display of items-for-sale behind elegant window panes, the Internet slowly transformed into a bazaar of sorts, with hawkers all around the place plying their wares. The markets grew to accomodate the new and the old. With the advent of Web 2.0, contextualization of information became the norm and not an option. It all began with a nifty bookmarking site called del.icio.us that allowed you to access your favorite sites across the web. Technorati extended the concept to Blogs and induced bloggers to 'tag' their posts with their choice of keywords/tags. With the Web evolving like a democracy, the obvious question of authority in the Web-democracy arose. Which voice among the loud babble was to be trusted? As the web evolved, so did the concept of it's franchise. Only, in this virtual reality, links were deemed votes and tags were your campaign ads. Let's take a quick look at the foru weights that influence your vote.
  1. Tags - Powerful Keywords
  2. Each tag is a keyword that associates a particular context, a topic, with a given chunk of data.

  3. Time - The 'other' Long Tail
  4. All topics & data have a peak presence time. The freshness of a particular keyword is of prime importance in its influence. Consider this simple example: When Iraq was attacked, almost all of the Search Engines across the world were buzzing with Search queries consisting of corresponding keywords, viz., "Iraq" "attack". The "hotness" of the Search cooled down as the days progressed, as the world got other topics to discuss about.

  5. Trust & Authority
  6. Even in flat hierarchies like the Internet there are obvious postitions of Trust and Authority. People who blog well, and blog often gain a large following, and effectively, the crucial element of Trust.

  7. Authenticity
  8. A news on a Microsoft blog would obviously be rated higher in all terms than a news quoting a "trusted Source at Microsoft". The only exceptions to this rule are:

    • The news is a really good bit of juicy gossip - like a rant or a 'leaked' secret
    • The blogger has high levels of Trust & Authority
There's a common thread that binds all of these.. Do you see it yet? (To be concluded) Note: I profusely apologise for disappearing from the Blogging scene, all of a sudden. I was forced into a short hiatus by unforeseen circumstances. We updated our website platform to a new version, recently. although the beta is pretty stable, we are still working on a better UI. As a result, I had to spend some sleepless nights and a few Blog-less weeks. ;) Once again, my sincere apologies for the same... Technorati Tags: , , ,

powered by performancing firefox

Monday 7 August 2006

Why Search Engines will be redundant soon...

Part 1: Search and the Web 1.0: Gorblimey! Those of you who reached here through Google, Yahoo or MSN are probably laughing as you read this. But do go on, there's more. :-) (Un)common Recurring Searches Often our searches are simple keywords crafted with central themes in mind:
  • A name (e.g. Shrikant Joshi or Performancing)
  • A topic (e.g. Corporate Communications)
  • A context (e.g. "Spanish Omelette" +recipe)
Some of us might even burden the spartan box (or in the old days, the Butler) with an entire question. The faithful zombie then crawls its way through the innards of the webs, looking for that occasional diamond stashed away in the back alleys. Usually, in the common cases such as the ones defined above, results are returned in the correct context of our request. Often, the SERPs also throw results that are related yet not within context. Robert Scoble's post on Optimization had this line that caught my attention:
It all starts with the blog. Now, why can’t I put my blog on the map? When you go to Live.com and search on “Scoble” why can’t I customize my results there with more information for you?
Well, I don't agree wholly. Search for my name on Google. There are at least three different people called Shrikant Joshi who turn up in the top 3. We keep exchanging the first three ranks. And all of us are pretty active bloggers it would seem. The see-sawing of rankings in the Organic Search results is not a matter of concern for me. Nor do I want to customise these search results so that I would get more result-space. I am not a key-word What are search engines? Simply speaking, search engines are content-aggregators assigned the additional job of classification. As humans we need to have everything classified into a taxonomy so as to facilitate recollection. Our knowledge depends upon storage which in turn depends upon collection and classification of data. Classification helps recollection and hence improves perceptive retention of knowledge. Or, in simple words: The more you know, the wiser you are. Hence, classify and remember. Similar to how we retain knowledge, Search Engines classify the data they crawl according to keywords. A huge index is built up and referenced and cross-referenced until all the possible avenues of keywords linking to pages and vice-versa are covered. But you probably know all that and more already. Keywords, mmmm... Aah! The next step would be making sense out of the data, which eventually leads to contextualization. Don't get it? Well, simply put: "A search engine's job is to make sense out of all that data." Let's take a simple case. Someone in your town happens to own a convenience store named Uncle Tom's Cabin. Let us imagine that an outsider in your city is searching for it. Here's how the conversation would go: Outsider: "Where can I find a convenience store?" You: "That would have to be Uncle Tom's cabin. Go straight down for about two blocks and then take a left. It's right across the street." Outsider: "Would I be likely to get some cigarettes there?" You: "Oh! If you simply wants cigarettes, there a tobacconist just round the corner!" A normal conversation, eh? Well, let's take a look at it again. Only this time, we'll look at it the way a search engine would. Let's insert some key words into it for understanding the flow of the conversation: 1. "Where can I find a convenience store?" [New Search Query, keyword: "convenience store"] 2. "That would have to be Uncle Tom's cabin. Go straight down for about two blocks and then take a left. It's right across the street." [Response keywords:"Uncle Tom's cabin", "directions"] 3. "Would I be likely to get some cigarettes there?" [Refine Search Query, keyword: "cigarettes"] 4. "Oh! If you simply wants cigarettes, there a tobacconist just round the corner!" [Response keywords: "Tobacconist","Round the corner"] With me so far? Here's the stumper: If each of these sentences corresponded to an entire blog-post in the Blogosphere, how would you track this conversation? How would you rank each post with respect to the keywords. Would those keywords be enough to cover all aspects of the conversation? Would you call those keywords as appropriate descriptors of the conversation? Where would these posts appear in SERPs for the combined keywords {"Your Name" +directions} To be continued... Disclaimer: I am no Search Engine Expert. These opinions are simply my $0.02 worth. Or may be less. :) Technorati Tags: , , , , ,

powered by performancing firefox

Wednesday 26 July 2006

Netscape.com says, "Hi to all Diggers!"

Surprised? Well, read on... Early this morning, someone submitted a story on Netscape.com. And Digg fans all over the world erupted in laughter and glee. Ever since the story was submitted, this is what appears, when Netscape is loaded into your browser: The first is a four word expletive, and the second greets "all you Diggers out there!" The culprit? A story titled "Unbearable Cuteness". Ironical,eh? Here's the what and why of the entire fiasco. Analysis: A quick check of the JavaScript on the page reveals this script:
via <a title="http://www.cute.com"><script>alert("fuck"); alert("Hi to all you Diggers out there ;)");