rediff.com
rediff.com
News
      HOME | NEWS | REPORT
Friday
October 18, 2002
1300 IST

NEWSLINKS
US EDITION
SOUTH ASIA
COLUMNISTS
DIARY
SPECIALS
INTERVIEWS
CAPITAL BUZZ
REDIFF POLL
DEAR REDIFF
THE STATES
ELECTIONS
ARCHIVES
US ARCHIVES
SEARCH REDIFF








 Click for confirmed
 seats to India!



 Is your Company
 registered?



 Spaced Out?
 Click Here!



 Secrets every
 mother should
 know



 Rediff NRI
 Finance
 Click here!


 Search the Internet
         Tips
E-Mail this report to a friend
Print this page Best Printed on HP Laserjets



Google wants to be part of
journalism's future

Suleman Din in New York

It wasn't necessity that motivated Google research scientist Krishna Bharat, 32, to begin working on a programme to simplify searching for articles on the Internet.

Sheer frustration was more like it, as the task of searching for news amidst the Web's endless flow of information was very tiring, he says.

Who had the time, he thought, to flip through one Web page after another, scan and then click on umpteen number of hyperlinks and battle with multiplying ads just to read a particular story.

And so he went to work, thinking of a way to bringing at one place news from different sources.

The result was an entirely computer-generated news site, http://news.google.com, which has some journalists wondering if they would be replaced.

That's because without the aid of any human editors, the site automatically updates every fifteen minutes with hundreds of stories culled from around the world. And like in any news site, stories are categorised into separate sections like 'Top Stories' or 'Sports'.

"It's presumptuous to say this [Web site] will be the future of journalism itself," Bharat says. "But it will be part of the future of journalism."

Putting it all together was just a question of mathematics, he explains.

At the heart of his programme is a clustering algorithm, which functions like a librarian or clipping service, by searching out, matching and collecting articles based on one's reading interest.

Loosely explained, a clustering algorithm is a mathematical procedure that finds similarities between elements and groups them. It examines articles from different sources, analysing factors such as an article's information, page rank, and timeliness.

"I wanted to automate the process [of searching for news]," he told rediff.com over phone from Google's headquarters in Mountain View, California. "I wanted to gather all the articles in one spot."

The programme he created was for his personal use, but it worked so well that he decided to share it with his co-workers.

Emails were sent around, praising its usability and functionality. Even Google's founders Sergey Brin and Larry Page started using it.

Google's executives felt Bharat was on to something. "It was consistent with Google's goal of organising the information you find on the Web...," Bharat says.

What began as his pet project quickly turned into a 'beta' site, which was first linked to the outside world in March, scouring 150 news sources an hour.

This initial attempt was hidden within the layers of the site as Google cautiously tested the waters. Other search engines such as alltheweb.com and altavista.com already had news searches.

In order to compete, Google put five engineers on to the project. After eight months, the programme could crawl through 4,000 news sources in real time every fifteen minutes, posting 100,000 articles daily.

What is unique about the programme is that it scans the full text of the articles, rather than just headlines, allowing it to analyse and group stories according to the complete content.

Every featured story on the front page has a headline linked to its source, a blurb, the time it was last updated, headlines from other sources, links to news sites with similar stories.

Google unveiled its news portal in September and it received cautious praise from the press.

"Meet Editor Al Gorithm," the headline of Online Journalism Review's report on the site said.

Competitors tried to brush it aside. "News cannot be 100 per cent automated and present a meaningful picture of what is happening in the world," Chris McGill, director of news and information at Yahoo, told The Washington Post.

Some mentioned possible legal issues if the site were to monetarily benefit from someone else's content, and the possibility of cutting off their site from Google's searches.

Others pointed out bugs on its front page like the incorrect categorisation and placement of stories.

Bharat says the site is still in 'beta' mode and that Google analyses all the feedback it receives. He does not say what kind of traffic the site is getting.

Legally there have been no complaints, he says, adding the site complies with copyright laws.

He says initially there were misunderstandings about its purpose. "It was not intended to say that editors don't have a role."

"We tap journalists to create our compendium, but that's not the last level. Others will take [this site] and build upon it."

He says that the site's automated article filtering and wide-ranging search are of benefit to the media as well as the public.

"It will broaden people's perspectives. They will be able to understand news from multiple sources [and become] more informed. Searches also avoid bias. Machines do that very well," he says.

Bharat says the site provides exposure to small, local sources that otherwise are not accessed, and culls news from both traditional and non-traditional sources.

For example, if there were to be an earthquake, the portal would find not only reports from news sites, but also articles from sources like the Disaster Relief network. "Though they are not a news site, they might have a valuable perspective to add," he says.

He says while creating the search, he also had in mind people who are not qualified as journalists, but have "something fresh" to contribute.

The search avoids bias by selecting all articles that are determined relevant by the mathematical grouping, "not just what [a human editor would] believe in", he says.

News sites will benefit from this facility because it will give them more exposure, he says, adding there are 150 million unique visitors daily to Google.

Bharat says the site is also good for writers lacking mainstream exposure. "We'll find you and put you up," he says, wondering about the possibility of the portal becoming a syndicate like the Reuters or Associated Press for 'small-time' journalists.

Such commercial applications were not wholly discussed when the product was being developed ["Don't focus on revenue, build a product that people will like," he says]. But the idea of monetising the product will be explored, he says.

Not surprisingly, Bharat has a passion for journalism.

The Bangalorean, who did his Ph D in computer science from Georgia Tech, says he wrote for the campus newspaper while earning a B Tech from IIT-Madras.

A news junkie, Bharat has tinkered with technology and news delivery before, developing an interactive, personalisable newspaper called the Krakatoa Chronicle, using Java in 1994.

He laughs when it is mentioned. "That was my first experience in online news," he says.

The father of one says he has nothing but the utmost respect for those in the media. "In a second life, I would like to be a journalist."

Back to top

Tell us what you think of this report

ADVERTISEMENT      
NEWS | MONEY | SPORTS | MOVIES | CHAT | CRICKET | SEARCH
ASTROLOGY | CONTESTS | E-CARDS | NEWSLINKS | ROMANCE | TRAVEL| WOMEN
SHOPPING | BOOKS | MUSIC | PERSONAL HOMEPAGES | FREE EMAIL| MESSENGER | FEEDBACK