The Semantic Web | ZDNet.com

March 28th, 2008

Why kill Google?

Posted by Paul Miller @ 3:56 am

Categories: Semantic Web, Semantic Web People, Semantic Web Companies, W3C

Tags: Google Inc., Semantic Web, Internet, Paul Miller

Technology journalists from the mainstream media appear obsessed with locating some magic bullet with which to topple Google from its dominant position in today’s Web, and use of violent language seems part and parcel of this obsession. Have Larry and Sergey done something to upset them? Did they all have Alta Vista stock? Have they been playing too much Halo? Or can they just not handle the fact that a company is doing pretty well in the stock market whilst actually managing to deliver a valuable user experience?

Whatever the reason, ‘Google Killers’ crop up with depressing regularity, even if (allegedly) you need to put words in the mouths of your commentators to find one.

I spoke with Powerset CTO Barney Pell last night, and one of the topics we explored was his company’s billing last year as a ‘Google Killer.’ More on that conversation in a later post, because for now I want to turn to a related item from overnight; Tim Berners-Lee’s latest post to his low-volume blog.

Tim talks about the attention that one of his recent flurry of press interviews has attracted. In this case, he was talking to British broadsheet, The Times, and notes of the article;

“the Times online mis-states that I think ‘Google could be superseded’. Sigh. In an otherwise useful discussion largely about what the Semantic Web is and how it will affect people, a misunderstanding which ended up being the title of the blog.”

He continues,

“The Semantic Web will not supersede the current Web. They will coexist. The techniques for searching and surfing the different aspects will be different but will connect. Text search engines [like Google] don’t have to go out of fashion.”
(my emphasis)

Noting the speed with which news stories such as that from The Times spread into the blogosphere, Tim comments on the difficulty that he is experiencing in getting the paper to correct its misrepresentation of his words and uses this as a trail into a wider consideration of data re-use online.

This (the ability to combine, recombine, use, reuse, link, link, and link again), he would appear to suggest, is the Semantic Web’s (forgive my slip into the language of violence) ‘Killer App.’

“The benefit of the Semantic Web is that data may be re-used in ways unexpected by the original publisher. That is the value added. So when a Semantic Web start-up either feeds data to others who reuse it in interesting ways, or itself uses data produced by others, then we start to see the value of each bit increased through the network effect.”

Bravo. I couldn’t agree more.

I must also admit, though, to being surprised at the extent to which too many ‘Semantic Web’ companies appear not to get this. Too many of those I speak with are proudly, happily, and expensively building yet another data silo. Semantics may run through their applications, and World Wide Web Consortium (W3C) specifications may even get a mention. But the essential primacy of linkable reuse outside the carefully managed boundaries of their application is greeted - at best - with carefully spun hedgeing and - at worst - with outright horror. “Why would some poor misguided user want to do anything outside the Nirvana that my application gives them? Are you mad?”

Tim gives due credit to the great work going on in the Linked Open Data project, and trails the associated workshop (at which I’ll be speaking, along with several of my Talis colleagues) at the Web Conference in Beijing next month.

I, for one, want to see more of the commercial Semantic Web startups embracing a lot more of those ideas. Linked Open Data as a university research project is one thing. Linked Open Data at the heart of a business model is something else entirely, and it appears to be something that either the investors or the startups are not yet taking seriously enough. This is the promise of the Semantic Web; Linking. If the Semantic Web only results in yet another generation of silos then what’s the point? It’s probably easier to build a silo using mySQL and some PHP. The investment in enhancing the linkability, the citeability, of a data resource can only be realised once third parties can link, and can cite.

Rant over, for now. But I really do want to hear from those who get it. And, like Tim;

“So in scanning new Semantic Web news, I’ll be looking out for re-use of data. The momentum around Linked Open Data is great and exciting — let us also make sure we make good use of the data.”

March 26th, 2008

illumin8-ing improvements for knowledge workers?

Posted by Paul Miller @ 6:54 am

Categories: Semantic Web

Tags: Journal, Elsevier, Illumin8, Productivity, Product Development, Research & Development, Business Operations, Paul Miller

The printing press was a pretty pivotal invention, challenging artificial limitations on the dissemination of ideas maintained by scriptoria and opening flood gates to the vibrant philosophical, social and technological innovations of the Renaissance, Reformation, and beyond.

From those early innovations, we entered a long period in which more and more of the advances in thought and practice were reported via papers printed in scholarly or professional journals; journals that were, for most people, too specialised and expensive to be accessed anywhere other than in a library. The number of journals grew, and various tools emerged to help us find the papers we needed. These tools tended to divide along subject or publisher lines, forcing the searcher to have prior knowledge of those journals (and their publisher) most likely to be of use. Various attempts were made to offer solutions capable of searching across more than one of these databases, but these were usually hampered by an unwillingness from the publishers to share sufficient data to drive any really useful searches. We only need to glance at the rather daunting lists of resources maintained by a University Library to see how far from ideal the current model is, with its emphasis upon the container (the journal) rather than the content (the article).

Having spent much of my own time at University finding excuses to avoid dealing with the wilfully (well, so it seemed!) obtuse way in which e-resources were carved up, I was of course interested when offered the opportunity to learn more about a ‘better way.’

Rafael Sidi, VP Product Development in the engineering and technology division at scientific publisher Elsevier, and Jens Tellefsen, VP Marketing & Product Strategy at semantic indexing company NetBase spent some time on the phone, introducing me to their new joint venture; illumin8. Read the rest of this entry »

March 25th, 2008

Semantic Web Gang forms, debates Semantic Web ‘readiness’

Posted by Paul Miller @ 2:43 pm

Categories: Podcasts, Semantic Web, Semantic Web Gang

Tags: Semantic Web, Internet, Paul Miller

With the increasing cacophony from ’semantic’ players in the technology space, it can be extremely difficult to work out what’s important, to identify the trends, and to make informed decisions about how any of this affects you and your business. As part of our contribution to bringing some clarity to the proceedings, I am delighted to announce the first (virtual) meeting of the new Semantic Web Gang. We’ll be recording a show each month, gathering the regular Gang and the occasional special guest to talk about the issues of the day, and taking a step back in order to consider the news in light of broader trends.

The first tranche of Gang members comprise;

I shall be adding additional members to this pool of regulars over the next few episodes, expanding the range of experience and insight represented here even further.

In this first meeting of the Gang, we talked about the current perception that the Semantic Web is ready for mainstream adoption, drawing upon recent statements from Sir Tim Berners-Lee, the announcement from Yahoo! of support for a number of Semantic Web specifications, and the SemanticHacker challenge that TextWise announced the day before our call.

We will be talking on the third Thursday of each month, so episode 2 will be recorded on April 17. I wonder what newsworthy items will come to our attention between now and then?

The audio for our conversation is available here, along with pointers to some of the resources mentioned during the call.

Update: ReadWriteTalk syndicates the Semantic Web Gang.

March 20th, 2008

Jim Hendler shares AI’s lessons for the Semantic Web

Posted by Paul Miller @ 2:34 am

Categories: Podcasts, Research, Standards, Semantic Web, Semantic Web People, W3C, Talking Semantics

Tags: Web, Vision, Hendler, Semantic Web, Internet, Paul Miller

Jim Hendler Professor James A. Hendler goes by the daunting title of ‘Tetherless World Senior Constellation Professor’ at Rensselaer Polytechnic Institute (RPI) in Troy, New York. Behind the title stands a man who has been closely involved with Artificial Intelligence (AI) research for many years, and someone recognised as amongst the progenitors of the Semantic Web ideal. Hendler is also Associate Director of the Web Science Research Initiative (WSRI), an activity that is being pushed hard by Sir Tim Berners-Lee (a Director) and others.

I spoke to Jim recently, and in a wide-ranging conversation we touched upon early hype around the promise of Artificial Intelligence, conflicting aspirations for the Semantic Web Read the rest of this entry »

March 19th, 2008

TextWise offers $1million for an American semantic hack

Posted by Paul Miller @ 6:31 am

Categories: Commercialisation, Semantic Web, Semantic Web Companies

Tags: Concept, API, Paul Miller

Erick Schonfeld at TechCrunch draws my attention to SemanticHacker. The site details an invitation from Rochester, NY-based TextWise to suggest compelling applications powered by their API, in return for a guaranteed payment of $100,000 and up to $900,000 in revenue from subsequent commercialisation of the winning idea.

The challenge starts today, and runs until 18 June 2008. Entrants must be based in the United States, which rather unfortunately excludes the Semantic Web research powerhouses in Europe and Asia.

Quoting from the site;

“What will make you a winner in the SemanticHacker Innovators’ Challenge?

Develop a software prototype, business plan or both that will have demonstrable commercial viability and the potential for significant financial impact on the application space to which it is applied.

Focus your submission on a vertical market. Areas such as finance, health and pharmaceuticals are just a few of the industries that might be a good place to start.”

At the heart of TextWise’s technological offer is an API they describe as “the world’s first open API for Semantic Discovery.” That strikes me as an assertion that probably needs to be ringed with caveats and footnotes if it’s to stand up to closer scrutiny. This API enables developers to draw upon SemanticSignatures, defined as;

“a representation of ALL concepts covered in a block of text. Each block of text contains semantic dimensions (”concepts”) with associated weights. The dimensions capture the strength of each concept in the text.

Semantic Signatures provide a weighted representation of the concepts contained in a piece of text. The weight of each concept represents the strength of that concept in the text. The Semantic Signatures for two pieces of text that both address the same subject will share many common concepts with high weights. Our technology can therefore recognize that these two pieces of text are related even though they share no common keywords.”

It will be interesting to see what sort of entries this competition attracts, and the fact that TextWise have taken this course speaks volumes to the increasingly crowded semantic text analysis market. There’s a lot of consolidation to come in this market segment, and the current players are working hard to draw attention to themselves. I, for one, would welcome some more effort devoted to explaining why they’re different.

March 18th, 2008

hakia licenses OntoSem technology to third parties

Posted by Paul Miller @ 12:23 am

Categories: Commercialisation, Semantic Web, Semantic Web Companies

Tags: Lexicon, Ontology, OntoSem, RiverGlass Inc., Hakia, Semantic Web, Strategy, Internet, Management, Paul Miller

hakia logo New York-based semantic search company hakia will today use the Search Engine Strategies Conference to announce that their Ontological Semantic technology, OntoSem, is available for licensing. Illinois-based RiverGlass, Inc. is the first licensee, and will work to enhance their existing real-time analytics solutions with OntoSem.

I spoke to hakia Chief Scientific Officer, Dr. Christian F. Hempelmann, and Vice President of Search, Tim McGuinness, last night to learn more. Read the rest of this entry »

March 17th, 2008

Semantic Web sets conference data free?

Posted by Paul Miller @ 6:47 am

Categories: Podcasts, Standards, Semantic Web, Talking Semantics

Tags: Conference, Semantic Web, Internet, Paul Miller

Conferences can be useful for bringing a group of ‘interesting’ individuals together in one place for a few days, and giving them time and space to focus on a particular set of issues without the usual distractions of the working day. Blackberries, iPhones, and free venue wi-fi make the distractions a lot less removed than previously, but the conference remains a useful forum nevertheless.

However, many of the most useful elements - the people, the papers, the discussion - remain rigidly fixed in their silos and difficult to carry forward into our lives post-event. The pile of business cards wait, dustily, to be entered into address books or looked up on LinkedIn. The published papers languish on a CD in the bottom of a drawer, accumulate yet more dust as the printed tome that you misguidedly ditched child-appeasing schwag in order to create suitcase room to bring home, or hide away on some conference web site with yet another forgettable username and password. The most interesting sessions were circled on the paper programme, which you left in your hotel room to make room for those Google ice cubes you just know will go down well at home, and the chance encounters, bar room meetings of minds, and random eureka moments compete with jet lag and a groaning inbox upon your return home, doubtless consigned to be forgotten all too soon.

And with washing washed, ironing ironed, and children tested to see if they remember you, you head for the airport to repeat the exercise once more… doubtless having failed to spot or act upon any relationships between one event and the next.

I’m not for one minute suggesting that it solves all of these problems, but I had an interesting conversation with an old friend, Eric Miller of Zepheira, on Friday.

In our recorded conversation, we were talking about some work that Zepheira have been involved with to enhance the conference experience of delegates at one event in May. Demonstrating Sir Tim Berners-Lee’s assertion that the main building blocks for the Semantic Web really are in place, the Zepheira project uses a range of Open Source components and Semantic Web specifications to assemble a conference experience intended to be more personal, more interactive, and more lasting. By exposing information from the conference systems, opportunities are also created to share with vertical applications such as Dopplr and LinkedIn, or even with other (competing?) events. None of these have to buy a proprietary system, or implement odd proprietary formats. All they have to do is conform to widely recognised Semantic Web specifications, and some already do this to a degree.

Despite the hype, not everything about the Semantic Web has to be paradigm shifting and revolutionary. Many of the benefits will simply come as existing systems become a little more open, and as existing data moves a little more freely and a little more purposefully. The vision Eric paints in our conversation is a perfect example.

As for whether it works or not… well, I’ll tell you in May. After the conference. If I don’t leave my notes in the hotel in order to squeeze some great piece of conference schwag into my bag.

Disclosure: Talis are supporting this year’s Semantic Technology conference.

March 17th, 2008

Looking for a dominant Semantic Web search engine

Posted by Paul Miller @ 4:27 am

Categories: Podcasts, Research, Standards, Semantic Web

Tags: Search Engine, Semantic Web, Search, Internet, Paul Miller

Despite the continuing efforts of Microsoft, Yahoo! and others, Google remains the dominant horizontal search engine for most people, most of the time. In the United States, comScore reports 58.5% of searches during January were via a Google property. In the Semantic Web space, search is far less established and a number of much smaller sites offer their own solutions to the problem of locating appropriate semantic content from across the open web. Whether those sites are complementary or competitive seems to depend upon one’s perspective, and it is also interesting to ponder the extent to which Yahoo!’s recent announcement is an attempt to position themselves as the search engine of choice for the growing web of semantically enriched content. Read the rest of this entry »

March 14th, 2008

Commercial uses of the Semantic Web at WWW2008 ?

Posted by Paul Miller @ 6:34 am

Categories: Semantic Web, Semantic Web Companies, W3C

Tags: Conference, Semantic Web, Internet, Paul Miller

This year’s World Wide Web conference (WWW2008) is rapidly approaching, and all over the planet web researchers are grappling with the Chinese visa application process ahead of their trip to Beijing.

In contrast to a corporate event like Semantic Technology, the World Wide Web conferences tend to be geared more toward the research community, and .edu, .ac.uk and similar addresses definitely outweigh the .com’s on delegate business cards.

Corporate delegates certainly do attend, though, and as Semantic Web applications become increasingly relevant to the enterprise I would expect reasonable commercial representation in the Semantic Web tracks at the conference.

Talking this over with Danny Ayers and the Chair of the conference Developer Track, Google’s Jeffrey Korn, we wondered if there might be scope for a panel specifically looking at commercial uses of Semantic Web ideas and technology. I would certainly be interested in seeing that fusion of business thinking and cutting edge academic research.

So if you work in a .com that’s doing interesting stuff with the Semantic Web, plan to be in Beijing, and would be interested in sitting on a panel like that, please do get in touch. I’ll pass the names to Jeff, and he can then see if the idea is viable.

March 13th, 2008

Yahoo embraces the Semantic Web?

Posted by Paul Miller @ 9:40 am

Categories: Standards, Semantic Web

Tags: Yahoo! Inc., LinkedIn, Semantic Web, Internet, Paul Miller

Mike Arrington broke a story over at TechCrunch this morning, suggesting that Yahoo! are about to extend their Open Search Platform by embracing a number of Semantic Web specifications. A post on the Yahoo! Search blog confirms this.

Quoting from the Yahoo! blog post;

“While there has been remarkable progress made toward understanding the semantics of web content, the benefits of a data web have not reached the mainstream consumer. Without a killer semantic web app for consumers, site owners have been reluctant to support standards like RDF, or even microformats. We believe that app can be web search.

By supporting semantic web standards, Yahoo! Search and site owners can bring a far richer and more useful search experience to consumers. For example, by marking up its profile pages with microformats, LinkedIn can allow Yahoo! Search and others to understand the semantic content and the relationships of the many components of its site. With a richer understanding of LinkedIn’s structured data included in our index, we will be able to present users with more compelling and useful search results for their site. The benefit to LinkedIn is, of course, increased traffic quality and quantity from sites like Yahoo! Search that utilize its structured data.”

With this and FireEagle’s release last week, Yahoo! are clearly recognising the value that additional structure can bring to the search experience.

The tools to create and embed that structure need to follow, of course. And issues that efforts like Dublin Core struggled with over a decade ago need to be thrashed out in some more detail, as the malicious, the malevolent, the careless and the mischievous rush to ‘game’ the rich structured data with which their web pages will soon be filled.

This move from Yahoo! would appear to offer a significant step forward, but I’d like to know more before being too effusive.

« Previous Entries

At Talis, Paul Miller is active in raising awareness of new trends and possibilities arising from wider adoption of the Semantic Web. See his full profile and disclosure of his industry affiliations.