Natural Language Search - Playground...
The concept of keyword search - like we all do using Google today - is serving good results to find information that are containing facts we are looking for. Personally I got better over the many years I use Google e.g. in what keywords to use to even find things you don't know a name for. The idea of getting facts as results vs. getting possible documents (=web pages) that might contain the fact is winning new ground. A breed of new start-ups are trying to extract facts out of the web and provide the user the ability to pose queries as natural language questions (like asking a friend for something).
I'm talking about two things here:
Questions are written in English - supporting multiple languages is probably the hardest part of natural language search (at least if the engine should really understand it vs. using a statistical approach). And none of the tested understand anything else than English.
The response is a straight fact or links to documents that contains the fact. The guys at true knowledge are building an engine that analyses the question and provide a straight answer - or partial answer with the option to teach the site the missing facts. The response contains the detailed reasoning in order to verify that the question was correctly understood by the engine.

Asking the same question to Google does also provide a straight answer (I was surprised) including links to the sources to backup the fact.
The team in the labs at Powerset are using the wikipedia content as primary source for facts. Results are wiki articles that contain the fact (with clever hit-highlight).
Hakia - serving semantically analysed results from the ask.com index. Same question:
Funny detail - asking ask.com provides:
That overview shows the 2nd hardest problem (on my list) in semantic extraction - facts changing over time. How to figure out if a fact is still true - or how long has a fact been true - what is the date of the source. The web has a very weak infrastructure to provide these information.
Google and ask.com had the correct sources linked - and most of them are up-to-date - it looks like they are not indexing and analysing that often. The true knowledge guys are taking extra care on that topic - facts are always put in time context (including dependency to inherited facts). So they are able to distinguish between the 1st question and this one:
I'm talking about two things here:
- 1st: facts extraction in information (semantic analyses)
- 2nd: ability to understand a natural language question
Questions are written in English - supporting multiple languages is probably the hardest part of natural language search (at least if the engine should really understand it vs. using a statistical approach). And none of the tested understand anything else than English.
The response is a straight fact or links to documents that contains the fact. The guys at true knowledge are building an engine that analyses the question and provide a straight answer - or partial answer with the option to teach the site the missing facts. The response contains the detailed reasoning in order to verify that the question was correctly understood by the engine.
- My question: Who is president of Switzerland?
- Computed question by true knowledge: Who is the president (head of a nation state) of switzerland, the country in Western Europe at the current time?
- Answer by true knowledge: Pascal Couchepin (born 1942), the Swiss Federal Councilor

Asking the same question to Google does also provide a straight answer (I was surprised) including links to the sources to backup the fact.
- Answer by Google: Switzerland — President: Micheline CALMY-REY
The team in the labs at Powerset are using the wikipedia content as primary source for facts. Results are wiki articles that contain the fact (with clever hit-highlight).
- Answer by Powerset: 1st result is link to http://en.wikipedia.org/wiki/President_of_the_Swiss_Confederation
Hakia - serving semantically analysed results from the ask.com index. Same question:
- Answer by Hakia: Possible answer: Jean-Marie Musy was a lawyer, Swiss Federal Councilor and was twice elected president of Switzerland.
Funny detail - asking ask.com provides:
- Answer from ask.com: The Chief of State of Switzerland is President Samuel Schmid, who is also Head of State
That overview shows the 2nd hardest problem (on my list) in semantic extraction - facts changing over time. How to figure out if a fact is still true - or how long has a fact been true - what is the date of the source. The web has a very weak infrastructure to provide these information.
Google and ask.com had the correct sources linked - and most of them are up-to-date - it looks like they are not indexing and analysing that often. The true knowledge guys are taking extra care on that topic - facts are always put in time context (including dependency to inherited facts). So they are able to distinguish between the 1st question and this one:
- My question: Who was president of Switzerland in 2006?
- Answer by true knowledge: If there are any answers, I couldn't find any.
- Answer by true knowledge: Micheline Calmy-Rey (born 1945), the Swiss politician
