Search engines get you the results you want 80% of the time. That’s according to a South African startup called Gatfol (pronounced Gat fole, not like the Afrikaans Gatvol) which will soon be launching its first product to help you get closer to 100%.
Gatfol may seem like an odd name for a startup, but its US patented semantic search algorithm certainly isn’t. Rooted in natural language processing or NLP, the free Gatfol browser plug-in will attempt to optimise your search queries by making them as concise as possible. NLP, if you’re not familiar with the subject, is the field of computer science and linguistics concerned with the interactions between computer and human languages.
What started out as an accounting system that analysed patterns in accounting data has become the basis of the present day Gatfol technology. For the last 9 years, Gatfol’s husband-and-wife team from Pretoria South Africa, Carl and Sally Greyling, have been working on an algorithm they call Semantic Intelligence Filter Technology or SIFT which breaks down long complicated search queries into concise keywords, to mitigate what they call “long query loss”. According to the startup, 20% of our search queries are too long, or just not effective enough and are therefore inadequately served.
Behind the curtains, thirty Dell computers — bought on the cheap from HSBC when the bank was going through an upgrade cycle — crawl the web 24 hours a day looking for word patterns guided by Gatfol’s “multi-word equivalence” algorithm.
As an example, if you ask Google for “travel advice from a travel boutique” and click on the Gatfol “g” browser icon, Gatfol will rewrite the query to be ‘“learning more” travel company” which, according to its algorithm should give you the most relevant results. Notice how it rewrites “travel advice” to “learning more”. It understands context.
Understanding context is an important part of NLP. Apple’s digital personal assistant for the iPhone 4S, Siri, is probably the most commercial implementation of a context aware NLP product. As more research goes into the field of NLP, new products are coming to the forefront. Carl Greyling says he likes to think of their project as “language talking to data”. Gatfol is only the Greylings’ first public facing product based on their growing word relationship database aimed at piquing the interest of investors.
Gatfol is mainly self funded, with an amount of R5m currently invested. The project also received a European Union EMDA grant in 2006, seed funding and patenting in April of 2011 and a further undisclosed amount from two anonymous investors based in New Zealand. The startup is also initializing a further investment round for conclusion in June 2012.
NLP is an ambitious field to be in, especially since it is often referred to as AI-complete or AI-hard, implying that fully getting a handle on NLP, is as hard as figuring out the core artificial intelligence problem — making machines reach human level intelligence. Greyling acknowledges that investors are cautious about investing in NLP projects by saying, “the ‘messiness’ and ‘greyness’ of language analysis present a multitude of possible development paths at any one time – much more so than in the exact sciences. Funding providers that can understand and accept this are few and far between.”
Some of the longest and most in-depth research into the field of NLP was conducted under the “Cognitive Agent that Learns and Organizes Program” or CALO project. CALO is a result of more than 40 years of funding by the US military’s Defense Advanced Research Projects Agency (DARPA) and research by SRI (Stanford’s Research Institute) International’s Artificial Intelligence Center, as well as six other US universities and institutions. This research became the basis for Siri, the company founded in 2007 which would later be acquired by Apple to become the Siri application on the iPhone 4S.
With this in mind, it’s interesting to observe some of Carl Greyling’s claims. As with the CALO and indeed with most NLP projects, Greyling’s goal is to eventually enable humans to naturally converse with machines. In the short term however, the Gatfol project aims to improve on the keyword-driven technology of Google, Bing and Yahoo! which he calls, “an extreme disappointment and failure compared to how we have wanted to question computers through natural language since the 1960’s.”
Greyling follows the statement up by saying, “as opposed to common agreement my view is that after decades computers still cannot see our human world – and we cannot talk to them about it.” Hold on a minute. Common agreement? I don’t think so.
Greyling goes on to say that current search is “at caveman-speak three keyword level” and claims that Gatfol’s underlying technology makes it possible to “analyse everyday word relationships at power levels several magnitudes more subtle than that presently available.”
It’s good however, to see research being conducted in this challenging field and even better to see actual products coming out of emerging markets. Hopefully the Greylings have what it takes to back up the swagger. The Gatfol concept is intriguing, and I’ve yet to see a free commercial browser plug-in that utilises NLP for optimising search queries. The world needs less daily deal sites and more of this type of thing.
According to Carl Greyling, we are one week away from getting a look at a Gatfol prototype, and we’ll be sure to report back once we do.
Future plans to monetise the Gatfol technology include opening up its database to third parties through application programming interfaces (APIs) or outsourcing the technology to other private or corporate relational databases.