How does the search work?

The patent documents received from some of the jurisdictions must be rendered into searchable text by Optical Character Recognition (OCR) at CAMBIA.  All documents from the various jurisdictions are then marked up in XML so that they can be searched simultaneously and in a common format.  Initially a fast full-text search engine, Dekko, was designed in house at CAMBIA by Greg Quinn.  It featured options for ranking the search results by date, document number, and relevance (determined by weighted frequency and proximity of occurrence of search terms in each document). More recently full text searching technology utilizing open source libraries such as Lucene has been introduced.

Comments (0)