Cloud Photo

Azure Data Architect | DBA

Quick Tip: CouchDB-Lucene Function for Indexing Arrays and Attachments

CouchDB-Lucene is a powerful search engine that can index CouchDB documents and attachments. The example below indexes a database that contains documents from an imaginary items catalog. Each item in the catalog has parts that make up the item. Each part has a part name and part number. Each item document can also have attachments such as MS-Word documents that contain instructions or user manuals or engineering diagrams. For example, an item might be a cell phone which contains the following parts: a charger, a spare battery and a SIM card. 

The following function is stored in the design document called lucene in a database named bulk. It is important to note that when you store the function in CouchDB, it should be written on one line in order for the Erlang engine to process it. So, from the function(doc) all the way to the return ret;}” should be on a single line. The textarea in the Futon app will wrap the line, but that’s ok. If the function is not entered as a single line, CouchDB will return JSON.parse() errors.

{
“_id”: “_design/lucene”,
“_rev”: “8-6c0b89a237702a44d959db6ddc091fa0”,
“fulltext”: {
“parts”: {
“index”: “function(doc) {var ret = new Document();
ret.add(doc.item,{‘field’:’item’, ‘store’:’yes’});
for(var i in doc.components) {
ret.add(doc.components[i].part_number,{‘field’:’part_number’, ‘store’:’yes’});
ret.add(doc.components[i].part_name,{‘field’:’part_name’, ‘store’:’yes’});
}
for(var a in doc._attachments) {
ret.attachment(‘file’, a);
}
return ret;
}”
}}}

Once you have the function in place and CouchDB-Lucene has built the index, you can begin build query URLs. To search for attachments containing the string “audit,” use a URL similar to the following:

http://127.0.0.1:5984/_fti/local/bulk/_design/lucene/parts?q=file:audit*&force_json=true&include_docs=true

To search for CouchDB documents with part names like “charger,” use a URL similar to the following:

http://127.0.0.1:5984/_fti/local/bulk/_design/lucene/parts?q=part_name:charger*&force_json=true&include_docs=true

To search for items named “cell phone,” use a URL similar to the following:

http://127.0.0.1:5984/_fti/local/bulk/_design/lucene/parts?q=item:”cell phone”&force_json=true&include_docs=true

All of these URLs return JSON documents which can be parsed with jQuery and presented in HTML as an online catalog.

Leave a Reply