This post represents the first steps towards creating a web-enabled document repository. In other words, this is the initial test of a proof-of-concept. The goal of this initial step is to create a record in CouchDB and attach a document to it. For me, being new to NoSQL, this started out much harder than I thought it should be. With any luck, I can clear up some of my confusion that you might experience.
Why CouchDB?
First off, I chose CouchDB because it’s free and relatively well-documented. It has a light footprint so I can run it in a 64-bit Fedora 20 VM with just 1GB of RAM. Also, CouchDB has almost no limits to the size of attachments. Plus, CouchDB comes with it’s own web server. Finally, according to my research, CouchDB is highly scalable which can be a factor when a project starts off storing a few gigabytes of documents and ends up storing terabytes and more.
Coming into NoSQL with a long history of experience with RDBMS can be confusing. I couldn’t just insert data. Views aren’t created by select statements. While I won’t address these issues immediately, they do represent the leap in understanding required to relax with CouchDB. That said, the CouchDB admin console, known as Futon, makes the transition much easier. Additionally, there are new tools such as Kan.so that help kickstart the experience.
CouchApps are essentially attachments to CouchDB records that are served up by the CouchDB web server. There’s a certain elegance to a database that comes with it’s own web server. This packaged approach simplifies development because you no longer need to specify datasources or have libraries or drivers to connect your web application to your database. With CouchDB, your database is your web server.
Getting Started
Briefly, in Fedora, get started by using sudo yum install couchdb. When that’s complete run sudo couchdb. It’s just that easy. Point your browser to http://127.0.0.1:5984/_utils/ and relax on the “futon.”
When you first start out with CouchDB, it is an Admin Party. The Admin Party means everyone is an admin and can do anything to any database. For now, let’s just party like we’re admins. Also, eventually, you will want to access CouchDB from an external address. On Fedora, you’ll need to use a semicolon (;) to comment out the bind_address in the /etc/couchdb/local.ini file. For now, these are details just beyond the scope of this document.
For this example, I created a database called test and a document called test1. Then I attached the couchdbForm.html file (download below) to the test1 document. Save the document and then all I had to do was click the attachment link.
Start Coding
This post is focused on the couchdbForm.html attachment. The document starts off with the HTML5 doctype and the basic CouchDB javascript includes. Again, the beauty of including the webserver as part of the database is that you can reuse the native database code (javascript) in your web pages. And, this isn’t just any javascript, this is jQuery so you can take advantage of the functionality of jQuery.
<!DOCTYPE html>
<head>
<meta http-equiv=”Content-Type” content=”text/html;charset=utf-8″ >
<script src=”/_utils/script/json2.js”></script>
<script src=”/_utils/script/jquery.js?1.2.6″></script>
<script src=”/_utils/script/jquery.couch.js?0.8.0″></script>
<script src=”/_utils/script/jquery.form.js?0.9.0″></script>
Next, there are two javascript functions that perform the actions of converting the form to JSON, creating a new couchDB document with key-value pairs and then attaching a file.
The first function converts the HTML form into a JSON string in which the field name is the key and the field value is the value. So, a field named FirstName with a value of MARK converts to the following: {“FirstName”:”MARK”}
$.fn.serializeObject = function() {
var o = {};
var a = this.serializeArray();
$.each(a, function() {
if (o[this.name] !== undefined) {
if (!o[this.name].push) {
o[this.name] = [o[this.name]];
}
o[this.name].push(this.value || ”);
} else {
o[this.name] = this.value || ”;
}});
return o;
};
The serializeObject function is fairly straightforward. The function takes this, which happens to be a form, and serializes it into an array. The array is then pushed into a string which is formatted as JSON.
The next function can be broken down into three tasks. First, $.couch.db(“test”).saveDoc() creates a document in the database and returns the document id (_id) and revision (_rev). Because CouchDB uses a RESTful API, these saveDoc() functions are submitted with PUT operations rather than an HTML form’s standard POST or GET method. For this initial saveDoc(), a null dataset is passed in as empty brackets {}. If you have a SQL background, this is similar to fetching a primary key value and inserting it into a table to create a record which will be populated with data.
NOTES: If you are more familiar with SQL than NoSQL some of this will seem foreign. In NoSQL, a “document” is like a record in a SQL database. CouchDB refers to fields inside a document which are similar to columns inside a table in a SQL database.
Before the data is added to the document, the following jQuery updates two form fields with the _id and _rev so that the next saveDoc() function adds the data to the correct document.
$(‘#_id’).val(data.id);
$(‘#_rev’).val(data.rev);
NOTES: In CouchDB, each record has an unique identifier which is like a Primary Key in SQL. While there are no Foreign Keys, this _id field ensures that the correct document is updated. CouchDB uses the _rev field to ensure consistency. Therefore, when you update a document (row) in CouchDB, you must pass in the _id and _rev. If the _rev doesn’t match the current revision of the document, the update fails.
Once the form is populated with the _id and _rev, the following line uses the serializeObject() function to convert the entire form into a JSON string which can be passed to CouchDB as the data for the document.
var doc=JSON.stringify($(‘form’).serializeObject());
Next, the second $.couch.db(“test”).saveDoc( ) updates the document to using the JSON string. From a SQL perspective, this step is similar to updating a table row. The _id and _rev are included in the JSON string to identify which document will should be updated.
Finally, adding the attachment uses jQuery’s ajaxSubmit() to POST the document stream to the server.
NOTES: It helps to understand the record insert and update process if SQL users think of the records as documents. In order to update a document, it must be opened, changed and then closed. In programming, there are usually file handlers to help with this. In CouchDB, the process is wrapped inside the saveDoc() function. For the process of inserting a record, the first saveDoc() fetched the filehandle and the second saveDoc() makes the changes. The main difference in CouchDB, is that there is no explicit file close.
Summary
While I chose to nest the saveDoc() and ajaxSubmit() functions, it is possible to use each of these separately. This was purely a personal choice on my part because I wanted the form action to flow just like a standard file upload: enter text, choose file, click submit.
Also, it is important to note that this is an initial test of a proof-of-concept. In a production environment, there will need to be user authentication, document security, handling multiple attachments and the ability to search, view, update and delete the documents. (I hope to post more about this in the near future.)
Source Download
Download the complete file at the link below. You will need to change the database name from “test” to the name of your database. This document is intended to be accessed as an attachment to a CouchDB document. See the Getting Started section above. For example, you will access this file at an address like this:
http://myCouchServer:5984/test/test1/couchdbForm.html
Link to the complete file: HTML Form Data to CouchDB Record with Attachment.
Leave a Reply
You must be logged in to post a comment.