Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Azure's document database broadens storage choices

Azure DocumentDB is the newest figure in the company's data storage menagerie. What benefits does Microsoft's document database offer developers and DBAs?

Microsoft Azure offers a range of data storage options, including file, blob and relational database storage. And...

now, the cloud service offers another way to store data with Azure DocumentDB. So how does this document database and its benefits compare to others?

For starters, DocumentDB can meet certain application development requirements better than relational databases. Document databases are a popular type of NoSQL database where data is stored in JSON documents and do not require a schema that specifies tables and columns. Instead, developers programmatically create collections, documents and fields as needed. Collections are similar to tables in relational databases. Documents are used like rows in relational tables, and fields are similar to columns.

Because document databases lack a fixed structure, developers can accommodate new data requirements quickly. For example, a database table designed to store data about books would have columns for title, author, publisher, year of publication and page count. But a developer needs to make adjustments to support large kilobyte-sized e-books. Typically, a DBA will have to change the table definition to add a new column. But a document database automatically accommodates new fields, so the developer can simply include the size field (in KB) in a JSON document and store it in the database. This is ideal when the entities you track -- such as products, customers and devices -- have differing features.

Document databases are also fast to read and write. But for those accustomed to designing for relational databases, these benefits require a different approach to design.

These databases often are denormalized and store related content, such as an order and an item, in a single document rather than in two or more tables. Denormalization avoids joins, which can cause a delay when responding to a query. Joins are more efficient with properly indexed tables and well-crafted queries. But it's impossible to avoid the latency that occurs with the need to match keys from different tables and retrieve data from different parts of disks.

DocumentDB is different, but the same

Azure DocumentDB has more familiar features for relational database developers than other popular document databases. It supports SQL collection queries and the SQL engine has been adapted to work with JSON documents. Therefore, some syntax differences exist between DocumentDB SQL and standard SQL. For example, DocumentDB SQL supports hierarchical referencing of data. If you have a customer document in a collection called 'customers,' and the document had an address document within it that contained a city, such as:

{

 customer_id: 139839,

 customer_fname: 'Susan',

 customer_lname: 'Washington',

 customer_address: {

                  street: '1256 SE Main St',

                  city: 'Portland',

                                    state: 'ME'

                 }

}

In this scenario, you would reference the city as 'customers.customer_address.city' in your SQL statement. The query language is also extended to support arrays, which are often used in document databases.

DocumentDB automatically indexes documents without the need to specify indexes. If you are accustomed to minimizing the number of indexes in a relational database to improve write performance, this may seem excessive. But it makes sense in a document database because fields can vary in documents. Any field, even one that's only seen in a few documents, could be used in a WHERE clause and benefit from having an index.

DocumentDB supports stored procedures, user-defined functions and triggers -- a plus for developers who work with SQL Server. Instead of using Transact-SQL, DocumentDB is written in JavaScript.

Programmers for .NET may prefer to use the LINQ libraries to query DocumentDB databases; the query processor will map LINQ queries into SQL queries and run them on the database.

Microsoft bills DocumentDB in capacity units, which include the core resources needed for a data store. A standard capacity unit includes 10 GB of local SSD storage, 2,000 request units per second and 2 GB of BLOB storage. The 2,000 request units are enough for 2,000 reads per second or 500 inserts, replace, or deletes per second. During its preview phase, the cost per capacity unit of DocumentDB is $0.73 per day -- a 50% discount on the standard rate.

About the author:
Dan Sullivan holds a master of science degree and is an author, systems architect and consultant with more than 20 years of IT experience. He has had engagements in advanced analytics, systems architecture, database design, enterprise security and business intelligence. He has worked in a broad range of industries, including financial services, manufacturing, pharmaceuticals, software development, government, retail and education. Dan has written extensively about topics that range from data warehousing, cloud computing and advanced analytics to security management, collaboration and text mining.

Next Steps

AWS and Azure go head-to-head

Taking on the challenges of Azure cloud migration

Azure outage creates cloud concerns

This was last published in November 2014

PRO+

Content

Find more PRO+ content and other member only offers, here.

Essential Guide

An enterprise guide to Microsoft Azure cloud

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

-ADS BY GOOGLE

SearchServerVirtualization

SearchVMware

SearchVirtualDesktop

SearchAWS

SearchDataCenter

SearchWindowsServer

SearchCRM

Close