Document Management 101 - Part 1

Exactly what is document management?  I get that question a lot.  So, to educate, inform, and possibly entertain you, this is the first in a series of posts to delve into the world of document management.

In some cases, this is also called “content management” but I’m going to avoid that term, because content management can also be applied to systems that manage web based content, and the type of document management system I’m going to talk about is not limited to web content.  In fact, web content is not often stored in enterprise document management systems - although it can be.  But I’m getting ahead of myself.  Let’s step back and start from the beginning.

To start, you need to grasp that there are two basic components to a document management system.  The first is a document repository.   Think of this as a file room, filled with filing cabinets, filled with folders, and files.  In a modern DMS, the repository is usually just disk storage, with a specific hierarchy of folders.  The hierarchy can vary, and to the end user, isn’t visible.  And because we are sticking to basics, we won’t go there.

The second component is a database.  This could be one of several relational database offerings, including Microsoft SQL, Oracle, Sybase, DB2 - and others.  The database tracks the location of the document in the repository, along with document metadata.  Metadata may include the author of the document, the last edit date, the document type (Microsoft Word, Microsoft Excel, Adobe PDF, TIFF, TXT, etc) and any number of other classification types.  These classification types are also known as index values.  Think of this database as the cardfile in the library.  Find the card for the document you want, and it tells you the “room”, “filing cabinet”, “drawer”, and “folder” the file is in.

An example of this basic structure is shown here

Figure 1 - A two tier DMS

Figure 1 - A two tier DMS

In reality, the architecture shown in figure 1 is rarely seen for several reasons.  One is the need for more functionality.  The other is the direct connection between the user and the DMS resources is not the most efficient, or effective way to do things.  A better way is to insert another tier of service into the model as shown here

Figure 2 - A three tier DMS

Figure 2 - A three tier DMS

In the three tier model, the DMS services broker all the database transactions, handle all the file operations between the repository and the user, and interconnect the other services, such as the web interface, the full text indexer, and any other services that may be present.

Now, those are the basic architectural pieces.  But, how do users access a document in the system?

For DMS access, we can give the user a few different types of interfaces.  One is a web browser.  The user can log in to the document management system, search for documents using the metadata, select a document from the search results, and act on the document.  The actions possible include viewing the document, downloading a copy or “checking out” the document, very much like checking out a library book.  In addition to a web browser interface, the DMS may have a standalone client.  Or it may have a client that is a “plug-in” for Microsoft Outlook, which is a very popular option.  And last, but not least, some DMS variants offer Blackberry, Windows Mobile, and iPhone clients.

Another part of the user interface may include application integration.  With application integration, a user opening a file is presented with an alternate dialog box rather than the standard one.  The alternate dialog box provides the application access to files stored in the DMS, and typically also provides search functions the user can employ to find the document they need. So, for example, a user running Microsoft Word and opening a file might open that file directly from the document management system.  Here is an example of the file open dialog a Microsoft Word user might see (this is a dialog from Autonomy’s Worksite DMS);

Figure 3 - Autonomy Worksite integrated file open dialog.

Figure 3 - Autonomy Worksite integrated file open dialog.

This particular view shows the user’s worklist - the documents the user has worked on recently.  This is roughly equivalent to the recent file list that most applications provide as part of their user interface.  That might not seem very impressive, but remember that a DMS is a multi user library.  The real power of a DMS starts to be more apparent when you start searching for a document.  To give you a little hint about that search power, see Figure 4.  This is a sample search dialog (again from Autonomy’s Worksite Server).  Notice the multitude of indexes that can be used as search parameters (the index field captions have not been customized in this example).  Being able to build a taxonomy with this level of detail lets a user narrow a search result set even in a repository of millions of documents - assuming of course the documents were correctly classified when they were stored.  And this ignores the power of searching a full text index, which can be combined with the index search.

Figure 4 - Autonomy Worksite search dialog

Figure 4 - Autonomy Worksite search dialog

That’s one example of document management, and in this case, it’s the type typically found in professional services firms, such as law firms and accounting firms. In fact, the dialog boxes I have shown are from a product oriented at those environments.  But there is more to document management than tracking a legal pleading or a general ledger spreadsheet - as you’ll see in part 2.

Leave a Reply