Workflow, Collaboration, Enterprise Content Management

A Special Tribute to Albert Einstein - The Father of ECM

by John Holliday 27. December 2008 14:17

E=MC^2 is probably the most famous equation in modern science, but little is known about its true origins. At least I didn’t know much about it until very recently. Of course, I understood, like everyone else, that it was formulated as part of professor Einstein's General Theory of Relativity, but I never really thought about it much beyond what we learned in school. It was not until I started digging into my own personal genealogy that I stumbled across something that shed some light on what had remained, until now, a mystery.

I had often wondered why Professor Einstein chose to express his now famous equation using those particular letters, E=MC^2.  Was it just pure coincidence that we’re now using those very same letters to describe a business problem that threatens the very fabric of modern-day society? Indeed, where would we be today without some form of Enterprise Content Management (ECM) system?

Following the war, there wasn’t a lot of work for singers in Harlem, so my distant cousin, Billie (on my father's side, four times removed), would supplement her income by filling in as a part-time assistant for the professor at his office in Princeton. While rummaging around in the attic of our old house, I recently came across a box of her personal items. In it, there was a diary that she kept during the years between 1947 and 1952, after her divorce from Jimmy Monroe.

With the cold war still in full swing, it seems that the professor was literally swamped with letters and packages arriving daily from nearly every corner of the globe containing white papers and pamphlets on a myriad of subjects ranging from Newtonian physics to quantum field theory. Keeping track of all that content often proved too much for the professor as he became increasingly frustrated while developing his unified field theory. In one entry, my cousin writes:

“E. was really upset today. The poor man just can’t keep track of all the information that keeps streaming into the office. I try my best to organize it all, but there’s just too much of it.”

More telling is the entry from March 15th, 1951 - the day after the professor's 72nd birthday.

“I overheard E. shouting at one of his students just as I was coming into the office today. ‘Why so difficult to separate ze metadata from ze content?! Teilen-Punkt! Teilen-Punkt! If only we had ein Teilen-Punkt!’ I wish I knew what he was talking about, so I could help him find it.  He really is starting to worry me.“

I found this entry particularly intriguing; because it seems to capture the essence of the same problem many of us are facing today. Managing enterprise content is quickly becoming one of the most important fields of study in the information age – especially the problem of dealing with metadata. Could it be that Einstein perceived a solution to this problem way back in the 1950’s?

I kept digging through the box and came across lots of interesting stuff. There were photos of Billie on the road with Duke. Billie back-stage with a guy that looks like Miles and sharing a laugh with other hard-core musician types. But the real shocker was a small photo that was stuck to the side of the box with a piece of masking tape.  I almost missed it.  On the back there was one legible sentence scribbled in pencil along the edge so faint I could barely make it out. “Don’t ignore ze metadata. It is ze content. Without it, zer can be no Teilen-Punkt.”  I was still trying to comprehend what I was reading, and then I turned the picture over.  There, before my eyes, was the answer - scrawled on the chalkboard right behind him.

 

Einstein Photo

Was ECM part of Einstein’s grand vision of the unified field? Is metadata somehow connected with the theory of relativity? Are you buying any of this?  We know now that content management is at the heart all business processes.  Is it possible that the father of modern scientific thought was also focused squarely on solving this most perplexing problem of modern day society – namely, how do we deal with content overload without interfering with day-to-day productivity?

One can only imagine what might have happened had SharePoint been available in Einstein’s day. Teilen-Punkt.  Alas, all we can do now is speculate, and pay tribute to a wonderfully creative and brilliant mind. A true visionary – in many ways, the father of ECM.

All content is relative.  Teilen-Punkt, indeed!

Important Safety Tip for Office Open XML - Flatten Your Package!

by John Holliday 24. October 2008 18:18

Eric White -> Transforming Open XML Documents to Flat OPC Format

If you're doing anything with Office Open XML, then you need to read this article by Eric White.  In it, Eric describes a very nice approach that lets you gain full leverage out of XSLT when dealing with Office Open XML documents.  Basically, he starts by converting a .DOCX file (or .PPTX or .XSLX) to an intermediate format (Flat OPC) that retains the same structure as a standard package file, but in a single XML document.  Then you simply apply your favorite XSLT transform and convert it back into a package.  Voila! Instant transformation without all that messy system.io.packaging code.  Included with the article are the two conversion routines you need and lots of sample code to play with.  Great stuff!

Streamline Your SharePoint Code Using Extension Methods

by John Holliday 22. October 2008 06:42

The SharePoint API can be vexing for developers who are not used to the little 'gotchas' that pop up here and there.  This can lead to lots of wasted time when all you want to do is determine if a list contains a certain field, or perform a standard operation on a list.  For example, say you have a task list and you want to get the list of overdue tasks or the list of tasks that will become due in the next week.  Of course, you could write the code in a separate utility library and pull it out when you need it.  But wouldn't it be great if you could simply extend the SPList object so that it included your custom properties and methods?

What I'm after here is a way to essentially derive a new class from SPList that includes my additional properties and methods without having to embed the original SPList object inside of it.  That way, I can pass the same object to code that expects a standard SPList object and avoid having to build additional infrastructure around my extended class to make it look and behave like a standard SPList.

Let's start with a simple example - determining if a list contains a given field.  Normally, we'd have to write this in a for loop or inside a try/catch block to avoid the exception that is thrown when testing for a non-existent field.  The code might look something like this:

image

Turns out that as of version 3.0 of the C# language (.NET Framework 3.5), it is possible to write your own custom code and attach it to any .NET class as an extension method.  The method appears as part of the IntelliSense for the extended class, and the compiler generates the appropriate IL code to call your extension method as though it were part of the class.  This means you don't have to own the code for the class in order to extend it or otherwise modify it in any way.

To add this method, you would start by declaring a namespace to hold your extension and then declaring a static method that takes an SPList as the first parameter, prefixed with the "this" keyword.  So the same method from above could be packaged as follows:

image

All you have to do to use the extension is include the namespace in your code.  The nice part is all that IntelliSense goodness and the handy illusion that your method is now encapsulated within the definition of SPList.

image

 

Extension methods can make it easier for novice SharePoint developers to approach the object model without having to divide their attention between the core API and custom class libraries, which tend to have a broader focus than simply extending a given component.

SharePoint Sessions at Microsoft PDC 2008

by John Holliday 9. October 2008 13:34

pdc2008 hard driveBack in the day, I would never miss a PDC.  It was always a great experience, and I'm sure this year will not disappoint.  Unfortunately, I won't be able to attend this time and I'm kind of bummed about it.  If you're planning to be there, then be sure to check out the following SharePoint sessions. 

Also, I just learned that everyone will get their own PDC 2008 hard drive loaded with software that will include a pre-configured VPC with 10 hands-on SharePoint developer labs! 

Now I'm really bummed. :(

 

Here is a list of the SharePoint sessions I'll be missing...

  • SharePoint 2007: Creating SharePoint Applications with Visual Studio 2008

Chris Johnson will be talking about how to use Microsoft Silverlight and SharePoint together.

  • SharePoint Online: Extending your Service

Troy Hopwood will discuss ways to access and manipulate SharePoint files and data remotely with Web Services along with other extensibility points for SharePoint Online.

  • SharePoint 2007: Advanced Asynchronous Workflow Messaging

Alex Malek will build an employee on-boarding application that depends on a server located inside another company by constructing a document workflow and have it asynchronously message a business service hosted behind another company's firewall.

  • FAST: Building Search-Driven Portals with MOSS and Silverlight

Jan Helge Sageflåt, and Stein Danielsen will do a deep dive into FAST search and the FAST ESP Search Web Parts, including the use of Silverlight to deliver unique search experiences.

So, if you're gonna be there and you don't really need that extra hard drive...

Oh, nevermind!

Technorati Tags: ,,

JOG Meeting Reminder for September 2008: Enterprise Content Management - Document Retention

by John Holliday 16. September 2008 01:56

Jacksonville Office Geeks

Thursday, September 18th, 2008; 6p-8p

Bank of America, Building 500, 9000 Southside Blvd; 2nd Fl; Sea Oats Rm

Map: Meeting Logistics

RSVP: http://www.clicktoattend.com/?id=130639

Enterprise Content Management - Document Retention

Document retention is an important part of content management.  SharePoint provides out-of-the-box support for managing document retention using the built-in expiration policy feature of the information management policy framework.  The built-in expiration policy feature includes the ability to define expiration formulas based on document metadata, but there are many scenarios in which the expiration date depends on conditions external to a given document.  In this session, we'll explore the information management policy architecture in detail and learn how to extend the expiration policy feature by writing custom document expiration formulas that calculate the expiration date based on data pulled from elsewhere in the SharePoint farm.

Speaker: John Holliday, MVP Office SharePoint Server

 

Technorati Tags: ,,

Microsoft, IBM and EMC Announce new Enterprise Content Management Interoperability Specification

by John Holliday 11. September 2008 04:46

There is a new standards effort in the works that promises to unify the disparate ECM platforms currently available into a common set of interfaces.  The Content Management Interoperability Services (CMIS) specification is the result of a collaboration between Microsoft, IBM, EMC, Alfresco, OpenText, SAP and Oracle.  Click here to download a preview copy of the spec.

First impressions:

The specification tries to define the "core" components of any ECM system.  On the one hand, this is good because we at least need to agree on common terms.  For example, the whole "content type" versus "object type" debate is important to get resolved.  On the other hand, with such a young paradigm, it's not yet clear what functionality will be the winning market differentiator for any given vendor, thus thwarting further attempts to refine the specification.  To that end, the group has submitted the draft specification to the OASIS consortium so that others can join the effort.  This is a good move and will hopefully lead to faster refinement.

  1. The primary goal of the spec at this stage seems to be to provide a common set of SOA endpoints for the different ECM offerings from the vendors involved.  There are SOAP and REST/Atom protocol bindings for its proposed "domain model", each of which are explicitly required for conformance with the standard.
  2. The spec does not cover administration, configuration or security, and includes within its definition of "administration" the modification of Object Types.  This is a bit of a disappointment, but not surprising given the different interpretations of that term.  Yet, the explicit exclusion of such a fundamental set of methods again reflects the newness of the paradigm and the difficulty of defining a common object model.
  3. The data model focuses on the ability to create and access a repository, but explicitly excludes anything defined as "transient", which includes such important concepts as compound document, workflow, event and subscription.  Here again, we see what appears to be disagreement on first principles.  The result is that the data model describes only low-level components, such as object, property, document, folder, relationship and policy.
  4. CMIS supports versions for document objects, and also includes a standardized query model based on a subset of SQL.  This is very nice, because it means that clients will be able to submit SQL SELECT statements to retrieve objects from multiple ECM vendors.  This will include searches based on object properties and folder membership as well as full text searches.

The spec at this point is pretty low-level, similar to the Open XML Formats packaging API, but offers a strong first step toward cooperation among the ECM vendors in providing a common set of tools for accessing disparate content repositories. 

The query specification is interesting, and includes a few sample queries that I'm anxious to dig into.  There are also some good examples of how versioning, checkin and checkout will work.  Hopefully, I'll get a chance to go into these in some detail in the coming weeks.

Stay tuned.

SharePoint Permission Dependency Chart

by John Holliday 6. September 2008 13:16

When it comes to content management in SharePoint, half the battle is figuring out the best angle of attack.  Why?  Well, because there are so many options and combinations of options that come into play when designing a solution.

Take the SharePoint permissions architecture, for example.  There are 33 different permission masks divided into 3 categories: personal, list and site permissions.  These are combined into role definitions (permission levels) which may be assigned to users and groups and then associated with sites, lists and list items.   Often, you need to create a custom role definition for a particular user or group.  When doing so, it is important to understand permission dependencies, because that will determine the effective permissions being granted.

With the exception of the "Open" permission, which grants the ability to open Web sites, lists and folders to access their contents, every SharePoint permission depends on one or more of the other permissions.  This means, for example, that if you grant the "Manage Alerts" site permission, you are also automatically granting the "Create Item Alerts", "View List Items", and "Open List Items" list permissions as well as the "View Site Pages" and "Open" site permissions.

If you're like me, with so much information to digest and process, and so little time to do it in, a simple diagram can go a long way towards sharpening your focus.  To that end, here is a little chart I created for my ECM401 course (all praise to the Visio gods!) that you may find useful.

SharePoint Permissions

Enjoy.

Announcing MOSS 2007 ECM Developer Training

by John Holliday 28. August 2008 01:02

Let's face it, SharePoint is a huge development platform no matter what angle you approach it from.   As most of you know, I've spent the past year or so traveling and teaching SharePoint developer courses for the Ted Pattison Group, and my focus has been pretty broad, covering topics ranging from features to content types to workflow and everything in between. 

Now, as the platform continues to mature and as developers become more and more familiar with the basics of SharePoint development, it is increasingly evident that more specialized training is needed, particularly in the area of Enterprise Content Management (ECM).  To that end, I'm pleased to announce the immediate availability of two new courses I've developed that are being offered through the Ted Pattison Group:

  • ECM401 - Enterprise Content Management with SharePoint Server 2007 (hands-on)
  • WC-ECM401 - Enterprise Content Management with SharePoint Server 2007 (online)

Both versions of the course are available for immediate registration.  The online version is a great option for those unable to attend the hands-on course.

  • The hands-on version includes 12 modules instead of 10 with a deeper treatment of content modeling and using workflow to drive ECM solutions.
  • The hands-on course is 4 full days with lectures, demos and labs.  The online version is 5 days with about 3 hours of lectures and demos each day plus labs assigned as homework.
  • The online version is conducted via Live Meeting and students may submit questions during the lecture which are answered at the end of each session.  Questions may also be submitted directly to the instructor via email or live chat.
  • Students attending the hands-on course receive a student workbook with slides and labs.  Online students receive the workbook in electronic format.

Feel free to contact me directly with any questions you may have about either version of the course.  For additional details and to review the course outline, please visit the Ted Pattison Group website using the following links.

Here is the upcoming course schedule, with links to the registration form.

Course Dates Location Availability
WC-ECM401 Sept 15-19,2008 Your Desk Register
ECM401 Oct 20-23, 2008 Reston, VA Register
Technorati Tags: ,,,

Aurora Concept: Future of the Web?

by John Holliday 7. August 2008 08:50

 

Mozilla Labs in partnership with Adaptive Path has released two concept videos of what the web browsing experience may look and feel like in the future.  It shows some pretty interesting concepts.  I particularly like the 3D interface where content continually recedes as time progresses – kind of like a push-down stack from front to back with the most recent items in the front.

The little hand-held device looks remarkably similar to the iPhone, and the design team appears to have made the same kinds of assumptions about interfaces that will appeal to most people.   I must say I’m surprised these guys didn’t focus any attention on alternative interface mechanisms, like speech, for example.  I would have thought that any concept of web browser futures must include at least some kind of speech recognition and speech synthesis.  The idea that common folk will remain content to push little icons around on a tiny little screen just seems stupid to me.  Why do I LOVE my Garmin?   One reason is because she TALKS to me!  (Did I say ‘she’?  I meant ‘it’!)

Anyway, have a look at these concept videos and muse for a moment about what might be around the bend in terms of human-machine interaction.  I’d be particularly interested in hearing your thoughts about how these kinds of ideas might influence the future direction of the SharePoint UI.

To learn more about the Aurora project, visit http://adaptivepath.com/aurora.


Aurora (Part 1) from Adaptive Path on Vimeo.
Aurora (Part 2) from Adaptive Path on Vimeo.

Crash and Burn, Part Two - Wake Up Call

by John Holliday 6. August 2008 11:53

Ok, so first I need to give a shout out to John Miller, who read my last post and promptly encouraged me to check out the VMWare Fusion 2.0 beta, which adds a nifty feature called “AutoProtect”.  The idea is to have Fusion automatically save snapshots of your virtual machines as backups for easy rollbacks in exactly the same situation I find myself in now.  Unfortunately, I didn’t upgrade sooner, because it looks like this is one of those “must have” features.  The other neat thing about the 2.0 version is that it now includes the multi-snapshot feature I like so much in the Workstation 6.0 product.  Now we can have the best of both worlds.

Now back to reality.  What to do about the corrupt disk image?  I’ve since learned that the smart guys at VMWare have known about this for awhile and have pinpointed a potential bug in the Mac OS that can cause virtual machines to get fried because of some problem handling unbuffered i/o.  If you’re interested, you can read about it here.  Wish I’d known about this sooner!

First, I decided to upgrade to the 2.0 beta to see if perhaps by some fluke of grace (?), I might get lucky and Windows might magically be restored.  Not hardly.  But I have to say, the new Fusion interface is pretty slick.  Perhaps when I’m sane again, I can do a full treatment of the new features.  Right now, I’m still too wigged out to pay attention to such details.  I need to find a way to recover those lost VS2008 projects I was working on.

The next idea was to create a brand new VM and then try to connect to the old disk, perhaps by adding it as a second hard drive.  That way, even if the registry got fried, I might be able to retrieve the data.  I might even be able to repair the registry and somehow get windows to boot up again.  First things first – how to access the old virtual disk drive?

Using the new Fusion beta 2.0 interface was a snap.  They have a feature called “Easy Install” that automates the entire installation and then installs the VMWare Tools package for you.  I decided to go with Windows Server 2003 R2 Standard Edition with SP2 – the same as the one I was running before.  (I’ve tried various flavors of Windows Server 2008 as well as the 64bit versions, but I don’t really see that much of an improvement especially since I’m using it mostly as a workstation for SharePoint development.).  I had forgotten how fast this machine is.  I allocated 3GB to the VM and configured it for 4 virtual CPUs.  Even though it says “Setup will complete in approximately 37 minutes”, the whole thing was done in less than 10 minutes on the Mac Pro.

Once I had the OS installed, I couldn’t wait to attach the old drive and start poking around.  So I opened the VM settings page and added a second hard disk, making a copy of the existing virtual disk from the other system.  Then I held my breath, crossed my fingers and stood by the window a-wishin’  for a miracle…

I couldn’t believe how long it took to boot up – but boot up it did!  Then I opened Windows Explorer and navigated to the My Computer node.  Low and behold, there were recognizable files!  Sadly, the “My Documents” folder was completely empty, so most of my documents were off to emerald city.  Not that I even remember what was there exactly.  But I know there was some good stuff in there.  Oh well.  Don’t get me wrong.  I’ll take what I can get.  And you can best believe this was an important wake up call for me.

Here’s the deal:

  1. If you’re using VMWare Fusion – get the 2.0 beta NOW.
  2. Turn on AutoProtect to do periodic snapshots.  After this experience, I have mine set to take a new snapshot every hour and to keep 10 copies for safe keeping.  You can set it for every day, hour or half-hour and it’s smart enough to keep a range of snapshots to provide different restore options. 
     2008-08-06_Fusion_AutoProtect
  3. From the VMWare Fusion Preferences menu, under the ‘General’ tab in the ‘Performance’ section, select the “Optimize for virtual machine disk performance” option.  This turns off unbuffered i/o.  If you choose the other option (optimize for Mac OS application performance) it writes directly to the disk, using less memory, but you run the risk of hitting that nasty OS X bug.

Next, I’ll start poking around and see what files I can recover.  Then I’ll revert to the prior snapshot and see if the “My Documents” folder is still intact there.  I should be able to get back most of my older document files.  Anything else that was lost, well, what can I say?  Ah, the cost of complacency. 

JFH

 

Technorati Tags: ,,

About Me

John Holliday

Independent author, consultant, trainer, and software developer specializing in enterprise content management, collaboration, workflow and business process automation.

MVP Profile SharePoint training for developers and administrators

Developer Resources

  • Fields WSS XSLT - Custom XSLT stylesheet that displays the default SharePoint column definitions in a table.
  • Custom Action Identifiers - A sortable table of default field definitions, including CAML declarations for writing content types.
  • CAML.NET Documentation - Online documentation for the CAML.NET class library.