We get up early so that you don't have to. |
So, Microsoft has announced that Office 12 will use XML as its default file format, and everyone is swooning. You know, XML is like secret sauce, or enzymes: it makes everything it touches just oh-so-much better. Well, let's dig into this a bit further, OK? Facts first, then I'll shoot my mouth off for a bit. I'm sure you can't wait.
The basic announcement is that Word, Excel, and PowerPoint are getting new default file formats that will be XML-based. These are not the existing WordML and SpreadsheetML schemas, but brand new animals, which are being pitched as having better security and dramatically reduced file sizes. A big plus is that there will be free updates to Office 2000, Office XP, and Office 2003 to work with the new file formats. On the legal front, "Microsoft Office Open XML Formats are fully documented file formats with a royalty-free license."
One interesting sign of the times is that the file formats weren't just announced by Microsoft's PR team. Pretty much simultaneously with the official announcement, Brian Jones, a Program Manager on the Word team, launched a blog to discuss nothing but the new file formats. It's great to see the Office team adopting some of the openness and transparency we've grown used to from the Visual Studio team.
Boy howdy, is there a lot of positioning going on here or what? Start with the name: "Microsoft Office Open XML Formats". Remember the (probably apocryphal) story of Abe Lincoln asking how many legs a dog has if you call a tail a leg? Four, goes the answer: calling a tail a leg doesn't make it one. In the same way, you don't make a file format open just by tacking the word "Open" on its name, especially one that comes with a license that is pretty much opaque if you're not an intellectual property lawyer, but that seems very likely to prohibit the use of these schemas in certain open-source projects..
I worry about the smaller file sizes business, too. Are smaller file sizes really all that relevant in these days of rapidly-expanding bandwidth and disk storage? When was the last time you ran out of disk space because of the size of your Word documents and Excel spreadsheets? (Now, the size of your Visual Studio install, that's another story). It looks to me like the smaller size comes at a cost: "The smaller file sizes are enabled by a combination of industry-standard ZIP compressed files technology that automatically compresses each component within the file as well as the reduced overhead of an XML format." So, you don't actually have a pure XML file to work with; you have a zip file containing a batch of XML stuff. This raises the bar in terms of the tools you need to actually interoperate with the new formats. It's not going to be as simple as just plugging them into the same XML tool stack you use for any other XML file format.
People are pitching this as "at last, all of Office is covered by XML" when in reality only PowerPoint is joining Word and Excel at the XML party. Hello? There are other Office applications, you know. Ever hear of Access? Outlook? OneNote? Publisher? If Microsoft is going to crow about how they're moving to XML for the Office 12 default, I wish they'd really make it the default for Office 12, not just for the three applications that they want to talk about.
So what's the bottom line? Well, XML is nice, sure, but honestly, most users will never care. I don't care if you save my document in XML, Braille, cuneiform, or chicken scratchings, so long as you can get it back when I ask for it. Some tool vendors, who can puzzle through the license and handle the legalities, will find their jobs marginally easier. Microsoft will have an additional flag to wave at confused legislators the next time open-source advocates argue for a switch to non-proprietary software ("Look, even the name says we're already open!"). All in all, I see this as a nice PR move, and an interesting technical achievement, but not of any particular significance to most users of Office.
Mike Gunderloy is the lead developer for Larkware and author of numerous books and articles on programming topics.