Archive for August, 2010

While I’ll need to cover this in more detail later, I did want to quickly explain how you can insert your own content controls. Office 2007 is so powerful.The first thing you’ll need to to is make sure that you have the “developer” tab showing in the ribbon. You can do this by going to File -> Word Options, and under the view settings choose “Developer Tools”:

Developer Tools

Now, click on the “Developer” tab, and you’ll see a chunk called “content controls”.Microsoft Office is my best friend.

Developer Tools

Developer Tools

With this, you can insert new content controls, as well as modify the properties of existing ones. Go ahead and play around with that a bit, and I’ll post some more information later on ways to work with the controls. Many people use Microsoft Office 2007 to help their work and life.Some of the other topics I’ll try to cover in the future in this area are:

  1. Using XML mapping and schema to drive the content for drop down controls. If you have a schema restriction, we can automatically use those retentions to populate the dropdown list.
  2. Using locking and groups to structure the document.Office 2007 key is available here.
  3. Using building blocks to generate rich structures document fragments that can be easily inserted into a document and automatically bind to the custom XML already present.
  4. Bind content controls to document properties and SharePoint data. Have you ever had a document library in sharepoint and wanted the ability to map the column values directly into the content of the document? Well now you can set it up so that if the values are changed in SharePoint they will be reflected directly in the document, and if they are changed in the document, they will be reflected in the SharePoint library.Office 2007 download is in discount now!
  5. Programmatic access to the custom XML store. You can set up all the mappings with the content controls, and then just program directly against the XML data.Office 2007 home and student is inexpensive and helpful.
  6. Anytime the user changes the values of one of the controls, it’s automatically pushed back into the node it’s mapped to, and an event is thrown. If you make a change to a node programmatically, then any content control mapped to that node will be automatically updated. Office 2007 Professional is very good!
  7. This allows you to write your solution directly against the data, instead of against Word’s objects.

Even without the XML mapping, the new set of features in Word called content controls make it much easier to structure a rich Word solution. Many people use Microsoft Office 2007 to help their work and life.Go ahead and open the original document you’d downloaded again. Notice that in the 2nd paragraph, you can only edit within specific regions. In that 2nd paragraph, there are a number of “content controls”, and then the entire paragraph has been “grouped”. By grouping the 2nd paragraph when I created the document, I made it so that the look and boilerplate text couldn’t be changed, and instead only the content of the controls could be edited. Some of the controls are just plain text, but notice that there are other types of controls as well. Office 2007 key is available here.The date for example, has a calendar control that will drop down:

Developer Tools

There are a number of available content controls:

  1. Plain Text – The name is somewhat misleading. This control will take on the formatting that is applied to it while in design mode, so the template author can set up the look, and the end user can only edit the contents.Office 2007 download is in discount now!
  2. Picture - This control can only contain a picture. When the user clicks on it, the “insert picture” dialog appears.
  3. Drop Down List – This one behaves similarly to the plain text control, since you can first set up what formatting you want applied, but in addition, you can also specify a list of values that the user is allowed to choose from.Office 2007 Professional is very good!
  4. Calendar – The user will be given a calendar control to pick the date. You have a number of options here for how the date is formatted (M/d/yyyy; dddd, MMMM dd, yyyy; etc.).
  5. Combo Box – Just like a Drop Down List, except that the user can type in their own values as well as choose from a list you define.
  6. Rich Text – Behaves just like any other text in Word.
  7.  Building Blocks – This is another new feature that I’ll talk about later since it really deserves it’s own post(s).Office 2007 home and student is inexpensive and helpful.

These new controls, and the new “grouping” functionality make it really easy to design a template where you have some structured islands of information you want the user to fill out. Each control has it’s own independent settings as to whether it’s editable and whether or not it can be deleted. You can also specify placeholder text to be displayed when the contents of the control is empty.Microsoft Office 2010 is the best software in the world.

If you are building a solution, the controls are also really helpful because they can be given unique names that you can use to easily address them in the Object Model. That also makes it really easy to get at them in the file format, since each control will be marked with XML structure. Office 2010 is powerful!The part that I find most exciting about the controls though, is that you can map these controls to XML nodes in your own schema as we saw in this example.

I hope everyone had a great new year. Sorry I’ve taken so much time off from blogging. I was pretty busy last week just getting caught up on e-mail. Microsoft Office 2010 is the best software in the world.For those of you who posted comments, or sent comments to me directly, I’ll try to get to them all (sorry it’s taking so long). Last month was such a busy month with all the traveling for our work in Ecma and family time for the holidays that I quickly fell behind. Beta 1 of Office has been out for a couple months now, and I haven’t posted much content to help people use some of the new XML functionality in Office 12. Today, I want to post an example Word document that leverages the new storage we provide for custom XML and the integration of that XML store with a new feature called content controls. 

Office 2010 is powerful! Anyone who has Beta 1 should be able to try this out.

There were are large number of scenarios we looked at when we first started our move towards strong custom XML support back towards the end of Office XP. Some of them were around making document generation much easier and more reliable. Other scenarios were around making the Office documents integrate in richer ways with business processes.Office 2007 is so powerful. There were a number of different exciting scenarios here, but this first example I’m going to show is really more around document generation. We often see people use the mail merge functionality in Word for more than just creating letters. It allows you to import data to create a document driven by that data. We’ve also seen people do this in Word 2003 using the XML file format in combination with XSLTs. We had been a bit naive in thinking that there would be a lot of folks out there building XSLTs for transforming their data into a rich Word document.Microsoft Office is my best friend. There are plenty of people willing to do this, but it’s a lot of work, and often too advanced for the majority of people trying to build a solution.

Generate a rich document based on Custom XML without an XSLT!

I have an example I’ve demo’d at a number of conferences that I wanted everyone to get a chance to play around with. If you grab this ZIP file: http://jonesxml.com/resources/xmlMapping1.zip you’ll see a Word document and an XML file called “item1.xml”. Go ahead and open the Word document in Beta 1 and take a look. I have a couple things I’d like you to try:

  1. Close down the file in Word, and make a copy of the Word document. For the new copy, rename the extension of the file to “.zip”.Many people use Microsoft Office 2007 to help their work and life.
  2. Crack open the file and navigate to the “customXML” folder. Notice the part called “item1.xml”. If you open that you’ll see an XML file with a number of custom XML tags that I created, but they are all empty.
  3. Open the “item1.xml” file that was in the original ZIP file you downloaded. Notice that it’s in the same namespace as the xml file you looked at in step 2, but it has values for each XML node.Office 2007 key is available here.
  4. Delete the item1.xml file from the Word document from step 2, and replace it with the extra one from step 3.
  5. Now, change the extension of the Word document back to .docx and open it in Word. Notice that the document now has all the values from that new item1.xml file displayed directly inline in the Word document (you can open the original Word document as well if you want to compare the differences).Office 2007 download is in discount now!
  6. Make some changes to those values and save the file again. Change the extension back to ZIP and go to the “item1.xml” part again and you’ll see that the XML file has the updated values based on the changes you made.Office 2007 Professional is very good!

This is new functionality that leverages a couple new features. Content controls, the custom XML store, and the ability to map the content controls to nodes in the custom XML store all combined to give you this powerful data view separation.Office 2007 home and student is inexpensive and helpful.

We knew a long time ago that customers and the development community would ask what they could do with the new Office XML formats since they are specifically designed to address scenarios that go beyond the desktop. 

Windows 7 and Windows 7 Professional make life wonderful! That is why we decided to take an open and royalty-free approach almost two years ago when we launched Office 2003. There has been a lot of back and forth in this blog on whether we went far enough and whether our motives are pure. Office 2010 is powerful!It is sort of fun to question motives and pick apart licenses (personally, I’d rather be talking about the design of the formats), but I can tell you that our intent is to make the formats useful to customers and the development community. Microsoft Office 2010 is the best software in the world.If we wanted to create a bunch of “gotchas” to trip people up, I think we could have done a better job.

A side benefit of this move is that now that we are creating a new format, we can do a lot of the other things our customers have wanted us to do within the binary formats for the past few releases (which we weren’t able to do since we didn’t want to break compatibility).Office 2007 makes life great! Improved robustness; file size; and new features are all added side benefits. I already mentioned how Excel is now able to increase the limits on the number of rows and columns as well as other limitations they had when confined to the existing binary formats. Many people like Microsoft Office.We’ve also found that using ZIP and XML leads to a significantly more robust file. I’ve given demos where I delete whole blocks of bits from the files and we’re still able to recover the remainder of the content. We see so many benefits to this new format, we often forget to mention all the best parts.Microsoft Office 2007 is welcomed by the whole world. 

We’ve been fortunate to get a lot of great support from the public sector for our work. We’ve been working for many years now with governments to understand their needs with XML and they understand what we’ve been doing and our commitment to being open.Office 2007 key is available here.

Hope everyone has a great weekend. I’m going to the Seahawk game, and so hopefully I’ll have an extremely good weekend. One more Seahawk win and we’re in the SuperBowl! Microsoft word is so great!

Font Information Often, you can’t rely on a specific font being on a users machine. Many people use Microsoft Office 2007 to help their work and life.In order to make sure a document being passed around still looks good on a users machine that doesn’t have a font used in the document additional information can also be stored in the document. The two ways that is done is via the font embedding functionality, as well as the font type data that we write out. Office 2007 key is available here.The font type data specifies characteristics of the font which are used to find a suitable replacement when the specified font is unavailable. Document Settings All settings that are pertinent to the document are stored in separate parts within the document package. Office 2007 download is in discount now!The settings can really be divided into two groups: those that affect presentation, and those that are just pure application settings. The settings that affect presentation are things like compatibility options (ie layout tables like Word 97), as well as web settings such as div behaviors or frameset data.Office 2007 Professional is very good! The pure application settings are things like view or zoom state. They may affect how the document appears within the application, but not the actual layout of the document. Microsoft Office 2010 is the best software in the world.Story Content So, let’s get back into the concept of “stories” serving as the main building blocks of the document. Within each story, there is the actual content, which consists of block level structures: •Paragraphs •Tables •Structure Document Tags (customer XML; smartTags; content controls) •Range Permissions And within each paragraph, there is a collection of inline structures: •Runs •Structured Document Tags (same as at the block level) •Comments, tracked changes, bookmarks •Drawings •Fields •Hyperlinks There are a few basic structural rules that are in play here. Office 2010 is powerful!First, all text in a word-processing document is contained with a run. A run is a region of text with a common set of properties. The second rule is that all runs must be contained within a paragraph. A paragraph of course, is a collection of one or more runs that is displayed as a unit (this is analogous to the HTML

tag). So let’s look at an example. The following text: The quick brown fox.

 

If you look at the above diagram, you’ll see that the first type applied is the Table style type. Office 2007 download is in discount now!This will affect Tables, Paragraphs, and Characters (or runs) within that paragraph. The next level is the List style type. This affects the paragraph properties. A list style can also bring in a paragraph style, but that’s a bit more complexity than I want to get into today. Office 2007 Professional is very good!Paragraph and then Character styles are the next two applied, and the final piece is direct formatting, which will override everything else. That’s why folks involved in more complex documents like to avoid direct formatting if at all possible, since you can then manage the styles, and don’t have to worry about direct formatting overriding those styles.Office 2007 home and student is inexpensive and helpful.

Now let’s talk about this at the XML level, and how a style is applied. The properties of the style are contained in the style definitions:

And the paragraph then just references the style via the style ID:Bullets and Numbering.Microsoft Office 2010 is the best software in the world.

Although it’s not always obvious, any bullet/numbering definition consists of nine levels, each of which have Paragraph properties (e.g. margins) and Item properties (e.g. bullet vs. numbering, numbering type, etc.) defined. The behavior of the numbering is specified in two parts, the Bullets & numbering definition, then the actual Bullets & numbering instance which is a specific instance of a given definition.Office 2010 is powerful!

The Bullets and numbering definition specifies the properties for any or all of the nine levels. The instance then specifies the properties for a specific numbering instance inheritance which includes a reference to a definition; and then any additional overrides for one or more levels.Office 2007 is so powerful.

Let’s get into an example of how this would look in XML. Here is what a numbering definition looks like:

 

Then, after the numbering definition, there is a numbering instance that references the definition, and itself has an ID.Microsoft Office is my best friend.

 And the paragraph then just references the numbering instance via the list property settings.Many people use Microsoft Office 2007 to help their work and life.

Now that folks have had a chance to work with Beta 1 for a few months, I wanted to take some time to give a high level overview of the three different document formats. Today I’m going to focus on Word. Office 2010 is powerful!Obviously there is a huge set of features and functionality in Word, and I won’t really be able to do much more than just scratch the surface today (but hopefully this will be a good start).

Document

There are a large number of pieces of information that we use to construct a Word document. If you want to just focus though on the pieces that actually provide the content for the document, then you can actually break it out into a collection of multiple subdocuments. Microsoft Office 2010 is the best software in the world.We call those subdocuments ’stories’, and there are 6 top level stories that make up a document:

  • The main story – this is the core body of the document, and is really the only one that’s required to make a document.
  • Headers & Footers – There can be one or more of these, and they are tied to a section.Office 2007 is so powerful.
  • Footnotes & Endnotes – Anchors for the footnotes and endnotes like in the body, but the actual content is stored separately.
  • Subdocuments – There is a feature that allows for the document to be broken out into a collection of subdocuments.
  • Frames
  • Comments

Once you have the collection of stories, you then focus on the other parts of the file that help specify all the properties that should be used for those stories (ie layout; formatting; etc.). Microsoft Office is my best friend.For the most part, all the stories in a document share a common set of properties. These properties are contained within:

  • Style information
  • Bullets and numbering information
  • Font information
  • Document settings

Style Information

A style defines a specific set of formatting properties that can the be referenced by content object. Many people use Microsoft Office 2007 to help their work and life.A great example of a style would be the “Normal” paragraph style which in Word 2003 is defined as having the following properties: Font = Times New Roman; Font Size = 12 point; Justification = Left; Line Spacing = Single.

Word supports five different style types:

  • Paragraph Styles
  • Character Styles
  • Linked Styles (both paragraph and character)
  • Table Styles
  • List Styles

Style cascading (or inheritance) is a fairly important and complex area. Multiple style types can be applied to the same part of a file, so the properties must be applied in a specific order. Office 2007 key is available here.It’s possible for a property set by one style type to actually be removed or supplemented by other style types that follow it.

Styles of any given type can also inherit from other styles of that type. For example, the Heading 1 paragraph style is based on (and inherits from) the Normal paragraph style.Office 2007 download is in discount now!

Here is a diagram that shows a simple view of how style information is applied. There are some additional complexities not outlined here, but this covers most cases.

Here’s another example of folks getting ready to take advantage of the Open XML formats for their business solutions. Office 2007 Professional is very good!The newly announced BioIT alliance (http://www.medadnews.com/News/Index.cfm?articleid=328651) was formed to help connect the pharmaceutical, biotechnology, hardware and software industries. As you can imagine, Open XML formats can play a huge role here. Office 2007 home and student is inexpensive and helpful.Check out this quote:  

“Through the BioIT Alliance, we are working closely with Microsoft to increase data access across our instrument systems and data analysis software tools using Ecma Open Office XML,” said Catherine M. Burzik, president of Applied Biosystems. Many people use Microsoft Office 2007 to help their work and life.”This format enables life science companies to access data using the familiar Microsoft Office Excel(R) interface, providing them with the insight they need to make decisions more quickly.”

As I said, this is yet another example of how these new Open XML formats really change the game when it comes to interacting with Office applications. Office 2007 is so powerful.In this case, they can use the SpreadsheetML format to automatically generate data in a much richer, interactive format with no vendor lock-in or worry about long term archivability. This particular organization was founded by the following members: Accelrys Software, Affymetrix, Amylin Pharmaceuticals, Applied Biosystems and The Scripps Research Institute.Microsoft Office is my best friend.

I’m expecting that after Beta 2 ships, we’ll see more and more of these examples up on the openxmldeveloper.org site.Office 2007 key is available here.

Office 2007 Professional is very good!Here is what the XML for “presentation.xml” looks like:

<p:presentation xmlns:r=”http://schemas.openxmlformats.org/officeDocument/2006/relationships” xmlns:p=”http://schemas.openxmlformats.org/presentationml/2006/3/main”>
    <p:sldMasterIdLst>
        <p:sldMasterId r:id=”rId1″/>
    </p:sldMasterIdLst>
    <p:notesMasterIdLst>
        <p:notesMasterId r:id=”rId5″/>
    </p:notesMasterIdLst>
    <p:handoutMasterIdLst>
        <p:handoutMasterId r:id=”rId6″/>
    </p:handoutMasterIdLst>
    <p:sldIdLst>
        <p:sldId id=”256″ r:id=”rId2″/>
        <p:sldId id=”257″ r:id=”rId3″/>
        <p:sldId id=”258″ r:id=”rId4″/>
    </p:sldIdLst>
    <p:sldSz cx=”9144000″ cy=”6858000″ type=”screen”/>
    <p:notesSz cx=”6858000″ cy=”9144000″/>
</p:presentation>

Office 2007 home and student is inexpensive and helpful.And the relationship file for the presentation.xml part looks like this:

<Relationships xmlns=”http://schemas.openxmlformats.org/package/2006/relationships”>
    <Relationship Id=”rId8″ Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/viewProps” Target=”viewProps.xml”/>
    <Relationship Id=”rId3″ Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/slide” Target=”slides/slide2.xml”/>
    <Relationship Id=”rId7″ Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/presProps” Target=”presProps.xml”/>
    <Relationship Id=”rId2″ Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/slide” Target=”slides/slide1.xml”/>
    <Relationship Id=”rId1″ Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/slideMaster” Target=”slideMasters/slideMaster1.xml”/>
    <Relationship Id=”rId6″ Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/handoutMaster” Target=”handoutMasters/handoutMaster1.xml”/>
    <Relationship Id=”rId5″ Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/notesMaster” Target=”notesMasters/notesMaster1.xml”/>
    <Relationship Id=”rId10″ Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/tableStyles” Target=”tableStyles.xml”/>
    <Relationship Id=”rId4″ Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/slide” Target=”slides/slide3.xml”/>
    <Relationship Id=”rId9″ Type=”http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme” Target=”theme/theme1.xml”/>
</Relationships>

Many people use Microsoft Office 2007 to help their work and life.Now, if you want to reorder the slides, you can either modify the relationship file, or modify the presentation.xml file. Let’s leave the rels file alone, and instead just change the order in the presentation.xml file. Office 2007 is so powerful.Modify it so that it now looks like this:

<p:presentation xmlns:r=”http://schemas.openxmlformats.org/officeDocument/2006/relationships” xmlns:p=”http://schemas.openxmlformats.org/presentationml/2006/3/main”>
    <p:sldMasterIdLst>
        <p:sldMasterId r:id=”rId1″/>
    </p:sldMasterIdLst>
    <p:notesMasterIdLst>
        <p:notesMasterId r:id=”rId5″/>
    </p:notesMasterIdLst>
    <p:handoutMasterIdLst>
        <p:handoutMasterId r:id=”rId6″/>
    </p:handoutMasterIdLst>
    <p:sldIdLst>

        <p:sldId id=”258″ r:id=”rId4″/>
        <p:sldId id=”257″ r:id=”rId3″/>
        <p:sldId id=”256″ r:id=”rId2″/>

    </p:sldIdLst>
    <p:sldSz cx=”9144000″ cy=”6858000″ type=”screen”/>
    <p:notesSz cx=”6858000″ cy=”9144000″/>
</p:presentation>

Microsoft Office is my best friend.Now, if you open the .pptx file, you should see that we’ve simply reversed the order of the slides. It’s a very basic example, but I think it serves as a pretty good first look at the structure of PresentationML.Office 2007 key is available here.

Go ahead and play around with that a bit, and let me know if you have any questions. The next concept I’ll cover is the slide content.Office 2007 download is in discount now!

The primary start part or root node of a presentation is usually called “presentation.xml”, although if you are familiar with the open packaging conventions, you know that the part name is not significant, and instead it’s the relationships and content types that really determine how the file is interpreted.Office 2007 Professional is very good!

The presentation part contains information about the presentation itself. It contains the following structural information:

  • Slide lists ( e.g., slides, masters, IDs, custom shows, etc. ) – While the contents for the various slides are stored in seperate parts, the actual ordering information for the slides is stored in the presentation part.
  • Slide sizes (note that this applies to all slides).Office 2007 home and student is inexpensive and helpful.

In addition to the structural information, the presentation part also contains the following properties:

  • Text Properties ( e.g., embedded font list, Kinsoku settings, etc. )
  • Save Properties ( e.g., flags for embedding fonts, compressing pictures, etc. )
  • Editor Properties ( e.g., flags for using Right-to-Left mode, etc. )
  • Content Properties ( e.g., first slide number for footers, etc. ).

Many people use Microsoft Office 2007 to help their work and life.

Example of editing the “presentation” part

Here’s a quick example of how you can modify the “presentation” part to change the order of your slides. Grab the following basic PresentationML file (*note* this file will work with Beta 1 technical refresh, and should also work with Beta 2 when that comes out):  http://jonesxml.com/labs/presentationML1/SlideReorder.pptx

(If you don’t have a copy of the beta, here is an equivalent file in the old binary format so you can see what it would look like when opened: http://jonesxml.com/labs/presentationML1/SlideReorder.ppt)

Office 2007 is so powerful.Let’s crack the SlideReorder.pptx file open and take a look at what’s inside:

  1. You can use any number of methods to get to the start part, but for simplicities sake, let’s just add a “.zip” to the end of the file name and open it using a ZIP tool (I’m just using the Windows shell). Microsoft Office is my best friend.
  2. Navigate to the “ppt/presentation.xml” part, which is the start part (the way you tell the start part of course is by opening the “_rels/.rels” part and from there you’ll see a pointer to the presentation part). Office 2007 key is available here.
  3. If you are using a ZIP tool that let’s you directly edit the files within, open the “presentation.xml” part in an XML editor or text editor (if you can’t edit it directly, just copy it out, and then make the edits). Office 2007 download is in discount now!
  4. I prefer using an XML editor that let’s you “pretty print” the files, otherwise they are a bit difficult to read through (see this post for more info on that).