List strings with a particular style in a Microsoft Word document in C#

[style]

I’m writing a new book and the development editor wants me to make a list of key terms that I specifically want included in the index. I want to include all of the words marked with the style KeyTerm, so I wrote this program to find them.

The program requires a reference to the Word object library. To add the reference in Visual Studio 2008, open Solution Explorer, right-click the References entry, and select Add Reference. Then go to the COM tab and double-click Microsoft Word 14.0 Object Library.

The program’s code uses the following using directive to make using the object library easy.

using Word = Microsoft.Office.Interop.Word;

The following method searches a Word document for words that have a particular style.

// Find words with the given style in the Word file.
private List<string> FindWordsWithStyle(string file_name,
    string word_style)
{
    // Get the Word application object.
    Word._Application word_app = new Word.ApplicationClass();

    // Make Word visible (optional).
    word_app.Visible = false;

    // Open the file.
    object filename = file_name;
    object confirm_conversions = false;
    object read_only = true;
    object add_to_recent_files = false;
    object format = 0;
    object missing = System.Reflection.Missing.Value;

    Word._Document word_doc =
        word_app.Documents.Open(ref filename, ref confirm_conversions,
            ref read_only, ref add_to_recent_files,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref format, ref missing, ref missing,
            ref missing, ref missing, ref missing, ref missing);

    // Search.
    List<string> result = new List<string>();
    object style = word_style;
    word_app.Selection.Find.ClearFormatting();
    word_app.Selection.Find.set_Style(ref style);
    object obj_true = true;
    for (;;)
    {
        word_app.Selection.Find.Execute(ref missing,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing, ref obj_true,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing);
        if (!word_app.Selection.Find.Found) break;
        result.Add(word_app.Selection.Text);
    }

    // Close the document without prompting.
    object save_changes = false;
    word_doc.Close(ref save_changes, ref missing, ref missing);
    word_app.Quit(ref save_changes, ref missing, ref missing);

    // Return the result.
    return result;
}

The method takes as parameters the name of the file to search and the name of the style to locate. The method starts with the usual tasks required to work with Word documents. It creates a Word application object. It then makes some objects holding values that it will use because Word interop generally only works with objects passed by reference. The method then opens the Word document.

Next, the method performs the search. To do that, it uses the Selection object. It first clears the object’s formatting and uses its set_Style method to set the object’s style to the value in the style parameter. It then enters an infinite loop.

Inside the loop, the code calls the Selection object’s Find.Execute method. The only parameter that is not omitted in that method call is the Format property, which indicates whether the method should take into account any formatting specified for the Selection object. This example srets that property to true to make the search look for text that has the desired style.

After calling Find.Execute, the code checks the Selection object’s Find.Found property to see if it found a piece of text with the desired style. If Find.Found is not true, the code breaks out of its loop.

If Find.Found is true, the code adds the text that it found to its result list and continues its loop.

Each time the code calls Find.Execute, that method searches for the next piece of text that matches the style.

After the code breaks out of its loop, it closes the Word document and returns the strings that it found.

Download the example to see additional details such as how the program lets the user browse for the Word document ad how it displays the result.


Download Example   Follow me on Twitter   RSS feed   Donate




About RodStephens

Rod Stephens is a software consultant and author who has written more than 30 books and 250 magazine articles covering C#, Visual Basic, Visual Basic for Applications, Delphi, and Java.
This entry was posted in books, interoperability, Office, Word and tagged , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.