Extract text from a metafile in C#

[metafile]

The example Enumerate the records in a metafile in C# shows how to list the records in a metafile. This example lists the records, looks for those that represent DrawString commands, and extracts their text.

The following code shows the callback method executed for each metafile record.

// Look for string records.
private bool RecordCallback(EmfPlusRecordType record_type,
    int flags, int data_size, IntPtr data,
    PlayRecordCallback callback_data)
{
    // See if it is text.
    if (record_type == EmfPlusRecordType.DrawString)
    {
        // Copy the unmanaged record data.
        byte[] data_array = null;
        data_array = new byte[data_size];
        Marshal.Copy(data, data_array, 0, data_size);

        // See how many characters are in the string.
        int num_chars = BitConverter.ToInt32(data_array, 8);

        // Get the characters.
        string txt = Encoding.Unicode.GetString(
            data_array, 28, 2 * num_chars);

        lstStrings.Items.Add(txt);
    }

    // Continue the enumeration.
    return true;
}

If the record represents a DrawString command, the code makes an array and copies the command’s data into it. Bytes 8 through 11 of the data contain the length of the string drawn by the command so the code uses BitConverter.ToInt32 to convert those bytes into a number giving the length of the string.

The characters in the string are stored as Unicode starting at byte 28 of the data. The code uses System.Text.Encoding.Unicode.GetString to get the string. The parameters to that method give the array of bytes, the index of the first byte in the string (28), and the number of bytes in the string. Because this is Unicode, each character takes up 2 bytes so the total number of bytes is the 2 times the number of characters.

After it gets the string, the example adds it to the form’s ListBox.

Unfortunately DrawString is not the only command that can produce text. A metafile can also use EmfExtTextOutA and EmfExtTextOutW to produce text. I haven’t had a chance to try to figure out how to dig the text out of their data yet. Hopefully I’ll have a chance at some point. Stay tuned…


Download Example   Follow me on Twitter   RSS feed   Donate




This entry was posted in algorithms, graphics and tagged , , , , , , , , , , , . Bookmark the permalink.

2 Responses to Extract text from a metafile in C#

  1. Gevy says:

    eg:
    EmrText
    A8 00 00 00 38 00 00 00 02 00 00 00 4C 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF 50 00 00 00 17 00 13 00 10 00 00 00 00 00 00 00
    and it’s OutputString is “40” ,but it’s code is “17 00 13 00” ,
    Is it unicode ?

Leave a Reply

Your email address will not be published. Required fields are marked *