PFF is short for Personal Folder File and is mainly used by Microsoft Outlook
to store e-mails, appointments, contacts, tasks, etc. This specification is
based on the work by libpst [SMITH02]
started in 2002 and was enhanced by
analyzing test data in 2008 and 2009. In 2010 it was synced with Microsoft’s
official PST specification [MS-PST]
.
This document is intended as a working document for the Personal Folder File format specification. Which should allow existing Open Source forensic tooling to be able to process this file type.
Author(s): |
Joachim Metz <joachim.metz@gmail.com> |
Abstract: |
This document contains information about the Personal Folder File format. |
Classification: |
Public |
Keywords: |
PFF, Personal Folder File, OFF, Offline Folder File, PAB, Personal Address Book, PST, Personal Storage Table, OST, Outlook Storage Table |
Copyright (C) 2008-2020, Joachim Metz <joachim.metz@gmail.com>. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".
Version | Author | Date | Comments |
---|---|---|---|
0.0.1 |
J.B. Metz |
June 2008 |
Initial version based on earlier notes. |
0.0.2 |
J.B. Metz |
June 2008 |
Added information about LZFu compression and arrays. |
0.0.3 |
J.B. Metz |
July 2008 |
Added information about allocation tables. |
0.0.4 |
J.B. Metz |
October 2008 |
Updated for initial release. |
0.0.5 |
J.B. Metz |
October 2008 |
Added reference about RTF compression. |
0.0.6 |
J.B. Metz |
October 2008 |
Addition about attachments of type embedded object. |
0.0.7 |
J.B. Metz |
October 2008 |
Addition about 0x85 0x85 index node type. |
0.0.8 |
J.B. Metz |
October 2008 |
Added information about descriptor list type. |
0.0.9 |
J.B. Metz |
November 2008 |
Added information about name-to-id map. |
0.0.10 |
J.B. Metz |
December 2008 |
Additional information about the header. |
0.0.11 |
J.B. Metz |
December 2008 |
Additional information about the header and item values types (property types). |
0.0.12 |
J.B. Metz |
December 2008 |
Additional information about item types (property names/identifiers) and the name-to-id map. |
0.0.13 |
J.B. Metz |
December 2008 |
Additional information about the file header and the item types. |
0.0.14 |
J.B. Metz |
December 2008 |
Additional information about the item types. |
0.0.15 |
J.B. Metz |
January 2009 |
Additional information about the allocation full maps. |
0.0.16 |
J.B. Metz |
January 2009 |
Additional information about HTML e-mail body type. |
0.0.17 |
J.B. Metz |
January 2009 |
Additional information about name-to-id map. |
0.0.18 |
J.B. Metz |
January 2009 |
Moved MAPI definitions to separate document. |
0.0.19 |
J.B. Metz |
March 2009 |
Added information encountered by C. Byington of the libpst project. |
0.0.20 |
J.B. Metz |
March 2009 |
Additional information and corrections. |
0.0.21 |
J.B. Metz |
May 2009 |
Update for recipient types based on patch by K. Mazur |
0.0.22 |
J.B. Metz |
May 2009 |
Update for attachment rendering position from finding by K. Mazur. |
0.0.23 |
J.B. Metz |
June 2009 |
Update for non UTF-16 strings in name-to-id map. |
0.0.24 |
J.B. Metz |
June 2009 |
Added information about the b5 table header table entries level. |
0.0.25 |
J.B. Metz |
June 2009 |
Added information about 6c and 8c table. |
0.0.26 |
J.B. Metz |
July 2009 |
Additional information and corrections. |
0.0.27 |
J.B. Metz |
September 2009 |
Added information about none encrypted pst files with encrypted data. |
0.0.28 |
J.B. Metz |
September 2009 |
Added information about the array (indirection) level. |
0.0.29 |
J.B. Metz |
September 2009 |
Added information about the bc table (indirection) level. |
0.0.30 |
J.B. Metz |
January 2010 |
Small changes |
0.0.31 |
J.B. Metz |
January 2010 |
Additional information based on [MS-PST]. |
0.0.32 |
J.B. Metz |
March 2010 |
Corrected information about table offset index entries. |
0.0.33 |
J.B. Metz |
April 2010 |
Updated remarks. |
0.0.34 |
J.B. Metz |
April 2010 |
Added codepage 1200 scenario. |
0.0.35 |
J.B. Metz |
June 2010 |
Email change |
0.0.36 |
J.B. Metz |
July 2010 |
Changes to local descriptors |
0.0.37 |
J.B. Metz |
August 2010 |
Changed table entry to record entry, and table entry definition to column definition for clarity. |
0.0.38 |
J.B. Metz |
January 2010 |
License version update |
0.0.39 |
J.B. Metz |
July 2012 |
Email change |
0.0.40 |
J.B. Metz |
August 2012 |
Updated references. |
0.0.41 |
J.B. Metz |
February 2013 |
Small changes. |
0.0.42 |
J.B. Metz |
February 2013 |
Changes for Outlook 2013 OST (64-bit 4k page) file with thanks to S. Gurjar. |
0.0.43 |
J.B. Metz |
July 2013 |
Additional information about 64-bit 4k page format. |
0.0.44 |
J.B. Metz |
August 2013 |
Additional information about corruption scenarios with thanks to J.M. Cabo. |
0.0.45 |
J.B. Metz |
August 2013 |
Additional information about Outlook 2013 OST (64-bit 4k page) format with thanks to I. Rogov. |
0.0.46 |
J.B. Metz |
August 2013 |
Additional information about Outlook 2013 OST (64-bit 4k page) format. |
0.0.47 |
J.B. Metz |
July 2018 |
Switched to asciidoc format. |
0.0.48 |
J.B. Metz |
July 2020 |
Changes for formatting. |
The PFF (Personal Folder File) and OFF (Offline Folder File) format is used to store Microsoft Outlook e-mails, appointments and contacts. The OST (Offline Storage Table), PAB (Personal Address Book) and PST (Personal Storage Table) file format consist of the PFF format. A PFF consist of the following distinguishable elements:
-
file header
-
file header data
-
index branch node
-
index leaf node
-
(file) offset index
-
(item) descriptor index
-
local descriptors
-
item table type
Characteristics | Description |
---|---|
Byte order |
little-endian |
Date and time values |
FILETIME in UTC |
Character strings |
ASCII strings are Single Byte Character (SBC) or Multi Byte Character (MBC) string stored with a codepage. Sometimes referred to as ANSI string representation. |
Certain elements of the PFF format are related to the Microsoft (Office) Outlook Messaging API (MAPI).
[MS-PST]
defines two types of the PFF:
-
the 32-bit ANSI format
-
the 64-bit Unicode format
A third variant was discovered in an Outlook 2013 OST file namely:
-
the 64-bit Unicode format with 4k (4096 bytes) pages.
The file header common to both the 32-bit and 64-bit PFF format consists of 24 bytes and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
"\x21\x42\x44\x4e" (!BDN) |
The signature (magic identifier) |
4 |
4 |
A weak CRC32 of the following 471 bytes |
|
8 |
2 |
The content type (client signature) |
|
10 |
2 |
The data version (NDB version) |
|
12 |
2 |
Content version (Client version) |
|
14 |
1 |
0x01 |
Creation Platform |
15 |
1 |
0x01 |
Access Platform |
16 |
4 |
0 |
Unknown (dwOpenDBID) |
20 |
4 |
0 |
Unknown (dwOpenClaimID) |
Value | Identifier | Description |
---|---|---|
"\x41\x42" (AB) |
Used for PAB files |
|
"\x53\x4d" (SM) |
Used for PST files |
|
"\x53\x4f" (SO) |
Used for OST files |
Value | Identifier | Description |
---|---|---|
14 |
32-bit ANSI format |
|
15 |
32-bit ANSI format |
|
21 |
64-bit Unicode format (by Visual Recovery) |
|
23 |
64-bit Unicode format |
|
36 |
64-bit Unicode format with 4k |
The 32-bit header data is 488 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
24 |
4 |
Next (available) index pointer |
|
28 |
4 |
Next (available) index back pointer |
|
32 |
4 |
Seed value |
|
36 |
128 (32 x 4) |
Descriptor index high water marks (NID high-water marks) |
|
Part of the header data to which |
|||
164 |
4 |
0 |
Unknown (Reserved) |
168 |
4 |
Total file size |
|
172 |
4 |
Last data allocation table offset |
|
176 |
4 |
Total available data size |
|
180 |
4 |
Total available page size |
|
184 |
4 |
The descriptor index back pointer |
|
188 |
4 |
The descriptor index file offset |
|
192 |
4 |
The (file) offset index back pointer |
|
196 |
4 |
The (file) offset index file offset |
|
200 |
1 |
Allocation table validation type |
|
201 |
1 |
0 |
Unknown (Reserved) |
202 |
2 |
0 |
Unknown (Reserved) |
End of the root |
|||
204 |
128 |
The initial data free map |
|
332 |
128 |
The initial page free map |
|
460 |
1 |
0x80 |
Senitinal |
461 |
1 |
Encryption type |
|
462 |
2 |
0 |
Unknown (Reserved) |
464 |
8 |
0 |
Unknown (Reserved) |
472 |
4 |
0 |
Unknown (Reserved) |
476 |
3 |
0 |
Unknown (Reserved) |
479 |
1 |
0 |
Unknown (Reserved) |
480 |
32 |
0 |
Unknown (Reserved) |
Data after file header data probably extended data for AMap
Offset | Size | Value | Description |
---|---|---|---|
512 |
4 |
Unknown value |
|
516 |
4 |
Unknown value |
|
520 |
4 |
Unknown value |
|
524 |
4 |
Unknown value |
|
528 |
16880 |
Empty values |
The 64-bit header data is 540 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
24 |
8 |
Unused (bidUnused) |
|
32 |
8 |
Next (available) index back pointer |
|
40 |
4 |
Seed value |
|
44 |
128 (32 x 4) |
Descriptor index high water marks (NID high-water marks) |
|
172 |
8 |
0 |
Unknown (qwAlign) |
Part of the header data to which |
|||
180 |
4 |
0 |
Unknown (cOrphans) |
184 |
8 |
Total file size |
|
192 |
8 |
Last data allocation table offset |
|
200 |
8 |
Total available data size |
|
208 |
8 |
Total available page size |
|
216 |
8 |
The descriptor index back pointer |
|
224 |
8 |
The descriptor index file offset |
|
232 |
8 |
The (file) offset index back pointer |
|
240 |
8 |
The (file) offset index file offset |
|
248 |
1 |
Allocation table validation type |
|
249 |
1 |
0 |
Unknown (bARVec) |
250 |
2 |
0 |
Unknown (cARVec) |
End of the root |
|||
252 |
4 |
0 |
Unknown (dwAlign) |
256 |
128 |
0xff |
The initial data free map |
384 |
128 |
0xff |
The initial free page map |
512 |
1 |
0x80 |
Senitinal |
513 |
1 |
Encryption type |
|
514 |
2 |
0 |
Unknown (bReserved) |
516 |
8 |
Next (available) index pointer |
|
524 |
4 |
A weak CRC32 of the previous 516 bytes |
|
528 |
3 |
0 |
Unknown (rgbVersionEncoded) |
531 |
1 |
0 |
Unknown (bLockSemaphore) |
532 |
32 |
0 |
Unknown (rgbLock) |
Data after file header data probably extended data for AMap
Offset | Size | Value | Description |
---|---|---|---|
538 |
8 |
Unknown value |
|
546 |
8 |
Unknown value |
|
552 |
… |
Value | Identifier | Description |
---|---|---|
0x00 |
INVALID_AMAP |
One or more allocation tables are invalid |
0x01 |
VALID_AMAP1 |
All allocation tables are valid |
0x02 |
VALID_AMAP2 |
All allocation tables are valid |
Unknown use NID => Node ID ? Seems to be the equivalent of the items identifiers type=n is the number in the array Under high-water mark, any object less than the user's security level can be opened, but the object is relabeled to reflect the highest security level currently open. Hence the name.
Value | Identifier | Description |
---|---|---|
0x00 |
NDB_CRYPT_NONE |
No encryption |
0x01 |
NDB_CRYPT_PERMUTE |
Compressible encryption |
0x02 |
NDB_CRYPT_CYCLIC |
High encryption |
[MS-PST]
defines a common structure for the allocation table, the index
B-tree, the free map and the density list as the page.
The 32-bit page is 512 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
500 |
Page data |
|
500 |
1 |
Page type |
|
501 |
1 |
Copy of page type |
|
502 |
2 |
Signature |
|
504 |
4 |
The back pointer |
|
508 |
4 |
A weak CRC32 of the 496 bytes of the table data |
The 64-bit page is 512 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
496 |
Page data |
|
496 |
1 |
Page type |
|
497 |
1 |
Copy of page type |
|
498 |
2 |
Signature |
|
500 |
4 |
A weak CRC32 of the 496 bytes of the table data |
|
504 |
8 |
The back pointer |
The 64-bit 4k page is 4096 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
496 |
Page data |
|
4072 |
1 |
Page type |
|
4073 |
1 |
Copy of page type |
|
4074 |
2 |
Signature |
|
4076 |
4 |
A weak CRC32 of the 4072 bytes of the table data |
|
4080 |
8 |
The back pointer |
|
4088 |
8 |
Unknown |
The PFF contains several allocation tables. These tables are used to describe what parts of the PFF are in use and free.
The 32-bit allocation is 512 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
Unknown (Padding) |
|
4 |
496 |
The allocation table data |
|
Footer 12 bytes of size |
|||
500 |
1 |
0x83 |
Page type |
501 |
1 |
0x83 |
Copy of page type |
502 |
2 |
0 |
Signature |
504 |
4 |
The back pointer |
|
508 |
4 |
A weak CRC32 of the 496 bytes of the allocation table data |
The 64-bit allocation is 512 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
496 |
The allocation table data |
|
Footer 16 bytes of size |
|||
496 |
1 |
0x83 |
Page type |
497 |
1 |
0x83 |
Copy of page type |
498 |
2 |
Signature |
|
500 |
4 |
A weak CRC32 of the 496 bytes of the allocation table data |
|
504 |
8 |
The back pointer |
Offset | Size | Value | Description |
---|---|---|---|
0 |
4072 |
The allocation table data |
|
Footer 24 bytes of size |
|||
4072 |
1 |
0x83 |
Page type |
4073 |
1 |
0x83 |
Copy of page type |
4074 |
2 |
Signature |
|
4076 |
4 |
A weak CRC32 of the first 4072 bytes of the allocation table data |
|
4080 |
8 |
Back pointer |
|
4088 |
8 |
Unknown |
For both the 32-bit ANSI format and the 64-bit Unicode format the behavior of the allocation tables is as following:
-
The allocation table at offset 0x4400 with page type 0x84 addresses 64 byte blocks. Where the first bit in the allocation table data refers to offset 0x4400. These are used for the data allocation. The tables repeat themselves every 496 x 8 x 64 = 253952 bytes.
-
The allocation table at offset 0x4600 with page type 0x83 addresses 512 byte blocks. Where the first bit in the allocation table data refers to offset 0x4400. These are used for the page allocation. The tables repeat themselves every 496 x 8 x 512 = 2031616 bytes.
For the 64-bit Unicode format with 4k (4096 bytes) pages format the allocation tables is as following:
-
The allocation table at offset 0x22000 with page type 0x84 addresses 64 byte blocks. Where the first bit in the allocation table data refers to offset 0x22000. These are used for the data allocation. The tables repeat themselves every 4072 x 8 x 512 = 16678912 bytes.
Note
|
Page type 0x83 not yet been seen to be used with the 64-bit Unicode format with 4k format. |
The PFF consists of multiple index B-trees.
-
The (file) offset index B-tree (Block B-Tree (BBT))
-
The (item) descriptor index B-tree (Node B-Tree (NBT))
These B-trees have a similar basic structure.
An index B-tree consists of:
-
branch nodes that point to branch or leaf nodes
-
leaf nodes that contain the index data
Both the 32-bit branch and leaf node have a similar structure which is 512 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
496 |
Node entries |
|
Footer 16 bytes of size |
|||
496 |
1 |
The number of entries |
|
497 |
1 |
The maximum number of entries |
|
498 |
1 |
The size of an entry |
|
499 |
1 |
Node level |
|
500 |
1 |
0x80 |
Page type |
501 |
1 |
0x80 |
Copy of page type |
502 |
2 |
Signature |
|
504 |
4 |
Back pointer |
|
508 |
4 |
A weak CRC32 of the first 500 bytes of the index node |
The 32-bit index B-tree node entry is used in branch nodes. It is 12 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The index identifier of the first child node |
|
4 |
4 |
The back pointer |
|
8 |
4 |
The (file) offset |
The index B-tree node will contain the following values:
-
The maximum number of entries: 41
-
The size of an entry: 12
An index B-tree node can contain the same identifier value as a (file) offset index entry. This occurs when the leaf node is the lowest identifier in the branch node.
The 32-bit (file) offset index entry is used in leaf nodes. It is 12 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The identifier |
|
4 |
4 |
The (file) offset |
|
8 |
2 |
The size |
|
10 |
2 |
The reference count |
The index B-tree node will contain the following values:
-
The maximum number of entries: 41
-
The size of an entry: 12
The first LSB of the identifier is reserved.
The second LSB of the identifier is used to indicate if the block is internal or not.
-
0 = is not internal (external)
-
1 = is internal (used for array and local descriptors)
In an encrypted PFF the internal flag also indicates if the corresponding entry is encrypted or not. See section: Block types for more information.
When the index tree is searched make sure to clear the first LSB in the identifier.
The 32-bit descriptor index B-tree leaf node entry is 16 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The (descriptor) index identifier |
|
4 |
4 |
The (file) offset index identifier of the data |
|
8 |
4 |
The (file) offset index identifier of the local descriptors |
|
12 |
4 |
The parent (descriptor) index identifier |
The index B-tree node will contain the following values:
-
The maximum number of entries: 31
-
The size of an entry: 16
Both the 64-bit branch and leaf node have a similar structure which is 512 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
488 |
Node entries |
|
Footer 24 bytes of size |
|||
488 |
1 |
The number of entries |
|
489 |
1 |
The maximum number of entries |
|
490 |
1 |
The size of an entry |
|
491 |
1 |
Node level |
|
492 |
4 |
Unknown (Padding) |
|
496 |
1 |
0x80 |
Page type See section: Page types |
497 |
1 |
0x80 |
Copy of page type |
498 |
2 |
Signature |
|
500 |
4 |
A weak CRC32 of the first 496 bytes of the index node |
|
504 |
8 |
Back pointer |
The 64-bit index B-tree node entry is used in branch nodes. It is 24 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
8 |
The index identifier of the first child node |
|
8 |
8 |
The back pointer |
|
16 |
8 |
The (file) offset |
The index B-tree node will contain the following values:
-
The maximum number of entries: 20
-
The size of an entry: 24
An index B-tree node can contain the same identifier value as a (file) offset index entry. This occurs when the leaf node is the lowest identifier in the branch node.
The 64-bit (file) offset index entry is used in leaf nodes. It is 24 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
8 |
The index identifier |
|
8 |
8 |
The (file) offset |
|
16 |
2 |
The size |
|
18 |
2 |
The reference count |
|
20 |
4 |
File offset of the data allocation table |
The index B-tree node will contain the following values:
-
The maximum number of entries: 20
-
The size of an entry: 24
The first LSB of the identifier is reserved.
The second LSB of the identifier is used to indicate if the block is internal or not.
-
0 = is not internal (external)
-
1 = is internal (used for array and local descriptors)
In an encrypted PFF the internal flag also indicates if the corresponding entry is encrypted or not. See section: 7.4 Block type for more information.
When the index tree is searched make sure to clear the first LSB in the identifier.
The 64-bit descriptor index B-tree leaf node entry is 32 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
8 |
The (descriptor) index identifier |
|
8 |
8 |
The (file) offset index identifier of the data |
|
16 |
8 |
The (file) offset index identifier of the local descriptors |
|
24 |
4 |
The parent (descriptor) index identifier |
|
28 |
4 |
Unknown |
The index B-tree node will contain the following values:
-
The maximum number of entries: 15
-
The size of an entry: 32
In Outlook 2013, at least for OST files, a 4k (4096 bytes) page version of the 64-bit index B-tree node was introduced.
Both the 64-bit branch and leaf node have a similar structure which is 4096 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4056 |
Node entries |
|
Footer 40 bytes of size |
|||
4056 |
2 |
The number of entries |
|
4058 |
2 |
The maximum number of entries |
|
4060 |
1 |
The size of an entry |
|
4061 |
1 |
Node level |
|
4062 |
10 |
Unknown (Padding) |
|
4072 |
1 |
0x80 |
Page type |
4073 |
1 |
0x80 |
Copy of page type |
4074 |
2 |
Signature |
|
4076 |
4 |
A weak CRC32 of the first 4072 bytes of the index node |
|
4080 |
8 |
Back pointer |
|
4088 |
8 |
Unknown |
The node entry structures are the same as those of the 64-bit index B-tree node (512 byte page) version.
The index identifier is 32-bit of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0.0 |
5 bits |
Identifier type |
|
0.5 |
27 bits |
Identifier value |
Note
|
The identifiers should be unique and are so for allocated descriptors. However unallocated descriptors can have identifiers that are in use. |
The node identifier is used in both the item descriptor identifier and the table value reference. It signifies the type of node the identifier is referencing.
Value | Identifier | Description |
---|---|---|
0x00 |
NID_TYPE_HID |
Table value (or heap node) |
0x01 |
NID_TYPE_INTERNAL |
Internal node |
0x02 |
NID_TYPE_NORMAL_FOLDER |
Folder item |
0x03 |
NID_TYPE_SEARCH_FOLDER |
Search folder item |
0x04 |
NID_TYPE_NORMAL_MESSAGE |
Message item |
0x05 |
NID_TYPE_ATTACHMENT |
Attachment item |
0x06 |
NID_TYPE_SEARCH_UPDATE_QUEUE |
Queue of changed search folder items |
0x07 |
NID_TYPE_SEARCH_CRITERIA_OBJECT |
Search folder criteria |
0x08 |
NID_TYPE_ASSOC_MESSAGE |
Associated contents item |
0x0a |
NID_TYPE_CONTENTS_TABLE_INDEX |
Unknown |
0x0b |
NID_TYPE_RECEIVE_FOLDER_TABLE |
Inbox item (or received folder table) |
0x0c |
NID_TYPE_OUTGOING_QUEUE_TABLE |
Outbox item (or outgoing queue table) |
0x0d |
NID_TYPE_HIERARCHY_TABLE |
Sub folders item (or hierarchy table) |
0x0e |
NID_TYPE_CONTENTS_TABLE |
Sub messages item (or contents table) |
0x0f |
NID_TYPE_ASSOC_CONTENTS_TABLE |
Sub associated contents item (or associated contents table) |
0x10 |
NID_TYPE_SEARCH_CONTENTS_TABLE |
Search contents table |
0x11 |
NID_TYPE_ATTACHMENT_TABLE |
Attachments item |
0x12 |
NID_TYPE_RECIPIENT_TABLE |
Recipients item |
0x13 |
NID_TYPE_SEARCH_TABLE_INDEX |
Unknown |
0x14 |
Unknown |
|
0x15 |
Unknown |
|
0x16 |
Unknown |
|
0x17 |
Unknown |
|
0x18 |
Unknown |
|
0x1f |
NID_TYPE_LTP |
Local descriptor value |
The free map contains information about the longest consecutive number of bytes in the data allocation tables.
According to [MS-PST]
the free maps should not be used. The density list
should be used instead.
Note
|
According to [MS-PST] the page free map only has a 64-bit format. However
32-bit PFF have been seen containing page type 0x85. See
notes below.
|
The 32-bit free map is 512 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
Unknown (Padding) |
|
4 |
496 |
The free map data |
|
500 |
1 |
0x82 |
Page type |
501 |
1 |
0x82 |
Copy of page type |
502 |
2 |
Signature |
|
504 |
4 |
The back pointer |
|
508 |
4 |
A weak CRC32 of the 496 bytes of the free map data |
The 64-bit free map is 512 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
496 |
The free map data |
|
496 |
1 |
0x82 |
0x85 |
Page type |
497 |
1 |
0x82 |
0x85 |
Copy of page type |
498 |
2 |
Signature |
500 |
4 |
|
A weak CRC32 of the 496 bytes of the free map data |
504 |
8 |
The free map with page type 0x82 addresses the maximum number of continuous free data blocks in the corresponding data allocation table. Every byte in the free map data represents a separate data allocation table.
The free map with page type 0x85 addresses free pages. Every bit in the free map data represents a separate page allocation table.
Page type 0x85 seen in 32-bit PFF.
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
Next node back pointer |
|
4 |
4 |
Next node offset |
|
8 |
488 |
Unknown values |
|
496 |
1 |
Unknown value |
|
497 |
1 |
Unknown value |
|
498 |
1 |
Unknown (Empty value) |
|
499 |
1 |
Unknown (Empty value) |
|
500 |
2 |
Type indicator |
|
502 |
2 |
Unknown (Node identifier?) |
|
504 |
4 |
Back pointer |
|
508 |
4 |
A weak CRC32 of the first 500 bytes of the index node |
The density list is used to maintain a list of the data allocation tables in order of density.
The list starts with the low-density (free) data allocation tables?
According to [MS-PST]
there is only a single density list at offset 0x4200.
Only found in newer PST and OST files?
The 32-bit density list is 512 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
1 |
Flags |
|
1 |
1 |
Number of list entries |
|
2 |
2 |
0 |
Unknown (Padding) |
4 |
4 |
Next page index |
|
8 |
480 |
Density list entries |
|
488 |
12 |
Unknown |
|
Footer 12 bytes of size |
|||
500 |
1 |
0x86 |
Page type |
501 |
1 |
0x86 |
Copy of page type |
502 |
2 |
Unknown (Signature) |
|
504 |
4 |
Unknown (The back pointer) |
|
508 |
4 |
A weak CRC32 of the first 500 bytes of the density list |
The 64-bit density list is 512 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
1 |
Flags |
|
1 |
1 |
Number of list entries |
|
2 |
2 |
0 |
Unknown (Padding) |
4 |
4 |
Next page index |
|
8 |
476 |
Density list entries |
|
484 |
12 |
Unknown |
|
Footer 16 bytes of size |
|||
496 |
1 |
0x86 |
Page type |
497 |
1 |
0x86 |
Copy of page type |
498 |
2 |
Unknown (Signature) |
|
500 |
4 |
A weak CRC32 of the first 496 bytes of the density list |
|
504 |
8 |
Unknown (The back pointer) |
Seen at offset 0x21000
Offset | Size | Value | Description |
---|---|---|---|
Unknown |
|||
Footer 24 bytes of size |
|||
4072 |
1 |
0x86 |
Page type |
4073 |
1 |
0x86 |
Copy of page type |
4074 |
2 |
Unknown (Signature) |
|
4076 |
4 |
A weak CRC32 of the first 4072 bytes of the density list |
|
4080 |
8 |
Back pointer |
|
4088 |
8 |
Unknown |
Value | Identifier | Description |
---|---|---|
0x01 |
DFL_BACKFILL_COMPLETE |
Set if no backfill operation is in progress |
The density list entry is 32-bit of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0.0 |
20 bits |
The page number of the data allocation table |
|
2.4 |
12 bits |
Number of free entries in the data allocation table |
The corresponding file offset of a page number is determined by the following calculation:
offset = page number x page size (512)
[MS-PST]
defines a common structure for storing raw data, data arrays and
local descriptors as the block. Blocks should be 64 byte aligned, which is the
granularity of the data allocation map. Blocks contain other data types like
the local descriptor list, array and table.
The the maximum size of a block is 8192 bytes. The 64-bit 4k page block has a likely maximum size of 8 x 8192 = 65536 bytes, since the page size is 8 times larger than the non 4k page block variant (8 x 512 = 4096).
In a 32-bit PST the commonly used largest block size is 8192 - 12 = 8180, so the maximum of the block data size is the block size - size of the block footer.
The block footer contains a back pointer which refers back to the (file) offset index nodes. This value can be used to validate the integrity of the file.
The 32-bit block is variable of size in 64 byte increments and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
… |
Block data |
|
… |
… |
Unknown (Padding) |
|
Footer 12 bytes of size |
|||
… |
2 |
Block data size |
|
… |
2 |
Signature |
|
… |
4 |
Back pointer |
|
… |
4 |
A weak CRC32 of the block data |
The 64-bit block is variable of size in 64 byte increments and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
… |
Block data |
|
… |
… |
Unknown (Padding) |
|
Footer 16 bytes of size |
|||
… |
2 |
Block data size |
|
… |
2 |
Signature |
|
… |
4 |
A weak CRC32 of the block data |
|
… |
8 |
Back pointer |
The 64-bit 4k page block is variable of size in 512 byte increments and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
… |
Block data |
|
… |
… |
Unknown (Padding) |
|
Footer 24 bytes of size |
|||
… |
2 |
Block data size |
|
… |
2 |
Signature |
|
… |
4 |
A weak CRC32 of the block data |
|
… |
8 |
Back pointer |
|
… |
2 |
2 |
Unknown |
… |
2 |
Uncompressed block data size |
|
… |
4 |
Unknown (empty values) |
In the 64-bit 4k page the block data can be compressed. The compression method used is deflate/RFC1951.
If the block is compressed note that the size in the corresponding 64-bit (file) offset index entry contains the compressed block data size and not that of the uncompressed data.
TODO: confirm encryption comes before decompression, for now it seems that compressed files are not encrypted. Can an array be compressed if so how.
Value | Identifier | Description |
---|---|---|
Data block |
The raw data block has the 'external' bit set on the (file) offset index identifier. In a raw data block the block data value can be encrypted. |
|
0x01 |
XBLOCK |
The array block is used to store raw data greater than 8176 bytes. [MS-PST] refers to this structure as the XBLOCK which is a single level array. |
0x01 |
XXBLOCK |
The array block is used to store raw data greater than 2\^16 x 8176 bytes. [MS-PST] refers to this structure as the XXBLOCK which is a two level array. |
0x02 |
SLBLOCK |
The local descriptors |
The array is used when a (file) offset index identifier contains more data than can fit in a single (descriptor) data block. The array contains a set of (file) offset index identifiers.
The array is used for both table as for item value data.
The total data size should equal the sum of all the (file) offset index entry sizes referenced by the array.
The data of the individual array entries should be concatenated to each other in order.
According to [MS-PST]
the maximum level of indirection is 2.
The 32-bit array is variable of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
1 |
0x01 |
The array signature (or block type) |
1 |
1 |
0x01 |
The array (indirection) level |
2 |
2 |
The number of array entries |
|
4 |
4 |
The total data size of the array entries |
|
8 |
(number of entries x 4) |
4 byte array entries containing (file) offset index identifiers |
The 64-bit array is variable of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
1 |
0x01 |
The array signature (or block type) |
1 |
1 |
The array (indirection) level |
|
2 |
2 |
The number of array entries |
|
4 |
4 |
The total (uncompressed) data size of the array entries |
|
8 |
(number of entries x 8) |
8 byte array entries containing (file) offset index identifiers |
The format for the 64-bit array for 512 and 4k page sizes is the same.
The total data size of the array entries is based on the uncompressed size of the data blocks.
The local descriptors identifier in the descriptor index B-tree leaf node entry refers to a (file) offset index entry which contains the file offset and data size of the local descriptors nodes.
The local descriptors nodes make up a tree, that most of the time consists of only one level, therefore it was initially considered as a local descriptor list.
According to [MS-PST]
the maximum level of indirection is 0x01.
The local descriptors contain descriptor (file) offset mappings for table data. The 32-bit local descriptors are variable in size.
Offset | Size | Value | Description |
---|---|---|---|
0 |
1 |
0x02 |
The signature (or block type) |
1 |
1 |
The node (indirection) level |
|
2 |
2 |
The number of entries |
|
4 |
(number of entries x entry size) |
The entries |
The 32-bit local descriptors branch nodes have a level other than 0x00. An entry within the node is 8 bytes of size.
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The descriptor identifier |
|
4 |
4 |
The (file) offset index identifier of the sub node. |
The lower bit in data identifier should be cleared before searching the value in the (file) offset index.
If an attachment identifier is stored in a local descriptor branch node the corresponding the (file) offset index identifier of the data is in the sub node of the local descriptor branch node.
The 32-bit local descriptors leaf node has a level of 0x00. An entry within the node is 12 bytes of size.
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The descriptor identifier |
|
4 |
4 |
The (file) offset index identifier of the data |
|
8 |
4 |
The (file) offset index identifier of the local descriptors |
The lower bit in data identifier should be cleared before searching the value in the (file) offset index.
The (file) offset index identifier of the local descriptors are mainly used in email items for attachments. It refers to the local descriptors of the attachment item.
The local descriptors contain descriptor (file) offset mappings for table data. The 64-bit local descriptors are variable in size.
Offset | Size | Value | Description |
---|---|---|---|
0 |
1 |
0x02 |
The signature |
1 |
1 |
The node (indirection) level |
|
2 |
2 |
The number of entries |
|
4 |
4 |
0 |
Unknown (Padding) |
8 |
(number of entries x entry size) |
The entries |
The format for the 64-bit local descriptors for 512 and 4k page sizes is the same.
The 64-bit local descriptors branch nodes have a level other than 0x00. An entry within the node is 16 bytes of size.
Offset | Size | Value | Description |
---|---|---|---|
0 |
8 |
The descriptor identifier |
|
8 |
8 |
The (file) offset index identifier of the sub node. |
The lower bit in data identifier should be cleared before searching the value in the (file) offset index.
If an attachment identifier is stored in a local descriptor branch node the corresponding the (file) offset index identifier of the data is in the sub node of the local descriptor branch node.
The 64-bit local descriptors leaf node has a level of 0x00. An entry within the node is 24 bytes of size.
Offset | Size | Value | Description |
---|---|---|---|
0 |
8 |
The descriptor identifier |
|
8 |
8 |
The (file) offset index identifier of the data |
|
16 |
8 |
The (file) offset index identifier of the local descriptor |
The lower bit in data identifier should be cleared before searching the value in the (file) offset index.
The (file) offset index identifier of the local descriptor is mainly used in email items for attachments. It refers to the local descriptors of the attachment item.
The table contains entries which make up the items like email or contact. If the encryption type was set in the file header data the entire table is encrypted. Note that the not encrypted flag in the offset identifier can overwrite the table being encrypted.
The data identifier in the descriptor index B-tree leaf node entry refers to a (file) offset index entry which contains the file offset and data size of the table.
The table is made up of one or more table blocks. These table blocks can be stored in a table array.
The table block is variable of size and consists of:
-
table block header
-
table block values
-
table block index
The table block header is 16 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
2 |
The table block index offset |
|
2 |
1 |
0xec |
Signature |
3 |
1 |
The table type (or client signature) |
|
4 |
4 |
The table value reference |
|
8 |
4 |
Unknown (Fill level array) |
Only the first table block in a table array contains a table header.
According to [MS-PST]
the fill level array only applies to the 8 first table
blocks of the table array. The table block header of 2nd to 8th table array
entries is 2 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
2 |
The table index offset |
This header is repeated every table array entry not needed to contain a fill level array.
The table header of the 9th table array entry is 66 bytes of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
2 |
The table index offset |
|
2 |
64 |
Unknown (Fill level array) |
This header is repeated every 128 table array entries, e.g. in table array entry 137.
According to [MS-PST]
the fill level array entries for non existing table
array entries should be set to 0.
The following table types are currently known.
Table type | Description | Features |
---|---|---|
0x6c |
6c table |
Has GUID record entry identifiers |
0x7c |
7c table |
(Table context) |
0x8c |
8c table |
Has MAPI property (based) record entry identifiers |
0x9c |
9c table |
Has GUID record entry identifiers |
0xa5 |
a5 table |
Has MAPI property (based) record entry identifiers |
0xac |
ac table |
Has MAPI property (based) record entry identifiers |
0xb5 |
b5 table header |
(B-Tree on heap) |
0xbc |
bc table |
(Property context) |
0xcc |
cc table |
Unknown |
Value | Identifier | Description |
---|---|---|
0x0 |
FILL_LEVEL_EMPTY |
value >= 3584 bytes free or non-existent data block |
0x1 |
FILL_LEVEL_1 |
2560 >= value > 3584 bytes free |
0x2 |
FILL_LEVEL_2 |
2048 >= value > 2560 bytes free |
0x3 |
FILL_LEVEL_3 |
1792 >= value > 2048 bytes free |
0x4 |
FILL_LEVEL_4 |
1536 >= value > 1792 bytes free |
0x5 |
FILL_LEVEL_5 |
1280 >= value > 1536 bytes free |
0x6 |
FILL_LEVEL_6 |
1024 >= value > 1280 bytes free |
0x7 |
FILL_LEVEL_7 |
786 >= value > 1024 bytes free |
0x8 |
FILL_LEVEL_8 |
512 >= value > 786 bytes free |
0x9 |
FILL_LEVEL_9 |
256 >= value > 512 bytes free |
0xa |
FILL_LEVEL_10 |
128 >= value > 256 bytes free |
0xb |
FILL_LEVEL_11 |
64 >= value > 128 bytes free |
0xc |
FILL_LEVEL_12 |
32 >= value > 64 bytes free |
0xd |
FILL_LEVEL_13 |
16 >= value > 32 bytes free |
0xe |
FILL_LEVEL_14 |
8 >= value > 16 bytes free |
0xf |
FILL_LEVEL_FULL |
value < 8 bytes free |
The table block index is variable of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
2 |
The number of index offsets |
|
2 |
2 |
The number of unused offsets items |
|
4 |
(number of items + 1) x 2 |
Array of index offsets |
Note that:
-
the first index offsets is referred to as number 1;
-
the index offsets are stored in order;
-
the last index offset does not have to match the table block index offset;
-
the number of index offsets can be 0.
The table value reference is formatted in different ways, it can point to data either in within the table block or in some other block.
The table value reference is 32-bit of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0.0 |
5 bits |
The value reference type |
|
0.5 |
11 bits |
The value reference index |
|
2.0 |
16 bits |
The value reference array index |
-
internal table value references have the all the low order 4 bits zero e.g. 0x0020, the value needs to be right shifted by 5 bits, e.g. 0x0001. This value is the first entry in the the table index (starts at 1), so it points to a table index value offset e.g. 12 (0xc). for internal table values references the high order 16 bits are used to indicate which table array entry should be used, e.g. a high order value of 1 points to the second table array entry;
-
external table value references have some of the low order 4 bits set (and the value reference array index is 0). They are descriptor list identifiers that refer to another location of data.
TODO: Check with [MS-PST] p 56
The table value reference is 32-bit of size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
0.0 |
5 bits |
The value reference type |
|
0.5 |
14 bits |
The value reference index |
|
2.3 |
13 bits |
The value reference array index |
An internal table value reference refers to the first table index value pair that contain the table values descriptor.
The internal table value reference for:
Table type | Use of internal table value reference |
---|---|
0x6c |
points to table specific 6c table header |
0x7c |
points to table specific 7c table header |
0x8c |
points to b5 table header |
0x9c |
points to table specific 9c table header |
0xa5 |
points to record entries |
0xac |
points to table specific ac table header |
0xbc |
points to b5 table header |
The b5 table header is used in all table types except the a5 table. It contains information how the record entries are formatted. It consists of 8 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
1 |
0xb5 |
Table header type |
1 |
1 |
The size of the record entry identifier |
|
2 |
1 |
The size of the record entry value |
|
3 |
1 |
The level of record entries |
|
4 |
4 |
record entries reference |
The record entry index reference refers to the table index value pair that points to record entries. If the record entries reference is zero there are no record entries.
The level of the record entries is used to distribute the record entries over multiple table values. Intermediate level record entries are variable of size and consist of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
… |
record entry identifier (key) |
|
… |
4 |
record entries sub reference |
Where leaf level record entries are variable of size and consist of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
… |
record entry identifier (key) |
|
… |
4 |
record entry data |
The size of an individual record entry is the combination of the record entry identifier and value size.
The b5 table header values differs for different tables:
Table type | record entry identifier size | record entry value size | record entry size |
---|---|---|---|
0x6c |
16 |
2 |
18 |
0x7c |
4 |
2 |
6 |
0x7c |
4 |
4 |
8 |
0x8c |
8 |
4 |
12 |
0x9c |
16 |
4 |
20 |
0xac |
4 |
4 |
8 |
0xbc |
2 |
6 |
8 |
The individual table sections provide more information about the values in the record entries.
The bc table has table values that contain:
-
a b5 table header
-
a 6c table header
-
record entries that contain contain GUID descriptor values and the value array information
-
value array (table) entries that contain the item value information
The 6c table header consists of 8 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The b5 table header index reference |
|
4 |
4 |
Values array entries index reference |
The 6c table uses the b5 table header with a record entry identifier size of 16 and a record entry value size of 2. The record entries reference refers to the record entries. If the record entries reference is zero there are no record entries.
A b5 table header with a record entry identifier size of 16 and a record entry value size of 2 refers to a specific type of record entry. This type of record entry consists of 18 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
16 |
A GUID |
|
16 |
2 |
Unknown |
The bc table has table values that contain:
-
a b5 table header
-
a 7c table header
-
7c column definitions that contain the item type information
-
-
record entries that contain the value array information
-
value array (table) entries that contain the item value information
The 7c table header consists of 22 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
1 |
0x7c |
Table header type |
1 |
1 |
The number of column definitions |
|
2 |
2 |
values array entry end offset 32-bit values |
|
4 |
2 |
values array entry end offset 16-bit values |
|
6 |
2 |
values array entry end offset 8-bit values |
|
8 |
2 |
values array entry end offset cell existence block |
|
10 |
4 |
The b5 table header index reference |
|
14 |
4 |
Values array entries index reference |
|
18 |
4 |
Unknown (hidIndex) |
If the b5 header table index reference is zero the table should not contain any record entries. If the value array entries index reference is zero the table does not contain any value array entries.
The record entries contain references to the table value array entries. So if the table contains no values the value array should be empty.
In some tables the b5 table header index reference contains a references to a b5 table header with an empty record entries reference. The value array entries index reference in the 7c table header is also empty.
It is possible for the table to have table header entries but no values array entries. The reverse is unknown.
The remaining data in the 7c table header contains multiple column definitions. The column definitions describe the format of the data in the values array entries. The 7c column definition consist of 8 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
2 |
The record entry value type |
|
2 |
2 |
The record entry type |
|
4 |
2 |
The values array entry offset |
|
6 |
1 |
The values array entry size |
|
7 |
1 |
The values array entry number |
If the table contains values array entries the values array entry offset contains the offset of the value in the value array (table) entries.
In case of a value reference the actual value is found by reading the value size number of bytes from the value array entries at the specified value array entries offset. A value array entries offset of 0 points to the beginning of the value array entries.
The 7c table uses the b5 table header with a record entry identifier size of 4 and a record entry value size of 2 or 4. The record entries reference refers to the record entries. If the record entries reference is zero there are no record entries.
The record entries branch has a record entries level value of 1 (and probably higher). The initial tables entries level is specified in the b5 table header. A record entry branch consists of 8 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The first value in the lower level record entry array |
4 |
The record entry branch contains a reference to lower level record entries.
The record entries leaf has a record entries level value of 0. The initial tables entries level is specified in the b5 table header.
A b5 table header with a record entry identifier size of 4 and a record entry value size of 2 refers to a record entry consists of 6 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The first value in the value array |
|
4 |
2 |
Value array number |
A b5 table header with a record entry identifier size of 4 and a record entry value size of 4 refers to a record entry consists of 8 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The first value in the value array |
|
4 |
4 |
Value array number |
The values array entries contain item entries values. The 7c header entries define the format of the entry/value data within an array entry. The value size and value array entries offset in the 7c header entries refer to the item value in the value arrays.
The value array consist of multiple values of different sizes.
Offset | Size | Value | Description |
---|---|---|---|
0 |
… |
The 4 and 8 byte values |
|
… |
… |
The 2 byte values |
|
… |
… |
The 1 byte values |
|
… |
… |
The cell existence block bitmap |
For record entry value types that fit into the specified size the record entry value is used directly, i.e. 32-bit, like Integer 32-bit signed (0x0003) or 64-bit, like Filetime (0x0040). Otherwise, the record entry value is a value reference, which is either a descriptor list identifier, or a table index reference. If the record entry value is 0 the value is empty. Unlike the bc table the 7c table does store values smaller than 32-bit in lesser number of bytes.
If a values array reference is an external reference and the values array is stored in a data array there is additional padding at the end of the last value array in a certain data array block. If the data in the data array is assumed continuous this causes a misalignment for the value array in the next data array block. The value array entry identifier in the record entries can be used to realign.
The 8c table has table values that contain:
-
a b5 table header
-
record entries that contain identifier to descriptor mappings
The 8c table uses the b5 table header with a record entry identifier size of 8 and a record entry value size of 4. The record entries reference refers to the record entries. If the record entries reference is zero there are no record entries.
A b5 table header with a record entry identifier size of 16 and a record entry value size of 2 refers to a specific type of record entry. This type of record entry consists of 18 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
8 |
Unknown (Identifier) |
|
8 |
4 |
Descriptor identifier |
The 9c table has table values that contain:
-
a b5 table header
-
a 9c table header
-
record entries that contain GUID descriptor values
The ac table header consists of 4 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
b5 table header index reference |
The 9c table uses the b5 table header with a record entry identifier size of 16 and a record entry value size of 4. The record entries reference refers to the record entries. If the record entries reference is zero there are no record entries.
The a5 table has table values that contain:
-
record entries that contain record entry values
The a5 table is used by the ac column definitions as an array of record entry values.
The internal table value reference for the a5 table is 0.
If the a5 table is empty it signifies NULL values;
The ac table has table values that contain:
-
a b5 table header
-
a ac table header
-
ac column definitions that contain the item type information
-
a5 tables containing the actual record entry values
-
-
record entries that contain the value array information
-
value array (table) entries that contain the item value information
The ac table header consists of 40 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
1 |
"\xac" |
Table header type |
1 |
1 |
Unknown (Empty value) |
|
2 |
2 |
values array entry end offset 32-bit values |
|
4 |
2 |
values array entry end offset 16-bit values |
|
6 |
2 |
values array entry end offset 8-bit values |
|
8 |
2 |
values array entry end offset cell existence block |
|
10 |
4 |
B5 table header index reference |
|
14 |
4 |
Values array entry reference |
|
18 |
4 |
Unknown (Empty value) |
|
22 |
2 |
Number of column definitions |
|
24 |
4 |
column definitions reference |
|
28 |
8 |
Unknown (Empty value) |
|
36 |
4 |
Unknown value (Weak CRC?) |
The column definitions reference refers to the ac column definitions. The column definitions describe the format of the data in the values array entries. The ac column definition consist of 16 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
2 |
The record entry value type |
|
2 |
2 |
The record entry type |
|
4 |
2 |
The values array entry offset |
|
6 |
2 |
The values array entry size |
|
8 |
2 |
The values array entry number |
|
10 |
2 |
Unknown (Empty value) |
|
12 |
4 |
The descriptor identifier of the record entry values table (a5 table) |
If the table contains values array entries the values array entry offset contains the offset of the value in the value array (table) entries.
In case of a value reference the actual value is found by reading the value size number of bytes from the value array entries at the specified value array entries offset. A value array entries offset of 0 points to the beginning of the value array entries.
The ac table uses the b5 table header with a record entry identifier size of 4 and a record entry value size of 4. The record entries reference refers to the record entries. If the record entries reference is zero there are no record entries.
It might be that the 4 + 2 variant like for the 7c table is also possible for the ac table.
The record entries branch has a record entries level value of 1 (and probably higher). The initial tables entries level is specified in the b5 table header. A record entry branch consists of 8 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The first value in the lower level record entry array |
|
4 |
4 |
The value reference of the lower level record entry array |
The record entry branch contains a reference to lower level record entries.
The record entries leaf The record entries leaf has a record entries level value of 0. The initial tables entries level is specified in the b5 table header.
A b5 table header with a record entry identifier size of 4 and a record entry value size of 4 refers to a specific type of record entry. This type of record entry consists of 8 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The first value in the value array |
|
4 |
4 |
Value array number |
The values array entries contain item entries values. The ac header entries define the format of the entry/value data within an array entry. The value size and value array entries offset in the ac header entries refer to the item value in the value arrays.
The value array consist of multiple values of different sizes.
Offset | Size | Value | Description |
---|---|---|---|
0 |
… |
The 4 and 8 byte values |
|
… |
… |
The 2 byte values |
|
… |
… |
The 1 byte values |
|
… |
… |
The cell existence block bitmap |
For record entry value types that fit into the specified size the record entry value is used directly, i.e. 32-bit, like Integer 32-bit signed (0x0003) or 64-bit, like Filetime (0x0040). Otherwise, the record entry value is a value reference, which is either a descriptor list identifier, or a table index reference. If the record entry value is 0 the value is empty. Unlike the bc table the ac table does store values smaller than 32-bit in lesser number of bytes.
Some column definitions have a descriptor identifier of the record entry values table. This descriptor identifier refers to an a5 table which contains an array of record entry values. In this case the value in the values array actually contains an item index of the a5 table.
If a values array reference is an external reference and the values array is stored in a data array there is additional padding at the end of the last value array in a certain data array block. If the data in the data array is assumed continuous this causes a misalignment for the value array in the next data array block. The value array entry identifier in the record entries can be used to realign.
The bc table has table values that contain:
-
a b5 table header
-
record entries that contain the item type/value information
-
record entry value data
The bc table uses the b5 table header with a record entry identifier size of 2 and a record entry value size of 6. The record entries reference refers to the record entries. If the record entries reference is zero there are no record entries.
The record entries branch has a record entries level value of 1 (and probably higher). The initial tables entries level is specified in the b5 table header. A record entry branch consists of 6 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
2 |
The first value in the lower level record entry array |
|
2 |
4 |
The value reference of the lower level record entry array |
The record entry branch contains a reference to lower level record entries.
The record entries leaf has a record entries level value of 0. The initial tables entries level is specified in the b5 table header.
The record entries in the bc table contain item entries. This type of record entry consists of 8 bytes:
Offset | Size | Value | Description |
---|---|---|---|
0 |
2 |
The record entry type |
|
2 |
2 |
The record entry value type |
|
4 |
4 |
The record entry value or value reference |
For record entry value types that fit into 32-bit, like Integer 16-bit signed (0x0002), Integer 32-bit signed (0x0003), Boolean (0x000b), the record entry value is used directly. Otherwise, the record entry value is a value reference, which is either a descriptor list identifier, or a table index reference. If the record entry value is 0 the value is empty.
According to [MS-PST]
there should be a cc table, however
it is undocumented and has not yet been spotted in the wild.
The PFF items are stored in record entries. Different tables make up different PFF items.
Several of the PFF items have a predefined node identifier.
Value | Identifier | Description |
---|---|---|
33 (0x21) |
NID_MESSAGE_STORE |
The message store |
97 (0x61) |
NID_NAME_TO_ID_MAP |
The name-to-id-map |
161 (0xa1) |
NID_NORMAL_FOLDER_TEMPLATE |
The folder template |
193 (0xc1) |
NID_SEARCH_FOLDER_TEMPLATE |
The search folder template |
290 (0x122) |
NID_ROOT_FOLDER |
The root folder |
481 (0x1e1) |
NID_SEARCH_MANAGEMENT_QUEUE |
Pending search-related update queue |
513 (0x201) |
NID_SEARCH_ACTIVITY_LIST |
Active searches list |
577 (0x241) |
NID_RESERVED1 |
Unknown (Reserved) |
609 (0x261) |
NID_SEARCH_DOMAIN_OBJECT |
Search criteria list |
641 (0x281) |
NID_SEARCH_GATHERER_QUEUE |
Search gatherer queue |
673 (0x2a1) |
NID_SEARCH_GATHERER_DESCRIPTOR |
Search gatherer descriptor |
737 (0x2e1) |
NID_RESERVED2 |
Unknown (Reserved) |
769 (0x301) |
NID_RESERVED3 |
Unknown (Reserved) |
801 (0x321) |
NID_SEARCH_GATHERER_FOLDER_QUEUE |
Search gatherer folder queue |
2049 (0x801) |
Unknown (found in OST) |
|
2081 (0x821) |
Unknown (found in OST) |
|
2113 (0x841) |
Unknown (found in OST) |
|
3073 (0xc01) |
Unknown (found in PST, OST) |
The descriptor index identifier 33 (0x21) refers to the message store.
The message store is a bc table which can contain:
-
The display name: "Personal Folders"
-
Valid folder mask
-
Password checksum
The message store contains several entry identifiers of Outlook special folders. These are:
Folder | Entry identifier property |
---|---|
Outbox folder |
PidTagIpmOutboxEntryId |
Deleted Items folder |
PidTagIpmWastebasketEntryId |
Sent Items folder |
PidTagIpmSentMailEntryId |
IPM root folder |
PidTagIpmSubtreeEntryId |
Search-results root folder |
PidTagFinderEntryId |
Common views root folder |
PidTagCommonViewsEntryId |
Personal views root folder |
PidTagViewsEntryId |
Contacts root folder |
PidTagIpmContactEntryId |
Drafts root folder |
PidTagIpmDraftsEntryId |
Journal root folder |
PidTagIpmJournalEntryId |
Calendar root folder |
PidTagIpmAppointmentEntryId |
Notes root folder |
PidTagIpmNoteEntryId |
Tasks root folder |
PidTagIpmTaskEntryId |
Note
|
Some PFF files do not contain a message store. |
The descriptor index identifier 97 (0x61) refers to the the name-to-id map.
The name-to-id map is a bc table which contains the following entries:
-
0x0001 (Name-to-ID Map Number of Validation Entries)
-
0x0002 (Name-to-ID Map Class identifiers)
-
0x0003 (Name-to-ID Map Entries)
-
0x0004 (Name-to-ID Map Strings)
-
0x1000 and up (Name-to-ID Map Validation Entries)
The entry 0x0002 (Name-to-ID Map Class Identifiers) is of type 0x0102 (Binary data) and contains an array of class identifiers (CLSID).
The entry 0x0003 (Name-to-ID Map Entries) is of type 0x0102 (Binary data) and contains an array of name-to-id map entries. An name-to-id map entry consist of 8 bytes.
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The name-to-id map entry value or value reference |
|
4 |
2 |
The name-to-id map entry type |
|
6 |
2 |
The name-to-id map entry number |
The lowest bit in the name-to-id map entry type signifies where to find the name-to-id map value.
-
If set it contains an offset into the 0x0004 (Name-to-ID Map Strings) array. This type corresponds to MAPI MNID_STRING;
-
If not set it contains the entry type to which the name-to-id is mapped. This type corresponds to MAPI MNID_ID.
Type | Identifier | Description |
---|---|---|
0x0000 |
NAMEID_GUID_NONE |
No class |
0x0002 |
NAMEID_GUID_MAPI |
The name-to-id map entry type refers to the class MAPI (PS_MAPI 00020328-0000-0000-c000-000000000046) |
0x0004 |
NAMEID_GUID_PUBLIC_STRINGS |
The name-to-id map entry type refers to the class Public stings (PS_PUBLIC_STRINGS: 00020329-0000-0000-c000-000000000046) |
The remaining name-to-id map entry type value refers to a value in the class identifier array:
index number = ( type / 2 ) - 3
E.g. the value 0x0006 or 0x0007 refer to the first entry (entry: 0) in the class identifier array.
The correspondent item type is the name-to-id map number + 0x8000.
The entry 0x0004 (Name-to-ID Map Strings) is of type 0x0102 (Binary data) and contains an array of strings. An individual string consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The number of bytes in the string |
|
4 |
… |
The string in ASCII or Unicode without the end of string character (NUL-character) |
Note
|
The Name-to-ID Map Strings can be empty. |
Most of the time the Name-to-ID Map Strings are in Unicode (UTF-16) however sometimes the string consists of an ASCII string containing. Until now only ASCII strings containing a MAPI property identifier string (PR_) have been found. In particular in relation with a BlackBerry RIM server properties (PR_RIM_). Note that the last byte in such ASCII strings can be a 0 byte.
Could this be a mnemonic way of mapping MAPI identifiers?
The entries s0x1000 and up (Name-to-ID Map Validation Entries) contain values similar to those in the entry 0x0003 (Name-to-ID Map Entries). Except that these are used for validation.
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
The name-to-id map entry validation value |
|
4 |
2 |
The name-to-id map entry type |
|
6 |
2 |
The name-to-id map entry number |
The lowest bit in the name-to-id map entry type signifies where to find the name-to-id map validation value.
-
If set it contains a weak CRC32 of the string in the 0x0004 (Name-to-ID Map Strings) array;
-
If not set it contains a duplicate of the value in the 0x0003 (Name-to-ID Map Entries).
Note
|
Some PFF files do not contain a name-to-id map. |
The descriptor index identifier 290 (0x112) refers to the the root folder item.
The descriptor index entry of the root item refers to itself as its parent.
The child items can be found by the parent descriptor identifier in the descriptor index entry. The descriptor index entries that are not part of the item hierarchy should not contain parent identifiers (parent identifies of 0).
The folder item can contain:
-
Display Name
-
Number of content items
-
Number of unread content items
-
Has sub folders
-
Associate content count
The number of content items in a folder is made up from the item count and associated item count.
The Inbox folder contains several entry identifiers of Outlook special folders. These are:
Folder | Entry identifier property |
---|---|
Calendar |
PidTagIpmAppointmentEntryId |
Contacts |
PidTagIpmContactEntryId |
Journal |
PidTagIpmJournalEntryId |
Notes |
PidTagIpmNoteEntryId |
Tasks |
PidTagIpmTaskEntryId |
Drafts |
PidTagIpmDraftsEntryId |
Outbox special folders?
A folder descriptor index identifier + 11 (0x000b) refers to the related sub folders item, e.g. for the root folder this is 290 + 11 = 301. The related sub folders item consists of a 7c table.
The sub folders item can be used to determine which items in the folder are sub folders. The row identifier (PidTagLtpRowNid) value of each set contains the identifier of the sub folder item.
A folder descriptor index identifier + 12 (0x000c) refers to the related sub messages item, e.g. for the root folder this is 290 + 12 = 302. The related sub messages item consists of a 7c table.
The sub messages item can be used to determine which items in the folder are messages. The row identifier (PidTagLtpRowNid) value of each set contains the identifier of the sub message item.
A folder descriptor index identifier + 13 (0x000d refers to the related sub associated contents item, e.g. for the root folder this is 290 + 13 = 303. The related sub associated contents item consists of a 7c table.
The sub associated contents item can be used to determine which items in the folder are associated contents. The row identifier (PidTagLtpRowNid) value of each set contains the identifier of the sub associated contents item.
Special purpose folder descriptor index identifier + 3 => empty descriptor Special purpose folder descriptor index identifier + 4 => bc table A folder descriptor index identifier + 18 => 8c table (Inbox, Drafts, Sync Issues, Renamed By MAE) Contains an number of elements similar to the number of messages in the corresponding folder. A folder descriptor index identifier + 19 => 8c table (CommonViews, Inbox, Calendar) Calender folder descriptor index identifier + 23 => 7c table
When a folder contains X the local descriptors contains an entry 1718 (0x06b6). This local descriptor refers to a 7c table which contains the Y item. The Y contains multiple sets (1 per Z). It can contain the: (Used by: CommonViews, OutlookReminder, To-Do Search, Tracked Mail Processing, Shortcuts, Views, IPM SUBTREE, Deleted Items, Inbox, Sent Items, Calendar, Contacts, Drafts, Journal, Junk E-mail, Notes, RSS Feeds, Conflicts, ) Perhaps a Folder View or sort sub item? 0x67f2 ( : Row identifier) points to an entry type within the parent table
The message item is a generic variant of other items, like e-mail, contact, appointment, etc.
The message item at least has:
-
a message class containing: 'IPM'
Certain types of message items also can have the sub items:
-
attachments
-
recipients
Value | Identifier | Description |
---|---|---|
1682 (0x0692) |
NID_TYPE_RECIPIENT_TABLE |
Recipients item |
1649 (0x0671) |
NID_TYPE_ATTACHMNET_TABLE |
Attachments item |
NID_TYPE_ATTACHMNET |
Attachment item |
|
NID_TYPE_LTP |
Table value reference |
When an item contains attachments the local descriptors contains an entry 1649 (0x0671). This local descriptor refers to a 7c table which contains the attachments item.
The attachments item table contains multiple sets (1 per attachment). It can contain the attachments:
-
size
-
filename
-
attachment method
-
attachment item local descriptor
The local descriptor identifier refers to an entry in the local descriptors of the corresponding item. The list identifier of the local descriptor entry refers to the local descriptors of the attachment item.
The attachment can contain the attachments:
-
size
-
creation time
-
modification time
-
data object
-
filename
-
attachment method
-
mime type
-
rendering position
The Attachment data object (0x3701) can be of type Binary Data (0x0102) and Embedded Object (0x000d). The Embedded Object is used for bounced e-mails in which it contains an embedded PFF table. The Embedded Object can also contain other data like an OLE2 document.
Attachment method (0x3705) value | Attachment data object (0x3701) entry type | Attachment type |
---|---|---|
0x00000000 |
Unknown |
Unknown |
0x00000001 |
Binary Data (0x0102) |
(Attached) Data |
0x00000002 |
(Attachment is stored externally) |
Reference |
0x00000003 |
Unknown |
Unknown |
0x00000004 |
Unknown |
Unknown |
0x00000005 |
Embedded Object (0x000d) |
(Attached) Item |
0x00000006 |
Embedded Object (0x000d) |
(Attached) Data or item |
[yellow-background]*Found empty attachment data object (0x3701) value but attachment has size 206. But contains entry 0x0e27 with binary data of size 100.
Found attachment with attachment method (0x3705) 0x0001 (ATTACH_BY_VALUE) without an attachment data object (0x3701) entry. Attachment has size 54, however Outlook shows base64 encoded data of a larger size. Parent e-mail contains 'partial message' entries. This seems to be a split MIME RFC message, where the data is stored in the message body of the attached e-mails.
Outlook cannot access the attachment of the attached e-mails.*
When a folder contains attachments the local descriptors contains an entry 1682 (0x0692). This local descriptor refers to a 7c table which contains the recipients items.
The recipients item contains multiple sets (1 per recipient). It can contain the recipient:
-
type
-
e-mail address
-
search key
-
messaging username
The appointment item can contain the values:
-
message class: 'IPM.Appointment'
The appointment item also can have the sub items:
-
attachments
-
recipients
The distribution list item is also known as contact group item can contain the values:
-
message class: 'IPM.DistList'
-
0x8054 (PidLidDistributionListOneOffMembers)
-
0x8055 (PidLidDistributionListMembers)
The e-mail item can contain the values:
-
message class: 'IPM.Note'
-
transport headers
-
plain text message body
-
compressed RTF message body
-
HTML message body
The e-mail item also can have the sub items:
-
attachments
-
recipients
The message manager (associated) item can contain the values:
-
message class: 'IPM.MessageManager'
The migration status (associated) item can contain the values:
-
message class: 'IPM.Microsoft.MigrationStatus'
The rule organizer (associated) item can contain the values:
-
message class: 'IPM.RuleOrganizer'
The rule message (associated) item can contain the values:
-
message class: 'IPM.Rule.Message'
The extended rule message (associated) item can contain the values:
-
message class: 'IPM.ExtendedRule.Message'
The LZFu compression is used for RTF formatted data [ROTHMAN99].
Compressed LZFu data starts with a LZFu header
Offset | Size | Value | Description |
---|---|---|---|
0 |
4 |
Size of the compressed data |
|
4 |
4 |
Size of the uncompressed data |
|
8 |
4 |
Compression signature |
|
12 |
4 |
A CRC32 of the compressed data. |
The signature 0x75465a4c ("LZFu") that the data is compressed. The signature 0x414c454d ("MELA") that the data is uncompressed.
The CRC32 is similar to the standard CRC32 algorithm which is mentioned in RFC 1952 with a slight modification: the inversion (or xor with 0xffffffffL) before and after the CRC update is omitted. This inversion is applied in order to avoid a CRC weakness, which is that any number of leading or trailing zero bytes can be added or removed without the CRC detecting the change. For some reason, the compressed RTF CRC32 implementation is the weaker one, without this inversion. It is calculated on the compressed data bytes (excluding the LZFu header).
The compressed data is directly after the header. It consists of 8-unit chunks. Each chunk begins with a single flag byte. The bits within the flag byte are read in LSB order. Each bit in the byte flag is a flag for the corresponding unit in the chunk.
-
0 represent a 1 byte literal which should be copied as-is;
-
1 represent a 2 byte reference.
A 2 byte reference consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
1.4 |
Reference offset into the LZ buffer |
|
1.4 |
0.4 |
Reference size |
The reference offsets represent offsets into the LZ buffer. The size of the reference offset allows for 4096 possible values, which is the size of the LZ buffer. The LZ buffer wraps around as it is filled with the decompressed data. The LZ buffer is preloaded with a common RTF header string (found in RTFLIB32.LIB). The string is represented as a C string of 207 bytes.
{\\rtf1\\ansi\\mac\\deff0\\deftab720{\\fonttbl;}{\\f0\\fnil \\froman \\fswiss \\fmodern \\fscript \\fdecor MS Sans SerifSymbolArialTimes New RomanCourier{\\colortbl\\red0\\green0\\blue0\n\r\\par \\pard\\plain\\f0\\fs20\\b\\i\\u\\tab\\tx
The reference size is a 4 bit value that represents a value between 2 and 17. A reference size of 0 representing 2. Therefore the reference size needs to be corrected by 2.
The uncompressed size does not entail the 2 trailing zero bytes.
PST and presumably OST files created by Microsoft Entourage (Outlook for MacOS) seem to use the MacBinary encoding to store attachment data.
Three MacBinary standards are known:
-
version 1 (MacBinary)
-
version 2 (MacBinary II)
-
version 3 (MacBinary III)
The MacBinary format uses big-endian.
MacBinary data consists of:
Offset | Size | Value | Description |
---|---|---|---|
0 |
128 |
Header |
|
128 |
(secondary header size) |
Secondary header |
|
… |
(data fork size) |
The data fork |
|
… |
(resource fork size) |
The resource fork |
|
… |
(get info size) |
get info |
The MacBinary header is 128 bytes in size and consists of:
Offset | Size | Value | Description |
---|---|---|---|
As of MacBinary |
|||
0 |
1 |
0x00 |
Unknown (Reserved for version byte) |
1 |
1 |
Filename length |
|
2 |
63 |
Filename |
|
65 |
4 |
The file type |
|
69 |
4 |
The file creator |
|
73 |
1 |
Finder flags (bits 8-15) |
|
74 |
1 |
0x00 |
Unknown (Empty value) |
75 |
2 |
y coordinate |
|
77 |
2 |
x coordinate |
|
79 |
2 |
Folder identifier |
|
81 |
1 |
Protected flag |
|
82 |
1 |
0x00 |
Unknown (Empty value) |
83 |
4 |
Data fork size |
|
87 |
4 |
Resource fork size |
|
91 |
4 |
Creation date and time |
|
95 |
4 |
Modification date and time |
|
As of MacBinary II, in previous version these values would be all zeroed |
|||
99 |
2 |
Get info size |
|
101 |
1 |
Finder flags bits 0-7 |
|
As of MacBinary III, in previous version these values would be all zeroed |
|||
102 |
4 |
"mBIN" |
Signature |
106 |
1 |
Filename script |
|
107 |
1 |
Extended finder flags |
|
108 |
8 |
0x00 |
Unknown (Unused) |
As of MacBinary II, in previous version these values would be all zeroed |
|||
116 |
4 |
Size unpacked |
|
120 |
2 |
Secondary header size |
|
122 |
1 |
MacBinary II version (0x81) |
|
123 |
1 |
0x81 |
Minimum required MacBinary II (or later) version |
124 |
2 |
CRC-16 of the previous 124 bytes |
|
As of MacBinary |
|||
126 |
2 |
Unknown (Reserved for computer type and OS identifier) |
Value | Identifier | Description |
---|---|---|
0x0001 |
is on desk |
|
0x000e |
color (bits 1-3) |
|
0x0010 |
color reserved |
|
0x0020 |
requires switch launch |
|
0x0040 |
is shared |
|
0x0080 |
has no inits |
|
0x0100 |
inited |
|
0x0200 |
changed |
|
0x0400 |
budy |
|
0x0800 |
Unknown (bozo?) |
|
0x1000 |
system |
|
0x2000 |
bundle |
|
0x4000 |
invisible |
|
0x8000 |
locked |
The scenarios described below do not seem to be normal behavior.
Although encryption type is none some PFF files still contain compressible encrypted data. This was found in multiple times in PST files created by Microsoft Exchange Mailbox Merge Program (ExMerge). One of which created by ExMerge v6.5.7529.0.
One of these PST files also did not contain a message store and name-to-identifier-map.
Neither Outlook (MAPI) or scanpst was able to handle these files. Libpff is handles this scenario automatically.
In (at least) one of these files the entries of some data arrays all are encrypted except for the last one. Libpff is able to detect some of these unencrypted data array entries.
Encountered a 1.4 GiB PST file created by ExMerge (unknown version) with zero root index node pages and missing allocation table pages.
Neither Outlook (MAPI) or scanpst was able to handle these files. Libpff is able to handle this scenario only using recovery.
Encountered a PST file with an e-mail which has 4 e-mail messages attached. These 4 e-mail messages contained parts of a single message. The data of the single message was stored in the message body of the 4 attached e-mails. These 4 attached e-mails claim to have an attachment themselves, but Outlook is not able to access these attachments; neither do they contain any data.
Scanpst did not remark this as a problem. Libpff will not recombine the split attached message, and will tell you it cannot export the attachment of the attached e-mails.
In a PST file, with invalid encryption type, created by ExMerge v6.5.7529.0,from a ExChange 2003 server with BlackBerry RIM the codepage 1200 was found. Codepage 1200 is not defined by MAPI but used by Windows to indicate Unicode string. So the PST contains (extended) ASCII strings with the Unicode-'codepage'.
For now only UTF-8 strings have been found, but in the Extensible Storage Engine (ESE) Database File (EDB) format codepage 1200 is either used for UTF-8 or UTF‑16 little‑endian strings.
Libpff as of version 20100429 will handle this scenario automatically.
In a Unicode PST file, probably copied from an extended ASCII PST file, some messages contain values which are stored as Unicode strings but actually contains an extended ASCII string stored as 16-bit characters.
Libpff does not handle this scenario yet.
In a PST file, the string size of a name-to-id map entry is larger than the total name-to-id map strings data. Since the name-to-id map strings are stored without an end-of-string character it is not directly possible to tell the size of the string.
Libpff as of version 20130731 will detect this scenario and mark the name-to-id entry as corrupted.
An alternative approach could be to determine the first valid name-to-id entry string offset after that of the corrupted entry.
In a PST file in a 7c or ac table the value array number of a record entry is larger than the number of entries the value array holds.
Libpff as of version 20130804 will detect this scenario and ignore the corrupted record entry and mark the corresponding table as missing record entry data.
TODO: add info about the free map and free page map (header)
33 message store 97 name-to-id map 290 root folder 301 sub folders (7c table) 302 sub messages (7c table) 303 sub associated items (7c table) 481 empty descriptor 513 Active searches list (see below) 609 Search criteria list (see below) 641 Search gatherer queue (empty descriptor) 673 Search gatherer descriptor (see below) 801 empty descriptor 1549 empty 7c table (template of some kind?) 1550 empty 7c table (template of some kind?) 1551 empty 7c table (template of some kind?) 1552 empty 7c table (template of some kind?) 1579 7c table 1612 empty 7c table (template?) 1649 empty 7c table (also used by attachments sub item) (template?) 1682 empty 7c table (also used by recipients sub item) (template?) 1718 empty 7c table (also used by unknowns sub item) (template?) 1751 empty 7c table (template?) 1784 7c table (template?) 1840 emails/items table (ac table) 3073 guid lookup table (9c table) > 8194 related folder items 8742 empty descriptor (Search update queue) 8743 bc table (Search update queue) 8752 empty 7c table (Search contents table) > 32813 other kind of folders ? 32813 7c table (Sub folders) 32814 empty 7c table (Sub messages) 32815 empty 7c table (Sub associated contents) 32845 7c table (Sub folders) 32846 empty 7c table (Sub messages) 32847 empty 7c table (Sub associated contents) 32877 empty 7c table (Sub folders) 32878 empty 7c table (Sub messages) 32879 empty 7c table (Sub associated contents) 32909 empty 7c table 32910 emails/items table (7c table) 32911 empty 7c table > 524326 special folders ? 524326 empty descriptor (Search update queue) 524327 bc table (Search update queue) 524336 empty 7c table (Search contents table) > 1048648 messages ?
00000000: 23 22 00 00 23 00 08 00 43 00 08 00 63 00 08 00 #"..#... C...c... 00000010: 83 00 08 00 .... 00000000: 23 22 00 00 23 00 08 00 43 00 08 00 63 00 08 00 #"..#... C...c... 00000010: 83 00 08 00
List of descriptors?
00000000: 43 21 00 00 63 21 00 00 83 21 00 00 a3 21 00 00 C!..c!.. .!...!.. 00000010: c3 21 00 00 23 80 00 00 43 80 00 00 63 80 00 00 .!..#... C...c... 00000020: 83 80 00 00 a3 80 00 00 c3 80 00 00 e3 80 00 00 ........ ........ 00000030: 03 81 00 00 23 81 00 00 43 81 00 00 63 81 00 00 ....#... C...c... 00000040: 83 81 00 00 a3 81 00 00........ 00000000: 43 21 00 00 63 21 00 00 83 21 00 00 a3 21 00 00 C!..c!.. .!...!.. 00000010: c3 21 00 00 23 80 00 00 43 80 00 00 63 80 00 00 .!..#... C...c... 00000020: 83 80 00 00 a3 80 00 00 c3 80 00 00 e3 80 00 00 ........ ........ 00000030: 03 81 00 00 23 81 00 00 43 81 00 00 63 81 00 00 ....#... C...c... 00000040: 83 81 00 00 a3 81 00 00........
List of descriptors?
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........ 00000010: 01 00 00 00 00 00 00 00 01 00 00 00 01 00 00 00 ........ ........ 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........ 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........ 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........ 00000050: 00 00 00 00 00 00 00 00........ 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........ 00000010: 01 00 00 00 00 00 00 00 01 00 00 00 01 00 00 00 ........ ........ 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........ 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........ 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........ 00000050: 00 00 00 00 00 00 00 00........
33 message store 97 name-to-id map 290 root folder 301 sub folders (7c table) 302 sub messages (7c table) 303 sub associated items (7c table) 481 empty descriptor 513 (unknown) 609 (unknown) 641 empty descriptor 673 (unknown) 801 empty descriptor template tables or used to store empty tables for the specific purpose only once? 1549 empty 7c table (folder related template?) 1550 empty 7c table (message related template?) 1551 empty 7c table (message related template?) 1552 empty 7c table (message related template?) 1556 empty 8c table 1561 empty 7c table (identifier related template?) 1579 7c table with report data (contains IPC, IPM, REPORT.IPM) 1612 empty 7c table (Submit related template?) 1649 empty 7c table (attachments related template?) (attachments sub item) 1682 empty 7c table (recipients related template?) (recipients sub item) 1718 empty 7c table (identifier related template?) (unknowns sub item) 1751 empty 7c table (identifier related template?) 1784 empty 7c table (identifier related template?) 2049 6c table (contains GUID that map to other GUIDs?) 2081 8c table (folder identifier related table? 0x67f4 value related) 2113 7c table (folder identifier releated table? 0x36de value related) 3073 empty 9c table > 8194 related folder items 8194 folder (no root item) 8205 sub folders 8206 sub messages 8207 sub associated items 8212 empty 8c table 8213 empty 8c table ... 8739 SPAM search folder 2 (Outlook.ItemProcessor) (no root item) 8742 empty descriptor (8739+3) 8743 bc table (contains a single entry 0x660b 0x0003) (8739+4) 8752 sub associated items (empty 7c table) ... > 32813 other kind of folders ? 32802 Calendar (no root item) 32813 sub folders 32814 sub messages 32815 sub associated items 32820 empty 8c table 32821 empty 8c table 32825 empty 7c table (32802+23) ... > 524326 special folders ? 524323 Reminder folder (Outlook.Reminder) (no root item) 524326 empty descriptor 524327 bc table 524336 empty 7c table 524355 To-do search folder (IPF.Task) (no root item) 524358 empty descriptor 524359 bc table (IPM.Appointment) 524368 empty 7c table 524387 ItemProcSearch folder (Outlook.ItemProcessor) (no root item) 524390 empty descriptor 524391 bc table 524400 7c table (messages) 524419 Tracked Mail Processing folder (IPF.Note) (no root item) 524422 empty descriptor 524423 bc table 524432 empty 7c table > 1048648 messages ?
33 message store 39 bc table with "MAPIPDL" 41 0x01 0x00 data type 63 7c table with contacts 95 0x02 0x00 data type 255 empty 7c table 287 0x01 0x00 data type 319 bc table
41 (0x29) libpff_table_read: table:
00000000: 01 00 00 00 00 00 00 00 00 00 00 00 1e 00 01 30 ........ .......0 00000010: 00 00 00 00 ....
95 (0x5f) libpff_table_read: table:
00000000: 02 00 00 00 00 00 00 00 00 00 00 00 1e 00 00 e3 ........ ........ 00000010: 00 00 00 00 02 01 0b 30 00 00 00 00 .......0 ....
287 (0x11f) libpff_table_read: table:
00000000: 01 00 00 00 00 00 00 00 00 00 00 00 03 00 00 e4 ........ ........ 00000010: 00 00 00 00 ....
The record key in the message store item (33) is used as GUID in the entry identifiers for the within the PST file.
0x0ff9 (PidTagRecordKey : Record key) 0x0102 (PT_BINARY : Binary data) GUID : 4a2f9232-b9d7-4afc-a1d5-9634785b50af
The 0x6615 value in the message store item (33) is used as GUID in the entry identifiers for the within the OST file.
0x6615 (_UNKNOWN_ : Unknown) 0x0048 (PT_CLSID : GUID (128-bit)) GUID : 1b5d8add-509a-4cb8-af41-8ff3e6375fca
Note that the entryid object identifier data is different than those used in a PST file.
0x36d0 (PidTagIpmAppointmentEntryId : Calendar folder entry identifier) 0x0102 (PT_BINARY : Binary data) Entry identifier: Flags: 0x00, 0x00, 0x00, 0x00 Service provider identifier : 77c9cc1c-4915-4c72-bd2f-f28d487b92cc (Unknown) Object identifier data: 00000000: 01 00 00000000: 8e 5d f2 3a 78 32 7a 41 ae 4c 2c ce 44 e1 ...].:x2 zA.L,.D. 00000010: 8a 88 GUID: 3af25d8e-3278-417a-ae4c-2cce44e18a88 00000010: 00 00 00 19 50 c5 First part of the PRQ_ID_SECURE4 value 00000010:00 00......P. .. Corresponding item 0x65e2 (PR_CHANGE_KEY : Change key) 0x0102 (PT_BINARY : Binary data) GUID : af1252d6-dd91-4391-b1fa-ee82341e0c04 Unknown1: 00000000: 00 00 04 12 .... 0x65e3 (PR_PREDECESSOR_CHANGE_LIST : Predecessor change list) 0x0102 (PT_BINARY : Binary data) Size : 0x16 GUID : 3af25d8e-3278-417a-ae4c-2cce44e18a88 Unknown: 00000000: 00 00 00 19 60 c8 ....`. Size : 0x14 GUID : af1252d6-dd91-4391-b1fa-ee82341e0c04 Unknown: 00000000: 00 00 04 12 0x67f4 (PRQ_ID_SECURE4 : ) 0x0014 (PT_I8 : Integer 64-bit signed) integer 64-bit signed : -4228852562310201343 (0xc550190000000001) 6c table record entry guid: 3af25d8e-3278-417a-ae4c-2cce44e18a88 record entry values array number : 0x0001 record entry value guid : 3af25d8e-3278-417a-ae4c-2cce44e18a88 maps the guid to the last part of the PRQ_ID_SECURE4 8c table identifier: 0xc550190000000001 descriptor identifier : 0x00008080 maps the PRQ_ID_SECURE4 to a descriptor identifier, requires correction of the lower four bits
But what is record key used for?
0x0ff9 (PidTagRecordKey : Record key) 0x0102 (PT_BINARY : Binary data) GUID : e71f4b30-150f-410d-90f9-1e7d9204d76e
Server Failures 0x65e2 (PR_CHANGE_KEY : Change key) 0x0102 (PT_BINARY : Binary data) GUID : 5003c7c4-af96-4aa3-9d5b-e0c1e039d0ef Unknown1: 00000000: 00 00 00 12 76 38 ....v8 0x65e3 (PR_PREDECESSOR_CHANGE_LIST : Predecessor change list) 0x0102 (PT_BINARY : Binary data) Size : 0x16 GUID : 3af25d8e-3278-417a-ae4c-2cce44e18a88 Unknown: 00000000: 00 00 00 3c f4 8e ...<.. Size : 0x16 GUID : 5003c7c4-af96-4aa3-9d5b-e0c1e039d0ef Unknown: 00000000: 00 00 00 12 76 38 ....v8 0x67f4 (PRQ_ID_SECURE4 : ) 0x0014 (PT_I8 : Integer 64-bit signed) integer 64-bit signed : 929777818772963329 (0xce73c0000000001)
Inbox 0x65e2 (PR_CHANGE_KEY : Change key) 0x0102 (PT_BINARY : Binary data) GUID : af1252d6-dd91-4391-b1fa-ee82341e0c04 Unknown1: 00000000: 00 00 04 0f 0x65e3 (PR_PREDECESSOR_CHANGE_LIST : Predecessor change list) 0x0102 (PT_BINARY : Binary data) Size : 0x16 GUID : 5003c7c4-af96-4aa3-9d5b-e0c1e039d0ef Unknown: 00000000: 00 00 00 0c 9c d8 Size : 0x14 GUID : af1252d6-dd91-4391-b1fa-ee82341e0c04 Unknown: 00000000: 00 00 04 0f 0x67f4 (PRQ_ID_SECURE4 : ) 0x0014 (PT_I8 : Integer 64-bit signed) integer 64-bit signed : -2209002423085694974 (0xe1580c0000000002) 0xd89c0c0000000002
Archive Search 0x65e2 (PR_CHANGE_KEY : Change key) 0x0102 (PT_BINARY : Binary data) GUID : af1252d6-dd91-4391-b1fa-ee82341e0c04 Unknown1: 00000000: 00 00 04 1c .... 0x65e3 (PR_PREDECESSOR_CHANGE_LIST : Predecessor change list) 0x0102 (PT_BINARY : Binary data) Size : 0x16 GUID : 5003c7c4-af96-4aa3-9d5b-e0c1e039d0ef Unknown: 00000000: 00 00 00 0c a4 e4 ...... Size : 0x14 GUID : af1252d6-dd91-4391-b1fa-ee82341e0c04 Unknown: 00000000: 00 00 04 1c 0x67f4 (PRQ_ID_SECURE4 : ) 0x0014 (PT_I8 : Integer 64-bit signed) integer 64-bit signed : -2136944829047767038 (0xe2580c0000000002)
6c table record entry guid: 5003c7c4-af96-4aa3-9d5b-e0c1e039d0ef record entry values array number : 0x0002 record entry value guid : 5003c7c4-af96-4aa3-9d5b-e0c1e039d0ef
[SMITH02]
Title: | outlook.pst — format of MS Outlook .pst file |
---|---|
Author(s): |
David Smith, Joe Nahmias, Brad Hards, Carl Byington |
URL: |
[LIBESEDB]
Title: | Extensible Storage Engine (ESE) Database File (EDB) format |
---|---|
Author(s): |
Joachim Metz |
URL: |
[LIBFMAPI]
Title: | Message API (MAPI) definitions |
---|---|
Auhtor: |
Joachim Metz |
URL: |
https://github.com/libyal/libfmapi/blob/master/documentation/MAPI%20definitions.pdf |
[MSDN]
Title: | Microsoft Developer Network |
---|---|
URL: |
[MS-PST]
Title: | [MS-PST] Outlook Personal Folders File Format (.pst) structure specification |
---|---|
URL: |
[MACBINARY]
Title: | Macintosh Binary Transfer Format ("MacBinary") Standard Proposal |
---|
Title: | Macintosh Binary Transfer Format ("MacBinary II") Standard Proposal |
---|
Title: | Macintosh Binary Transfer Format ("MacBinary III") Standard Proposal |
---|
[OPENCHANGE]
Title: | Openchange MAPI library |
---|---|
URL: |
[RFC1950]
Title: | ZLIB Compressed Data Format Specification |
---|---|
Version: |
3.3 |
Author(s): |
P. Deutsch, J-L. Gailly |
Date: |
May 1996 |
URL: |
[RFC1951]
Title: | DEFLATE Compressed Data Format Specification |
---|---|
Version: |
1.3 |
Author(s): |
P. Deutsch |
Date: |
May 1996 |
URL: |
[ROTHMAN99]
Title: | The Compressed RTF Format |
---|---|
Author(s): |
Amichai Rothman |
URL: |
Version 1.3, 3 November 2008 Copyright © 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc. http://fsf.org/
Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
The purpose of this License is to make a manual, textbook, or other functional and useful document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.
This License is a kind of "copyleft", which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.
This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The "Document", below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as "you". You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law.
A "Modified Version" of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.
A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document’s overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.
The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none.
The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words.
A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not "Transparent" is called "Opaque".
Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only.
The "Title Page" means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, "Title Page" means the text near the most prominent appearance of the work’s title, preceding the beginning of the body of the text.
The "publisher" means any person or entity that distributes copies of the Document to the public.
A section "Entitled XYZ" means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as "Acknowledgements", "Dedications", "Endorsements", or "History".) To "Preserve the Title" of such a section when you modify the Document means that it remains a section "Entitled XYZ" according to this definition.
The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License.
You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly display copies.
If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document’s license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.
You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:
-
Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission.
-
List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement.
-
State on the Title page the name of the publisher of the Modified Version, as the publisher.
-
Preserve all the copyright notices of the Document.
-
Add an appropriate copyright notice for your modifications adjacent to the other copyright notices.
-
Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below.
-
Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document’s license notice.
-
Include an unaltered copy of this License.
-
Preserve the section Entitled "History", Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence.
-
Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission.
-
For any section Entitled "Acknowledgements" or "Dedications", Preserve the Title of the section, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein.
-
Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles.
-
Delete any section Entitled "Endorsements". Such a section may not be included in the Modified Version.
-
Do not retitle any existing section to be Entitled "Endorsements" or to conflict in title with any Invariant Section.
-
Preserve any Warranty Disclaimers.
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version’s license notice. These titles must be distinct from any other section titles.
You may add a section Entitled "Endorsements", provided it contains nothing but endorsements of your Modified Version by various parties—for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.
You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections Entitled "History" in the various original documents, forming one section Entitled "History"; likewise combine any sections Entitled "Acknowledgements", and any sections Entitled "Dedications". You must delete all sections Entitled "Endorsements".
You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.
A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an "aggregate" if the copyright resulting from the compilation is not used to limit the legal rights of the compilation’s users beyond what the individual works permit. When the Document is included in an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document’s Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate.
Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warranty Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail.
If a section in the Document is Entitled "Acknowledgements", "Dedications", or "History", the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title.
You may not copy, modify, sublicense, or distribute the Document except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, or distribute it is void, and will automatically terminate your rights under this License.
However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice.
Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, receipt of a copy of some or all of the same material does not give you any rights to use it.
The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation. If the Document specifies that a proxy can decide which future versions of this License can be used, that proxy’s public statement of acceptance of a version permanently authorizes you to choose that version for the Document.
"Massive Multiauthor Collaboration Site" (or "MMC Site") means any World Wide Web server that publishes copyrightable works and also provides prominent facilities for anybody to edit those works. A public wiki that anybody can edit is an example of such a server. A "Massive Multiauthor Collaboration" (or "MMC") contained in the site means any set of copyrightable works thus published on the MMC site.
"CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0 license published by Creative Commons Corporation, a not-for-profit corporation with a principal place of business in San Francisco, California, as well as future copyleft versions of that license published by that same organization.
"Incorporate" means to publish or republish a Document, in whole or in part, as part of another Document.
An MMC is "eligible for relicensing" if it is licensed under this License, and if all works that were first published under this License somewhere other than this MMC, and subsequently incorporated in whole or in part into the MMC, (1) had no cover texts or invariant sections, and (2) were thus incorporated prior to November 1, 2008.
The operator of an MMC Site may republish an MMC contained in the site under CC-BY-SA on the same site at any time before August 1, 2009, provided the MMC is eligible for relicensing.