SDF/Mol V2000 File Format Specification

Q

What are SDF (Structural Data File), or Mol, V2000 format specifications?

✍: Guest

A

Here is a summary of SDF/Mol V2000 format specifications:

1. Text File - A SDF in a text file to store multiple molecule structures.

2. Structure Separator Line ($$$$) - Each molecule structure is separated from others by the "$$$$" line.

3. Structure Header (3 lines) - Each molecule structure starts with a 3-line header. The first line should provides an ID for the structure. The second line should provides the source of the structure. The third line can be used for comments. Below is a 3-line SDF header example:

FYI-001
FYICenter.com
123456789012345678901234567890123456789012345678901234567890

4. The Counts Line - The 4th line of each structure after the 3-line header is the counts line, which provides 12 counts including atom counts, bond counts. etc. The length of each count is fixed to 3 characters except the last one, which is 6 characters. The first count specifies the number of atoms. The second count specifies the number of bonds. Below is a counts line example that says 13 atoms and 14 bonds:

 13 14  0  0  0  0  0  0  0  0  0     0

5. The Atom Block - Following the count line is the atom block with one atom per line. Each atom line starts with x, y and z coordinates taking 10 characters per coordinate. Coordinates are followed by a space and the atom's element type, which takes 3 characters. Additional atom properties can specified after the element type. Below is an atom line example specifying the location of a "N" atom:

    0.8400   -0.1600    0.0000 N   0  0     0  0  0  0  0  0

5. The Bond Block - Following the atom block is the bond block with one bond per line. Each bond line starts with 2 atom indexes for the bond. Atom indexes are followed by bond type, stereoscopy and other properties. Each value in the bond line takes 3 characters. Below is a bond line example specifying a single and non-stereo bond between atom #2 and #1:

  2  1  1  0  2  0  0

Bond type codes are: 1 for single, 2 for double, 3 for triple.

Stereoscopy codes are: 0 for none, 1 for pointing up, 6 for pointing down.

6. The Properties Block - Following the bond block is the properties block with one property per line. Each property line starts with "M xxx", where "xxx" is the property ID. "M END" indicates the end of the properties block. Below is a property line example saying "add a charge to atom #1 of +2".

M  CHG  1   1   2

7. Custom Fields - After the properties block, multiple custom fields can be specified with multiple lines per field. The first line identifies the field name in the form of "> <name>". The second line and more lines specifies the field value. The last line ends the field with an empty line.

A good explanation of the SDF file format is given by Nonlinear Dynamics at nonlinear.com/progenesis/sdf-studio/v0.9/faq/sdf-file-format-guidance.aspx.

A more detailed description of the SDF file format is given by Accelrys Software Inc. at http://download.accelrys.com/freeware/ctfile-formats/

 

What Are CTfile and CTAB

What Is SDF/Mol File Format

Introduction to SDF/Mol File Format

⇑⇑ SDF/Mol File FAQ

2020-04-16, 6493🔥, 1💬