"babel -o fpt ..." Command - Similarity Search

Q

How to use "babel -o fpt ..." command to do similarity search?

✍: FYIcenter.com

A

You can use the "fpt" output file format with the "babel" command to do similarity search using the following syntax:

babel input_section -o fpt 

When you run the above commands, Open Babel will perform the following:

  1. Take the first molecule from the input data source as the query molecule, and calculate its fingerprint as the query fingerprint.
  2. Take the next molecule from the input data source as the object molecule and calculate its fingerprint as the object fingerprint.
  3. Calculate the Tanimoto coefficient of the query fingerprint and object fingerprint.
  4. Output the Tanimoto coefficient as the similarity score of the object molecule.
  5. Continue with step 2.

Example 1 - If the query molecule is stored as the first molecule in file with object molecules, you can do similarity search with a shorter syntax as shown below:

babel mymols.sdf -o fpt 

MOL_00000067
MOL_00000083   Tanimoto from MOL_00000067 = 0.810811
MOL_00000105   Tanimoto from MOL_00000067 = 0.833333
MOL_00000296   Tanimoto from MOL_00000067 = 0.425926
MOL_00000320   Tanimoto from MOL_00000067 = 0.534884
MOL_00000328   Tanimoto from MOL_00000067 = 0.511111
MOL_00000338   Tanimoto from MOL_00000067 = 0.522727
MOL_00000354   Tanimoto from MOL_00000067 = 0.534884
MOL_00000378   Tanimoto from MOL_00000067 = 0.489362
MOL_00000391   Tanimoto from MOL_00000067 = 0.489362
10 molecules converted

The similarity score output shows that molecule MOL_00000105 is the most similar molecule to molecule MOL_00000067 with the best similarity score of 0.833333.

Example 2 - If the query molecule is stored in file separately from the object molecule file, you can do similarity search with a more clear syntax as shown below:

babel  query.sdf  mymols.sdf -o fpt 

MOL_00000067   Tanimoto from first mol = 0.0888889
MOL_00000083   Tanimoto from first mol = 0.0869565
MOL_00000105   Tanimoto from first mol = 0.0888889
MOL_00000296   Tanimoto from first mol = 0.0714286
MOL_00000320   Tanimoto from first mol = 0.0888889
MOL_00000328   Tanimoto from first mol = 0.0851064
MOL_00000338   Tanimoto from first mol = 0.0869565
MOL_00000354   Tanimoto from first mol = 0.0888889
MOL_00000378   Tanimoto from first mol = 0.0816327
MOL_00000391   Tanimoto from first mol = 0.0816327
11 molecules converted

The similarity score output shows that molecules MOL_00000067, MOL_00000105, MOL_00000320, MOL_00000354 are 4 most similar molecules to the molecule stored in "query.sdf" with the best similarity score of 0.0888889.

 

Fingerprint Types Supported in Open Babel

Similarity Search with Open Babel

Similarity Search with Open Babel

⇑⇑ Open Babel Tutorials

2020-12-15, 1560🔥, 0💬