molprint

molprint is a high-performance molecular fingerprint computation and similarity search library written in Rust.

The goal is to be fast enough for large-scale virtual screening — targeting 5–10× faster than RDKit for bulk workflows — while remaining accurate enough to match RDKit bit-for-bit on standard benchmarks.

What it does

  • Parses SMILES strings into molecular graphs
  • Computes MACCS-166 structural key fingerprints (100% RDKit-accurate on ChEMBL 10k)
  • Computes Morgan/ECFP circular fingerprints at configurable radius and bit width (512–4096)
  • Calculates Tanimoto, Dice, and Cosine similarity using POPCNT on u64 word arrays
  • Runs parallel threshold and top-k screening via Rayon
  • Reads and writes FPS, SDF, and SMILES file formats (FPS is chemfp-compatible)

Benchmarks

Measured on Apple M-series, Rust 1.94, --release.

OperationPerformance
Tanimoto (2048-bit)36 ns
Morgan ECFP4 batch~700k mol/s
MACCS-166 batch~535k mol/s
Screening 100k compounds826 µs / query

Quick example

#![allow(unused)]
fn main() {
use molprint_core::smiles::parse_smiles;
use molprint_fp::{morgan::Morgan, maccs::Maccs166, traits::Fingerprinter};
use molprint_search::metrics::tanimoto;

let mol_a = parse_smiles("c1ccccc1").unwrap();   // benzene
let mol_b = parse_smiles("c1ccncc1").unwrap();   // pyridine

let fp = Morgan::new(2, 2048);
let sim = tanimoto(&fp.fingerprint(&mol_a), &fp.fingerprint(&mol_b));
println!("{:.3}", sim); // 0.600
}

Workspace layout

CrateRole
molprint-coreSMILES parser, molecular graph, ring perception, SMARTS
molprint-fpMorgan and MACCS fingerprint algorithms
molprint-searchSimilarity metrics and parallel screening
molprint-ioFPS, SDF, SMILES file I/O
molprint-cliEnd-user CLI tool

Crates form a strict dependency chain: molprint-coremolprint-fpmolprint-search + molprint-iomolprint-cli. No cycles, no dev-dependency shortcuts across this chain.

License

MIT