Molecular encoder/featurizer using rdkit and OCaml

Chemical fingerprints are lossy encodings of molecules. molenc allows to encode molecules using unfolded-counted fingerprints (i.e. a potentially very long but sparse vector of positive integers).

Currently, Faulon fingerprints and atom pairs are supported.

Currently, atom types are the quadruplet (#pi-electrons, element symbol, #HA neighbors, formal charge). In the future, pharmacophore features might be supported (a more abstract/fuzzy atom typing scheme).


