SDF to Excel file, in an automated fashion

Sharing SDF files between chemists is often a pain. It's supposed to be vanilla and super-standard but sometimes still gives everyone involved a headache. Especially when moving SDF between two chemistry codes, especially if hydrogens are involved...

For this reason, and because some people ONLY work with excel files, it's good to have an ability to automatically convert an SDF file to a Excel file (especially xlsx). With pandas and rdkit, its possible to easily make such moves. Example below.

Pandas uses xlsxwriter module to support the Excel format. There is no easy way to pass image objects, embedded in the pandasa.DataFrame, down to xlsxwriter. The writer itself supports the insert_image functionality that takes a filename as argument example). The easiest way is to make pandas detect that a cell contains a string ending with a .png and take use 'insert_image', see the hack below: And here you go: molecule_data.xlsx has a beautiful column with molecule images. There is one catch: one needs to modify pandas a tiny-tiny bit...


  1. What modifications are required for pandas?

  2. Hi Christos, thanks for your question – I added the modification to the post, it's a rather dirty hack to pandas ;)


