The proposed new conversion framework for OpenBabel has each format self contained. The next logical step is to compile it as a DLL or shared library. (I have to confess to knowing nothing about UNIX dynamic or shared libraries. Most of the discussion here is about DLLs for Windows, but I suspect something similar is possible, and maybe easier, in Unix systems.) Adding a format could then be done merely by adding its DLL to the containing the main program. New formats could be introduced and old ones updated without the user having to recompile. (I suspect this plugin approach is more of an advantage in Windows systems.) The Accord chemistry control, and maybe other systems, seem to use this method.
The main chemistry part of OB then also needs to be in a DLL - static linking would lead to a copy of OBMol etc. in each framework. This proved to be more difficult than it should have been using Visual C++6 and the next few paragraphs refer only to this compiler.(This is pretty esoteric, but I needed to write it down somewhere.)
Making classes, functions and global variables in the DLL visible to an application which is using it is called exporting. There are two ways of doing this. (I found this article useful.) Either you can put
__declspec(dllexport)in each declaration, or you can have entries in a .DEF file specifying each exported entity by its decorated name. Since we want to export everything in the chemistry part of OB, the first method would involve too many code modifications. These anyway are undesirable because the same files are used for other builds and on other platforms. The second method requires you to use a program to convert a .MAP file, which is produced in a first compiling step, to an appropriate .DEF file, and then recompiling a second time. Apparently the MFC library itself is handled like this, according to Mike Blaszczak's book on Visual C++. The program I have used is derived from a sample program, map2def, from Sam Blackburn's WFC library.
The two-stage process worked ok for the classes and functions, but it is necessary to export a few global variables which are declared in mol.h (etab is one of them). The process didn't work with them because it is necessary to have a
__declspec(dllimport)on their declaration in the program which is using the DLL. It is optional for classes and functions. This means that it is sadly necessary to modify mol.h, but the changes are not large. The extern keyword on these global variables is replaced by a macro EXTERN which resolves to
__declspec(dllexport) externwhen the main DLL is being compiled, to
__declspec(dllimport)when an application which will use the DLL is being compiled, and to just
externin other cases. A similar strategy seems to have been taken in porting other large libraries to Windows DLLs.
OBConversion has a member function LoadFormatFiles() called from its constructor which runs once per session to look for files with the extension *.obf in the current directory and loads them. The global instance of the format class then runs, as it does at startup with normal static linking, and registers the format with OBConversion. This code is platform dependent and some different code needs to be added for non-Windows systems, although it should be ok for other Windows compilers.
No changes to the code in the files containing the format classes are necessary to allow a DLL to be made. It seems that it is not necessary to provide a dllMain(). Nor does anything need to be exported and no lib file needs to be made, since all the linking takes place at when the DLL is loaded when OBConversion::RegisterFormat() is called from the constructor when a global instance of the format class is made.
Formats in DLLs can co-exist with formats statically linked with the main program. Any format in a DLL with the same ID (file extension) will replace a built in one. This gives the best of both worlds: the convenience of built-in code with the opportunity to easily update later without doing anything to the main program.
OBConversion can also be put into a DLL, rather than be compiled with the main chemistry. This is worth doing only if it can be sufficiently isolated from the chemistry that it does not have to #include mol.h. An external program could then do file format conversions by including only obconversion.h and not mol.h. The chemistry could then be changed without having to recompile the external program. All that would be necessary is to replace the OBDLL.dll containing the chemistry and possibly the dlls containing the formats.
So now OpenBabel at runtime could be in several files:
It is possible to combine the parts in several ways, for instance all together in one exe, or with OBConversion and the chemistry in a single dll. But I wanted to overcome the problems involved in separating the parts completely. Putting it together again doesn't seem to lead to any additional problems.
Using STL in DLLs can have difficulties arising from hidden static objects in the code. There are Microsoft Knowledge Base articles Q172396 and Q168958 on the subject but the current code seems apparently to have avoided the difficulties.
More details of the files comprising the code for these parts, together with project files for VC++6 are to be found here.
Note that the DLLs are likely to be different for each Windows compiler, since they use C++ interfaces, which are non portable. This means that all the components would have to be from the same compiler, which is a pity. It may be possible to overcome this, but I haven't yet thought how.
New framework. Implementation code. Windows interface. Program examples