A closer look at the standard Fortran C compatibility layer by exploring objects and linkers

Background

Derived types and their interoperability have been covered previously in the context of usage from Python. However, much of the focus of the previous approach revolved around the iso_c_binding intrinsic module. A closer inspection of the functionality provided therein is the first step towards extending beyond the standards to support calling type bound procedures. This is an often overlooked aspect of the derived type usage pattern, in terms of interoperability.

Series

Thoughts relating to interoperability of Fortran with things can be vaguely collated into the following series.

  1. NumPy Meson Fortran
  2. Simple Fortran Derived Types and Python
  3. Exploring ISO_C_BINDING and type-bound procedures <– You are here!

Setup

We would like to understand the effect of the intrinsic iso_c_binding module. So, we will re-use at first, the basic cartesian type of the previous post, along with a non-type-bound procedure which is to be called from C.

 1! vec.f90
 2module vec
 3  implicit none
 4  integer, parameter :: dd = selected_real_kind(15,9)
 5
 6  type :: cartesian
 7    real(kind=dd) :: x,y,z
 8  end type cartesian
 9
10contains
11
12  subroutine unit_move(vec)
13    type(cartesian), intent(inout) :: vec
14    print*, "Modifying from Fortran"
15    vec%x = vec%x + 1
16    vec%y = vec%y + 1
17    vec%z = vec%z + 1
18  end subroutine unit_move
19
20end module vec

There isn’t a whole lot going on, except that the iso_c_binding labels have been dropped, and correspondingly, instead of c_double we now have an approximate kind mapping.

The associated C driver is exactly the same, though in this particular situation the function signature is optional.

 1/* vecfc.c */
 2#include<stdlib.h>
 3#include<stdio.h>
 4#include<vecfc.h>
 5
 6void* unit_move(cartesian *word);
 7
 8int main(int argc, char* argv[argc+1]) {
 9    puts("Initializing the struct");
10    cartesian a={3.0, 2.5, 6.2};
11    printf("%f %f %f",a.x,a.y,a.z);
12    puts("\nFortran function with derived type from C:");
13    unit_move(&a);
14    puts("\nReturned from Fortran");
15    printf("%f %f %f",a.x,a.y,a.z);
16    return EXIT_SUCCESS;
17}

To complete the translation unit, our header contains the struct definition.

1#ifndef VECFC_H
2#define VECFC_H
3
4typedef struct {
5  double x,y,z;
6} cartesian;
7
8#endif /* VECFC_H */

So as to not detract from the main focus of the post, instead of using meson or another build system, we will compile everything by hand.

Compilation

We will attempt to follow along the same logical process as in the bind(c) situation:

  1. Compile and assemble the Fortran module
  2. Compile, assemble and link the C driver with the module

A very literal attempt at satisfying the two step process above is perhaps as simple as:

1gfortran -c vec.f90 # generates vec.o
2gcc vec.o vecfc.c -I./ -lgfortran -o gf_vec
3/usr/bin/ld: /tmp/ccOOdInK.o: in function `main':
4vecfc.c:(.text+0x9a): undefined reference to `unit_move'
5collect2: error: ld returned 1 exit status

Naturally, the linker will fail at this point. Recall that we can check which symbols are actually part of vec.o.

 1nm vec.o
 2                 U _gfortran_st_write
 3                 U _gfortran_st_write_done
 4                 U _gfortran_transfer_character_write
 5                 U _GLOBAL_OFFSET_TABLE_
 6                 U __stack_chk_fail
 70000000000000000 T __vec_MOD___copy_vec_Cartesian
 80000000000000000 B __vec_MOD___def_init_vec_Cartesian
 9000000000000002c T __vec_MOD_unit_move
100000000000000000 D __vec_MOD___vtab_vec_Cartesian

Where the first few undefined functions are to be resolved by -lgfortran directive to the linker 1. Note that the function we care to call is actually called __vec_MOD_unit_move. We can also check the symbols required by our program.

1gcc -c vecfc.c -I./ -o gf_vec.o
2nm -u gf_vec.o
3                 U _GLOBAL_OFFSET_TABLE_
4                 U printf
5                 U puts
6                 U __stack_chk_fail
7                 U unit_move

So our problem is essentially one of renaming to the right symbol.

Symbol renaming

A first approximation towards a solution is then evident, we will forcibly rename the symbol in our Fortran code, thus allowing us to link and call the function. objcopy is fantastic for this.

 1gfortran -c vec.f90 # get vec.o
 2# Rename symbol
 3objcopy --redefine-sym=__vec_MOD_unit_move=unit_move vec.o
 4# Compile and link in one shot
 5gcc vec.o vecfc.c -I./ -lgfortran -o gf_vec
 6# Profit
 7./gf_vec
 8Initializing the struct
 93.000000 2.500000 6.200000
10Fortran function with derived type from C:
11 Modifying the derived type now!
12
13Returned from Fortran
144.000000 3.500000 7.200000

Indeed, apart from this very satisfying result, we can verify that our symbols are not undefined as well.

1nm -u gf_vec
2# nothing but library functions

Switching compilers

There are a number of caveats with the approach described so far, but perhaps one of the more striking ones has to do with changing the compiler. Consider the symbols generated by the Intel compiler.

1ifort -c vec.f90
2nm vec.f90
3                 U for_write_seq_lis
40000000000000000 r __STRLITPACK_0
50000000000000000 r __STRLITPACK_1.0.2
60000000000000000 T vec._
70000000000000010 T vec_mp_unit_move_

As Fortran remains one of the few languages to have a rich and varied set of compilers with varying levels of support and standardisation, it would be rather a tall order to keep track of the symbol mangling approaches used by every compiler. Indeed, problematically, it is not just that the symbols mangled differently, the code generated is quantitatively different as well.

1ifort vec.o vecfc.o -lc
2ld: vecfc.o: in function `main':
3vecfc.c:(.text+0x0): multiple definition of `main'; /opt/intel/oneapi/compiler/2021.2.0/linux/bin/intel64/../../compiler/lib/intel64_lin/for_main.o:for_main.c:(.text+0x0): first defined here
4ld: cannot find -lirng
5ifort -c vec.f90 -fPIE
6gcc vec.o vecfc.c -I./ -o gf_vec
7/usr/bin/ld: vec.o: in function `unit_move':
8vec.f90:(.text+0x55): undefined reference to `for_write_seq_lis'
9collect2: error: ld returned 1 exit status

In any case, we recall from the info pages of gfortran that:

Note that just because the names match does not mean that the interface implemented by GNU Fortran for an external name matches the interface implemented by some other language for that same name. That is, getting code produced by GNU Fortran to link to code produced by some other compiler using this or any other method can be only a small part of the overall solution–getting the code generated by both compilers to agree on issues other than naming can require significant effort, and, unlike naming disagreements, linkers normally cannot detect disagreements in these other areas.

Simplifying labels

The first functionality of the iso_c_binding is actually rather easy to implement from a vendor perspective, but vastly simplifying for the end-user, the ability to provide a single binding label. Recall the bind(c) variant of the previous post.

 1! vec_bind.f90
 2module vec
 3  use, intrinsic :: iso_c_binding
 4  implicit none
 5
 6  type, bind(c) :: cartesian
 7     real(c_double) :: x,y,z
 8  end type cartesian
 9
10  contains
11
12  subroutine unit_move(array) bind(c)
13    type(cartesian), intent(inout) :: array
14    print*, "Modifying the derived type now!"
15    array%x=array%x+1
16    array%y=array%y+1
17    array%z=array%z+1
18  end subroutine unit_move
19
20end module vec

Which generates the following symbols.

 1gfortran -o vec.o -c vec_bind.f90
 2nm vec.o
 3                 U _gfortran_st_write
 4                 U _gfortran_st_write_done
 5                 U _gfortran_transfer_character_write
 6                 U _GLOBAL_OFFSET_TABLE_
 7                 U __stack_chk_fail
 8000000000000002c T unit_move
 90000000000000000 T __vec_MOD___copy_vec_Cartesian
100000000000000000 B __vec_MOD___def_init_vec_Cartesian
110000000000000000 D __vec_MOD___vtab_vec_Cartesian

This true across compilers as well, all without any invocations of objcopy and other approaches.

1ifort -c vec.f90 -o vec.o
2nm vec.o
3                 U for_write_seq_lis
40000000000000000 r __STRLITPACK_1
50000000000000000 r __STRLITPACK_2.0.2
60000000000000010 T unit_move
70000000000000000 T vec._

Type-bound procedures

We would like to extend the discussion to beyond where the light of the standard reaches, that is to consider calling type-bound procedures, which are not supported by the standard. We will begin by a suitable modification of our code. In an ideal world, we would simply annotate the type-bound procedure.

 1! vec_typeb.f90
 2module vec
 3  use, intrinsic :: iso_c_binding
 4  implicit none
 5
 6  type, bind(c) :: cartesian
 7     real(c_double) :: x,y,z
 8     contains
 9       procedure, pass(self) :: unitmv
10  end type cartesian
11
12  contains
13
14  subroutine unit_move(self) bind(c)
15    class(cartesian), intent(in) :: self
16    print*, "Modifying the derived type now!"
17    self%x=self%x+1
18    self%y=self%y+1
19    self%z=self%z+1
20  end subroutine unit_move
21
22end module vec

However, this understandably does not go very well.

 1ifort -c vec_typeb.f90
 2vec_typeb.f90(7): error #8575: A derived type with the BIND attribute shall not have a type bound procedure part.
 3     contains
 4-----^
 5vec_typeb.f90(14): error #8224: A derived type used with the CLASS keyword shall not have the BIND attribute or SEQUENCE property.   [CARTESIAN]
 6    class(cartesian), intent(in) :: self
 7gfortran -c vec.f90
 8vec.f90:7:13:
 9
10    7 |      contains
11      |             1
12Error: Derived-type ‘cartesian’ with BIND(C) must not have a CONTAINS section at (1)

Removing the attribute, from both the type and the subroutine and rearranging slightly, we get the following.

 1module vec
 2  use, intrinsic :: iso_c_binding
 3  implicit none
 4
 5  type :: cartesian
 6     real(kind=8) :: x,y,z
 7     contains
 8       procedure, pass(self) :: unitmv
 9  end type cartesian
10
11  contains
12
13  subroutine unitmv(self)
14    class(cartesian), intent(inout) :: self
15    print*, "Modifying the derived type now!"
16    self%x=self%x+1.0
17    self%y=self%y+1.0
18    self%z=self%z+1.0
19  end subroutine unitmv
20
21end module vec

The class attribute is required here instead of type since,

1# ifort
2error #8264: The passed-object dummy argument must be a polymorphic dummy data object if the type being defined is extensible.

In any case, we are now in a position to inspect the generated object.

 1gfortran -c vec_typeb.f90
 2nm vec_typeb.o
 3                 U _gfortran_st_write
 4                 U _gfortran_st_write_done
 5                 U _gfortran_transfer_character_write
 6                 U _GLOBAL_OFFSET_TABLE_
 7                 U __stack_chk_fail
 80000000000000000 T __vec_MOD___copy_vec_Cartesian
 90000000000000000 B __vec_MOD___def_init_vec_Cartesian
10000000000000002c T __vec_MOD_unitmv
110000000000000000 D __vec_MOD___vtab_vec_Cartesian

Which does not seem to be very different at all. We will make an attempt to directly modify the symbol-table as before.

1objcopy --redefine-sym=__vec_MOD_unitmv=unit_move vec_typeb.o

An attempt to call this is bound for failure, a segmentation fault to be exact. In order to be able to call the type bound procedure then, we can begin by writing a wrapper subroutine.

1subroutine unit_move(cartobj)
2  type(cartesian), intent(inout) :: cartobj
3  call cartobj%unitmv()
4end subroutine

This does allow for the existing C interface to work.

 1gfortran -c vec_typeb.f90
 2objcopy --redefine-sym=__vec_MOD_unit_move=unit_move vec_typeb.o
 3gcc vec_typeb.o vecfc.c -I./ -o gf_vec -lgfortran
 4./gf_vec
 5Initializing the struct
 63.000000 2.500000 6.200000
 7Fortran function with derived type from C:
 8 Modifying the derived type now!
 9
10Returned from Fortran
114.000000 3.500000 7.200000%

It is useful to at this point, that Fortran passes arguments by reference and not by value, so there are no additional copy related overheads incurred by this “wrapper” approach.

It was correctly pointed out on the Fortran Discourse that the pass-by-reference calling convention is not mandated by the standards, though in practice many compilers do pass by reference when the VALUE attribute is missing.

On the other hand, it does leave the programmer in a world bereft of the iso_c_binding, which means also that the mapping of types and precisions become rather fluid.

Conclusions

This brief interlude on derived types and functions is more relevant in the context of automated binding generators like f2py Peterson (2009).

Perhaps a more formal approach to wrapper generation is the one elaborated upon in great detail in the literature Gray, Roberts, and Evans (1999) which details the concept of a logical interface and a physical interface.

A similar but more pragmatic approach is the opaque pointer method used for derived types with pointers in Pletzer et al. (2008) and forms the basis for the implementation in f90wrap Kermode (2020).

The (ab)use of type-bound procedures in the context of these binding generation methodologies is to follow at some point since none of them make any explicit mention of the same. Given that type-bound procedures were not introduced before 2003 Reid (2003), it is not surprising they have been overlooked, however their usage is foundational for long-lasting, sustainable Fortran code.

References

Gray, M. G., R. M. Roberts, and T. M. Evans. 1999. “Shadow-Object Interface Between Fortran 95 and C++.” Computing in Science Engineering 1 (2): 63–70. https://doi.org/10.1109/5992.753048.

Kermode, James R. 2020. “F90wrap: An Automated Tool for Constructing Deep Python Interfaces to Modern Fortran Codes.” Journal of Physics: Condensed Matter 32 (30): 305901. https://doi.org/10.1088/1361-648X/ab82d2.

Peterson, Pearu. 2009. “F2py: A Tool for Connecting Fortran and Python Programs.” International Journal of Computational Science and Engineering 4 (4): 296. https://doi.org/10.1504/IJCSE.2009.029165.

Pletzer, Alexander, Douglas McCune, Stefan Muszala, Srinath Vadlamani, and Scott Kruger. 2008. “Exposing Fortran Derived Types to C and Other Languages.” Computing in Science Engineering 10 (4): 86–92. https://doi.org/10.1109/MCSE.2008.94.

Reid, John. 2003. “The New Features of Fortran 2003,” 38. https://wg5-fortran.org/N1551-N1600/N1579.pdf.


  1. As in all things, info nm provides the details on interpreting the remaining symbols ↩︎