10 minutes
Written: 2021-12-19 16:25 +0000
Updated: 2024-08-06 00:53 +0000
Exploring ISO_C_BINDING and type-bound procedures
This post is part of the Bridging Fortran & Python series.
A closer look at the standard Fortran C compatibility layer by exploring objects and linkers
Background
Derived types and their interoperability have been covered previously in the
context of usage from Python. However, much of the focus of the previous
approach revolved around the iso_c_binding
intrinsic module. A closer
inspection of the functionality provided therein is the first step
towards extending beyond the standards to support calling type bound
procedures. This is an often overlooked aspect of the derived type usage
pattern, in terms of interoperability.
Setup
We would like to understand the effect of the intrinsic iso_c_binding
module. So, we will re-use at first, the basic cartesian type of the
previous post, along with a non-type-bound procedure which is to be
called from C
.
1! vec.f90
2module vec
3 implicit none
4 integer, parameter :: dd = selected_real_kind(15,9)
5
6 type :: cartesian
7 real(kind=dd) :: x,y,z
8 end type cartesian
9
10contains
11
12 subroutine unit_move(vec)
13 type(cartesian), intent(inout) :: vec
14 print*, "Modifying from Fortran"
15 vec%x = vec%x + 1
16 vec%y = vec%y + 1
17 vec%z = vec%z + 1
18 end subroutine unit_move
19
20end module vec
There isn’t a whole lot going on, except that the iso_c_binding
labels
have been dropped, and correspondingly, instead of c_double
we now
have an approximate kind
mapping.
The associated C
driver is exactly the same, though in this particular
situation the function signature is optional.
1/* vecfc.c */
2#include<stdlib.h>
3#include<stdio.h>
4#include<vecfc.h>
5
6void* unit_move(cartesian *word);
7
8int main(int argc, char* argv[argc+1]) {
9 puts("Initializing the struct");
10 cartesian a={3.0, 2.5, 6.2};
11 printf("%f %f %f",a.x,a.y,a.z);
12 puts("\nFortran function with derived type from C:");
13 unit_move(&a);
14 puts("\nReturned from Fortran");
15 printf("%f %f %f",a.x,a.y,a.z);
16 return EXIT_SUCCESS;
17}
To complete the translation unit, our header contains the struct
definition.
1#ifndef VECFC_H
2#define VECFC_H
3
4typedef struct {
5 double x,y,z;
6} cartesian;
7
8#endif /* VECFC_H */
So as to not detract from the main focus of the post, instead of using
meson
or another build system, we will compile everything by hand.
Compilation
We will attempt to follow along the same logical process as in the
bind(c)
situation:
- Compile and assemble the Fortran module
- Compile, assemble and link the C driver with the module
A very literal attempt at satisfying the two step process above is perhaps as simple as:
1gfortran -c vec.f90 # generates vec.o
2gcc vec.o vecfc.c -I./ -lgfortran -o gf_vec
3/usr/bin/ld: /tmp/ccOOdInK.o: in function `main':
4vecfc.c:(.text+0x9a): undefined reference to `unit_move'
5collect2: error: ld returned 1 exit status
Naturally, the linker will fail at this point. Recall that we can check
which symbols are actually part of vec.o
.
1nm vec.o
2 U _gfortran_st_write
3 U _gfortran_st_write_done
4 U _gfortran_transfer_character_write
5 U _GLOBAL_OFFSET_TABLE_
6 U __stack_chk_fail
70000000000000000 T __vec_MOD___copy_vec_Cartesian
80000000000000000 B __vec_MOD___def_init_vec_Cartesian
9000000000000002c T __vec_MOD_unit_move
100000000000000000 D __vec_MOD___vtab_vec_Cartesian
Where the first few undefined functions are to be resolved by
-lgfortran
directive to the linker 1. Note that the function we
care to call is actually called __vec_MOD_unit_move
. We can also check
the symbols required by our program.
1gcc -c vecfc.c -I./ -o gf_vec.o
2nm -u gf_vec.o
3 U _GLOBAL_OFFSET_TABLE_
4 U printf
5 U puts
6 U __stack_chk_fail
7 U unit_move
So our problem is essentially one of renaming to the right symbol.
Symbol renaming
A first approximation towards a solution is then evident, we will
forcibly rename the symbol in our Fortran code, thus allowing us to link
and call the function. objcopy
is fantastic for this.
1gfortran -c vec.f90 # get vec.o
2# Rename symbol
3objcopy --redefine-sym=__vec_MOD_unit_move=unit_move vec.o
4# Compile and link in one shot
5gcc vec.o vecfc.c -I./ -lgfortran -o gf_vec
6# Profit
7./gf_vec
8Initializing the struct
93.000000 2.500000 6.200000
10Fortran function with derived type from C:
11 Modifying the derived type now!
12
13Returned from Fortran
144.000000 3.500000 7.200000
Indeed, apart from this very satisfying result, we can verify that our symbols are not undefined as well.
1nm -u gf_vec
2# nothing but library functions
Switching compilers
There are a number of caveats with the approach described so far, but perhaps one of the more striking ones has to do with changing the compiler. Consider the symbols generated by the Intel compiler.
1ifort -c vec.f90
2nm vec.f90
3 U for_write_seq_lis
40000000000000000 r __STRLITPACK_0
50000000000000000 r __STRLITPACK_1.0.2
60000000000000000 T vec._
70000000000000010 T vec_mp_unit_move_
As Fortran remains one of the few languages to have a rich and varied set of compilers with varying levels of support and standardisation, it would be rather a tall order to keep track of the symbol mangling approaches used by every compiler. Indeed, problematically, it is not just that the symbols mangled differently, the code generated is quantitatively different as well.
1ifort vec.o vecfc.o -lc
2ld: vecfc.o: in function `main':
3vecfc.c:(.text+0x0): multiple definition of `main'; /opt/intel/oneapi/compiler/2021.2.0/linux/bin/intel64/../../compiler/lib/intel64_lin/for_main.o:for_main.c:(.text+0x0): first defined here
4ld: cannot find -lirng
5ifort -c vec.f90 -fPIE
6gcc vec.o vecfc.c -I./ -o gf_vec
7/usr/bin/ld: vec.o: in function `unit_move':
8vec.f90:(.text+0x55): undefined reference to `for_write_seq_lis'
9collect2: error: ld returned 1 exit status
In any case, we recall from the info
pages of gfortran
that:
Note that just because the names match does not mean that the interface implemented by GNU Fortran for an external name matches the interface implemented by some other language for that same name. That is, getting code produced by GNU Fortran to link to code produced by some other compiler using this or any other method can be only a small part of the overall solution–getting the code generated by both compilers to agree on issues other than naming can require significant effort, and, unlike naming disagreements, linkers normally cannot detect disagreements in these other areas.
Simplifying labels
The first functionality of the iso_c_binding
is actually rather easy
to implement from a vendor perspective, but vastly simplifying for the
end-user, the ability to provide a single binding label. Recall the
bind(c)
variant of the previous post.
1! vec_bind.f90
2module vec
3 use, intrinsic :: iso_c_binding
4 implicit none
5
6 type, bind(c) :: cartesian
7 real(c_double) :: x,y,z
8 end type cartesian
9
10 contains
11
12 subroutine unit_move(array) bind(c)
13 type(cartesian), intent(inout) :: array
14 print*, "Modifying the derived type now!"
15 array%x=array%x+1
16 array%y=array%y+1
17 array%z=array%z+1
18 end subroutine unit_move
19
20end module vec
Which generates the following symbols.
1gfortran -o vec.o -c vec_bind.f90
2nm vec.o
3 U _gfortran_st_write
4 U _gfortran_st_write_done
5 U _gfortran_transfer_character_write
6 U _GLOBAL_OFFSET_TABLE_
7 U __stack_chk_fail
8000000000000002c T unit_move
90000000000000000 T __vec_MOD___copy_vec_Cartesian
100000000000000000 B __vec_MOD___def_init_vec_Cartesian
110000000000000000 D __vec_MOD___vtab_vec_Cartesian
This true across compilers as well, all without any invocations of
objcopy
and other approaches.
1ifort -c vec.f90 -o vec.o
2nm vec.o
3 U for_write_seq_lis
40000000000000000 r __STRLITPACK_1
50000000000000000 r __STRLITPACK_2.0.2
60000000000000010 T unit_move
70000000000000000 T vec._
Type-bound procedures
We would like to extend the discussion to beyond where the light of the standard reaches, that is to consider calling type-bound procedures, which are not supported by the standard. We will begin by a suitable modification of our code. In an ideal world, we would simply annotate the type-bound procedure.
1! vec_typeb.f90
2module vec
3 use, intrinsic :: iso_c_binding
4 implicit none
5
6 type, bind(c) :: cartesian
7 real(c_double) :: x,y,z
8 contains
9 procedure, pass(self) :: unitmv
10 end type cartesian
11
12 contains
13
14 subroutine unit_move(self) bind(c)
15 class(cartesian), intent(in) :: self
16 print*, "Modifying the derived type now!"
17 self%x=self%x+1
18 self%y=self%y+1
19 self%z=self%z+1
20 end subroutine unit_move
21
22end module vec
However, this understandably does not go very well.
1ifort -c vec_typeb.f90
2vec_typeb.f90(7): error #8575: A derived type with the BIND attribute shall not have a type bound procedure part.
3 contains
4-----^
5vec_typeb.f90(14): error #8224: A derived type used with the CLASS keyword shall not have the BIND attribute or SEQUENCE property. [CARTESIAN]
6 class(cartesian), intent(in) :: self
7gfortran -c vec.f90
8vec.f90:7:13:
9
10 7 | contains
11 | 1
12Error: Derived-type ‘cartesian’ with BIND(C) must not have a CONTAINS section at (1)
Removing the attribute, from both the type and the subroutine and rearranging slightly, we get the following.
1module vec
2 use, intrinsic :: iso_c_binding
3 implicit none
4
5 type :: cartesian
6 real(kind=8) :: x,y,z
7 contains
8 procedure, pass(self) :: unitmv
9 end type cartesian
10
11 contains
12
13 subroutine unitmv(self)
14 class(cartesian), intent(inout) :: self
15 print*, "Modifying the derived type now!"
16 self%x=self%x+1.0
17 self%y=self%y+1.0
18 self%z=self%z+1.0
19 end subroutine unitmv
20
21end module vec
The class
attribute is required here instead of type
since,
1# ifort
2error #8264: The passed-object dummy argument must be a polymorphic dummy data object if the type being defined is extensible.
In any case, we are now in a position to inspect the generated object.
1gfortran -c vec_typeb.f90
2nm vec_typeb.o
3 U _gfortran_st_write
4 U _gfortran_st_write_done
5 U _gfortran_transfer_character_write
6 U _GLOBAL_OFFSET_TABLE_
7 U __stack_chk_fail
80000000000000000 T __vec_MOD___copy_vec_Cartesian
90000000000000000 B __vec_MOD___def_init_vec_Cartesian
10000000000000002c T __vec_MOD_unitmv
110000000000000000 D __vec_MOD___vtab_vec_Cartesian
Which does not seem to be very different at all. We will make an attempt to directly modify the symbol-table as before.
1objcopy --redefine-sym=__vec_MOD_unitmv=unit_move vec_typeb.o
An attempt to call this is bound for failure, a segmentation fault to be exact. In order to be able to call the type bound procedure then, we can begin by writing a wrapper subroutine.
1subroutine unit_move(cartobj)
2 type(cartesian), intent(inout) :: cartobj
3 call cartobj%unitmv()
4end subroutine
This does allow for the existing C
interface to work.
1gfortran -c vec_typeb.f90
2objcopy --redefine-sym=__vec_MOD_unit_move=unit_move vec_typeb.o
3gcc vec_typeb.o vecfc.c -I./ -o gf_vec -lgfortran
4./gf_vec
5Initializing the struct
63.000000 2.500000 6.200000
7Fortran function with derived type from C:
8 Modifying the derived type now!
9
10Returned from Fortran
114.000000 3.500000 7.200000%
It is useful to at this point, that Fortran passes arguments by reference and not by value, so there are no additional copy related overheads incurred by this “wrapper” approach.
It was correctly pointed out on the Fortran Discourse that the pass-by-reference calling convention is not mandated by the standards, though in practice many compilers do pass by reference when the
VALUE
attribute is missing.
On the other hand, it does leave the programmer in a world bereft of the
iso_c_binding
, which means also that the mapping of types and
precisions become rather fluid.
Conclusions
This brief interlude on derived types and functions is more relevant in
the context of automated binding generators like f2py
Peterson
(2009).
Perhaps a more formal approach to wrapper generation is the one elaborated upon in great detail in the literature Gray, Roberts, and Evans (1999) which details the concept of a logical interface and a physical interface.
A similar but more pragmatic approach is the opaque pointer method used
for derived types with pointers in Pletzer et al.
(2008) and forms the basis for
the implementation in f90wrap
Kermode
(2020).
The (ab)use of type-bound procedures in the context of these binding generation methodologies is to follow at some point since none of them make any explicit mention of the same. Given that type-bound procedures were not introduced before 2003 Reid (2003), it is not surprising they have been overlooked, however their usage is foundational for long-lasting, sustainable Fortran code.
References
Gray, M. G., R. M. Roberts, and T. M. Evans. 1999. “Shadow-Object Interface Between Fortran 95 and C++.” Computing in Science Engineering 1 (2): 63–70. https://doi.org/10.1109/5992.753048.
Kermode, James R. 2020. “F90wrap: An Automated Tool for Constructing Deep Python Interfaces to Modern Fortran Codes.” Journal of Physics: Condensed Matter 32 (30): 305901. https://doi.org/10.1088/1361-648X/ab82d2.
Peterson, Pearu. 2009. “F2PY: A Tool for Connecting Fortran and Python Programs.” International Journal of Computational Science and Engineering 4 (4): 296. https://doi.org/10.1504/IJCSE.2009.029165.
Pletzer, Alexander, Douglas McCune, Stefan Muszala, Srinath Vadlamani, and Scott Kruger. 2008. “Exposing Fortran Derived Types to C and Other Languages.” Computing in Science Engineering 10 (4): 86–92. https://doi.org/10.1109/MCSE.2008.94.
Reid, John. 2003. “The New Features of Fortran 2003,” 38. https://wg5-fortran.org/N1551-N1600/N1579.pdf.
As in all things,
info nm
provides the details on interpreting the remaining symbols ↩︎
Series info
Bridging Fortran & Python series
- NumPy, Meson and f2py
- Simple Fortran Derived Types and Python
- Exploring ISO_C_BINDING and type-bound procedures <-- You are here!
- Fortran OOP and Python
- Types from Fortran to Python via Opaque Pointers