Cobol in 2025

Tags: misc

Kobol, ancestral home of Twelve Colonies, is a planet in the Milky Way galaxy, located in sector 728. While lush and life supporting, the planet has been abandoned by all sentient life for at least 2000 years, ever since the exodus of Thirteen Tribes. The surface is dotted by ruins of civilized life, now all but reclaimed by nature.

Oh wait, that’s the wrong cobol.

COBOL

“The use of COBOL cripples the mind; its teaching should, therefore, be regarded as a criminal offense.” - Edsger W. Dijkstra

COBOL (“common business-oriented language”) is a compiled English-like computer programming language designed for business use. It is an imperative, procedural, and, since 2002, object-oriented language.

I’ve been intrigued by COBOL for a long time. It’s such a long lived language, as old as FORTRAN, but without the loving fanbase.

Why does COBOL persist despite getting no love? It is used today in banking systems all over. Hidden, in the background, making sure I can buy my luxury overpriced coffee beans (organic, single-origin, family run, co-fermented, unwashed, light roast, magical beans).

Where does it come from? (Other than the past.) COBOL derives from other English-adjacent wordy business-type languages popular when we were just getting started. (“We” - I mean those in the field of computer science.) Dr. Hopper’s FLOWMATIC, amongst others, were inspiration for the easily readible (but incredibly verbose) language.

Being a business language the business machine company, IBM, was involved. IBM released IBM COBOL for the IBM 1400 series (from 1959) and continues to support it today. The latest IBM COBOL compiler is from 2023.

COBOL was written out on program cards, guided by carefully writing code in templates like below.

IBM COBOL Programmer's card.

The punchcards are a wonder these days. Check out the fantastic examples in Tristan Davey’s archive.

With modern compilers and fancy technology we can do away with those pesky line numbers. In fact, my editor adds them automatically! How handy! The upshot? Leading blank spaces!

 IDENTIFICATION DIVISION.
 PROGRAM-ID "Hello-World".
 PROCEDURE DIVISION.
     DISPLAY 'Hello, World!'
     STOP RUN.

Hello World in COBOL

With this snippet we see the reason why no-one likes COBOL which I alluded to earlier: holy verbosity Batman! We can’t quite match python’s print('Hello, World!'), not even close to c’s relatively verbose global introduction. Our planetary greeting in COBOL is an incredible merging of verbose business speak and precise computer-ese. And this is after I’ve chopped out the usual boilerplate like section name, empty ENVIRONMENT DIVISION. etc.

Above we have a simple COBOL program. It prints “Hello, World!” to the screen. Not very business-like, but it’s a start. What else can we do in COBOL? This is what I want to know.

There is a GNU COBOL compiler (GnuCOBOL) which transpiles via C and gcc (or msvc on Windows). It can generate shared objects, which will become important later.

Slight tangent: COBOL datatypes

I didn’t think this would be so interesting, but COBOL has a fascinating way of handling variable declarations.

* usual boilerplate
 IDENTIFICATION DIVISION.
 PROGRAM-ID. "Hello-World".
* ooh this is different
 DATA DIVISION.
 WORKING-STORAGE SECTION.
 01 WS-X PIC 999 VALUE 1.
* right... what did that mean?
 PROCEDURE DIVISION.
     COMPUTE WS-X = WS-X + 1
     DISPLAY WS-X
     STOP RUN.

1 + 1 = 002

Above we can see a listing of a program with a single variable, WS-X, which has the value of 1, and the type 999. This means it’s a number with 3 digits. Yes, digits!

Let’s look at how GnuCOBOL interprets the COBOL code. It converts it to c first and we can interrogate this directly with cobc -C test.cobol. The resulting code is a little verbose (it is autogenerated afterall) but it is suprisingly well commented! We can skip the setup to get to the part of the code which reflects the COBOL above:

    ...
  /* PROCEDURE DIVISION */

  /* Line: 6         : Entry  Hello-World : test.cobol */
  l_2:;

  /* Line: 7         : COMPUTE            : test.cobol */
  cob_decimal_set_field (d0, &f_8);
  cob_decimal_add (d0, dc_1);
  cob_decimal_get_field (d0, &f_8, 0);

  /* Line: 8         : DISPLAY            : test.cobol */
  cob_display (0, 1, 1, &f_8);

  /* Line: 9         : STOP RUN           : test.cobol */
  cob_stop_run (b_2);
    ...

GnuCOBOL generated c code for the previous COBOL listing.

In the generated code we can see the literal lines of the original COBOL (I expect this is very handy for debugging). There are three functions most of interest at the moment: cob_decimal_set_field, cob_decimal_add, and cob_decimal_get_field.

f_8 is our WS-X variable in some form. First, it looks like a decimal d0 is being set to the value of f_8… what’s a decimal in this context? Let’s look up libcob.h:

typedef struct {
	mpz_t		value;			/* GMP value definition */
	int		scale;			/* Decimal scale */
} cob_decimal;

src Note: not exactly the same as GnuCOBOL which split off from opensourcecobol.

mpz_t? This turns out to be from the GNU Multiple Precision library and represents a multi-precision integer (Z or $\mathbb{Z}$, like the label for the set of integers). Okay so that first function is like a conversion then, it takes our variable and smushes it into a decimal from a cob_field - which has the form:

typedef struct {
	size_t			size;		/* Field size */
	unsigned char		*data;		/* Pointer to field data */
	const cob_field_attr	*attr;		/* Pointer to attribute */
} cob_field;

src Note: not exactly the same as GnuCOBOL which split off from opensourcecobol.

Now we’re getting somewhere! There’s a size and a pointer, and an attr struct, which has many fields including a field type! So the number is stored as raw bytes in an array, and is handled according to a type. At first this seemed insane, but it makes perfect sense. It’s a good way to handle how the types differ between COBOL and the native machine.

FizzBuzz

We all know fizzbuzz - count up from one but say fizz on numbers divisible by 3 and buzz on numbers divisible by 5. Say fizzbuzz on numbers divisible by both. Solving this is trivial, but we will demonstrate being able to use a couple different parts of the language: branches, conditions, and variables.

* usual boilerplate
 IDENTIFICATION DIVISION.
 PROGRAM-ID. fizz-buzz.
 ENVIRONMENT DIVISION.
 DATA DIVISION.
 WORKING-STORAGE SECTION.
* Three variables this time:
* one for the counter and
* two for modulo results.
 01 WS-FB PIC 999 VALUE 1.
 01 WS-FC PIC 999 VALUE 1.
 01 WS-BC PIC 999 VALUE 1.
 PROCEDURE DIVISION.
 MAIN-PARA.
     PERFORM 21 TIMES
*      should it be Fizz?
       COMPUTE WS-FC = FUNCTION MOD(WS-FB, 3)
*      should it be Buzz?
       COMPUTE WS-BC = FUNCTION MOD(WS-FB, 5)
       IF WS-FC EQUAL 0
         IF WS-BC EQUAL 0
*          Both!
           DISPLAY "FIZZBUZZ"
         ELSE
*          Divisible only by 3
           DISPLAY "FIZZ"
         END-IF
       ELSE
         IF WS-BC EQUAL 0
*          Divisible only by 5
           DISPLAY "BUZZ"
         ELSE
*          Divisible by neither 3 or 5
           DISPLAY WS-FB
         END-IF
       END-IF
       COMPUTE WS-FB = WS-FB + 1
     END-PERFORM.
     STOP RUN.

FizzBuzz

Grand, so we can loop and we can check conditions. How about jumping?

Factorial!

 IDENTIFICATION DIVISION.
 PROGRAM-ID. factorial.
 DATA DIVISION.
 WORKING-STORAGE SECTION.
 01 WS-NUM PIC 9(2) VALUE 30.
 01 WS-PROD PIC 9(10) VALUE 1.
 PROCEDURE DIVISION.
* A label to jump to.
 01-MAIN.
     IF WS-NUM EQUAL 1
       DISPLAY WS-PROD
     ELSE
       COMPUTE WS-PROD = WS-PROD * WS-NUM
       COMPUTE WS-NUM = WS-NUM - 1
*      The dreaded GO TO!
       GO TO 01-MAIN
     END-IF.
     STOP RUN.

Factorial using jumps. 4! = 0000000024

This could have been written with another loop, but where’s the fun in that? We can jump between parts of the program by placing labels and GO TOing between them, just like in c.

Language features

COBOL is a big language with a deep history. While it lacks a good standard library, it has some features which are nice. The decimal type we met above can be used to restrict computation to a fixed number of decimals - ideal for financial institutions. For the physicist there’s floating point math still.

 IDENTIFICATION DIVISION.
 PROGRAM-ID. datatypes-example.
 DATA DIVISION.
 WORKING-STORAGE SECTION.
* 5 digit number
 01 WS-DEC PIC 9(5).
* 5 digit number then 2 places after the decimal
* e.g. $99,000.00
 01 WS-MONEY PIC 9(5)V99.
* native IEEE float (single)
 01 WS-FLOAT USAGE COMP-1.
* native IEEE float (double)
 01 WS-DOUBLE USAGE COMP-2.
...

Some COBOL data types

There’s also the ability to read and write files and you can store simple structs or row information as RECORDs. Oh, and there are binary data types too. You can execute a number of procedures in a row with PERFORM A THRU D. It has first class support for writing reports (you can really see the advantage for business). I has exception handling (sadly not error-as-value). You can (or could until 2002) get it to generate self-modifying code with the ALTER keyword. I’m just reading the wikipedia page at this point, but you get the idea.

It’s definitely a useful language, even today. I’ve written a couple wee programs in COBOL now and I struggle to remember all the boilerplate. But then, it’s the same with any new language: there’s an adjustment period. My first experience with Rust was hell fighting with the borrow checker and now I barely think about it. I’m not sure if the positive things I feel about COBOL are more than amusement at the novelty, so we’ll have to see if that wears off in time.

COBOL and Python

One last thing before I close this post: can we call COBOL from python?

It seems the COBOL is a useful language, however verbose and archaic. If I want to do something useful I could use pure COBOL. However, if I want to do something useful while also being productive… I would want to use a less obtuse language to handle the dull operations.

As a primarily python person, it would be great if I could call functions in a library written in COBOL from Python. Luckily, GnuCOBOL uses c as an intermediate (and c is not a programming language, it’s the universal protocol) so we can interface with the generated .so files using the ctypes or cffi Python libraries.

However, we do need to write some intermediary code - COBOL relies upon a runtime, which needs to be initialised. We could do this from the Python end, but for relative simplicity, let’s handle this with a c intermediary.

COBOL factorial library

 IDENTIFICATION DIVISION.
 PROGRAM-ID. FACTORIAL.
 DATA DIVISION.
* the data is not just for working!
* we are linking it between programs
 LINKAGE SECTION.
 01 LS-NUM PIC 9(2) USAGE COMP-5.
 01 LS-PROD PIC 9(10) USAGE COMP-5.
* need to tell COBOL what the relevant variables are
 PROCEDURE DIVISION USING LS-NUM, LS-PROD.
* same method as before
 01-MAIN.
     IF LS-NUM EQUAL 1
       GO TO 02-END
     ELSE
       COMPUTE LS-PROD = LS-PROD * LS-NUM
       COMPUTE LS-NUM = LS-NUM - 1
       GO TO 01-MAIN
     END-IF.
 02-END.
* exit program is like return
     EXIT PROGRAM.

factorial_function.cobol: Factorial calculation as a callable function/sub-program.

Intermediary c which initialises the runtime (and handles passing variables nicely):

#include <stddef.h>
#include <stdio.h>
#include <libcob.h>

extern int FACTORIAL(int *, long *);

long factorial(int input) {
  static int initialised = 0;
  if (!initialised) {
    cob_init(0, NULL);
  }
  long output = 1;
  FACTORIAL(&input, &output);
  return output;
}

glue.c

Compile these two together into a shared library with:

cobc factorial_function.cobol
gcc -o glue.o -c glue.c
gcc -o libfact.so \
  -shared \
  glue.o \
  factorial_function.so \
  $(cob-config --libs) \
  -fPIC

Now for the python part. We need to load out shared object, but, as it links to libcob, we also need to link against libcob in python:

import ctypes

# load the libcob library
ctypes.CDLL('libcob.so', mode=ctypes.RTLD_GLOBAL)

# load cobol library
libfact = ctypes.CDLL('./libfact.so')

# set up function signature
libfact.factorial.argtypes = [ctypes.c_int]
libfact.factorial.restype = ctypes.c_long

# run cobol!
inp = 4
out = libfact.factorial(inp)
print(f'{inp}! = {out}')

run_cobol.py

Running the above, all going well, gives:

$ python run_cobol.py
4! = 24

Summary

Code is on github.

COBOL is a fascinating language, albeit incredibly verbose. There’s something alluring about the obtuse specific language that I hope is not novelty to wear off in a few days.

There are a few language features that seem useful (nothing like a modern language) so I could see myself making something useful with COBOL in future.

COBOL can be called from python 🎉

Further reading: