Why Everyone Should Learn Functional Programming Today



In the world of programming languages, trends come and go. One trend that deserves consideration is the interest in functional programming that began earlier this decade. Functional programming is a style that emphasizes immutable data, functional primitives, and avoidance of state.

I know what you’re thinking. You wrote some Lisp in college and dreaded it. Or, you had to manage some awful Scala code at your last job and you’d rather not deal with that again.

I know, I know. But hear me out.

Functional programming is more than a trend. Understanding its concepts and appeal goes a long way toward understanding the problems facing software engineers in 2019 and on into the next decade.

In fact, it helps to understand the current state of the world, as data mining and Machine Learning algorithms become an issue of public concern.

Even if you don’t work in a functional language, the solutions offered by the functional way of thinking can help you solve difficult problems and understand the world of computing.

Imperative Style

Most programming languages in wide use today are Von Neumann languages. These are languages that mirror a Von Neumann computer architecture, in which memory, storage, control flow instructions, and I/O are parts of the language. A programmer creates a variable in memory, sets its value, deletes it, and controls what the next command will be.

Everyone who has written a program is familiar with these concepts. Indeed, all the most popular languages in use are Von Neumann family languages: Java, C++, Python, Ruby, Go.

Enter Functional Style

In August 1978, computer scientist John Backus published an article in the Communications of the ACM. Backus accused conventional Von Neumann style languages of being “fat and flabby.” He bemoaned the complexity of new generations of languages that required enormous manuals to understand. Each successive generation added more features that looked like enhancements but considerably degraded the language by adding complexity.

Furthermore, programs written in these languages couldn’t be composed into new programs because their components weren’t created in generic forms.

A sorry state of affairs, indeed.

Backus asked why we can’t create programs that are structured more like mathematical formulas. In such a language, data could be manipulated as in algebra. He proposes this “functional style of programming” would be more correct, simpler, and composable. Backus also stressed the importance of “clarity and conceptual usefulness” of programs.

It has been four decades since this paper was written, but I think we can all relate to this!

Languages like Java, Python, and JavaScript add new features intended to clarify syntax, but the overall trend of these languages is toward increasing complexity. Object-Oriented Programming (OOP) at least gives us modularity, but inheritance hierarchies lead to well-known design problems.

Models of Computing

The blame for all this complexity, according to Backus, goes back to the Von Neumann computer architecture itself. It served us well in the 1940s and ‘50s, but by 1978, it had begun to show its age. He defines several conceptual models to demonstrate the limitations of Von Neumann’s ubiquitous model.

Turing machines and automata

These are conceptual models of computers used by computer scientists. They meet all the requirements for computing, but they’re too unwieldy for human beings tasked with designing software programs.

The dreaded Von Neumann model

Backus calls the Von Neumann model, exemplified by most of the conventional languages we use today, “complex, bulky, not useful.”

Backus concedes that Von Neumann languages can be “moderately clear,” but he calls out their lack of conceptual usefulness.

Indeed, how many of us have stared cross-eyed at a 1,000-line block of Python or Java, trying to suss out what all these loops and conditional statements are trying to do? And with multiple contributors, it can be a nightmare to understand highly procedural code.

Backus also notes that the Von Neumann model is designed for a single CPU machine. Instructions are executed one at a time.

The functional model

Here, Backus identifies the lambda calculus, the Lisp language, and his own concept of “functional style programming” as the third category.

Programs written in this model have no state. Instead of setting variables directly, we bind values to symbols. Instead of looping, we transform collections. The result is programs that are concise and clear, as well as conceptually useful.

Another way to say it might be to say that functional style is obvious.

Indeed, a program written in a functional style language is often quite short, but its concise definition makes it easier to understand than its non-functional equivalent.

Why Should I Care?

OK, so maybe we could make better programs if we all dropped Python and Java and started writing Haskell. Uh-huh. OK. Sure.

But who’s going to do that? And why? How are we going to train developers fresh out of college in these languages that they don’t know? More importantly, why? Certainly, there has been a lot of quality software written in existing languages, and as C++ creator Bjarne Stroustrup once said:

“There are only two kinds of languages: the ones people complain about and the ones nobody uses.”

The reason we should care about all this beyond an academic exercise is that the present movement toward “Big Data”-driven products has led to problems in computing that the functional model is uniquely good at solving.


As Backus noted in 1978, the Von Neumann model is really oriented around simple computers that execute one instruction at a time. The flow of a Von Neumann style program puts the control of every instruction into the hands of the programmer.

Unfortunately, it didn’t take long before our computers became more complex. We now have computers with many CPUs, executing many instructions at the same time. Popular languages like Python and Java weren’t built from the ground up to take advantage of this. These languages have bolted on threading APIs to allow programmers to take advantage of multiple processors. Others rely on process forking, essentially pushing the problem down to the operating system.

Multi-threaded programs are hard to write correctly, and even very experienced programmers can make serious errors. Writing multi-threaded programs is so complex that there are entire books dedicated to doing it correctly.

What would our programming languages look like if computers with many CPUs were commonplace in the 1940s? Would we choose to program each thread individually, or would we come up with different concepts for achieving the same goal?

Distributed Systems

Ten years ago, most software was written to run on an operating system on a customer’s PC. Any operations that the software needed to do were processed using the customer’s CPU, memory, and local disk.

But the early success of Gmail and other web-based tools proved that a sophisticated software system could be run over the internet.

Today’s commercial software doesn’t just run on a customer’s PC. It runs in the cloud, across perhaps hundreds of machines. Software-as-a-Service (Saas) products are now commonplace, used by individuals and enterprises.

With the data taken off of the customer’s PC and sent over the wire to our data center in the cloud, we now have a situation where we can look at the data for all customers in aggregate form. And that data can identify trends in the data — for example, detecting fraud in bank transactions.

But these systems are hard to write. Instead of running in a single-threaded computer with local memory and disk access like the Von Neumann model presupposes, our programs now have to run across potentially hundreds of machines with many CPUs. Even worse, we’re now processing way, waymore data than we could ever hope to store on a single machine. And we need to be working on this data. It can’t just be shoveled into a data warehouse and queried later.

A Naïve Solution

One approach is to keep using the threading or process-forking models we have been given to write our code, and then build a fleet of machines to scale it. Those machines will then process data and push that data somewhere (a database?) to keep it from filling up the local disk.

As you might guess, this solution is very operationally complex. We have to manually shard the data somehow — i.e., split our data set evenly across our n processing machines — and write the glue code for all these machines to talk to one another and perform some sort of leader election to figure out whose job it is to coordinate all of this.

In practical programmer terms, it also means we’re going to have to write, maintain, and version the following in code:

  • Complex multi-threaded code written in Java, for example.
  • A bunch of bash scripts to deploy and update this code on our n machines in the cloud.
  • Code to scale up and down our solution to more machines as our data volume grows or shrinks.
  • Some kind of scheduler and coordination system to get all these operations to work together and join their result somewhere.

Now imagine debugging and maintaining this system. Doesn’t sound fun, does it? Certainly, the resulting solution in code will not be obvious.

An Elegant Solution

In 2013, Berkeley’s AMPLab donated the Spark project to the open-source world. Over the years, Spark has become one of the favored ‘big data’ cluster programming platforms, supplanting a variety of systems built by engineers at Twitter and Google.

Spark is written in the Scala language, a functional programming language that runs in the Java Virtual Machine (JVM). I won’t get into the gory details of how Spark works or write any code here. You can find plenty of examples online for that.

Instead, I’ll present the Spark conceptual framework and show how the functional model of computing is crucial to its elegant solution.

What is “the program?”

Ask yourself this question. In our hypothetical distributed system described above, what is “the program?”

Is it the concurrent Java code that we wrote? Or is it the bash scripts that deploy that code out to the various machines in our “cluster?” Is it the scheduling algorithm?

I’d argue that all of these components put together contain pieces of “the program.” The program is the instructions for transforming the data. Details like thread management and managing resources are incidental to this goal.

Think of it this way. Say we have a dozen machines in the cloud with 4 CPUs and 16 GB of memory. Throw all those machines together into a big “cluster” of computing resources. Now we have one big “computer” with 4 * 12 = 48 CPUs, 16 * 12 = 192 GB memory, and a certain amount of disk storage.

Now, imagine we write our data transformations in the functional style described by Backus. Each transformation is written like a mathematical function. There’s an input and an output. No state. All data is immutable, stored in stages on disk on each machine, and deleted when it’s no longer needed.

We could now have a scheduler that knows about the structure of our cluster. In other words, it knows it has 12 machines with 4 CPUs and 16 GB memory. The scheduler dispatches a portion of the data along with the data transformation function we’ve defined.

In fact, if we write our data transformation “program” in a purely functional style, the scheduler can dispatch many of these transformations at the same time, as many as can be fit in the cluster with its limited resources. That allows us to process our data in an efficient manner.

Programming the Cluster in Functional Style

I’m not going to promote Spark as the end-all, be-all of cluster computing. Perhaps we’ll come up with something better in the future, and Spark isn’t good for every distributed system. It’s optimized for data processing and streaming, and not serving up live requests, for example.

But I want to emphasize the shift in perspective that allows this type of system to be built, namely functional programming style. And indeed, when we enter the realm of ‘big data,’ we tend to find that most solutions rely on the functional model of computing.

Spark offers a Scala, Java, and Python API. Whatever language you choose, you’re going to be writing your Spark program in a functional style.

We also tend to find that the separation of transformation code from resource management is a theme. Apache Spark’s solution separates out the resource management aspects of our distributed system, leaving us to work with the data. Data transformation rules are clear and require no complex multithreaded code.

It seems that distributed systems are finally freeing us from the limitations of the Von Neumann model.


Functional programming languages may be falling out of favor as a popular replacement for languages like Java or Python. As a drop-in replacement for simple use cases, like a small web application, Scala or Haskell may be overkill.

But the functional model of computing has not gone away by a long shot. If anything, it’s more ascendant than ever. It’s hiding behind the scenes, powering the Machine Learning algorithms, business intelligence, and analytics engines that provide insights to modern organizations.

Software engineers and managers would do well to learn these concepts and understand why so many projects that run at the heart of the biggest tech companies rely on functional style projects like Apache Spark.

Functional style allows us to separate the “how” of computing resource management from the “what” of a program. It frees us from burdensome and complex multithreading APIs bolted on to languages that are based on a model of a simple computer conceived of in the 1940s.

The functional model is uniquely well-adapted to the data-rich world that we’re entering. It’s an indispensable tool for any software engineer working today.

Comments are disabled.