Introduction

Python is one of the most popular programming languages in the world. It is widely used for various applications: end-user applications, robotics, scripting, web servers, and A.I. just to name a few. It is easier to tell where Python is not used, and that would be more exotic niches like kernel drivers. Of course, it is also popular in quantitative finance. In fact, by this point, it has almost pushed out competition almost entirely and is used for quant research, trading algorithms, various tools, and so on.

But is it a good tool for the jobs one has to deal with in quantitative finance? I would say “no”, and there is no expectation that the situation is going to improve. The problems are inherent to the specifics of the language itself, as well as its popular infrastructure is unlikely going to be replaced by anything else.

Despite this, Python reigns supreme and there is no viable competition available.

Requirements of quantitative finance

There are several kinds of work in quantitative finance, and there is a bit of diversity depending on the kind of trading involved and the specific aspect of the work.

In general, one needs to deal with math-heavy matters during quantitative research, modelling, and backtesting.

Then there is algorithmic trading, which as a minimum would include execution, but would say (and I would say “must”) also include decision-making and basic risk controls.

Finally, trading would involve a suite of various tools which support operations, including the reconciliation of records and comprehensive risk controls.

Let us consider all three aspects in more detail.

Quantitative research, modelling, and backtesting

Research requires an ability to work with various data, mostly of time-series sort. The volume of data can be large.

When working with time-series data, it is important to be able to match two timestamped datasets in a way where for items in one of them the most recent data from another dataset is selected (as-of join).

The activity involves statistical analysis, so access to quality statistical libraries is important.

The ability to represent linear algebra operations and work with multidimensional data in a natural way is valuable.

The results occasionally need to be visualised, and it would help if that visualisation is intuitive and effortless.

The documentation often needs to include complex formulae, so support for standard notations like LaTeX is useful.

Decent support for time, date, timestamps, time deltas, and missing values is often important.

And finally, and this is important, there is a strong demand for flexibility and agility in trying new things out.

Support tools

The tools need to do various jobs, but mostly it is the creation of reports, moving data around / transforming the data.

The code is usually not complex but often needs to deal with many external systems and work with diverse data.

The data is often in a tabular or time-series form. It must be possible to represent operations on it with ease.

Reliability is more important than in research as many tools are meant to run automatically.

Algorithmic trading

Here it is assumed that algorithmic trading is running automatically and mostly autonomously. It is based on the results of quant research, modelling, and backtesting. Unlike the research environment, everything happens in real-time and while there is human oversight, people do not make decisions in every single case: the algorithm makes decisions independently, based on its model.

Reliability and precision are very important here, as human oversight happens only post-factum and on a fairly high level. Nobody babysits an algorithm.

Raw performance may also be important. This depends on the nature of the trading.

Documentation is important, but less so than in quant research. There is less demand for complex mathematics in comments.

The clarity of the algorithm’s code is very valuable. Quantitative parts of the code are hard to review at the best of times, but there is a higher chance of making a costly mistake if they are written poorly.

Python and its ecosystem

Python is known as a language that is easy to learn, full of libraries for anything you might need, and enjoys a large community behind it.

Let’s consider the strengths and weaknesses of different parts of Python’s ecosystem in the context of quantitative finance.

Pros and cons

… of the language

Pros:

The basics of the language are ready to grasp for beginners.
The language is fairly expressive and flexible, while also nudging its users towards structuring the code cleanly.
There is some support for functional programming but the style is not enforced.
Dynamic typing is usually helpful while prototyping.

Cons:

“Easy” is a double-edged sword. Beginners tend to write code which mostly works but is extremely fragile and convoluted. Features like default arguments are often abused, leading to many bugs down the road.
Making use of several CPU cores is not straightforward. Usually one needs to solve this problem by creating multiple processes and paying the cost of inter-process communications.
Python is sort of slow. This is mostly invisible in 99% of cases, but sometimes it is visible and matters. This is a consequence of Python’s dynamic nature.
Dynamic typing means some bugs would be seen only when the code runs. This is OK for research code, but much less so for algorithms which are meant to run autonomously.

… of numpy

Numpy is a popular library which provides support for the efficient representation of arrays and operations on them. As of now, it is a de-facto standard. It underpins pandas and many other libraries. The reason why it is fast is that its storage is specialised for the data it works with, and various operations like matrix multiplication are implemented with the help of well-tuned libraries (rather than in Python itself). This makes a dramatic improvement when working with large data.

Pros:

On large data, it can be easily around x100 faster than an equivalent Python-native code.
There is probably no comparable competitor anyway which could offer various linear algebra operations, slicing of multidimensional data, sampling of random data, basic statistics, and so on.

Cons:

Weak support for time-related data which could represent things like “date”, and “time of day”, although there is some support for time deltas and timestamps.
Weak or non-existing support for missing values.
Sometimes API in an unexpected way, making the old code break or silently do unexpected things.
Let’s say you have an array of integers. Its type is numpy.array, of course. What is the type of a single element of the array? No, it is not an int. It is a… numpy.int64. Goodbye, performance.

… of pandas

Pandas is a library which provides an implementation of tables (called DataFrame) and series (which can be seen as vectors with an associated index).

Pros:

Much of the data in quantitative finance can be naturally represented as a table. Pandas provides an implementation of this.
The library is intuitive and fairly capable. For example, it supports operations like merging tables by keys.

Cons:

Simple things like creating a column which is a sum of two other columns are a bit ugly.
Most operations are accelerated (pandas uses numpy). However, operations for which there is no numpy-accelerated path are very slow.
Just like numpy, API occasionally changes, breaking the existing code.

… of other libraries

Pros:

There is a wealth of libraries for Python. SciPy, Matplotlib, TensorFlow, seaborn, and so on provide valuable and fairly easy-to-use functionality.

Cons:

API of many libraries often changes and it does so with little regard to backward compatibility. In the best case, an upgrade can break your code in an obvious way. In the worst case, you can end up with obscure bugs observable only occasionally. This may be OK for research, but it is terrible for algorithmic trading.
The quality of the libraries is not always stellar. For example, as of this writing, there is no way to draw a vertical or horizontal line in matplotlib, which is Python’s go-to library for making plots. No, vlines and hlines do not draw these lines, they draw segments.
Python’s libraries are written mainly by software engineers rather than statisticians or specialists in other fields. This becomes obvious when you use something like an algorithm for linear regression. Yes, it solves the problem, and you do have a result. However, the result does not contain much information about the quality/reliability of the regression.

… of Jupyter Notebook/Lab

Jupyter Notebook/Lab is irreplaceable for quant work in Python.

Pros:

Notebook interface is a must for quantitative research.
It is possible to mix code with documentation and the documentation can contain complex formulae (thanks to support for \(\LaTeX\)).

Cons:

It is possible to use normal Python code from a Jupyter Notebook, but it is not possible to use the code in a Jupyter Notebook from a normal Python code. This matters as you cannot do your research in a Notebook and then use the code of the research elsewhere by just importing it.

… of contributors

Pros:

There are many people contributing to Python’s ecosystem. Their number is probably on a scale of tens of millions.

Cons:

Everyone can code, but not everyone should.

For example, years ago I spotted a bug in a library dealing with differences in sequences. One of its functions was called find_longest_match and its documentation said that it finds the longest matching block. However, it was not. In fact, the code was sometimes unable to find a match at all, even though it was clearly there.

When I filed a bug report, I was told that:

It was not a bug.
The code is looking for the longest interesting match.
The bug is not in the code, it is in the documentation.

This was back in 2007. Now it is 2023, the documentation is still claiming that the code is looking for the longest match, and the bug is still there:

import difflib
text1 = "a" + "b"*210
text2 = "b"*210 + "c"
m = difflib.SequenceMatcher(None, text1, text2)
(_, _, l) = m.find_longest_match(0, len(text1), 0, len(text2))
print("Text #1: ", text1)
print("Text #2: ", text2)
print("Longest match: ", l)

other notable weaknesses

Firstly, time representation is awkward.

For the idea of a timestamp you can use:

A floating point number, which would normally mean “seconds since UNIX Epoch”.
numpy’s datetime64
pandas’ Timestamp
datetime’s datetime

They are not equivalent, conversions between them are awkward, and important matters like time zones are sometimes handled differently depending on which version of a library you are using.

Speaking of timezones, there are datetime’s tzinfo and timezone, as well as pytz.

There are also datetime’s date and time with a mostly obvious meaning. But let’s not stare at the abyss for now.

The idea of a time difference can be represented too:

A floating point number, which would usually mean “time difference in seconds”.
numpy’s timedelta64
pandas’ Timedelta
datetime’s tiemdelta

Again, conversions between them are not as obvious as one might want and there are differences between the representations.

Secondly, there is no unified representation of missing values. There is None for “objects”, NaN may be used for floating point values (although strictly speaking it is not the same thing), and that is mostly it. There is no way to denote a missing integer value. If you have to deal with sparse data, this can be a big issue.

Alternatives

R

R is a very flexible and expressive language which was designed for statisticians. I have to admit, I like it.

Pros:

Quant work in R is natural and easy.

As an example, you can run a linear regression in both Python and R, but R will give you much more diagnostics, helping you to understand whether your regression is reasonably trustworthy. Python would not.

It has some pretty good plotting capabilities. In particular, ggplot2 is a fantastic way of visualising data, where visualisation is defined declaratively.
R has native support for formulae. The aforementioned regression can look like this:

regression_results <- lm(heart_disease ~ biking + smoking, data = medical_data)

Here, the data is taken from a table medical_data, where heart_disease, biking, and smoking are columns. Linear regression is fitted, where heart_disease is a predicted value while biking and smoking are feature variables.

Linear algebra in R looks fairly readable. For example, suppose you have three matrices \(A\), \(B\), and \(C\), and you wish to compute \(A B^{-1} C^T\). Then the corresponding code in R would be:

A %*% inv(B) %*% t(C)

In my humble opinion, this is easier to read than its equivalent in Python:

np.matmul(np.matmul(A, np.linalg.inv(B)), C.T)

R can be used for more than statistics. It has a massive collection of libraries called CRAN, covering various domains. It is not hard to spot that while the majority of libraries for Python are related to software engineering as such, most of the libraries for R are covering science and engineering. Nevertheless, it is possible to use R for general-purpose programming and things like databases, GUIs and networks are accessible.
Arrays of data (both vectors and multidimensional arrays) are native to R.

Cons:

Integers in R are 32-bit, so their absolute value cannot go much more than about \(2*10^{9}\). As integers are also used for indexing, this does cause problems on very large datasets. Various workarounds exist, but they are half-measures.
Fewer people are familiar with R than with Python.
R is object-oriented but, it may be confusing to some as there are different types of classes: S3, S4, and Reference Classes. This is two types too many for most people.

C# / Java / C++

These three languages are quite different, but I am grouping them here regardless. All three have a similar syntax and have inspired each other over decades.

Pros:

The languages are statically typed, which helps to weed out some bugs.
All three languages also naturally enforce some rigour in the way the code is written.
Overall, all three languages can be successfully employed to implement trading algorithms and tools.

Cons:

Working with time series and tabular data in these languages is hard.
Some libraries are available, but they are not as good as numpy or pandas.
Support for missing data is almost non-existent.
Standardised support for various time-related ideas is either weak or missing.
There are various libraries for linear algebra, such as BLAS and LAPACK. However, it takes relatively a lot of effort to write the relevant code: where a single %*% operator would suffice in R, one would need to write multiple lines of code in C / C++. Although to be fair, direct use of these libraries offers more control over how things are done.

Overall, Java, C++, and C# are quite good for many things, but they would be a terrible choice for research, modelling, and backtesting.

APL derivatives

APL is one of the oldest languages where the central datatype is the multidimensional array. It also makes use of a set of unusual symbols, which may have put off many people from using it.

Multiple implementations and derivatives of APL exist, such as q and J.

Pros:

APL code is terse, perhaps too terse.
The language nudges its users to use vectorisation by default. This is good for performance, although not every idea can be written in this form.

Cons:

The code tends to be of a write-only kind. Some would disagree, of course. Here is a classic example:

life ← {⊃1 ⍵ ∨.∧ 3 4 = +/ +⌿ ¯1 0 1 ∘.⊖ ¯1 0 1 ⌽¨ ⊂⍵}  ⍝ Conway's Game of Life

Support for missing values and time-related data may be very basic (although there are exceptions).

While APL derivatives may look interesting in a theoretical sense, it is hard to see why would someone use these for research, modelling, actual trading, or tools. There is no obvious and strong-enough edge over other languages.

Wolfram Mathematica and MATLAB

Wolfram Mathematica and MathWorks MATLAB are commercial products which can be used for modelling, research, trading, and the creation of tools. They also have a wide range of libraries/toolkits available, which covers many problem domains.

Pros:

Both are robust commercial products.
Wolfram Mathematica has a Notebook interface, which has probably inspired Jupyter Notebook.
Both Mathematica and MATLAB were designed for engineers and scientists rather than programmers.

Cons:

Both are commercial products with licences that may be awkward to use in algo trading or tooling. They should be totally fine when used by individuals for research and modelling, however.
I am not an expert in these products, but a cursory look suggests that support for missing values and time may be relatively weak.
Mathematica’s notebook interface is lovely and the code can be documented, but just like in the case of a Jupyter Notebook there appears to be no way to wrap it up as a module and use it from elsewhere.

Overall, it seems that both may be fairly good for research and modelling, being as scientific as R while also being more polished. It is harder to see why would someone want to use them for algo trading or tools.

Conclusion

Python is not a perfect language for quantitative finance. Its lack of decent support for missing values makes it hard to work with partially unavailable data. Messy and poor support for time-related data does not help and sometimes actively causes problems. It is easy to write code quickly but the code would tend to be fragile, which is bad for algo trading. The available libraries were mostly created with programmers rather than scientists/statisticians/engineers in mind, which has an impact on the quality.

Nevertheless, Python is widely used. It is a result of many factors:

Python and most of its ecosystem are free to use.
There is a large number of people who know Python.
There is a hope that models and signals created during research can be moved into trading with no modifications. Sometimes this works, although often it does not.
Many believe that it is good to have just one language which is used for everything.

The best approach in your case depends on the kind of trading you are doing.

My view is that in most cases a single language won’t be enough to cover everything. Well… Perhaps OCaml can, although it is hard to see how can research be done in it. So I would rather bet on specialisation:

Pick the best language for your research and modelling, so that discovery of signals and market inefficiencies would be as quick as possible. Python may be an OK choice here, but it is worth considering the alternatives as well.
Pick the best language for your algo trading, so that you would be competitive in actual trading while also bearing in mind that safety is important.
Establish a process by which you can implement your trading algorithm and signals, backtest them, and confirm that the results of backtesting match the numbers seen in research. Yes, that would likely mean two independent backtesting systems running off the same data.
Implement your tools in whatever is easiest.

The classic combination is Python for research and Java / C# / C++ for algo trading. I am sure it can work well when properly implemented. Although I am sceptical of C++, especially when it is chosen in order to minimise latencies. But that is a story for another post.

Happy trading!