Among all economic sectors impacted by Covid19, air traffic was hit the hardest. Airlines are suffering a lot, and consequently so do airports. Now that most EU countries are reopening their borders, air traffic is slowly going up again.
Obviously, current traffic is nowhere near pre-covid levels, and it does not make sense for an airport to open all its terminals.
The stakes are high: delaying a terminal reopening by even a few months can represent dozens of millions of euros of savings.
I was recently involved in a project with #1 EU airport to work on this problem:
if you’re an airport, given airline traffic forecasts over the next decade, and financial data on your terminals (fixed/variable costs/revenue), how can you optimize the reopening sequences for all terminals?
First of all, you have to define the objective. In our case, the objective was to maximize profit, and minimize capacity mismatch (between airlines traffic forecasts and open terminals capacity), which is reasonable. Concretely, my plan was to load traffic forecasts and financial data into Pytorch tensors, and then use Autograd to run an optimization with a carefully-built loss function (with a profit term, and a capacity mismatch term).
I had traffic forecasts from each airline over a 90-months period, so my traffic tensor had shape (nb_airlines, nb_months). I had airport terminals financial data separated into 6 metadata: fixed costs/revenue if closed/open, variable cost/revenue per PAX, so my terminal metadata tensor had shape (nb_terminals, 6).
The first problem I encountered was the problem constraints: an airport is not a simple organization, and you can’t just assign any kind of traffic to any terminal. Some traffic is international, other is Schengen (EU borderless area), all terminals do not have the same capacity for one or the other.
Moreover, not all companies can be assigned to any terminal, for operational reasons (a given company may not have available subcontractors in a given area) or historical reasons (companies with strong brands do not want to confuse regular customers).
As in all optimization situations, you have to dumb down the problem, model it, run optimizations, and once you grasp the dynamics, you can start adding constraints to increase realism.
In our case, I decided to ignore at first airlines/terminals assignments, and just focus on traffic allocation to terminals (schengen, and international). Concretely, this means I loaded airlines traffic forecast into Pytorch tensors, and sum them along the airlines dimension, to have aggregated traffic sums for international and Schengen traffic. I had then 2 traffic tensors (International/Schengen) with shape (1, nb_months).
The second problem I encountered was the modelization of the problem. At first, I wanted to have as variables open/close binary flags for each terminal, month by month. I would then compute the effective available capacity, the mismatch with airlines traffic forecasts, and compute fixed+variable costs and revenue for each terminal.
Problem is, Pytorch Autograd needs the loss function to be differentiable, which means it has to be continuous, almost everywhere. This does not work well with binary integer flags for terminals open/close states.
So, I then decided to use continuous variables. Concretely, I used a float32 tensor of size (nb_terminals, nb_months) of values between 0 and 1. I could then, with a few tensor operations, compute in the loss function the profit over time, and traffic capacity mismatch.
The third problem I encountered was about variables bounds enforcement. Obviously, you want terminal open/close rates to be between 0 and 1. If you’re not careful, the optimizer will output negative open rate for terminals that have bad cost structures (they lose money most of the time so the optimizer might assign a negative open rate = negative traffic to make it profitable, which of course does not make sense).
As it turns out, constraining a variable in [0,1] range with Autograd is not that simple, in practice, and in theory as well. In practice, Autograd does not have (as of June 2020) a built-in way to specify variables bounds. This could be overcome with the use of torch.min and torch.max in the loss function (= “soft clamp”, the “soft” part is important because you need to have gradients, don’t use torch.clamp).
But the problem was bigger than that. I first tried to be clever, and include a term in the loss function to ensure open/close rates were as close as possible to 0 or 1. Concretely the term was about minimizing square(square(x) – x). 0 and 1 are the only floats where “square = identity” so I expected the optimizer to try to have mostly 0/1 floats at the end of the process.
I put a strong multiplicator coefficient on this “bounds enforcement” term of the loss function, to make sure the optimizer took it seriously. Problem is, it totally discouraged exploration: the optimizer found it too costly to go too far from initialization point, and this “soft binary constraint” made it impossible for the optimizer to find paths that would change a terminal open/close state.
In the end I had to remove this constraint, and accept any value between 0 and 1 for terminal open/close state, to let the optimizer explore all possible states.
The fourth problem I encountered was normalization. This one was not too hard to solve, basically if different terms of the loss function have different orders of magnitude, the gradient will be biggest for the biggest term in the loss function, which means the other term will not impact much the optimization. The best solution is to make sure each term of the loss function is normalized. For traffic capacity mismatch, I assumed the biggest traffic mismatch is “all terminals closed” so I used the mismatch in that situation as the denominator to normalize the mismatch term of the loss function. For profit, I assumed the biggest profit is when all terminals are at full capacity, I used the profit in that situation as the normalization denominator for the profit term of the loss function.
If I sum up, at this stage, I was able to run an optimization and get the optimal reopening sequence of airport terminals, given the air traffic forecasts from each airline, and the terminals financial characteristics. Optimal is defined as: maximal profit, under the constraint of satisfying airline forecasted capacity needs.
Then came the question of companies/terminal allocations. This one is very tricky, because basically, the airport wants all companies to be able to fly right now, wants to open as few terminals as possible, and wants to move as few airlines as possible (because airlines hate switching terminals). Those needs are contradictory, which is not shocking in itself, it’s the basis of almost all real-world optimization problems. If there are no contradictory factors, then the problem is generally easy.
Taking into account airlines/terminal allocations has a very nasty problem: the airport operations come with many exceptions. Airlines from the same group (ex: StarAlliance, SkyTeam etc) have to be grouped together where possible. Some airlines cannot move, some others can, but only between terminals X and Y etc.
When it comes to optimization, this is not good news, especially in my situation where I use Pytorch Autograd, which needs the problem to be differentiable. All those constraints could be implemented with “if”-like code, but branching logic in your loss function really makes it hard for differentiability.
Moreover, I knew that given all those constraints, it was very likely that the optimal scenario would be very hard to find, and would probably consist in moving just a handful of airlines out of the almost 200, between a few terminals, to delay the reopening of a handful of terminals by a few months. In other words, finding the right small touch to optimally change airlines/terminals allocation was going to be like finding a needle in a haystack.
I arrived at the conclusion that given the time constraints, it was wiser to start from the usual airlines/terminal allocation, and have a human with good domain knowledge look at the previous optimization results (optimal terminals reopening sequence from a pure traffic/financial perspective), and use that as a compass to manually try a few airlines/terminals reallocation scenarios.
In the end, using Pytorch and Autograd was obviously not a magic wand that could take into account all airport constraints and output a universally optimal scenario. But by using Autograd on a simpler problem (traffic/terminal allocation to maximize profit and minimize effective/needed capacity mismatch), we were able to identify good terminal candidates whose reopening should ideally be delayed, and then that information was used by a consultant with expert airport knowledge to propose a few good airlines/terminal reallocation scenarios.
This project used Pytorch and Autograd but did not involve neural networks at all. Some might say that it is a bit of an overreach to use the “AI” word, and it’s true that this project would probably be better defined with the term “differentiable programming”.
Yann LeCun said a few years back “Deep Learning est mort. Vive Differentiable Programming!” and I agree that we’ll see more and more great demonstrations with Pytorch and Autograd without necessarily involving neural networks.