The Geo-redundant Core
As we mentioned in the last edition of our EAN Tech Update Series, 2020 was a tough year for the airline industry. We got busy and used the time to further improve our European Aviation Network. In this second part of the series, we look at geo-redundancy.
EAN has been successfully developed by Inmarsat and Deutsche Telekom as the world’s first and only dedicated aviation connectivity solution to combine a satellite link with a complementary ground network (CGC) of around 300 antenna sites, delivering resilient inflight broadband to airlines over European airspace.
All of EAN’s CGC antenna sites are connected to a joint core element called the Mobile Core, which serves as the heart of any mobile radio network. This Mobile Core element was built from the beginning to be redundant by duplicating all of its key components, starting with a two-way power supply via two independent main routers all the way down to board level redundancy. We even separated the major server elements by hosting the equipment in different fire compartments. With this concept, all technical outage scenarios such as malfunctioning equipment, fire or water damage or other local accidents were mitigated right from day one.
Eliminating the last potential outage threat
With the aim to reach the ultimate reliability standard for our end-to-end EAN service chain, we have now eliminated the last potential outage threat to our CGC Mobile Core: disaster scenarios. These unforeseen scenarios, while unlikely, go beyond technical root causes and include natural disasters or other rare events such as a large-scale fire, earthquake, major accidents or malicious third-party events.
No matter the solidity of the design, the only adequate mitigation for such disaster scenarios is geo-redundancy, which is defined as the physical separation of the core elements supporting our services into multiple geographic locations. As a first step, we’ve taken the initiative in 2020 to build a twin to our existing EAN CGC Mobile Core, which is hosted several hundred kilometers out of harm in a different city. In the future, all of our EAN CGC 300 antenna sites will have the ability to connect to both cores simultaneously. If one core fails, the other one will immediately and autonomously take over.
Just recently, we have proven the reliable operation of our geo-redundant CGC Mobile Core site by simulating a disaster scenario. The existing main Mobile Core was switched off at a time of day where normal passenger flights were operated and therefore inflight connectivity usage would be expected. This event was immediately detected by the network outage discovery mechanisms that performed an automatic re-routing of CGC antenna site connections to the new geo-redundant Mobile Core. Within less than 2 minutes, the full service was transferred to the new Mobile Core site without the need for manual interaction. Traffic passing through our S-Band satellite link at the time of the transfer remained entirely uninterrupted by the maneuver and all passenger inflight connectivity sessions onboard were maintained throughout.
Geo-redundancy brings additional benefits
While the principal goal of any geo-redundancy initiative is superior service protection, even in disaster scenarios, its setup also brings a lot of additional benefits, including increased performance. For example, as an operational principle for deployment of the two Mobile Cores, we chose an ‘active-active’ configuration, which means no ramp-up times apply in outage cases. A key advantage in this setup is that we can enable load sharing between both cores by sending a portion of the traffic to one of our servers and another portion to the second one. Aircraft will therefore be served by the most appropriate core at any given time, providing an even better service for our customers than a single Mobile Core could do. This directly translates to an enhanced user experience for our airlines’ passengers. Furthermore, the two cores allow seamless lifecycle management. New software and release upgrades so far were introduced in night shifts, where no or only a few aircraft were flying. With the new dual core principle, upgrades can be executed per core while its twin temporarily takes over, allowing henceforth outage-free maintenance windows. The same principle of UX improvement will continue to be part of any geo-redundancy feature design we would consider.
As service reliability and performance are key attributes for EAN, there are various additional measures in place to ensure the risk of complete outage is fully minimized. With the critical S-Band satellite link, for example, all of the active satellite sub-systems are fully redundant. In addition, an antenna redundancy solution has been built-in at the EAN Satellite Access Station (SAS) in Nemea, Greece. In the event that a technical issue would prevent us from managing this radio link through the dedicated EAN SAS element, the existing design and infrastructure would allow a transfer of the traffic flow as an interim solution to an alternative antenna, thereby avoiding a complete interruption of our service.
Our newly established EAN geo-redundant Mobile Core is ultimately a set of precautionary measures with added benefits for our peace of mind. Although, hopefully, no disaster scenario will ever occur. We are fully committed to making EAN the best ‘Internet in the Skies’ service there is, and therefore have not stopped there. Next time, read about how we have tweaked our radio network last year to optimize it even further.