The RARS2 Playbook (for TOCs who want to win)
The UK passenger railway industry is comprised of many Train Operating Companies (TOC) that leverage a shared reservation system that has been ripe for innovation for years. This legacy system called NRS (National Reservations System) was built during a time where rail travel was much less complex. Due to this simplicity, NRS just wasn’t built with the flexibility TOCs demand in today’s travel environment. Enter RARS2.
What is RARS2?
RARS2 is the next-generation UK reservation system built to resolve many of these challenges. With increased capabilities over NRS including simple data integrations, scalable infrastructure, secure interfaces and business scalability at its core, RARS2 is the fit-for-purpose solution that will grow with UK TOCs into the future.
This scalability is driven by its flexible APIs for various data elements like bookings, train schedules, physical availabilities, logical availabilities, sales restrictions and many more. All these API endpoints are protected by industry leading security (OAuth Authorization), but it’s their flexibility with integrating 3rd party solutions that unlocks real value for TOCs. Whether it’s a next-generation Revenue Management solution, like Revenue Analytics’ FareVantage™, or it’s your customer-centric CRM system, you have open and immediate access to your reservations and pricing in real-time. No more day-long delays and no more limitations of a legacy solution.
RARS2 really depends on 2 key processes; sending data to and retrieving data from. It’s really as simple as that, but a quick summary is below:
- Retrieving Data from RARS2: This includes extracting all data elements needed for downstream analysis and optimization processes. The basic form of extraction is batch extraction which runs once every night or multiple times in a day to reduce load on RARS2 system. However, real-time updates are available using the Live Event Management System. It’s important to think through how you can leverage these real-time interfaces and what is better suited for batch updates.
- Sending Data to RARS2: The core functionality of RARS2 is to manage your customers reservations. As such, it is critical that inventory allocations are sent frequently and accurately. This is where your Revenue Management capabilities, come into play. For example, FareVantage™ crunches all the numbers and generates optimal logical availabilities for all trains, all journeys, for all departure dates. This equates to roughly 8 million distinct allocations, across all inventory buckets, every evening. Getting this integration correct is critical. If you get it wrong, your customers will be sure to let you know.
Let’s dive more into how the process works for retrieving/sending data from RARS2 and how simple the process really is. Below is a summary of how the RARS2 APIs are being used in a Production environment.
- Request for an Auth token from RARS2 using a secret key
- Extract the full train schedules (service segments). The extraction process requires requesting a REST API along with previously requested auth token and other input parameters. This forms a base for extracting availabilities.
- Extract Booking revisions, which is a list of booking numbers which were created/updated/cancelled from a certain timestamp which is provided as an input. This requires requesting booking revisions API endpoint using a GET request. This forms a base for extracting booking detail.
- Extract Availabilities for the list of trains obtained in the earlier step of service segments. We extract Physical Availability (train capacity) and Logical Availability (logical limits for each class and bucket). These two have their own API endpoints and a GET request is used.
- Extract Booking Detail for the list of booking numbers obtained in step 3. This contains all the booking related information like number of passengers, status, origin/destination etc. It is important to note that this API endpoint does not expose any customer personal information.
- Finally, once the data is passed to downstream analytical and optimization processes, recommended logical availabilities are generated which are then exported back to RARS2 using PUT request for logical availability endpoint.
It’s worth mentioning the simplicity in accessing live updates as well. But, just because you can access at this frequency, doesn’t necessarily mean that you always should. In many cases, batch processing is sufficient and real-time may actually result in higher costs and customer complaints.
- Subscribe to live event notifications by adding a webhook in the RARS2 system. This is a one-time configuration.
- Whenever a new booking revision is created in RARS2, a notification is automatically sent. From there, a system can retrieve these updates. FareVantage™ has this automated retrieval built-in.
- The incoming event is authorized using a custom authorization technique and if it is successful then the event is stored else it is rejected
- FareVantage™ leverages a specialized AWS Kinesis Firehose data stream which is appended to reservations that has already occurred. These reservations are then exported via flat-files to S3 that are subsequently picked up by FareVantage’s analytical pipelines.
Technical Challenges During RARS2 Integration There is no need to mention that there will obviously be technical challenges when you are trying to build something as complex as a reservation system. There are plenty of lessons learned, but there are 3 that all RARS2 users should plan for.
- Fault tolerance
Next, let’s talk about each one of these individually and what you can do to ensure a successful RARS2 implementation.
- Security: There are multiple areas in security that must be handled to make sure data doesn’t get leaked or TOC pricing isn’t manipulated.
Authorization is a key aspect of building any integration. You must make sure that you are sending the right credentials the right way so that the reservation system authorizes your request. Not only that, but you also need to make sure that the live event notifications you are receiving through the webhook are coming from a legible source and no one in the middle is trying to hack your system or network.
The way we solved the earlier challenge is by getting system credentials from the reservation system. These credentials are stored in AWS Secrets Manager which is the safest place to store such content. Using these credentials, we request for an access token every time we run our batch process, this token expires every 30 minutes, so we must refresh the token when it is about to expire.
For the later challenge, which is to authorize live event notifications, we generate MD5 hash of the request body that we receive and that MD5 is used along with a secret key that we receive when adding the webhook to generate HMAC signature. This generated HMAC signature is compared with the HMAC signature from the request header, we authorize the event when they match, when they don’t we send back a 4XX exception.
B) DDoS Attacks
Denial-of-service attack is a cyber-attack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by flooding the targeted resource with superfluous requests in an attempt to overload systems
In order to prevent these situations, AWS API Gateway is leveraged for all our REST API endpoints which has in-built rate limiter which limits a user to block any requests that exceed a preset threshold. By default, it is set to10,000 requests/sec/user which is still a lot to incur huge operational costs, so we adjust that to a reasonable limit.
- Scaling: The amount of data that we receive varies for different data elements and it varies from time to time. You can have a day where you only get couple of thousands of bookings whereas another day you get hundreds of thousands of bookings. It’s very important that all systems can handle these variations.
At Revenue Analytics, we’ve developed an internal tool to handle job orchestration using AWS Step Functions, building and deploying docker containers, spinning up EC2 servers using Batch, invoke serverless compute environments using AWS Lambda and scale up and down depending on the amount of load. This capability is known as “Pipeline Manager”, and is an industry-leading tool developed specifically by our Platform Engineering team.
Using this Internal pipelines, batch array json files are provided which contain indexes along with a segment (think particular trains, departure days, etc.) that we want the analytics for on a single machine. Pipeline Manager takes each one of these indexes and runs it in its own Docker container on a physical machine which is independent of other executions. Other than this parallel processing on a container level, we also use multi-threading in each container to process one API request for one thread. This level of parallel processing helps us finish our workloads in a near optimal timeline.
Solving scaling issues for live event capturing is a bit different in a fundamental way that batch process sends requests to reservation system whereas for live process we receive requests from the reservation system. To solve this issue, we need a way to auto scale up and down depending on the current load.
Imagine a situation where a popular game is announced and everyone is trying to book tickets, you’ll see a spike in the booking activity, whereas if it’s a regular weekday midnight where only a handful people are trying to book tickets. The best way to solve this would be to use AWS API Gateway in integration with AWS Lambda for our live bookings API, which we used to create the webhook. This way we are transferring all the heavy lifting to AWS by using their fully managed serverless technologies. This way you’ll only pay for the number of event notifications that you are receiving.
While FareVantage™ has been built with years of AWS experience and with proprietary components built to scale, this need to increase/decrease computational power based on need is critical to leveraging RARS2 to its maximum. Without it, you’ll be forced to take a 1-size-fits-all approach that will very likely impact your costs/performance in a negative way.
- Fault Tolerance: Since there are a lot of moving parts in this integration module, there are equally high number of chances that it could go wrong. This is an important thing to consider since any analytics that leverage RARS2 data would be skewed if data were erroneous or incomplete. We need to plan for the worst, so let’s discuss how we made FareVantage™ fault-tolerant for couple of scenarios and how you can do the same.
How do we synchronize any two booking data extraction runs? What happens when the process fails in a day? – To solve this issue, we maintain a most recent run date time that gets updated every time we successfully extract latest data. For every run we extract all updates that had happened from that recent run datetime. If in any case the current execution fails, then in the retry (or) in the next run it’ll still try to get all updates from the most recent successful run. This way we make sure that any fault in either of the systems is not affecting our data quality or coverage.
How do you make sure you received all booking events? – Let’s say while extracting batch bookings something went wrong for a single booking, we shouldn’t stop the whole process, similarly let’s say something went wrong while processing a live booking event. How do we make sure we maintain data entirety? To solve this issue what we do is at the end of booking extraction process we fetch all booking revisions which is just a list of booking numbers that needs to be extracted and we compare it with already extracted booking detail, by doing this we can find the ones which are missing and process them again. We configured the system to perform several retries in case of failures.
How to Deploy RARS2 integration while mitigating risk?
Rolling out RARS2 was as exciting as it was challenging. The Revenue Analytics team in partnership with LNER and Sqills have gone through a very comprehensive testing approach, but with any system, there’s always a risk of system issues during go-live. To reduce this risk, we’ve followed a couple of safety measures as detailed below.
Our first approach was to make sure that this is not a “big-bang” switch from NRS to RARS2. Future departure dates were split across the two systems in a way that the near dates (highest risk) continued to run on NRS and the rest of the future dates (lower risk) are run on RARS2. By doing this, we are making sure that highly active booking horizon is not being affected while running the rest in RARS2 which lower passenger demand further from departure. Slowly as we move closer to that split line, we process more dates in RARS2 and lesser dates in NRS and eventually we cut off NRS completely. To date there have not been any customer reported issues, but this approach gives TOCs breathing room to put patches and hot fixes into production without large numbers of customers being aware.
While doing this gradual roll-out an additional layer of safety was planned by running the system in production even before the roll-out date and blocking reservations from occurring. This gave us a head start to see how integrations and overall performance was trending in the live environment while not exposing the trains to customers.
Lastly, and as mentioned earlier, this integration contains both batch and live data extraction. Since live data is an enhancement over the existing batch process in NRS, we planned the roll-out with only batch process and made sure that the core functionalities was working as expected before enabling the more advanced live integration.
All of these decisions enhanced confidence for all parties involved and has resulted in a go-live that has been flawless.
Why should TOCs need to be excited about FareVantage™ RARS2 integration?
First and foremost, RARS2 is designed with the future of rail and the future of your customers in mind. True journey level management lets you focus on your customers needs, increased flexibility allows you to manage your pricing/inventory in the best way possible and improved performance lets you do all of this much more quickly and more easily.
At Revenue Analytics, we’re experts on Revenue Management, so here’s specifically why we’re excited about RARS2 and what it could mean for you.
- Quick Implementation: Data integration is consistently one of the hurdles to implementing any software, but especially a Revenue Management solution. Not anymore. Since RARS2 leverages a standard set of APIs and we’ve incorporated all of these configurations into FareVantage™, a customer can be up and running, with recommendations being delivered to customers, all within 8 weeks!
- Real-time Integration: RARS2 has the ability to provide reservation in real-time and be updated in real-time. With FareVantage™’s built-in AI and automation, TOCs can automatically trigger new, optimal pricing with the click of a button. No more waiting for NRS processing time or using an overly cumbersome user interface.
- Customer Specificity: This next-generation Reservation System is a perfect pairing to a next-generation Revenue Management system like FareVantage™. RARS2 enables journey specific management, not leg-level, allowing you to price your trains much more precisely. FareVantage™ automatically reserves seats for your highest valued customers while also decreasing price where additional demand exists.
In summary, RARS2 unlocks hidden value that has existed across every TOCs network for years. While there is some effort involved to capture this value, hopefully these tips and tricks will get you there just a little bit faster. And if you really want to maximize your investment to RARS2, it’s time to consider FareVantage™ as your next-generation Revenue Management System that has all this built-in!