Thinkshifts Group

A Comprehensive Guide to Real-Time Big Data Analytics

While real-time analytics and big data are both trending, it seems that real-time big data analytics, which is their combination, should be a very promising initiative, and many businesses should be desirous of it. Let’s find out if this is really so.

You will find this article richly supplied with the examples of real-time customer big data analytics. We’ve done so for the reasons of ease and consistency. Though there are more areas where real-time data analytics can be applied.

Let’s start from defining the term

If you are going to skip this section because you think there can’t be two definitions of real-time, please don’t be surprised – there are. In fact, the definition of real-time is extremely vague and it differs a lot from company to company or, to be more exact, from business task to business task.

Real-time big data analytics means that big data is processed as it arrives and either a business user gets consumable insights without exceeding a time period allocated for decision-making or an analytical system triggers an action or a notification.

As real-time is often confused with instantaneous, let’s clarify the time frames for data input and response. As far as data input is concerned, the real-time processing engine can be designed to either push or pull data. The most widespread example is a push option with an incessantly flowing high-volume data (also known as streaming). However, the real-time processing engine is not always capable of ingesting streaming data. Alternatively, it can be designed to pull data by asking if any new data has arrived. The time between such queries depends on business needs and can vary from milliseconds to hours.

Correspondingly, the response time also varies. For instance, a self-driving car requires a very fast response time – just several milliseconds. If we deal with sensors installed, say, to a wind turbine and they communicate a slowly growing gearbox oil temperature, which is still below the critical level but higher than normal, we need one-minute response time to change blade pitch, thus offloading the turbine and preventing machine breakdown or even fire. However, a bank’s analytical system would allow several minutes to assess the creditworthiness of an applicant; and a retailer’s dynamic pricing can take up to an hour to update. Still, all these examples are considered real-time.

Real-time big data analytics as a competitive advantage

Although in general organizations value managing data in real time, not all the companies go for real-time big data analytics. The reasons could be different: the lack of expertise or insufficient funds, the fear of the associated challenges or overall management team’s reluctance. However, those companies who implement real-time analytics can gain a competitive advantage.

Let’s say you are a fashion retailer who would like to take the advantage by delivering a top-notch customer service. Analyzing big data in real time can help bring this great initiative into life. For example, once a customer is passing by a retailer’s store, they get a push notification on their smartphones that serves to incentivize them to enter. Usually, it’s a personalized promo offer that is based on the customer’s purchasing or even surfing history on the website. Once a customer is in the store, the staff gets a notification in their mobile apps. This makes them aware of the customer’s latest purchases, overall style preferences, interest in promotions, a typical spend, etc. It looks like a win-win situation for both customers and retailers, doesn’t it?

An ecommerce retailer can also achieve better performance by analyzing big data in real time. For instance, they can reduce the number of abandoned carts. Say, a customer has gone that far, but for some reason, they’ve decided not to finalize their purchase. Still, there are good chances to incentivize them to change their mind. The system is turning to the customer’s profile data, as well as the purchasing and surfing history to compare the customer’s behavior with the conduct of other customers from the same segment and their response to different actions in a similar situation. Based on the analysis results, the system chooses the most suitable of all the possible actions – for example, offers a discount.

A typical architecture for real-time big data analytics

Let’s have a look at how a typical real-time big data analytics solution works. To make the explanation more vivid, we will accompany it with an example that is illustrative for everybody, as, now and again, we all assume the role of a customer.

Imagine a retailer that is aiming to deliver a personalized customer experience. The first step on this long road is to recognize a customer, once they are in the store. A retailer can achieve this in multiple ways, for example, by implementing face recognition.

With this only data source, the retailer can do a simple analysis, like calculate how many male and female customers are currently in the store. However, the retailer will not satisfy themselves with one data source only. Even to know how many of the customers have come for the first time and how many are regulars, another data source is needed, for example, CRM. The general context will also be helpful, for instance, the information about the store’s opening hours.

After processing, real-time data finds its way to a real-time dashboard or turns into either a notification or a system’s action. We’ve already provided the example for the first case, when the retailer can understand how many customers are in the store at the moment. Let’s look at another option in detail. Say, a customer has formed a shopping list in the mobile app and is moving around the store. Based on the customer’s current location data (gathered by beacons and processed by the same real-time analytics), the app can prompt the most optimal way along the sales floor so that they can grab everything that is on their list.

Let’s continue with the above mentioned example to explain the contribution of machine learning. By the way, machine learning itself does not happen in real time. It’s an elaborate process, and the system requires significant time to analyze an enormous volume of data, which usually covers the period of 1+ year, from different angles to come up with valuable models and patterns. These models help the system to make real-time decisions. Now, to the example: the system has already analyzed customer profiles and segments they belong to, their behavior model, the purchasing history, the response to marketing campaigns, etc. and built a model that enables personalized recommendations. And while the customer is walking in the aisles, the system can notify them about promo offers or related products that the customer will find interesting.

The concept of machine learning also requires model verification applications, as they enable a constant improvement of the models’ accuracy. Additionally, they improve the quality of the input data by allowing a basic filtering from erroneous or noisy data.

Now let’s turn our eyes to data storage. It consists of two components: a data lake and a data warehouse. The former is the place to store all the raw data or the data that has undergone a very simple processing. A data warehouse allows making big data 2-10 times smaller by extracting, transforming and loading only some data from the data lake.

In a word, a retailer cannot live by real-time analytics alone. You can see some other important components of the scheme that fall out of real-time. Still, they are critical if the retailer wants to get valuable and deep insights. For example, a data analytics module, which we haven’t mentioned yet, is responsible for running complex analysis by applying elaborate algorithms and statistical models driven by data analysts. Indeed, this process can take hours or more, but the results are worth waiting. Correspondingly, the retailer’s analytical dashboards will always contain not only real-time but also historical data.

To sum it up

If thoroughly planned and properly implemented, real-time big data analytics definitely can become a competitive advantage. Taking into account how different the interpretations of real-time can be, it’s important to have a clear understanding of the company’s requirements to the analytical system.

In the article, we’ve described a typical architecture for real-time data analytics solution. Before taking it as an example, check whether it will cover your short-term and long-term business needs. If for some reasons, it does not, you may always turn for professional advice on how to tailor it.