How to Understand the Requirements of Modern Retail Analytics

By Coeo

By Gavin Payne - Head of Digital Transformation at Coeo

Analytics, which was once a back-office capability, is now becoming a business differentiator in the increasingly competitive online retail market.  To quote Kelsie Marian of Gartner, a leading technology research agency, “retailers will survive only if quality data is embedded into every decision, minute by minute, across the retail organisation.”

This article will take a look at why traditional reporting platforms are struggling to meet the expectations of the next generation of business, and share how helping retailers can solve this problem by deploying modern online retail analytics platforms.

The Evolution of Data

Modern analytics technologies give online retailers a better understanding of their customers, letting them provide more personalised recommendations and use data to improve their operational decision-making.  Yet each of these capabilities often uses an isolated analytics system that only tells part of the story.

Getting a complete view, whether it’s of a customer or an end-to-end online retail process, needs an analytics platform that can see and analyse all of an organisation’s data.  While this may seem like a straightforward requirement, the reporting systems still being used by many retailers are struggling to cope for three common reasons:

  • the volume of data retailers collect has increased significantly in recent years
    (Collecting gigabytes of data a day is increasingly becoming normal)
  • the sources and formats of the data they collect are a lot more varied than they used to be
    (Social media posts are collected and stored very differently to POS transaction data)
  • retailers increasingly need data to be available for analysis as soon as it’s been collected
    (Near-real-time analytics is often a prerequisite for online businesses)

Structured Data

Most business applications, such as CRM and ERP systems, have always stored their information as structured data in a database.  Structured data is stored as rows in a table that has a predefined set of columns.

When these columns are created, they’re each configured to only be able to store a specific type of data.  For example, an address column may allow text whereas a sale value column may only allow a decimal number.  This gives them the rigour and reliability retailers need when they store their most valuable data.

Transactional database systems, which store structured data, are usually the revenue generating engines and business support systems for retailers.  They typically store information about a specific stock asset, financial transaction, or customer interaction as a single row in a table, which is why they’re often called a “system of record”.

While a transactional application’s data can be analysed using analytics queries in its mission critical database, it’s often not the most efficient way.  Analytics queries usually process large amounts of data and need to use large amounts of server resources, meaning they can easily affect the performance of business critical and revenue generating systems.

It’s common therefore for retailers to maintain a data warehouse database alongside their transactional systems.  Data warehouses are databases specifically designed to store large amounts of historical, structured, and transactional data in a way that makes it easy to analyse.

They’re also kept separate from their transactional source systems meaning they can’t interfere with each other’s performance. Data warehouses therefore tend to be an important, albeit sometimes expensive, part of an analytics platform.

Un-structured Data / Big Data

The biggest benefits of using structured data systems, their predefined data structures that give them their operational robustness, can also be their biggest weaknesses.

Not all the data retailers need to collect, store and analyse has a consistent structure or justifies being stored in fast, reliable but relatively expensive transactional database systems.  This then is the domain of un-structured data, or as it’s more commonly known, ‘Big Data’, which is defined as data that is often:

  • too large in volume to be stored efficiently or cost effectively in a traditional database system
    (E.g. Web clickstream logs)
  • too varied in its format, such as free text entry data, to be stored and then analysed using traditional database technologies
    (E.g. Customer feedback comments)
  • too varied in its structure, such as having different types of data in the same column of each row, to be stored as structured data
    (E.g. Social media posts)
  • generated too quickly for traditional database systems to capture it
    (E.g. People tracking systems in stores)

Another distinguishing feature of big data is that unlike structured data, a single row of big data on its own may be worthless and only when it’s combined with thousands of other rows does it provide any value.  Imagine for example traffic flow data from people counters in a physical store.

Knowing the route an individual shopper took maybe irrelevant but knowing the most common route taken on a Saturday afternoon maybe invaluable.

Part of a tweet stored in a more modern big data format

The Challenge of Combining Data

Having now seen the differences between structured data and big data, it’s possible to understand why traditional analytics systems are struggling to collect, store and analyse both these types of data.

The systems that have traditionally provided tables for querying, multi-dimensional cubes for slicing and dicing, and reports for reading about finance, sales or stock activity are now struggling to show how web traffic related to sales revenue, how a voucher mailshot affected store visitors or whether price increases generated negative sentiment on social media.

The Requirements of a Modern Retail Analytics Platform

When designing and deploying modern online retail analytics platforms, we have seen that retailers have three recurring requirements:

  • Being able to query:
    • Real-time transactional data in POS and ecommerce systems
    • Historic transactional data stores in a data warehouse
    • Big data storage services, such as Hadoop platforms
    • Social media platforms, such as Twitter
    • Web services, such as Google Analytics
  • Providing business power users, rather than technical developers, with the analytics tools that let them discover and explore their data, find relationships between data from different sources and then let them use these data models as the source for interactive dashboards
  • Providing store staff and back-office leadership teams with interactive dashboards that show near-real-time data on a tablet

For the retailers still using late-2000s static reporting platforms together with some web based analytics services and Excel to manually integrate all the data, this can sound like a complex and excessive set of asks.  The reality in our experience though is that these are now the norm.  Not every retailer we work with deploys every component, but most deploy most.

The cost of some of the analytics technologies themselves is also significantly less than it was just a few years ago meaning once some upfront development work is done, the operating costs can sometimes be near zero.

Building a Modern Retail Analytics Platform

When building a modern online retail analytics platform, we recommend using a layered solution architecture, such as the one shown below.  This allows several individual technologies to operate independently of each other yet still allows their data to be integrated using data discovery tools.

A typical modern retail analytics platform solution


Data Sources

When building a layer of data sources, it’s important to play to an individual database product’s strengths but also recognise that innovation is happening in this space.  New features in existing database products are starting to blur the lines between what were previously very different technologies.

Operational database engines, which for a long time have been the engine rooms of retailers, now often ship with credible big data querying capabilities built into them. That said, big data platforms, such as those using Apache Hadoop technology, can often be a lot cheaper to deploy and operate than the premium database engines from the likes of Oracle or Microsoft.

A cheaper option is still often the preferred option even if it’s not the better technical option.  Some of the products most commonly found in this layer are: Oracle Database Server, Microsoft SQL Server, SQL Server Analysis Services, Apache Hadoop and Google Analytics.

Data Discovery

Of all the layers in this architecture, this one has seen the most innovation in recent years.  The availability now of data discovery tools that let a business power user visually manage their data has changed how business users see the analytics industry.

Previously, the complexity of finding relationships between data sources and creating logical data models meant it was often a struggle even for analytics developers, regularly having to resort to writing code to get the job done.

Now, drag and drop graphical tools allow business power users familiar with their data to explore their data sets and find and map relationships between data sources.  Perhaps most importantly, this is the layer of the analytics platform where the different data source technologies can join and integrate.

With the right data discovery tool, such as Microsoft Excel or Power BI, or Tableau Desktop, then business users can treat structured data, big data and web service data as peers, removing the technical complexities of knowing how the underlying data is stored. 

The output from this tier is often a logical data model, essentially a list of the relationships between related fields in the different data sources, which when viewed present a single view of the customer or an end to end retail process.

A simple logical data model


Data Visualisation

Once created, logical data models are often analysed by users in data discovery tools, to “slice and dice” them, ask “what if” questions or find trends between multiple data sets over time.  More visible though are when they become the data source for data visualisations — dashboards, charts and reports.

Far from offering static reports, this layer of the solution provides interactive dashboards where users can see top-level KPI indicators then drill down through the sub-levels right through to the raw underlying data that turned an indicator from green to red, or a dial from 65% to 90%.

Tablet-based dashboards are no longer confined to a web browser and are a core capability in the enterprise space.  The leading vendors, Tableau and Microsoft, as well as smaller vendors, now provide their own iOS and Android apps which provide a rich user experience.

A data visualisation tool being used to create a dashboard



How up-to-date the data on those dashboards is depends on how often the underlying source systems can provide new data.  Directly querying an operational source system will always provide the most up-to-date data, but potentially puts an unwelcome load on a business-critical system.

The data warehouse that sits next to it however may only be updated once a day.  There’s no single right way to keep data latency as low as possible, but the trend we are seeing is that inter-day data warehouse refreshes are increasingly becoming the norm while the most time-sensitive data comes straight from the source systems.



The data collected and used by retailers today is very different to just a few years ago, but so are the capabilities of the analytics technologies that are commonly deployed.  Collecting big data is the norm yet integrating it with traditional reporting systems is regularly a challenge.

However, the cost of implementing a modern online retail analytics platform is becoming easier to justify. Leadership teams are now seeing the potential to treat data as an operational asset, rather than just a tool to help them reflect on past performance. 

Take Aways

  • Structured data systems, such as CRM, ERP and POS systems, are an established and mature capability that are likely to remain the revenue generating engines for many retailers for a while yet
  • Find a competitive advantage with big data analytics systems, such as clickstream analysis
  • Analyse structured data alongside big data — The new generation of data discovery tools make it lot easier than it used to be
  • Use data visualization technologies to offer rich and interactive dashboard on mobile devices

Join thousands of other Online Retail professionals

Get unique insights straight to your inbox for free, and improve your understanding of online retail. Subscribe to Online Retail Weekly now.

Webinar Scroll Banner
Join thousands of other Online Retail professionals

Get unique insights straight to your inbox for free, and improve your understanding of online retail. Subscribe to Online Retail Weekly now.

Webinar Scroll Banner