Powering BI on big data across retail, healthcare, telco, financial services and online
One of America's oldest and leading retailer generates close to 60% of its yearly revenue during the 6-week holiday season. Its digital business, worth $4 billion and growing 11% annually, depends heavily on a strong paid search presence between October 30 and December 31. Paid search over this 6-week period involves hundreds of thousands of keywords for online advertising, costs $100 - $1000 per term, and generates millions to billions of rows of online activity data.
A digital pioneer, this retail giant saw the value of big data early, and funneled ad and paid keyword data into a Hadoop cluster to help inform the keyword bidding process. Traditional analytics tools didn't perform as hoped directly on the big data, and so their process began. They would ETL (extract, transform and load) the data into a data mart that tools like Tableau, Excel and Cognos were faster at querying. Analysts would analyze keyword performance and adjust bidding decisions based on the results.
This multi-step process slowed analysts from quickly acting on expense saving and revenue generating opportunities. It could take hours to learn a $100 term was performing better than a $2,000 term, losing millions to wasted expense and missed revenue. The issue wasn't the data, but delays in access due to data movement. They saw a potential competitive advantage if their BI professionals could analyze data as it landed in Hadoop; leading them to consider the AtScale Intelligence Platform.
With AtScale they eliminated the need to ETL data out of Hadoop for analysis. Analysts query and analyze paid keyword activity in Tableau, Excel or Cognos, as the data lands in their big data cluster. Despite querying against massive and growing data generated by keywords clicked (or not) by would-be online buyers, analysts get insights as fast, or faster, than the old data warehouse. AtScale's machine-learning smart-aggregates, which self-adjust based on query activity, mean marketing analysts identify key word opportunities and make bidding changes within minutes, not hours, of the keyword being clicked.
This leading retailer, coupled with AtScale, their big data investments and existing BI tools, is able to deliver better online customer experiences and drive value for the company.
"Adaptive aggregates is possibly one of the most meaningful breakthroughs in this space. We put the AtScale Adaptive Cache technology through a test on our 57 billion rows of data. The results were literally 10 – 20 times faster."
Richard Langlois, Director Enterprise Data
Consumers may remember this search and advertising legend as large books dropped on doorsteps. They've since transformed into a digital business that delivers consumer value online and drives revenue through 200K+ advertisers and 400M+ digital visits per year.
To prove ROI and drive ad renewal, marketing analysts needed to deliver on-demand insight on ad performance, but were challenged with queries and reports running slowly directly on their Hadoop system. IT added a BI ready data mart to aid self-service, but introduced a delay in analysts' access to the data. The process began to look something like this:
Tableau-wielding analysts began circumventing IT to create their own extracts, but introduced governance issues in the process. IT became frustrated with limited adoption of their big data investment which held huge value potential. Marketing and other business groups became frustrated, unable to run key ad activity reports, and advertisers became frustrated with limited and delayed ad results insight. Online activity, and data, grew. Executives, IT, analysts and clients agreed, something had to change.
Capitalizing on existing investment in Hadoop and Tableau, with the addition of AtScale the ad legend delivered analytic dashboard access to advertisers for on-demand insight of respective ad activity data as it landed in Hadoop. Simultaneously, internal analysts run Tableau reports live on the same ad big data, empowering them to catch low performing ads and make ad change recommendations to help advertisers achieve maximum value.
Through secure, consistent and immediate ad insights, this ad veteran cum digital advertiser has increased customer service (to advertisers), improved consumer experience (to their advertisers customers), and driven corporate value through ROI of existing big data and BI. Ultimately, improving the bottom line (expense saving and revenue generating) for all involved.
"AtScale’s no-ETL and no-data movement approach is simply a gamechanger. This application should be required for anyone who wants to do BI on Hadoop."
- Kevin Johnson CEO, Ebates
Processing close to 25% of all US credit card transactions, this global financial services leader has more than 120 million customers worldwide. To better anticipate and identify fraudulent activity, the company sought to analyze credit-card activity in real time; but their traditional multi-hop data architecture could no longer keep up with such hyper-speed analysis demands.
Historically, to analyze activity for risk and fraud related activity across regions, vendors, products, days, periods, currencies, and other dimensions and measures, analytics professionals had to move anonymized credit card activity data into multi-dimensional cubes on a regular basis. This process entailed moving raw card data into Hadoop, processing it via ETL into a data warehouse, structuring the data via SQL Server, and updating a cube via Microsoft Analysis Services. This multi-step process introduced not only recurring 4+ day cube rebuilds, but also obstructed analysts ability to identify and get ahead of fraudulent trends.
To achieve this objective meant being able to analyze transaction level card data in tandem with data collection, which meant all data had to be in the same place at the same time; enter Hadoop + AtScale. Now analysts access and query all the data as it lies in Hadoop, and analyze multiple years of credit-card activity side by side; not simply 1 annual quarter at a time (what fits in a single cube). When data related to fraudulent or risky activity is identified, drilling to credit card transaction detail is immediate; versus an IT ticket requesting transaction detail and a multi-hour or day wait. By removing the need to move data 3 times and wait 4-days to rebuild cubes, they are able to pinpoint fraudulent transaction almost immediately after they occur, and begin to use that information to predict and avoid future fraud.
In addition to faster risk and fraud analysis, eliminating repetitive data moves has meant they are better able to track and prove data lineage. As opposed to risking data interference with ETL and data movement they confidently accommodate regulators by quickly identifying the when, where and whom behind the life of data from transaction to analyzed and reported results.
With real-time, complete, governed, self-service access to all credit card data in Hadoop, this global financial services company is better able to track and update fraud protection algorithms, to deliver consistent protection and services to customers worldwide.
This leading digital eCommerce company provides a fast and reliable online experience through which customers can receive digital cashback offers from retailers across the U.S. Working with 3,000+ merchants, they needed to deliver daily transaction + event tracking reports, across 5TB of usage data (growing to 30TB in 2017) and nearly $4M in cashback transactions per year. Retail merchant decision-makers depend on these reports to adjust offers to align with daily fluctuating market demand.
Report creation entailed moving all online-activity data, offer redemption and other data into Hadoop. Next came ETL (extract, transform, load) processes, taking unstructured data into structured within SQL Server. Finally data landed in 3 separate SQL Server Analysis Services (OLAP) cubes (marketing acquisition, shopping, and merchant); there was so much data it all couldn't fit into just one cube.
Data grew at such pace and volume that delivering the latest day's data via data and cube refreshes began to take 24+ hours. Not only was 24+ hours outside of reporting SLAs for retail clients, but insights revealed were in some cases no longer relevant; consumer desires and demands fluctuate quickly in the online offer market. To maintain pace with consumer behavior and merchant insight needs, this leading eCommerce leader recognized eliminating multiple data writes and data movement was key to faster analytics and reporting insights.
With AtScale, they capitalized on their Hadoop and BI (Tableau and Excel) investments; both tools and skills. Their analytics professionals are now able to execute Tableau and Excel queries without IT moving data out of Hadoop and into a relational database or cube. AtScale's single semantic layer adapts aggregates in response to user queries, so Tableau queries and reports respond at an interactive pace that analysis require. For the first time analysts can analyze current and historical usage data (not just data subsets pulled into cubes); and they do it in less time (aka immediately). They are now finding trends, outliers and opportunities they never did before.
By driving faster performing analytics on data that doesn't have to be moved out of Hadoop, this digital cashback leader supports more offers that better meet end-customer desired cashback demands. They deliver better service to retail clients and a better end-customer experience that truly differentiates them from competition.