RYAX APPLIED TO RECOMMENDATION ENGINES

The right recommendations to grow your business

Thriving businesses usually have in common that they know all about their customers' habits. Harnessing this knowledge to generate relevant and appropriate product suggestions can lead to tremendous growth opportunities. With the help of automated recommendation engines, retailers can systemize this approach and establish new sustainable revenue streams.

What we deliver for your recommendation projects:

  • rocket-gradient copy

    Build new profitable business models

    Advanced high-value recommendations to open new profitable endeavours.

  • pipeline-pink

    With top-of-the-line recommendation engines

    AI/ML/DL/Neural data science models to serve the right service/product to your customers.

  • automate-pink

    Automating data prep, engine training and run.

    Self-running pipelines providing continuous performance improvement.

  • data-growth

    Delivered in days, not months

    Our community of experts leverages Ryax's platform to build better recommendation engines, faster.

Recommendation use-case

We're considering a use-case that can be flagged as quite common across various product or service dealers. We'll take advantage of any of the existing datasets within the dealers' environment to build a self-running recommendation pipeline that has the ability to improve over time.

reco-biz-use-case-web-stepf

Our typical pipeline ingests as much relevant data possible to feed the recommendation engine. Interesting data sources include: products/services catalog, products/services content & assets, customer data & profiles, as well as any dataset sourced from external gateways such as social networks, open weather databases, etc.

The pipeline will automatically connect to these sources when available, clean & transform datasets and feed them to the engine. The recommendation algorithms will then generate customer-specific suggestion lists and deliver them to a target system (very often: a website or a CRM solution).

Here are the main goals we identified for such a project:

  • Allow vendors to substantially increase the average spend per customer by triggering preference-based purchases

  • Foster customer loyalty by suggesting surprisingly relevant offers

  • Increase customer knowledge by closely monitoring customer response to served recommendations

  • Allow vendor's business analysts to easily manipulate the results of their recommendation pipelines

  • Free up IT teams from hours of development by fully automating Machine Learning pipelines from end to end

Let's now dive a bit deeper into what these recommendation pipelines are made of.

Table of contents

1. Automating the collection of business data sources

reco-tech-use-case-web-step1
One of the first challenge to establishing recommendation pipelines is getting a continuous and reliable access to relevant data sources. In our situation, we'll make use of two different types of data inputs: 
  • Internal data: generated & operated by the vendor, they include customer data, product & services catalogs, etc 

  • External data: generated outside the vendor's environment, they can be accessed via dedicated gateways (most often APIs) and used to refine recommendations

There's currently no way for the vendor to automatically collect and aggregate information from these various data sources into a single standpoint.
Here is how to do it with Ryax: each of the data entry points are accessed using Ryax modules (either built-in, or - in the case of very vendor-specific data sources - developed for this purpose). Each of this module is set up with the right credentials to either query or listen to a data source: these data flows can then be used in any downstream step of a workflow. At this point, here is how the workflow looks within the Ryax platform:
At this point, here is how the workflow looks within the Ryax platform:
UC-labauto-datacollec
NB: for a detailed walkthrough of the Ryax platform and step-by-step guide on how to build such a workflow from scratch, see our dedicated resources here.
For each data collection point a specific Ryax module was reused (either from the client's module store or from Ryax's generic module store). A few new domain-specific modules had to be developed from scratch because of their technical distinctiveness.
Setting up this first part of the workflow allows for complete data collection automation: once the workflow is deployed, it is able to query the relevant data from these sources.
And the amount of work getting out of the vendor's hands goes even beyond data collection in itself. It also covers the whole scope of events that are regularly occuring when dealing with IT and electronic systems. For example:
  • If for some reason a database goes offline

  • If the some file systems become unresponsive

  • If queries' output data formats change

  • If an externally sourced dataset becomes inconsistent

  • ...etc.

In case such things happen, Ryax will put the process on hold and alert users so they can point their researches around where the problem actual lays, instead of spending hours to debug a custom script blindly.

2. Automating data preparation & standardization

reco-tech-use-case-web-step2
Now that our access to data sources is secured and reliable, we'll cover the process of data preparation. In most contexts, two separate steps are considered:
  • Data cleansing: to check and correct possible defects, deviations and inconsistencies in data sources

  • Data normalization: to harmonize data formats coming from heterogeneous sets

Data normalization will vary depending on the client's environment, and is usually implemented as a best practice to strenghten the workflow's robustness. It ensures that data is ingested in a reliable way and minimizes risks of failure for computations occuring downstream. This step of the process was built into a client-reusable Python module into Ryax.
Here is the state of our workflow in the Ryax platform so far:
uc-labauto-dataprep2
NB: for more details on how to build custom reusable modules with Ryax, see our dedicated resources here.
In addition to this process, we added a generic module to store raw clean data before being fed to the recommendation pipeline. This serves archiving purposes, and was done using a standard Ryax module that simply had to be configured.
This second part of our newly-built workflow fully automates the data prepping and normalization process for the rest of the automation to run smoothly.
Along with data prepping automation, Ryax also frees up vendors from having to handle whimsy Excel Macros and other related troubles:
  • No more debugging of scripts the user did not develop himself

  • No more trouble to have the right datasets in the right place, with the right format, with the right filename, etc

  • No more errors due to intermittent data format changes

  • ...etc.

reco2-verbatim-banner

3. Automating recommendation training & computation

reco-tech-use-case-web-step3
Now that our data has been ingested and its quality is guaranteed, we'll be able to use it to train our recommendation model. This step is most critical, since the whole pipeline relies on a savvy algorithm to generate appropriate recommendations to the vendor's clients.
In Ryax, this part of the workflow will take as an input pre-cleaned / pre-normalized datasets stored on AWS, and use it to feed the neural network model the Ryax Community has developped previously. Our Machine Learning model uses an unsupervised learning system: meaning that the machine will correlate customers and product offers by itself, with no prior examples to rely on. This is useful in the context of a vendor that has no past experience with customer / offer matching, as well as for vendors seeking to detect hard-to-imagine associations.
As a general good practice, we'll split the training part of the Ryax workflow into 3 distinct steps. This will help clearly differentiating their functionnal scope making it easier for end-users to assess their meaning, and also help rendering clearer log messages for better overall observability of the workflow.
  • Gather training data is a generic Python custom module, simply querying AWS for the latest training dataset

  • Model training is a client-reusable custom Python module, triggering a learning process upon ingested datasets

  • Upload model data is a generic Python reusable module, pushing datasets to a configurable AWS storage system.

Only one of these modules was developed for the need of this project (the "training module"), the others being generic and available in the Ryax store for further use in any kind of workflow. The training module will remain proprietary and can be reused by the client only.
A glimpse of the looks of our workflow at this stage:
UC-labauto-dataprocess
The frequency at which the algorithm needs to be trained depends on the vendor's objectives and data volumes. Often, new training is triggered when input datasets have reached a given threshold of newly-available data. This ensures that:
  • The algorithms keeps on improving over time, delivering the most relevant results possible

  • The model stays updated on fresh data, reducing the risk of irrelevant correlations

Now that we have a trained algorithm ready to be used, let's execute it:
reco-tech-use-case-web-step4
This step consists in a separate leg of the recommendation pipeline: it is indeed often preferable to separate "training data" from "inference data", since some vendors may want to use 'frozen', dedicated high-quality datasets for training purposes only, while using 'live' operating data for day-to-day algorithm executions.
To execute the latest-trained model upon any given dataset and generate lists of all customer / offer recommendations, we'll use a 3-step workflow leg:
  • Retrieve latest recommendation model: is a generic Python module pulling the algorithm's data from a given AWS storage

  • Model inference: is a client-specific reusable module that runs the algorithm upon the given input data set. It generates a ranked list of client / offer recommendations

  • Store recommendations: is a generic Python module pushing the result list onto an SQL database

Here is a big-picture view of our workflow so far:
UC-labauto-dataprocess
This part of our workflow completely automates the process of training algorithms, and computing operating data to generate recommendation lists:
  • Automating upstream steps of this workflow now enables robust processing upon consistent and reliable datasets.

  • If one step of the workflow fails at some point, Ryax will point directly at it so the vendor knows where the issue is.

  • Intermediate results are available at any step for the users to check.

  • ...etc.

4. Automating recommendation delivery

reco-tech-use-case-web-step5
The last - and very important - part of our automation use-case consists in serving the recommendation results to the customer. There are different ways to do so:
  • Asynchronous delivery, where recommendations are served to customers using emails, postage, scheduled app notifications, etc.

  • Live delivery where recommendations contents are delivered to customers while they interact with a platform, using instant channels such as website banners, forms, chatbots, in-basket suggestions, etc.

For this use-case we will consider the most "time-constrained" delivery system: live deliveries on a retail website. In this context, time-to-recommendation is critical since vendors need to serve the right suggestion at the right time to maximize customer adoption.
In the Ryax platform, modules that are used to send data at the end of a workflow branch are called 'publishers'.
  • Machine API module: next-generation lab systems often come with their own API layer Ryax can plug into to send data. This module is a client-reusable publisher specific to the machine.

  • File-share module: the same as the one we used for data collection, duplicated and reconfigured to send results in a specific shared folder.

  • SQL database module: the same as the one we used for previous database SQL queries, duplicated and reconfigured.

Here's how it looks now in the Ryax platform:
UC-labauto-datapublish
NB: to learn more about how Ryax can interconnect to existing IT ecoystems, see our dedicated resources here.
This last data-publishing step now puts the workflow in full end-to-end automation mode. Upon scientists triggering a new execution by providing their experiment configuration workbooks, the whole workflow will automatically spread until the results are send to the 3 destinations and the actual real-world dilution process is initiated on the machine.
At any step of the workflow, scientists can access intermediate data inputs/outputs, read the log statuses and interrupt the workflow if necessary.
To wrap-up on the achievements of this Lab Automation use-case:

"Having my workflow automated in the Ryax platform, I can now handle 40 samples in roughly 4 hours, instead of 25 samples in a good day"

"In some occurrences I could spend up to 5 days on this experiment, accounting for all the various queries to be made, the tedious file management, and all the waiting in between script runs. Now I just trigger it and let it run by itself. I'm usually done in less than an hour."

"There was always something wrong happening with the Excel Macro automations we had in place in the past, and I often had to spend more time sorting things out instead on making actual decisions with my experiments' results. Now I can focus on the output."

verbatim-banner-framework1400

'After' situation

Coming all the way from a sloppy and disseminated scripting environment into a centralized data analysis framework that allows for actionable, reusable and robust automations.

Our references in retail, art and hospitality

  • posaly

  • circle-arts

Read about other Ryax use cases

  • Industry

    Ryax offers a data engineering platform that facilitates the creation and deployment of Industry 4.0 data analytics workflows, from industrial automation to predictive maintenance.

  • Mobility

    Ryax addresses mobility challenges through its data engineering platform by enabling a seamless development, deployment and monitoring of workflows in hybrid edge-cloud computational environments.

  • Smart Agriculture

    Ryax is a software that can address algricultural issues with a platform allowing data scientists to create, deploy and manage data analytics workflows simply, by abstracting the complex data engineering plumbing.

  • Pharmaceuticals

    Thanks to its abilities to orchestrate complex data processing over distributed infrastructures, Ryax can seamlessly address lab automation projects, AI-powered nano-molecules research or drug discovery endeavours using machine learning.

Ryax tackles new use cases every day.

Tell us about your projects.