Lab automation use-case
Data collection, transfer, preparation, processing and publishing in the context of a pharma laboratory is by definition a complex set of tasks lab operators are likely to get entangled with. This is a challenge that mustn't be understated and can be addressed more efficiently with Lab Automation approaches.
Very often, laboratory teams have taken it to themselves to ease up the sum of their daily tasks by automating bits of their workflows with the tools they already have: for example using Excel Macros, VBA and Python scripts. But this incredible local automation efforts come with the price of having to manually develop and maintain hundreds of custom heterogeneous scripts. Even in mild-turnover team environments, this situation can lead to tremendous time-consuming operations.
Here are the main goals we identified for such an endeavour:
- Allow lab operators to multiply the amount of experiments they can conduct safely in a week
- Free lab scientists from the need to maintain tens of scripts and workbooks so they can focus on research
- Allow scientists to easily automate any lab workflow by themselves from end to end, using a module-based framework
- Allow lab teams to collaborate, share and reuse modules and workflows across sites
- Increase scientists' workflow quality by reducing human-related data manipulation errors
How can Lab Automation with Ryax help a Lab Operator cut his time spent on lab experiments? How can we help him conduct X times more experiments in a single working day? How can we free up their time for them to focus on actual science (and not scripts)? We'll follow along the implementation path of a rather complex laboratory experiment automation using Ryax. Each step on this road aims at keeping the automation's end-user in mind: the Lab Scientist.
Table of contents
1. Automating the collection of experiment data sources
One of the first obstacles to lab automation is getting a continuous and reliable access to experiment data sources. In our situation, we'll make use of the following data inputs:
- Upstream instrument data: mass spectrometers data sets.
- Team-scale archive databases: storing data from previous experiments.
- Team-scale reference databases: storing calibration values needed to normalise incoming raw data.
- Custom experiment parameters files: (mostly .xls and .csv workbooks) edited by lab technicians.
Currently scientists need to manually recover the spectrometer files from a local PC, manually query both databases and download corresponding .csv files, edit their experiment workbook and put all these inputs in the same folder for the Excel Macro to work.
Aiming at speeding up and robustifying this tedious collection process (accounting for hours spent by technicians daily), here are the changes brought with Ryax:
- Upstream instrument data output is redirected to a secure file share where Ryax can collect it directly
- Archive databases are queried on-the-fly by Ryax
- Reference databases are queried on-the-fly by Ryax
- Custom experiment parameters files can be dropped directly into Ryax by lab technicians
This last manual action (dropping a parameter file) can be further automated in the future in different ways, we'll discuss it later on. At this point, here is how the workflow looks within the Ryax platform:
For a detailed walkthrough of the Ryax platform and step-by-step guide on how to build such a workflow from scratch, see our dedicated resources here.
For each data collection point a specific Ryax module was reused (either from the client's module store or from Ryax's generic module store). A few new domain-specific modules had to be developed from scratch because of their technical distinctiveness: in this case, only the "Reference DB query" module had to be built from scratch due to the legacy proprietary format of the client's on-prem database.
To sum up :
- Instrument data is collected using a standard Ryax module that was simply configured (file-share-watcher)
- Archive databases queries are made with a standard Ryax SQL module that was simply configured
- Reference databases queries are made with a new reusable client module
- Parameters files are collected using a standard Ryax form module (with a ffile-drop-field)
Setting up this first part of the workflow allows for complete data collection automation: once the workflow is up and running, scientists will be able to trigger a new experiment simply by inputting their configuration workbook. All other data queries and transfers are handled by the Ryax platform.
And the amount of work getting out of operators' hands goes even beyond data collection in itself. It also covers the whole scope of events that are regularly occuring when dealing with IT and electronic systems. For example:
- If for some reason a database goes offline
- If the file share becomes unresponsive
- If queries' output data formats change
- If a parameter field is missing or incorrect in the triggering workbooks
- ... etc
In case such things happen, Ryax will put the process on hold and alert users so they can point their researches around where the problem actual lays, instead of spending hours to debug a custom script blindly.
2. Automating experiment data preparation & standardization
Now that our access to data sources is secured and reliable, we'll cover the process of data preparation. In our client's context, two distinct steps are considered:
-
Data cleansing: to check and correct possible defects, deviations and inconsistencies in data sources
-
Data normalization: to harmonize data formats coming from heterogeneous lab instruments and bases
Data normalization was actually implemented for the first time along with this automation project, as a long-wanted best practice that was previously absent due to lack of time and resources to build and maintain such scripts by scientists. This step of the process was built into a client-reusable Python module into Ryax.
Before Lab Automation, data cleansing was carried out via a custom Excel Macro reproducing human steps of moving, clearing and reorganizing cells across several workbooks. Resources allocated to the project allowed Ryax to build a client-reusable Python module achieving the same goal in a more robust, speedy and IT-compliant way.
Here is the state of our workflow in the Ryax platform so far:
NB: for more details on how to build custom reusable modules with Ryax, see our dedicated resources here.
In compliance with the lab's rules, we also added a generic module to push cleansed raw data from the spectrometer into the client's SQL database, for achiving purposes. This was done using a standard Ryax module that simply had to be configured.
To sum up :
- The cleansing module is a Python custom-built client-reusable module taking raw data from the spectrometer and cleaning it (it was built from Excel Macros)
- The normalise module is a Python custom-built client-reusable module normalizing raw data from instruments based on standardization keys taken from a client database
- The store clean data to DB module is a standard Ryax reusable SQL module that was simply configured
This second part of our newly-built workflow fully automates the data prepping and normalization process for the rest of the automation to run smoothly. Before Lab Automation, end-users still had to trigger a set of Excel Macros manually and were lacking the data normalization part of the process, resulting in more manual actions needed later on to reuse the data collected by spectrometers.
Along with data prepping automation, Ryax also frees up scientists from having to handle whimsy Excel Macros and other related troubles:
- No more debugging of an Excel Macro the scientist did not develop himself
- No more trouble to have the right workbooks and .csv files in the right place, with the right format, with the right filename, etc
- No more errors due to intermittent data format changes
- ...etc.
3. Automating experiment computation by integrating scientists’ scripts
Now that our data has been ingested and its quality is guaranteed, we'll compute it and generate the resulting instruction file. This step constitutes the core of the scientists' knowledge and is the highest-value part of the automation, where any mistake must be avoided. This is also the section of their workflow they tried to automate the most over the years, resulting in half a dozen of intercorrelated workbooks and VBA scripts.
To process their data today, scientists need to selectively trigger different VBA scripts in different workbooks, waiting for some results to be generated before moving onto the next Macro. The overall process is error-prone, tedious and time-consuming (several hours for a single experiment).
In Ryax, this part of the workflow will take as an input the experiment configuration file given by the scientist, plus the normalized spectrometer data along with the reference data. It then needs to calculate the area under the spectrometer curves, process it using client-specific scripts, and generate an instruction file for an automated dilution system.
As a general good practice, we'll split the computation part of the workflow into 3 distinct Ryax modules. This will help clearly differentiating their functionnal scope making it easier for end-users to assess their meaning, and also help rendering clearer log messages for better overall observability of the workflow.
- "Area under the curve" is a client-reusable Python custom module, calculating curve integrals.
- "Mass correlation" is a client-reusable custom Python module, correlating integral calculations with given experiment parameters
- "Generate instruction file" is a Python custom client-reusable module, generating dilution plate maps to be used in a dilution system.
All of these modules were developed for the need of this project. The "Curve integral" module being generic, it was added to the Ryax module store to be reused in other contexts by other customers. The 2 other modules will remain proprietary and can be reused by the client only.
A glimpse of the looks of our workflow at this stage:
This part of our workflow completely automates the processing of instrument data and the generation of instruction files as a final result.
-
Automating upstream steps of this workflow now enables robust processing upon consistent and reliable datasets.
-
If one step of the workflow fails at some point, Ryax will point directly at it so the scientist knows where the issue is.
-
Intermediate results are available at any step for the scientist to check.
-
There is no more need for the scientist to wait for the workflow to run, since it is now automatically computing the automation by itself.
-
...etc.
4. Sending configuration results to automated equipment, for experiment auto-triggering
Reaching the last parts of our automation use-case, we now have to send the data computation results to 3 disctinct destinations:
-
The dilution machine, to start the dilution process based on the generated instructions
-
The local lab PC for further use in other experiments
-
An on-prem service database for archiving
In the Ryax platform, modules that are used to send data at the end of a workflow branch are called 'publishers'. To deliver results in these 3 destinations, we'll use the following Ryax publishers:
-
Machine API module: next-generation lab systems often come with their own API layer Ryax can plug into to send data. This module is a client-reusable publisher specific to the machine.
-
File-share module: the same as the one we used for data collection, duplicated and reconfigured to send results in a specific shared folder.
-
SQL database module: the same as the one we used for previous database SQL queries, duplicated and reconfigured.
Here's how it looks now in the Ryax platform:
NB: to learn more about how Ryax can interconnect to existing IT ecoystems, see our dedicated resources here.
This last data-publishing step now puts the workflow in full end-to-end automation mode. Upon scientists triggering a new execution by providing their experiment configuration workbooks, the whole workflow will automatically spread until the results are send to the 3 destinations and the actual real-world dilution process is initiated on the machine.
At any step of the workflow, scientists can access intermediate data inputs/outputs, read the log statuses and interrupt the workflow if necessary.
To wrap-up on the achievements of this Lab Automation use-case:
"Having my workflow automated in the Ryax platform, I can now handle 40 samples in roughly 4 hours, instead of 25 samples in a good day"
"In some occurrences I could spend up to 5 days on this experiment, accounting for all the various queries to be made, the tedious file management, and all the waiting in between script runs. Now I just trigger it and let it run by itself. I'm usually done in less than an hour."
"There was always something wrong happening with the Excel Macro automations we had in place in the past, and I often had to spend more time sorting things out instead on making actual decisions with my experiments' results. Now I can focus on the output."
'After' situation
Coming all the way from a sloppy and disseminated scripting environment into a centralized data analysis framework that allows for actionable, reusable and robust automations.
Lab Automation dramatically accelerates lab experiments, in any context, for any team.
More to come soon!
Read about other Ryax use cases
Ryax tackles new use-cases every day.
Tell us about your projects.