How to do a good Big Data Proof of Concept

At the origin of any Big Data project completed, there is a proof of concept. This essential step allows us to assess the feasibility of an idea but contains many pitfalls. In order not to lose sight of the goal - that of putting into production - realism and efficiency are essential. In this article, we give you the keys to achieving a good proof of concept (PoC) in Big Data.



What is a proof of concept and why is it useful?

Let the specialists be reassured, the aim here is not to fuel the debate on the subtleties of the definition of proof of concept. If many tend to assimilate proof of concept and prototype, there are differences. Nevertheless, they are of limited interest under this section.

The proof of concept is one of the first steps in the development of a Big Data project. It is at this time that the feasibility of the project is evaluated from different angles including time and money to invest as well as the added value of the project. Many projects will be halted at this stage and that's all for the better. If a Big Data project is doomed to fail or does not meet expectations, it is best to limit the damage as early as possible in the process.

Many observers taunt the failure rate of Big Data projects. The small percentage of PoCs that actually get until production is often referred to in a negative way. This is an error of reasoning. Finding the right Big Data project that meets the needs of the business and has real added value often requires several PoCs and that's normal. This step is critical to assessing the value of the project. On the other hand, it is crucial to be able to carry out a proof of concept effectively in order to avoid wasting time and to draw the necessary conclusions as quickly as possible in terms of viability. The attractiveness of proof of concepts is also crucial to convince your customers, whether they are internal or external to the  company.

How does a good PoC look?

Designing a good proof of concept is sometimes a freestyle exercise. Indeed, there is no standard in this area. A good PoC must convince of the added value of the project and thus encourage decision-makers to devote time and money to its development.

When it comes to big data, a PoC will usually present a practical case on the test database to demonstrate the effectiveness of the algorithm or AI in question. In the absence of a standard, several characteristics or elements can still be considered in order to put all the chances on our side.

Here are some recommendations for establishing a good proof of concept:

  • Ensure that the project meets a business need and has real added value. It is interesting to pitch the project to some decision makers in order to get a first opinion. It also involves demonstrating the value of a data-driven solution rather than another alternative;
  • Involve different departments or teams to ensure the project's transversal benefits;
  • Set a time horizon to finalize its proof of concept and don’t get lost in the search for perfection. Always keep in mind that the goal is to demonstrate the value of the project. Biases are acceptable at the proof of concept stage as long as they are known and can be controlled  thereafter;
  • Take the time to develop a cost-benefit analysis for the company by eventually involving the product and marketing teams. Most of the time, financial considerations will hold decision-makers back. It is therefore necessary to convince of the added value of the project over a medium and long term horizon. Be realistic in your budgets. It is also at this stage that we need to ensure that the data needed to make the model work. Also consider evaluating the model's training time to ensure an acceptable result. Don't forget to estimate the cost of collecting and organizing the data;
  • Do not neglect the presentation of the proof of concept. The PoC often has to convince in a few minutes. Even if aesthetic or design issues seem trivial when presenting your idea, you need to spend time on it. Keep in mind that you must convince different types of stakeholders and don't make an overly technical presentation.

Some mistakes to avoid as part of a proof of concept

Most of a PoC's classic errors have already been mentioned above. The main trap remains to find yourself stuck in one's own idea and to be unable to take a step back from the merits of his project. Often, simpler solutions exist to solve the problem without necessarily requiring a Big Data model. It is therefore crucial to be critical of its concept and to ensure that there are enough external feedbacks to avoid disappointments.

The other classic mistake is to lose sight of the customer aspect, whether internal or external. It has been said that proof of concept takes on various forms and a more or less successful character. It is your responsibility to adjust your PoC to suit the people you will need to convince. Know how to evaluate their priorities and what will make a difference to their leader. If you are not a born salesperson or speaker, surrounding yourself with people with the right skills can help you submit your idea. In 2020, neglecting internal policies within the company or your external customers by imagining that your idea will sell for itself is a gross mistake.

What are the steps after proof of concept has been validated?

Once your PoC made gained unanimous consent, and you have the green light to deploy the project, above all, don't lose the momentum! If this has not yet been done, now is the time to quickly develop the prototype and then begin the production phase.

To do this, you have different options. It is of course possible to create your data architecture in-house or with the help of external data engineers to create a clean system.

It is also possible to use industrialization software. In addition to their expertise, the advantage of these on-demand software is that once mastered, they can be reused in other projects. Indeed, data model industrialization software offers great flexibility.

This is the case with the software designed by the Ryax teams that allows you to industrialize your data through our intuitive and scalable platform. If you would like a demonstration or to know our product better, please contact us.

The Ryax Team.