Startup Pitch: The Data Cleaners


Data preparation for machine learning applications is complex and laborious. WhatToLabel has developed a solution to automate the preparation process. For the first PoC the team was able to win a Fortune 500 company and even during the Corona crisis new projects were added.


Machine Learning is a very hot technology trend. But everyday life often looks less glamorous. Machine learning specialists spend 80% of their time collecting,cleaning, and labelling data. This activity is not very popular and there are no tools available to support the process. Engineers would prefer to focus more on designing and training models. With the solution of WhatToLabel they can do exactly this. The product supports data preparation and helps to filter, clean, and optimize datasets. In addition the solutions enables the customers to use more of their massive amounts of data that cannot be used without this process.

The problem of complex data preparation is as new as machine learning itself. There are only a few players in this field and most of them are small startups out of research labs. However, in most cases the potential customers of WhatToLabel have developed software for the problem themselves. Compared to these solutions, the product of the ETH and HSG spinoff has two clear advantages. "Firstly, we work with raw and unlabeled data such as images, which nobody else does," explains co-founderIgor Susmelj, "and secondly, we can leverage know-how from very different application areas.

This is very well received. WhatToLabel works for numerous companies in a wide range of fields from autonomous driving to visual inspection, and medical imaging. The pilot customers come from Switzerland, other European countries and the USA. "Even during the lockdown, we were able to attract new companies," explains Igor Susmelj. The company benefits from the advantages of their solution, in particular saving costs and making the process more efficient as well as from their Software-as-a-Service model. "SaaS especially in this tech-savvy field can be sold remotely very well," says Igor Susmelj.

Several MVPs to find the customer pain

The success with the Data Preparation Platform did not come overnight. "We went through several pivots until we discovered a problem that companies take so serious that they are willing to pay money to solve it," says the founder. Before, the team time and again developed MVPs and tested them on the market. And they did so at a high pace. It did not take longer than two to three months from the idea to the test. The team's biggest surprise was a solution for detecting fake videos they developed in late 2018: "Although this is a much-discussed topic, we couldn't find anyone who would pay for such a solution," recalls Susmelj.

The Startup was originally founded as Mirage Technologies AG in 2018 with a different idea and team composition. Today, four people work for WhatToLabel, in addition to ETH graduates specializing in deep learning, there are also businesspeople with HSG degrees and consulting experience. The start-up is both an ETH and an HSG spinoff.

In 2019, the first proof of concept project started with a Fortune 500 company from Germany working on autonomous driving. Since then, the team has collected feedback from the partners and has also actively worked on finding out the real customer needs with the help of a white paper. A first round of financing is now planned to enable the next step and allow the start-up to gain a foothold in the market. Igor Susmelj describes one of the most important goals as follows: "We want to reach the CHF100,000 recurring revenue threshold as soon as possible.”

Apply for the next Startup Pitches

This is the third article of our start-up pitch series. Until the summer break startupticker in collaboration with Swisspreneur will introduce Swiss start-ups on the search for investors. Find out how to apply in a separate article.

(Stefan Kyora)