Skip links

Web-based AI analysis: Measuring the innovation performance of companies with InnoProb

Robert Dehghan

Web-based AI indicators offer a comprehensive, granular, up-to-date and cost-effective alternative to traditional data collection methods. We at are pioneers in the field of web-based indicator technology. With webAI, we are developing an artificial intelligence that provides our customers with market and company information in real-time.

20. November 2020

Market and company data help stakeholders in business and politics to make informed decisions. Due to the increasing acceleration of societal processes, traditional data collection methods often deliver outdated results since a lot of time passes between collection and analysis. A high manual workload and the associated personnel costs prevent a higher frequency of data collection. As a result, relevant data is often not available in granular form for all levels of analysis and coverage is therefore limited.

webAI Deep Market Intelligence

Web-based AI indicators offer a comprehensive, granular, up-to-date and cost-effective alternative. We at are pioneers in the field of web-based indicator technology. With webAI, we are developing an artificial intelligence that provides our customers with market and company information in real-time. With webAI we search web mass data with high frequency and identify the most relevant information by means of artificial intelligence. The system extracts relevant knowledge and makes it available to our clients in an easily accessible form. We call this “Deep Market Intelligence”. This automated process allows us to answer individual questions, update results daily and realize significant cost savings compared to traditional data collection methods.

webAI InnoProb identifies innovative companies

The InnoProb Score is an example of such a webAI indicator. The score is calculated daily by for all companies with their own website and is therefore available comprehensively and up-to-date. A direct comparison shows that InnoProb results are in no way inferior to the quality of traditional data collection methods. In comprehensive scientific studies1 , our method for the web-based collection of market information scores very well and delivers comparable results to information from traditional data sources. The map below, for example, shows the detailed distribution of innovative and non-innovative companies in Berlin: On the one hand, as predicted by InnoProb (“Predictions” – top right in the chart) and on the other hand, as collected in a traditional company survey (“Survey” – bottom right in the chart).

Illustration: Share of product innovators in the overall local firm population predicted by InnoProb (left). Comparison between web-based InnoProb predictions and survey-based results (Berlin Innovation Panel) of the distribution of innovative and non-innovative companies in Berlin (right).
Source: Kinne and Lenz 2021.1

The survey results shown for Berlin (Berlin Innovation Panel) are a unique feature. Due to the high survey effort, such granular information is only available for very few places in Germany. With the help of webAI InnoProb all companies in all regions can be analyzed with high frequency and innovative companies can be then identified.  Such data is of great value for market analyses, but also for the evaluation of innovation policy measures.  In contrast to traditional survey methods, all companies are evaluated – not just a random sample. Decisions can thus be made on the basis of a highly up-to-date and universally available database.

How does InnoProb work?

The webAI InnoProb innovation prediction model is based on publicly available web texts that companies publish on the Internet. These texts are analyzed and evaluated with a deep learning model using an artificial neural network. Previously the InnoProb network was trained with information from a traditional company survey. In the context of the survey, companies were asked whether they have recently launched new products and can therefore be considered as “product innovators”. Subsequently, the web texts of all participating companies were downloaded using a web scraper, a tool to extract texts from web pages. These texts were then enriched with the information from the survey (product innovator “Yes”/”No”) and used as training data for the artificial neural network. During training, the neural network learns how web texts differ between product innovators and non-innovators, for example in terms of certain words and word combinations. Based on the learned correlations, the neural network can then make a prediction (the InnoProb Score) about the probability of a company being a product innovator. In the figure below the described process is shown schematically.

Illustration: Schematic representation of the InnoProb workflow.
Source: Kinne and Axenbeck 2020.3

What will come after InnoProb?

With a total number of more than thirty presentations, InnoProb received a lot of interest from the scientific community, official statistics, and institutions with an economic policy focus. is now continuously advancing the approach presented here. As a novel and potentially revolutionary method for the collection of market information, comprehensive analyses can be conceptualized and carried out in short periods of time with a minimum of personnel effort. Especially in the context of highly dynamic events, which require comprehensive, but also quickly available and up-to-date data, our approach shows its full potential. Within a short period of time, we were able to adapt webAI to measure the impact of the coronavirus pandemic on German companies and thus identify regional clusters, for example. In a current project, we are working, among other things, on adapting webAI for the identification of companies that use artificial intelligence. Due to the broad field of application of webAI, webAI data are already being used by scientists from various research areas in Germany and abroad. More about this in our next blog articles.

1 Kinne and Lenz 2021:

2  Berlin Innovation Panel:

3   Kinne and Axenbeck 2020 : 

Interested in the data that was used in this case study?

Just book a free demo.