What do we measure?
The webAI Digital Layer Agent captures hyperlinks between companies’ websites to map their interconnectedness. Companies link to the websites of other economic actors for various reasons. For example, companies often name reference customers on their websites and then usually also set up a hyperlink to their website. In addition to quantitative measures (number of links, number of partners, etc.), webAI also analyzes the texts associated with the hyperlinks in order to make a qualitative assessment of the existing hyperlinks. This qualitative assessment includes the “cognitive”, “organizational” and “geographical” proximity between the companies connected by a hyperlink.
The “cognitive proximity” describes the similarity of the companies in terms of their communicated website content as a reflection of their knowledge bases. If the knowledge bases of two actors are too far apart, they could learn a lot from each other, but in the worst case the distance is too great for fruitful cooperation. If the knowledge bases are too close, communication might be easy, but the knowledge exchanged hardly differs. Traditionally, cognitive distance is approximated by industry classifications. This assumes that only companies from the same or a very similar industry share a common knowledge base. In fact, however, a mechanical engineering company specializing in the construction of chemical plants, for example, might have a similar knowledge base to a traditional chemical company, but a “distant” industry classification. WebAI Digital Layer, on the other hand, would detect the proximity of these two companies and reflect it in a high “cognitive proximity.”
“Organizational proximity” in the webAI Digital Layer approach describes whether linked companies are in a business relationship or not. A high organizational distance reflects that companies with a high probability do not have a business relationship. Instead, they may be purely “informational” relationships, such as when a daily newspaper reports on the success of a local business and provides a hyperlink to the local business’ website in the corresponding article. Hyperlinks from law firms to bar associations and hyperlinks in imprints to external legal texts also fall into this category. Business relationships, on the other hand, include hyperlinks to business partners, reference customers, suppliers, service providers, and companies in the same business group.
“Geographical proximity” describes the geographical distance between the company locations of the linked companies. It provides information on whether a company tends to network locally or nationally.
How do we measure?
Our webAI reads the website of the examined company and searches it for hyperlinks that target other websites (domains). Thus, webAI searches the entire corporate website, or the particularly relevant “top-level” sub-webpages, if very extensive websites with hundreds of sub-webpages are involved (for more information, see the publication: Web mining for innovation ecosystem mapping: a framework and a large-scale pilot study).
WebAI thus finds a certain number of hyperlinks per company website that point to external websites. Then, webAI merges this information with content from all other corporate websites that were also searched. Thus, webAI identifies not only “outbound” hyperlinks, i.e., those pointing from Company A to Companies B, C, and D, but also “inbound” hyperlinks, i.e., hyperlinks from Companies B, E, and F pointing to Company A. From this information, webAI calculates the following variables that provide information about the interconnectedness of the company:
- Incoming links: The destination domains of all outbound hyperlinks from a company’s website.
- Incoming links count: The total number of target domains of all outgoing hyperlinks from a company’s website.
- Outgoing Links: The origin domains of all incoming hyperlinks pointing to a company’s website.
- Outgoing links count: The total number of origin domains of all inbound hyperlinks pointing to a company’s website.
- Links: Variable (1) & (2) and (3) & (4) combined.
- Links Count: The number of domains from (5).
In addition to these quantitative variables, webAI Digital Layer also calculates the following qualitative indicators that provide information about the nature of networking:
- Organizational proximity: the organizational proximity between this company and all linked partners from column (5) “Links”, which webAI also analyzed. Based on the texts of both partners, webAI determines the probability of a business relationship. Relationships that have a high probability of being business relationships have a value close to 1.0, while relationships that are unlikely to be business relationships have a value close to 0.0.
- Mean Organizational Proximity: The mean value from (1). This indicates whether a company is more likely on average to have business relationships (value close to 1.0) or not (value close to 0.0).
- Cognitive proximity: The cognitive proximity between this company and all linked partners from column (5) “links”, which webAI also analyzed. WebAI calculates this proximity based on the text similarity of linked company websites. Companies with identical texts receive a “Proximity” of 0.0, whereas companies with theoretically maximally dissimilar texts would receive a value of 1.0.
- Mean Cognitive Proximity: The mean value from (4). This indicates whether a company has, on average, rather similar partners (value close to 0.0) or not (value close to 1.0).
- Geographical Proximity: The geographical proximity between this company and all linked partners from column (5) “Links”, which webAI also analyzed. This is given as the distance between the company locations of the linked companies in meters.
- Mean Geographical Proximity: The mean value from (5). This indicates whether a company tends to have geographically close partners on average (value close to 0.0) or not (high values).
How do you interpret the data?
The data collected with webAI Digital Layer maps the hyperlink connections between the examined company websites. This relational data can be examined for more complex analysis using specialized software for network analysis. For example, centrality measures or cluster detection algorithms can be applied. However, the data provided also allows valuable insights into the networking of individual companies without further analysis.
For example, a glance at the “Links Count” column reveals how many other companies the company under consideration is connected to via a hyperlink. This variable combines both all outgoing links from the company under consideration and all incoming links from other companies. As an example for the DACH region (Germany, Austria, Switzerland), it can be said that companies have an average “Links Count” of 13.1 (median 6.0), with 9.4% of companies having no hyperlinks to other companies at all. The maximum achieved value for “Links Count” is 559,203, which is google.com. Depending on the nature of the question, it is recommended to remove such outliers from the data.
Qualitative insights into the networking of companies are provided, for example, by the “Mean Geographical Proximity” column. This indicates how far away (in meters/kilometers) the companies networked with the company under consideration are on average. It should be noted here that only the distance to companies from the webAI database are calculated. On average, companies in the DACH region are 284.4 km away from their hyperlink partners. About 0.3% of companies in the DACH region have an average distance of 0.0 km to their partners, which means that they share a building (or postal address) with their partners.
How do we ensure the quality of the data?
Like all our webAI agents, the webAI Digital Layer was developed and validated together with independent subject matter experts. This way, we ensure that the results of the webAI agents are validated in the context of external studies at a high scientific level. For the development of this agent ISTARI.AI collaborated with ZEW – Leibniz Centre for European Economic Research in Mannheim, the University of Salzburg and the Technical University of Berlin. In addition, the results of the agent have been used in scientific studies, including by researchers at the University of Groningen in the Netherlands and in a joint study with researchers at ETH Zurich. The latter study was awarded the Best Paper Award at the renowned “R&D Management Conference 2022”.
Interested in our data?
Purchase this and other datasets for hundreds of regions in our Data Market.