In the data collection game, web scraping drives the most revenue
Tue, 17th Jan 2023

The right data at the right time can make or break the deal for financial institutions. Yet the scope and amount of data the industry is collecting is immense: it’s easy to get lost between the numerous data sources and collection methods. To illuminate these issues, Oxylabs asked key data decision-makers in the finance industry about their focus.

A few months ago, we partnered with Censuswide to survey UK-based and US-based senior data decision-makers from financial services companies with the aim of collecting insights on how they gather their data. The results revealed a clear trend - web scraping is seen as the most valuable external data collection method, directly connected to company revenue generation.

26% of those surveyed believe that web scraping has the greatest impact on revenue out of all alternative data collection methods. It also came just behind internal data collection, which is seen as the most important revenue driver by 28% of respondents. Meanwhile, manual data collection and third-party traditional data aggregators (for example, statistics reports) only collected 12% of all votes each.

Shifting priorities

These numbers reveal a clear shift in the finance industry. Long-dependent on internal data and using only small amounts of external data to support it, the industry now treats internal and external data as equals. Used in combination, they provide the most accurate picture required for business decisions, specifically - evaluating companies, discovering future investment opportunities or managing risks.

Even in its young days, the finance industry was one of the earliest adopters of web scraping and external data collection. However, the growth we are witnessing in the past few years is exponential. Financial companies are investing significantly more into external data gathering methods and collecting more data in general.

Our survey reveals that 71% of financial services companies use alternative data collected via web scraping in their operations. Internal data, which used to be the dominant data acquisition method, is used by 74% of companies. The close tie reveals that web scraping is seen as indispensable for those who base their decisions on data. Internal data, while still inevitable, is not sufficient on its own anymore.

Racing for unique findings

The most obvious driver of the change is the enormous competition. Companies are fighting for original insights to make their business decisions more accurate by including new data sources in their pipeline. As the scope of information publicly available online is now enormous, it’s only up to the company’s creativity to use it to its advantage.

Another trend fostering the growth of web scraping is the changing investment priorities, for example, ESG. Using it requires a diligent review of a large criteria set. While in most cases it wouldn’t be possible to do manually, the online presence of most companies allows automating it with web scraping.

Web scraping allows for finding previously undiscovered information that could provide important context. It also allows adding up different pieces of seemingly different information into one bigger picture. For example, weather reports, job postings and satellite imagery data could be added into one pile if a data decision-maker knows how to combine the different findings they provide into valuable business insights.

As web scraping technology is now used almost by default in the finance industry, the main differentiator in the game of competition is not the technology itself - it’s the human factor. Even in cases when two competitors are scraping the same websites, they might be focusing on different parts of information. It’s up to the data decision-maker in each company to find niche sources, diversify them and extract the right insights from it.

UK and US treat data collection differently

Two global financial centres - UK and US - have some differences in their approach to data gathering, according to our survey.

In the UK, web scraping ranked as the leading collection method with the greatest impact on revenue - the sentiment was shared by 25% of respondents. Meanwhile, it took second place (27%) in the US, ranking behind internal data, which gathered 37% of votes.

Surprisingly, UK companies were significantly more likely to choose “None of the above” (14%), hinting at other, publicly unknown data collection methods. I would explain it by the tendency in the UK for companies to collect data from a wide variety of disparate sources without provisioning it from an aggregator.

Despite these minor differences, both markets share the same attitudes towards the future - web scraping is the technology to focus on, and the investments directed to it speak for themselves. Our survey reveals 80% of companies plan to focus on web scraping even more in the upcoming year.