Proxies beyond Internet Privacy

//

Svilen Madjov

As digital technology propels new forms of market demand, Internet users around the world have acquired the taste for new and original content. In most cases, the pursuit of exciting content disregards user location and access rights.

At the same time, a stable and growing share of tech-savvy consumers seek ways to maintain higher privacy levels. While legality is rarely the case, most are wary of attempts by businesses and authorities to interfere with their free and informed choice, as well as personal preferences.

The apparent incompatibility of the above two major motivators is quickly settled by a solution that has already seen a massive share of users employ it regularly – proxy servers for web connection or specific content access. Whether directly setup or software-assisted, proxy usage has seen a constant rise in popularity, both for personal and business purposes.

Proxies serve infinite cases and scenarios – avoiding restrictions or blocks, content filtering, download-upload screening or simply a general state of (relative) anonymity. This paper presents an overview of proxy usage basics and assesses the main trends that affect the global proxy market.

How Are Proxies Changing the Web?

The proxy server is simply another computer which serves as an intermediary machine in connecting to the web. Requests are processed on a case or rolling basis, with the majority of proxy servers representing a hub for a number of connected computers. The alternative is a dedicated service (which is explained below) but with identical go-between functions.

The bulk of proxy connections are rather plain, with third-party access not kept out and, therefore, remote monitoring still a possibility. Encryption standards, however, raise a further protection wall. In general terms, HTTPS encryption goes a long way towards protecting from external monitoring in general website access and other standard use cases. HTTPS protocols are also a bit cheaper and faster to run. Alternatively, SOCKS encryption adds a further layer of security as the protocol does not interpret data and can give access to firewalled and traffic-intensive communication and resources.

Many users are also familiar with the benefits and limitations of accessing the web via a Virtual Private Network (VPN). Encryption and support are (generally) higher but so are prices and subscription requirements. Further clarifications on differences between a proxy and VPN service are provided below.

Widely proclaimed as an indispensable shield supporting privacy and anonymity, proxies turn out to be more useful in a pro-active internet behavior rather than a “defensive” one. Personal and business use cases, as detailed further in this report, point to an expanding applicability of proxies for workentertainmentresearch and much more. Marketing and social media objectives quickly come into mind, as proxies give access to resources and insights that would otherwise be off-limits.

More importantly – and for whatever reasons consumers and enterprises chose to connect via a proxy – the global trends is unmistakable. The share of users who have regularly passed through a proxy is beyond the quarter of world totals, as shown by Statista.com data.

Share of online users using an IP-masking service by Region, 2018:

Although the above statistics milestone is from pre-Covid times, it is rather indicative of the perception and usage patterns of proxies around the world. Another clarification is due: tools that cover IP addresses and geo-sources are all placed under the same denominator – in industry publications, research, as well as this paper – as proxies. A proxy service, tool or connection might be delivered via VPN, an individual or enterprise proxy server setup. It might be free, paid, self-installed or assisted. For the purpose of obtaining a thorough understanding of the market, these will all be addressed as proxies.

More than a quarter of internet users use a proxy, therefore, and that has been the case for them at least once in the last month. It is not a surprise to see that figure remain fairly stable for the past decade and we will see further updates when presenting our market projections. Crucially, based on almost 5 billion internet users globally, that means roughly 1.27 billion users interested or actively using proxy services. At the least, these are one billion.

Referring to the same period in time (2017-2018), Statista reveals the online markets with the higher usage penetration. As we have seen above, the Asia Pacific region is stably above the global average in proxy usage and the below graph shows just why:

Leading proxy markets (user penetration), 2018. Source: Statista

Significant markets – by size, purchasing power or economic potential – like IndonesiaIndia and China are all above 30% in user adoption of proxy services. Turkey, lying between Europe and Asia, is also a particularly strong market and we will see again. Other emerging markets are also around or above the 25% mark.

These figures are indicative of the immense number of users who opt for an IP disguise. Many might use proxies via a software-as-a-service subscription, most do it by trying to connect to free or more affordable proxies. The over 100 thousand respondents of the apposite global online survey have all been in the 16-to-64 age group.

This geographical usage pattern might be interpreted not only for user needs but also limitations that the market or the local consumption culture poses to internet usage. However, this requires a level of understanding and representation of local socio-political realities that is superfluous to our study.

Implications are also multifaceted, particularly for businesses that end up on the “receiving end” of IP masking. Digital consumers that use a proxy are incorrectly geo-located by mainstream passive analytics, unless the website uses active proxy-revealing tools. In any case, with more than a quarter of all users (and a third of Asians) hiding their activity, this alters website traffic measurement significantly.

In the end, there can be a number of reasons and motivations to go down that road. Surprisingly to some, achieving web anonymity is not the leading cause. It has long been overtaken by the need to avert restrictions and blockages and access particular content.

Proxy usage has also become quite relevant in the business world, as companies take advantage of automation and machine learning increasingly over the past few years. Proxies are an essential requirement for web data extraction on a commercial scale. The so-called web scraping service is also addressed in more detail in this report.

What’s the Difference between Proxy and VPN

Having established the importance of proxies on a global scale, we need to take a step aside and eliminate any ambiguity between VPN and Proxy connections. Admittedly, on the supply side, these may seem contrasting in their function and scope. And, depending on the precise wording of terms, search volumes for VPN might be higher up the SERP list in particular markets.

However, a VPN is fundamentally a software service (client, SaaS or an otherwise setup access) that helps users connect to a proxy server. All internet protocols are compatible for both, and consumers or businesses have to choose among criteria such as speedtraffic volumes and server availability.

In the end, VPN users end up trusting such services because of their higher security and privacy needs. Encryption is elevated, anonymity is preset and taken care of professionally, while end-users need to do less on their end. Clearly, the extra service translates into higher costs for consumers.

Either one can claim to be faster but that depends on a number of technological and even local factors. Private proxies (see residential below), however, have a proven track record of higher speeds than those available via VPNs. Proxies tend to be used for rerouting traffic coming from a browser or a specific app. Thus, they end up being more responsive, easier and faster to use on an application basis. Businesses and individual researchers and professionals use them for automating a number of projects: from web data scraping to social media and various bot tasks. Being light, fast and uninterrupted is essential for such objectives and proxy network connections check all the boxes.

Once again, for the purposes of this publication, we ultimately designate both kinds of connections with the same appellations – proxy services or proxies.

Why Are Proxies Increasingly Popular?

Proxy usage inevitably ends up impacting some characteristics of online markets and their perception on behalf of businesses. Put simply, proxies might distort some of the relevant statistics and commercial insights (e.g. traffic, demand and other indicators). Then again, what is working in favor of online enterprises is the fact that big data tends to stabilize and even out outcomes in the long run, particularly for mainstream products and services.

One thing begins to transpire from what we have presented so far – there are two major underlying motivations for considering and using proxies. Anonymity and content access are not only driving proxy demand, they are the main variables that need to be taken into account when performing market analysis.

These “needs” are met by the millions of proxy servers operating at any given time. Understandably, most end-users like to start their experience with an open proxy, a server freely accessible by anyone that has its coordinates and parameters. No rules and requirements, on the other hand, lead to a huge amount of IP addresses handled by the same proxy servers, slowing speed and leaving users vulnerable to third-party interference.

Enterprises or individual professionals who employ proxy services tend to avoid free and lower quality servers. As the online dimension grew in commercial relevance over the past two decades, business use cases (see below example list) began having a bigger share and impact on the market, influencing the supply of higher quality proxy services. Still, private users make up the bulk of proxy usage figures globally.

Motivation for using a proxy service, 2019. Source: GWI.com

The marketing agency GlobalWebIndex has carried out a survey in late 2019 among users who confirm to have used proxy services at least once in the past month. It shows the evolution of the marketing image of proxies – from a means of achieving anonymity and maintaining privacy to a valuable tool in accessing resources which would otherwise not be available. What’s more, entertainment content heads this ranking, meeting the needs of more than half of all proxy users. Accessing social networks and various news outlets come next, with just about the same weight as privacy concerns.

Proxy server connections have clearly and firmly surpassed the limited image of a privacy protection solution. They offer much more to those who are well-informed and able to use them to their full potential.

Business usage also follows that trend, as hinted above. Being able to “see” consumers from a different viewpoint (i.e. IP location) has become indispensable for researchers. The same logic is valid for positioning a marketing message in a “fenced” location or reaching out to a solution which is not otherwise available (i.e. purchasing or selling in particular markets).

The above preferences are not universal, neither for consumers nor for businesses. Quite naturally, separate markets (i.e. countries and regions) vary somewhat in their motivations. On one hand, these reflect societal preferences and the mindset of the domestic user. On the other hand, digging deeper one could add political considerations, economic development factors, business and media freedom levels and much more.

Grouped below, we can see the same two major motivational categories which determine local demand for proxy services.

Source: GWI.com, 2019

Generated with the contribution of 140 thousand users in a global online survey, the above data has substantial validity. It reveals that emerging markets have essentially different concerns when compared to “mature” digital markets. While Malaysia, Philippines, Vietnam, Thailand, Indonesia and India lead rankings in content access needs, consumers in the West (e.g. Europe, UK, Canada) are more interested in their privacy.

Demand for particular entertainment content is of paramount importance in Asia Pacific, with over 60% of proxy users in Asian democracies citing it as their top need. APAC and most other emerging markets also show quicker adoption patterns, translating into a more pro-active search for solutions. These countries have been the main driver for global growth of proxy markets for years.

Furthermore, the thirst for better entertainment content goes hand in hand with the need for more speed and connection quality, unlike encryption and other concerns related to privacy and anonymity. This makes feasible more connections that are direct, light and self-catered (while not necessarily free and low-quality).

A similar line of reasoning creates more demand for mobile apps that offer proxy services. In the second half of 2021, data provided by App Annie has shown that mobile proxy and VPN apps have seen incredible rise globally, placing in the top 10 by growth (excluding gaming).

Emerging Markets Drive Trends, Usage Growth

Regional distribution of consumer preferences merits some further exploration. It is evident by now that proxy usage is driven by fast-growing markets and their demand for content access located and offered elsewhere.

If we consider that APAC has about half of all internet users, the fact that about a third of those opt for a proxy connection is a game changer – for companies offering the contents, for marketers and third parties wishing to get have a clear understanding of domestic consumption habits.

Indonesia tops rankings perennially, with proxy users nearing or surpassing 40% according to year and survey providers. India is also not far behind and some more recent reports place it on top of regional and global lists – certainly by volume and often by share of proxy users.

Considering their main motivation, many content consumers in apparently established digital markets, i.e. Europe or the US, might actually turn out to be users in the fast-growing APAC, Middle East and African (MEA) or Latin American (LatAm) markets. This might lead to trends and user needs being further overlooked, particularly when companies use “passive” data collection tools. Ultimately, the phenomenon contributes to deepen the existing digital divide by misplacing content creation and distribution efforts, ad placement and much more.

Emerging markets need better entertainment content. Source: GlobalWebIndex 2019

High-profile privacy breaches (by tech giants, government slip-ups and investigation leaks) end up disturbing public sentiment everywhere but they create more waves in the West. Privacy measures and proxy uptake end up obscuring some of those consumer preferences as well but they do not contribute to a further divide in content access. Indeed, global public awareness and proxy usage after every major privacy scandal do tend to rise but the lesser numerical weight of mature market users is overwhelmed by the need to bypass geo-blocking and other restrictions.

The above graph demonstrates that personal experiences and interests count less than market restrictions in emerging markets. The APAC region has a pronounced motivation in pursuing more and diverse entertainment content. All top markets are located in that region, with as much as two-thirds of Indonesians and Vietnamese claiming this is their key motive for proxy use.

Emerging markets peak and drive demand, trends (2019). Source: GWI.com

The above table confirms APAC countries as strong adopters and drivers for overall proxy growth. The trend is also visible in the Middle East and Africa Region (UAE, Saudi Arabia, Turkey, South Africa), and to a lesser extent in Latin America.

Popular audiovisual productions are cited as the reasons behind many of the surges in interest – with OTT services by Netflix, Disney, Hulu or Amazon Prime rarely available to those interested in emerging countries in identical or affordable terms when compared to mature markets. Sports broadcasts are also high on that list, with ESPN or Amazon as pertinent examples.

There are some obvious country specifics within the APAC macro-region as well. While more than half of all proxy users watch Netflix (and nearly 40% listen to Spotify), certain platforms have established clients only in some of the countries. Amazon is not available in China, while the Chinese market theoretically has exclusive access to platforms like iQiyiYouKu & TudouTencent Hollywood VIP and QQ MusicIndia has dedicated content provided by Hotstar and Sony Liv.

European and North American consumers, on the contrary, pursue and prefer anonymity over other aspects. The 44% proxy users among German online consumers is emblematic of such concerns, as Germany is the largest European economy with lots of purchasing power. Australian proxy users express similar preferences (43%).

Demographics is yet another aspect relevant in this market split – Western and mature markets present a broader age profile of users, having a greater proportion of older users with the technical means and interests of protecting their privacy. They are also less likely to attempt accessing restricted content, especially given the fact that most top-rated products are available to them and affordable according to their standard of living.

China is a case aside, as it often happens when considering free market logic. Users in the People’s Republic are particularly active via mobile proxies, as they give them access to social networks otherwise inaccessible or potentially monitored when available (41% of all proxy users cite it as a leading motive). Therefore, geo blocking tends to count as much as privacy and work-related reasons, since many are aware of the widespread tracking practice.

GWI, the same UK-based tech company cited above, reveals that throughout the series of proxy market surveys (2017–2021), the typical users tend to be youngmale and likely to have a university-level education. Such demographics lead to the profile of an early adopter in many a field but are particularly determinant in digital and tech product niches.

Indeed, the majority of those using proxies say they are almost constantly online and owning the latest tech solutions is rated as rather important. They are quite knowledgeable about proxies, VPNs, encryption and various related tools, therefore needing little on the instructional end of proxy support.

Further confirmation is found in the 2019 demographic breakdown of proxy users by macro region. While the total usage share in APAC (and MEA) rose to ~34%, younger consumers (16-24 y.o.) are represented as high as 36% in MEA and 41% in APAC. Both macro-regions present sizeable younger, urban user groups. Metro users also tend to be more affluent than the average domestic consumer.

The most recent comparable data (October 2021) supports the majority of the above findings. Some of the country data, however, presents a mildly surprising updated picture of the global market.

Share of users (16-64) who use proxy services at some point. Source: Datareportal & GWI.com

While 2021 figures place worldwide proxy users as somewhat under 29%, they are in line with previous statistics yet decreasing when compared to the 31% reported in 2020.

Indonesia and India remain at the top of the country ranking although they switch places as the Subcontinent comes out ahead for the first time with 43.2% in 2021. Interestingly enough, the biggest growth since 2017 has been reported in mature markets like the Netherlands (76% growth in 4 years) and Australia (a 69% rise).

Proxy Use Cases and Practical Scenarios

In pragmatic terms, end-users and most physical persons who use proxies want to hide their real IP address. Altering their manifest location allows them to achieve both of the major objectives identified above – access restricted content or navigate anonymously.

Individual users might have an infinity of case-specific applications for such behavior. Open proxies are mostly free and are a popular gateway to verify how much added value a proxy might generate for such users. However, open proxies give equal access to all connections, be those malicious or fair-use. There are some server administrators who, to the best of their ability, install monitoring tools and warning mechanisms against hackersviruses and other kinds of breaches. Nevertheless, most open proxies are vulnerable to both automated and man-made malicious interventions against the server and the connected users.

Closed proxy servers step up security procedures by setting up online user communities which are more exclusive. Addresses, passwords and pre-defined network settings are known to community members only. Such information is shared after member application, verification or other form of formal acceptance. This allows proxy server administrators and the very communities to agree on customizations and other common interests, settings and functional changes. Unverified actors and identities are kept out and problems minimized; when the latter occur, they are tracked more easily.

For business use cases, however, both scenarios are not optimal. Online enterprises have highly competitive needs related to projects addressing market intelligencedigital asset development or brand security.

Database scanning, for example, does not really need anonymity at all costs but requires remote access efficiency and getting back meaningful results. The same is true for market research – e.g. verification (and placement) of ads, marketing intelligence, competitor bench-marketing – proxies make remote and masked access possible. These are all cases when technology delivers pertinent insights and useful business information without location restrictions.

Additionally, proxy connections facilitate tasks like tradingUX development and verification. However, one of the most commercially significant applications of proxy servers is their use in web scraping.

Web Scraping – an Industry Standard

Web scraping might also be defined as crawling in order to extract data at a large scale. Naturally, this entails task automation and a highly professional approach. In this context, proxy server management represents a cornerstone element in setting up and operating an efficient web scraping project.

Businesses and professionals with the necessary expertise in “spider” setup (logic), maintenance and output commercialization tend to delegate proxy support to external providers, more often than not. Thus, they avoid handling vertical risks and other time-consuming tasks indirectly related to the core nature of the project.

In its essence, web scraping is performed through repeated request to a pool of proxy IPs. Tasks are limited through request delays, while proxies that get banned by the target domains are “discarded” for the purposes of the project. Advanced users give more dynamic instructions to their proxy service provider, often via dedicated API software. The main parameters involve regional and domain positioning or other mimicked characteristics of the requests sent.

Whether it’s retail platform, news outlets or other big data sources, the targeted domains provide valuable information on trendsproductsconsumers and competitors. A pool of well-managed proxies allows crawling to remain sustainable over time and provides more reliable results through diversification and backup connections. Crucially, proxies allow many concurrent sessions to one domain, something which greatly improves efficiency.

At this point, proxy-enabled scraping is an industry standard for online businesses which invest in tech-intensive intelligence and operations. Further examples include mobile IPs used to get insights on mobile-first platforms and markets (see more on mobile proxies below), particularly valuable when engaging in transactions with online retailers.

There have not been any indications in legislative or court practices that using a proxy is illegal. This is crucial for reputable businesses and for the digital sector as a whole. The same logic holds true for web scraping – worst case scenarios is that it’s against internal company policies or targeted domains.

When dealing with copyrighted or sensitive material, or performing tasks which are illegal in their very nature, we have an entirely different scenario. That is to say, public data is not protected, with or without a proxy connection. Contrarily, accessing protected information and other illegal endeavors cannot be made legal simply by hiding one’s IP or identity.

The tech industry has does expect data extraction to continue growing for the foreseeable future. Being highly automated, it requires few resources (especially for larger companies) and even fewer personnel. Web scraping is regarded as the “alternative” market data that helps not only current operations but also strategy planning and investment decisions.

Proxy Setup Options

Proxy server configuration can be performed manually, via automated tools or other dedicated software interface. The basic methods of making sure that the intermediary (server) really hides the original IP address and correctly executes user requests are structured around these main approaches to proxy setup.

Individuals and freelance professionals often chose manual configuration. The standard procedure entails adjusting the settings of the application that will actually send the requests, predominantly a browser.

Whether it’s ChromeFirefox or another browser, proxy setup is done through the Options menu, seeking for LAN settings or manual proxy settings. Besides the actual IP address of the proxy server, users need the port number. Correct compilation of the above should preferably be checked through IP verification tools that are able to detect trouble-free proxy connections.

Users at or above a semi-professional level who understand and trust automation usually prefer it most tasks. This might also be the case for proxy settings and other related presets. The so-called Proxy Auto-Config files (or PAC files) contain the proxy URL and port number among other prebuilt values. PAC files automate and facilitate proxy configuration – especially when a browser, application or a machine is updated or changed – so that users do not have to perform these settings themselves.

While PAC files automatically redirect the application to the proxy URL, this explains why there may be certain risks connected with malfunctioning or malicious redirects. Alternatively, when malware enters a machine, it could subsequently alter or recreate a PAC file, leading the host application to redirect to a phishing website or other malicious domain.

We discuss briefly some of the main risks tied to proxy usage below. Yet, inevitably, PAC deployment requires better cyber protection or entrusting such support to a paid service. The providers of the latter will assume most if not all risks and responsibilities related to proxy setup and usage settings.

Types of Proxies

An all-around report on proxies cannot lack at least a concise look at the types of proxy services that are generally available and, given the particular market demand and specifics, most sought-out. Cost-benefit comparisons of proxy connections abound, yet these are not the focus of our paper.

The simple task of performing a request and pulling a web resource can be done through a free proxy. Free and open web proxy services have their “fan base”, albeit among casual, mostly non-professional and technically less knowledgeable users.

With free proxies one does not typically know who operates them, whether it is a company, a criminal enterprise or even an intelligence agency. Trust issues are only natural, therefore, as hosts could see all the online activity, while they ask for nothing in return.

Residential vs Data Center Proxies

Users that have higher expectations about performance and quality standards can chose from a multitude of paid proxy services, providers and connection types. As expected, these offer more guarantees and better support along with the exploitation fees, typically based on traffic volumes.

Datacenter proxies are the most common type on the market. They give access to an IP address of a server which accommodates many other users. Still quite affordable as a solution, data center proxies allow the integration of many other business and personal solutions, from the very proxy management interface to a web crawling bot and much more.

Residential proxies, on the other hand, are the IPs of private users, based and operated within a residential network. Residential proxies are also associated with local ISPs. They are traceable to a physical location, if need be, and enable rather smooth and reliable online operations.

As they are more expensive to obtain, providers pass on these costs to proxy users. Despite that, they are in constant demand, since an individual’s network activity raises fewer concerns and is less often subjected to countermeasures by targeted domains and resources. Web servers “trust” them and are less likely to block or ban the IPs in question.

Residential proxies, in turn, can be segmented in two major types – static and rotating. Static proxy servers assign a single residential IP address to connected users. While relatively secure, they are easier to track down and access from an external location since they function via a single source. These features make them also more vulnerable to being flagged by cyber-defense systems for persistent questionable online behavior.

Rotating proxies, on the other hand, enable users to obtain different IPs at regular or preset intervals. The “rotation” is possible with there is a pool of proxy addresses readily available and assigned with every new user re-connection. This methodology makes rotating proxies more secure since their mutating values make them harder to track.

All of the residential proxies entail an agreement between the device owner (app or software operator) to install a software which allows other peers to connect to and through their device. In turn, these owners receive payment for the amount of users or traffic elaborated through the peer/proxy access program.

Ultimately, data center proxies are cheaper and satisfy most of the mainstream needs and tasks. Residential IPs are prized, more reserved and better serviced for those who have more extensive or particular requirements.

Mobile Proxies Are the New Big Thing

Mobile proxies are essentially a popular sub-type of the “residential” category. These IPs belong to actual private mobile devices and are rarely used for big-data research or automated tasks at scale. They are more expensive and tend to be used for filtering results shown to mobile devices within a geo-fenced market.

Mobile proxies operate through mobile data (3G and above) and respond via authentic IP addresses of smartphones or tablets. Researchers and tech professionals use them on more powerful machines (desktops) to achieve a mobile “appearance”, develop and test mobile-first projects – UX, apps, ad integration and other products and services.

Given the importance of mobile devices on most markets (with many APAC countries recording market shares above 90%, i.e. India), mobile proxies have become an indispensable part of market research and intelligence for most tech-intensive enterprises.

The Benefits of a Rotating Proxy

With the above summary categorization in mind, it is readily apparent that rotating residential proxies offer the most advanced functionalities and performance standards. These have a set of parameters and settings which are customarily handled through a proxy management program.

The validity of an IP within a pool of rotating residential proxies could be set up for 1, 5, 10 or 30 minutes at a time. Session persistence (the so-called sticky session) rarely has a random duration and the frequency of IP changes is selected according to the sensitivity of tasks and target domains.

Actually, more often than not, ISPs setup residential users with a rotating IP address for their end-user needs. Session persistence in this case is rather extended, however, as users need to restart their machine or even modem to obtain a new IP. Static IP addresses are also a possibility, although these are more frequently destined for incoming commercial web requests to a business domain.

Hence, the dynamic benefits of a rotating proxy materialize only after an intentional switch of IP addresses among available proxies. This is traditionally achieved through a proxy rotator, a system that switches proxy addresses between each automated batch of bot requests, a preset time or other conditions. This approach avoids proxies in the same pool from being flagged by anti-bot systems for engaging in excessive and repeated requests to the target domain.

Proxy rotation services tend to be included in residential proxy support packages. While this may seem too much for low-key domestic proxy users, most businesses that employ web scraping and big data analysis of online markets rely primarily on such integrated services for their automated tasks.

CAPTCHA Limitations and Proxy Solutions

One example of a simple anti-bot defense is frequently called into question by users and businesses trying to automate their proxy tasks. Captchas (automated Turing tests to tell humans and machines apart) have been a major impediment to online marketers since their mainstreaming, roughly two decades ago. They could be bypassed with the contribution of cheap human labor, as well as by trying to find a glitch in the hosting system which avoids the pop-up of captchas entirely.

However, most professional automated projects chose not to rely on such approaches and turn to AI and machine learning to put together a captcha solver. Most certainly, repetitive and excessive requests to a popular website will evoke a captcha – graphic, text-based, requiring a selection click or otherwise. This is where external captcha solving services find proxies indispensable as they operate from a different IP than the one executing the vetted task.

Proxies need to deliver unsuspicious requests so that they do not get blocked before or during the verification process. Moreover, captcha developers and providers have evolved the recognition capacities of anti-spam systems and they increasingly manage to identify simple bot logic.

In practice, the so-called captcha proxies and solvers that aim to overcome the automated defenses of websites end up being capable of dealing with both reCaptcha v2 and reCaptcha v3. These request, respectively, a single-click confirmation or a high enough score in image recognition (e.g. fire trucks, planes). That is sufficient in most scenarios.

However, such dynamic and competitive settings make proxies facing captchas unsuitable for a number of large-scale data scraping projects, regardless of AI capacities and machine learning mechanisms. They are deployed in selected market access and online resource probing, i.e. UX projects or retail tasks. A prime example is provided by the purchasing of high-end limited edition products to obtain and possibly resell (e.g. the infamous “sneaker bots” perform such tasks).

The above line of reasoning does not necessarily limit the quest for captcha-fenced resources to residential proxies. Data center proxy solutions are also quite relevant because of low costs and the smaller-scale request volumes usually handled. Setting up a limited range of domains and types of tasks to be performed also increases the chances of success.

Main Risks for Proxy Users

We have already made references to some of the common risks and vulnerabilities that come along with imprudent proxy usage. Consumers and businesses may end up being targeted by third-party surveillance, excessive advertisementsmalware or hacker activity. The majority of these drawbacks are associated with open and free proxies that offer no protection to those connected.

Despite this being a rather common knowledge, industry figures show that in developed markets such as the US and the UK up to 72% of users still chose to connect to free proxy and VPN servers. Only a third of all proxy users pay for their services (with some, apparently, employing both).

In order to upgrade security protocols, users need to consider data encryption whenever possible. Encryption is unrelated to proxies themselves and solutions are always external or additional to a proxy connection.

Moreover, the IP addresses of connected users are not hidden from the proxy operators. Unless there is an explicit agreement that guarantees higher quality standards (which, typically, comes with paid services), proxy users remain rather transparent in their behavior. Put plainly, while there might be a KYC policy for sign-ups and proxy community applications, there is rarely a similar reverse procedure to settle potential user concerns about proxy owners, operators or other third parties with extensive access rights.

This might not automatically mean that various account detailsauthentications and other sensitive information is exposed but a user’s digital identity is certainly at risk – again, if no express control and support mechanisms are agreed upon.

What else might happen when connecting to unverified “bad” proxies? An endless list of cyber-crimes might be associated with a user’s identity or activity but those that place people on alert are of a more subjective nature – selling their personal information, taking over their browsing sessions or attaching their devices with viruses.

As mentioned above, PAC files speed up proxy configuration and look an attractive proposition for some users. Unverified sources, on the other hand, might infect host machines, relay fake DSN coordinates and expose users to a series of risks when connecting to a server dedicated to e-crimes.

Through collaboration with proxy industry providers Global Web Index has elaborated additional user preference statistics of Western markets. These reveal that users who trust only paid proxies place data sharing with third parties (54%) above all other motives for using such services. Connection quality and other performance characteristics remain second (47%), still quite relevant.

Nevertheless, marketing and sales campaigns dedicated to the same mature markets insist with qualities like “high speed” and “affordable” to lure consumers. Proxy providers are aware that privacy policies are at the top of user concerns, without a doubt. Yet they do not reveal too much on those aspects without insistent digging into small print and customer support interactions.

The majority of active users trust proxies and VPNs as possessing an in-built security that gives them impregnable protection. Around two-thirds (62%) associate proxies with being secure, above and beyond their being effective or efficient. Unfortunately this is not always the case, for all the above-listed reasons and additional requirements. Agencies and suppliers know there is a knowledge gap between perception and market reality but it is up to consumers to get proper information and avoid bad proxy service providers.

From the point of view of domains and web resources that are targeted by proxies, such servers do not have a particularly favorable reputation. Some administrators are wary of bot activity; almost all fear unlawful access, abuse and other cybercrimes. A 2021 research paper shows that the detection of unauthorized users can still present problems for many present-day cyber protection methods. Quite often, organizations (especially public) lack even the basic anti-bot detection tools, leaving masked IPs and bot activity beyond their oversight.

Businesses and researchers equipped with cutting edge computing devices and advanced neural network logic can, to a large extent, protect themselves against malicious proxy and bot access. Recent years have brought advances in this sense, as research in this field has grown, particularly driven and financed by business security needs. Certain IT solutions have shown a capacity to detect and keep out most unauthorized user access coming through proxies (the above paper cites 93.71%).

Ultimately, as proxy solutions and integrated services evolve, automated solutions and machine learning technology have been in continuous support for both ends of the proxy spectrum. As business logic commands, diminishing risks entails higher costs but is unavoidable for professional activities.

We must not forget that anonymous proxies and associated privacy services are not intended for illegal activities in the first place. Looking at earlier market segmentation data, users in emerging markets and those dominated by younger audiences tend to use them for entertainment purposes above all. Besides privacy, users in mature and post-industrial markets associate proxies increasingly with rendering their activities legal in the eyes providers and peers. Both concepts are not completely correct without being inherently wrong. However, proxy usage alone does not legitimize anything – as observed earlier, activities and resource are legal or not according to national or international legislation which stands above any access means.

Even users who do not intend to take part in prohibited digital behavior, connections through unverified proxies might expose them indirectly and unwittingly. As anonymous proxies have an indisputable role in cybercrimes, such activities often lead to the banning of IP addresses and servers being reported to authorities. Thus, even if a user connects to a proxy server for genuine and legitimate reasons, they might get blocked for being part of the same proxy community using particular IPs over time.

Major Markets and Top Proxy Locations

This overview of proxy markets has, so far, given us an understanding of some of the major drivers for proxy usage, as well as some basic operational protocols. While the need for more privacy or content access remain fundamental end-user motivations (or business strategies), there are factors which influence proxy adoption at different rates in various countries and macro regions.

The fact that there are little or no concrete laws against proxy use is always an element in favor of global proxy market growth. Some countries may have vague definitions on the subject, others may have formulated rules or established practices against deceptive acquisition and delivery of goods and services.

Then again, there are some very clear and explicit restrictions imposed by governments that companies and ISPs have to follow. We have covered the concept that public security breaches, illegal acquisition of commercial products and copyright infringements cannot be legitimized by proxy usage in any way. In India, for example, authorities hand out (in theory) up to three years jail time for accessing torrent sites and blocked URL content as per current legislation.

In principle, EU countries have a completely different paradigm of geo-fencing and limitations between them. Restrictions within the common market are illegal, by definition, and some tech giants and online gaming companies have already gotten fines for blocking their content on purely territorial grounds. Content producers attempt to get obtain higher prices from countries with higher purchasing power – something which, according to the European Union common market regulations, is not a legitimate commercial strategy.

Similar issues arise when, for example, OTT streaming users try to connect their device through a proxy or a VPN. Legally, not only within the EU, accessing the international catalogs of major global providers should not be treated as piracy, particularly if not used for commercial purposes or other re-distribution attempts that damage the company (resale or even torrent sharing). There have not been any emblematic legal proceedings against proxy users for such activities. Yet, this might be clearly stated as “undesirable” in the T&C of the agreement between companies and end-users. In the end, should the businesses possess the technical means (rather than legal) to enforce their internal policy, they simply cut off access when the platform detects unwanted behavior.

Listing the above market specifics – without going deeper into their real-world application – aims to help us narrow down the factors that shape global proxy market sizes.

Adoption rates we quoted above indicate an important starting point. Multiplying these by total online users in a given domestic market, we can outline the leading country markets. The mentioned influencing factors are harder to weigh and quantify – some proxy markets are easier to access (for end users and service providers), some have defense mechanisms and stricter response policies by both companies and authorities.

We can still outline certain countries as high-profile examples of leading markets with significant adoption shares, size and even unexpressed potential. We have seen Indonesia and India, for example, top market penetration rankings – if not in terms of absolute quality, than at least considerably “fertile” markets for proxy server usage.

As a next step, we need to discuss quantity in overall numbers and market volumes. This could be done both on a country level, as well as a city level for the major tech hubs and user markets around the world. Quite significantly, such an in-depth research has been carried out by a joint University team from South Korea and the US in 2020. Scholars have segmented geospatial distribution, market similarities and differences, absolute numbers and relative shares, as well as interesting features like blacklisting rates.

Similar research studies are also conducted by business-sponsored teams, particularly on spam and malicious behavior responses and the significance of proxy usage on the commercial performance of the main industry actors. Naturally, few to none of those organizations share such primary data.

The publicly accessible analysis cited above emphasizes the fact us that in their essence open and residential proxies can be used for the same or at least similar objectives. Even so, more advanced users know their distinct features and employ them according to current project needs. In turn, this shapes the proxy market differently and online ecosystem responds in unlike manner to these two major categories (i.e. detection, blocking, spam and bot use, etc.).

What we revealed earlier is addressed as well – the so-called “country-level characteristics” that influence market adoption and volumes. The most relevant among these are, namely, Internet freedom (or censorship), political stability and GDP (per capita).

The following figure represents the open proxy distribution according to absolute numbers – i.e. market volumes:

The below illustration represents the market volumes for residential proxies and their global distribution by country:

Data source: Choi et al, 2018 (South Korea, USA)

We can clearly see that China and the USA host a large portion of open proxy servers29% combined. However, both global powers are out of the top 10 ranking for residential proxies. In a contrasting manner, Turkey hosts a large amount of residential proxies and almost no open ones (528 thousand to 5 thousand, respectively). We can see the top 10 lists in both categories below.

Leading countries hosting proxy servers. Data: Choi et al.

The analyzed figures include nearly 7.5 million IP addresses. Interestingly, there are around 21 thousand IPs that are part of both datasets (open and residential), yet that is merely 0.3% and does not alter any major conclusions.

The open proxy datasets are easier to identify, despite the fact that a large proportion of those may be unresponsive to connection requests at a given time. There is a sufficient amount of public IP lists that identify open proxies. Additionally, over 13 million open proxy requests have been monitored over a 50-day period. Residential proxies are more difficult to obtain and an earlier research piece has been user for the purposes of the study. The combined IP list contains parameters allowing its geospatial analysis and subsequent distribution mapping.

Industry estimates speak of over 100 million available proxy IPs altogether – serving over a billion users given the above-listed market penetration figures for almost 5 billion online users. The 7.5 million IPs in the cited study certainly do not represent an exhaustive list of proxies but are proportional to their country and city distribution. As per the Law of Large Numbers, this data set should not deviate from the actual spatial distribution of proxies and, therefore, the resulting trends and conclusions should be valid on a global scale.

Hence, besides China and the US being open proxy leaders, Turkey and India stand out as the countries hosting the highest number of residential proxies (a total of over 15%). Russia and especially Ukraine also surpass world averages but probably not tech-industry (and commonplace) expectations.

Moreover, while open proxy distribution is considerably skewed towards the market leaders (and particularly the top 2), residential ones are spread out among Europeanex-Soviet territories and the Americas. The top 10 open proxy countries add up to roughly 70% of such IPs, while residential proxies in the top 10 nations are less than half of all (46.8%).

Looking at a city-level distribution, we see metropolitan areas like Bangkok and Amsterdam excel among open-proxy locations (with nearly 13% of the global total combined). However, most open proxies remain situated in Chinese cities, despite the fact that only Hangzhou is in the top 10.

As for residential proxies, Istanbul and Ankara amass nearly 89% of Turkey’s share (already a global leader), ending up 1st and 3rd in the rankings.

Open proxy distribution by city:

Residential proxy distribution by city:

Source: Choi et al.

In the end, the current proxy market overview has much more to gain from country-level distribution and macro-regional specifics rather than city-level segmentation.

Among the larger markets, China is a case aside. Search volumes, services, market access, content delivery specifics and many other limitations make it a very particular market, especially impenetrable to non-speakers of Sinitic languages. Perceptibly large (with around 300 million active proxy users and many more potential ones), it is a challenging territory to describe in plain terms and we will limit our observations to its dominant global share of open proxy servers.

India has the second largest share (and number) of residential proxies, with four of its tech hubs within the leading 30 cities. Bengaluru alone hosts an estimated 2.15% of the world’s residential proxies. The below graphic illustrates the residential proxy distribution in India among its major cities:

The larger circles, as anticipated, show higher concentration in Bengaluru, NCT Delhi, Mumbai and Hyderabad.

Bengaluru, on the other hand, has as high as 99.91% of the local residential IP addresses blacklisted by at least one content-delivery platform (see below), mostly by foreign domains. India’s leading tech hub also has more than 41% of its residential proxy IPs listed as used for proven spam activities, with a further 7% estimated to be “vulnerable” to future spam campaigns. Both figures are world-highest in their category. This clearly shows a certain ethical devaluation of local IPs and their overexploitation, particularly by tech businesses and various automation projects.

As for straightforward malicious attacks (i.e. hacking, viruses, etc.), the analysis of world averages show that under 7% of open proxies and merely 0.27% of residential ones end up being used for such actions.

We have seen Indonesia stand high in terms of market penetration – with nearly 200 million online users, almost half admit to using proxies. Clearly, Indonesians needing a proxy connection to access foreign-based contents do not do it with domestic servers. However, verified local datacenter proxies provide the necessary speed and stability that are in demand for some of the major sites – besides most global leaders, local favorites include the likes of tokopedia.comhotstar.com and shopee.co.id.

Germany is another quite relevant market in terms of purchasing power, Internet freedom and privacy awareness. Notably, it is the top EU country on the residential proxy list with 3.65% of global IPs in the category. The EU also has some of the best privacy protection regulations under its renowned GDPR laws. This comes to show that privacy does not have to be compromised (and entertainment access limited) for users to seek more of the same.

On the other hand, often, some of the most internet-censored countries have less penetration of privacy tools and proxy access than others. Examples like North Korea are probably inapplicable but those of Iran (a large nation yet barely in the top 25), Eritrea and Turkmenistan (nowhere near the leaders) illustrate that there are many variables to building a strong domestic proxy server following and physical presence. Once more, China is likely to skew some of the rankings in this group of countries, as it boasts an immense tech-savvy population despite the presence of strong online censorship.

Matching these proxy market figures against blacklist services, the analysis brings up country-level indicators on proven spamattack and vulnerability characteristics of local proxy IPs.

China, expectedly, has the most blacklisted IPs, as much as 94.24% of its open proxies. It also has the most proxies with proven involvement in spam and attack activities, only resulting second in IPs that remain vulnerable for future spam actions (despite those representing less than 1%).

Iran is the 10th most blacklisted country in terms of open proxies (~93%), while it rises to 6th in proven attacks and overall current vulnerability. Thailand and (99.5%) and Taiwan (98%) also have an impressive share of their open proxies blacklisted. The USA is third in absolute numbers but these are “only” 55.5% of its open proxy IPs, making it the least blacklisted in the top 10.

As for residential IPs, 8 of the top 10 have more than 90% blockage of existing proxies, save for Australia and Ukraine. Indonesia, Turkey, Mexico and Germany all stand at above 95% blacklisting. Turkey alone hosts over 9% of the world’s blacklisted residential proxies.

Four countries are in the top 10 of both open and residential proxy blacklisting: Indonesia, Russia, Brazil and Australia. Thailand, Vietnam and Mauritius have provided the most proven attacks. India, Indonesia and Australia have the highest number of vulnerable IP addresses.

Proxy Market Prospects

The well-worn understanding that proxy servers improve user security has been long outgrown in the industry, yet it continues as the main mantra when addressing online communities. In fact, proxies do improve privacy but often need to be integrated by additional services and professional support. Many unaware consumers end up on the receiving end of malicious behavior and other forms of online abuse.

Considering, on one hand, the distinct operational settings of open and residential proxies and, on the other, actual market demands and socio-economic factors, the proxy market is characterized by a complex country and regional distribution. Political, economic and technological developments all add up to a better understanding of the dynamics of proxy market growth. The ambition of this overview is to help consumers and businesses appreciate proxies for their role in the evolution of the global online ecosystem.

We have seen that many of the national jurisdictions that pose stringent online censorship rules are proportionally more influential in the proxy market – both in terms of demand and supply. Among global leaders, Indonesia, India, Saudi Arabia, Turkey, Thailand and Vietnam all score below 50 in their Freedom on the Net score, making proxy usage an obvious choice for their large and technologically well-supplied consumer groups.

Last year’s GWI user penetration ranking traces the substantial growth in most of the leading proxy markets. Government campaigns and new laws (see the above Freedom House report for more details) have had contrasting effects – crackdowns in Turkey, China and Egypt have blunted the emergence of proxy usage; Russian laws have been largely ineffective and proxy usage reported substantial growth (still below 30%).

Proxy penetration among domestic Internet users – leading countries, 2021. Data: GWI.com

Somewhat surprisingly, countries with admirable levels of internet freedom posted double-digit growth, particularly in Europe (i.e. the Netherlands) and Australia. As detailed in this report, content access and online freedom has never been the issue there, rather the awareness of government and business surveillance that is urging users to take measures against ISPs, public agencies, hackers or even advertisers.

Therefore, low proxy and VPN adoption rates can serve as a good measure for a currently unrealized market potential in many countries. Privacy concerns are reported as high in nations like Hong KongSouth KoreaSpain or Israel, yet there is a considerable spread between such knowledge and the low adoption rates in these markets.

The saturated Indonesian and Indian markets, on the contrary, indicate that proxy usage is part of the very online culture of these markets.

While content access demand has spurred the growth of proxy markets in emerging markets, online privacy concerns tend to be cyclical and mostly sensitive to specific events, whether regional or global. Breaches in Facebook security, the revelations of Edward Snowden, various financial paper leaks, the Cambridge Analytica scandal and repetitive bans in China and elsewhere – all these have led to surges in interest towards privacy tools, only to subside gradually after a few weeks.

In the end, proxies are not as frequently used as ad blockersprivate browsing sessions or cookie deletion. While there are some easy-setup incentives like PACs and browser extensions, proxies are still the least integrated and user-friendly among privacy and anonymity tools.

Despite that, the 14% overall growth in the past 4 years speaks volumes about the proven effects of proxies, when employed and set up well. Other privacy tools have not grown in the meantime, or not as much, indicating that consumers are gradually acquiring the know-how and habitual practice of considering more advanced privacy tools.

The undeniable and continuous awareness growth among online users about the importance of their privacy and protection leads industry experts to project a lasting and sustainable expansion of proxy markets. Users who generate demand for richer entertainment offers and internet freedom, similarly, can fuel the rise in proxy and VPN adoption across the globe.

Even when consumers are not conscious and knowledgeable enough on how to get quality proxy services, the penetration and popularity of the latter is only likely to grow, get more automated and possibly more user-friendly.

Connect

Contact

Esse N Videri Media Limited
First floor, Penrose 1, Penrose Dock, Cork, T23 KW81, Ireland

[email protected]