Downloading bulk WHOIS data: why and how?
In spite of all of its known deficiencies, WHOIS data still play a crucial role in e.g. IT security, marketing research, scientific investigations related to the Internet or the World Wide Web and the domaining business. They provide essential information on the ownership and registration details of a domain, including relevant dates such as that of registration and expiry.
It is hard to obtain WHOIS data directly. The WHOIS protocol itself, as defined by RFC 3912 is rather archaic and has its limitations. For instance, it provides data in a not very strictly defined textual form. There were several proposals to overcome this issue by replacing WHOIS with a newer protocol to provide structured data (e.g. RDAP), but none of them has taken off so far. Hence, even to obtain WHOIS data in a structured from which does not need further parsing, one has to rely on services providing appropriate APIs.
Another issue with WHOIS is that even though it is an open, distributed database, the operators of WHOIS servers have a big freedom in introducing limitations on the amount or frequency of queries. If you just need the data of a few domains occasionally, there is no problem. However, if you need many of these data within a short period, you quickly run into these limitations. Again, the use of some API can be a solution. In some cases, however, it is a tempting idea to have your own database with the WHOIS data so that you do not rely on the API provider at all.
There are situations, however, when you need a large, possibly full and complete set of WHOIS data. Any kind of complex research for trends or structures in registrations, no matter if it is for marketing purposes, domaining purposes, IT security or legal investigation, or even fundamental research is typically a big data research tasks with WHOIS data as input. And this frequently includes also historic data which are not at all available by default in the WHOIS system.
In several applications various statistics of domain registrations are sufficient to obtain the relevant information. These are also provided at domainnamestat.com, optionally even in the form of custom reports. But what if you have a particular, more sophisticated research goal which needs special tools to process the data. You may want to use some more sophisticated big data tool (such as solr or ElasticSearch), or some custom piece of analytical software to find the information you really need. There is no other option here: you need a complete set of WHOIS data, possibly including the historic part, available locally on your infrastructure.
DomainNameStat.com is there to provide you with these data. We have been specialized in collecting and normalizing WHOIS data for several years. This activity requires a tremendous infrastructure as well as special expertise and development of custom software for WHOIS data collection and parsing. Many data sources have to be addressed, including WHOIS servers, proxies, zone files, DNS sensor networks, etc.
As a result, we can provide complete and structured WHOIS databases including historic data for downloading. Using our download services you can find the data you need. The data sets are organized along various aspects to meet your requirements. The level of details range from listings of newly registered or dropped domains to fully detailed WHOIS information. The data are available for generic top level domains as well as country code top level domains, possibly categorized by registrant countries. We release quarterly WHOIS databases as well as daily updates.
You can also choose from different formats of data. We provide MySQL dumps as this popular relational database management system is used quite prevalently. The simplest and most portable format, however, is csv (comma separated values). These files can be processed directly and imported by virtually any database management system or big data tool, including ElasticSearch or solr.
The data can be downloaded via web access or ftp protocol. Naturally, we provide detailed documentation for the data and client-side scripts in support of downloading the data, loading them into databases, etc. We have numerous clients worldwide, ranging from independent IT security investigators through scientific research groups to big companies, who have used our data successfully in many of the aforementioned applications. So if you are in the need of bulk WHOIS data, you may also become one of them.Read the other articles