Bulk Data Files
Shodan provides a few different datasets as bulk data files:
banners-daily,banners-hourly: contains all the banner/ service information that crawlers collected during a given day/ hour. Each file is compressed usingzstdand contains a single JSON-encoded banner per line. The most recent 30 days are always available for download. This data powers the/shodan/host/set of API endpoints. Visit the data dashboard to get a sense of what the latestbanners-dailyfile contains.raw-daily: the legacy dataset containing the banner data. It’s formatted usinggzip. We continue to support this dataset but for new projects we recommend the`banners-daily`/banners-hourlydatasets.dnsdb: DNS data gathered using OSINT techniques. This data powers the/dns/domain/endpoint of the API.internetdb: SQLite database file that contains minimal service information but is small enough to fit into memory. It powers the InternetDB API.cvedb: SQLite database containing information about the CVEs published to NVD. It powers our public CVEDB API.internet-scanners: contains a list of IPs that have been observed scanning the Internet within the past 24 hours. This data is used to add thescannertag on the banners.ping: a CSV containing the results of a ping sweep of the Internet.whoisdb: MMDB database file containing Whois information for all IPs on the Internet.
Bulk Data API
The Bulk Data API methods provide a programmatic way to discover and download all the raw data files that Shodan generates. And the data itself is stored in the cloud for optimized delivery across regions. The current methods for the API are documented on the developer website.

The /shodan/data method returns a list of available datasets and metadata about them:
[ { scope: "monthly", name: "internetdb", description: "Minified database containing network information about all IPs on the Internet" }, { scope: "monthly", name: "dnsdb", description: "DNS data for active domains on the Internet" }, { scope: "daily", name: "banners-daily", description: "Data files containing all the information collected during a day" }]The /shodan/data/{dataset} method returns a list of URLs that can be used to download the files within a dataset. For example, the below shows part of the response for the /shodan/data/raw-daily request:
[ { "url": "https://...", "timestamp": 1611711401000, "sha1": "5a91f49c90da5ab8856c83c84846941115c55441", "name": "2021-01-26.json.gz", "size": 104650655998 }, { "url": "https://...", "timestamp": 1611655444000, "sha1": "ea29acc25fc154ac64dde0ab294824ae7f1f64c9", "name": "2021-01-25.json.gz", "size": 152517565458 }, { "url": "https://...", "timestamp": 1611540775000, "sha1": "aed18f2a952df7731fec447d81ead8a96907000d", "name": "2021-01-24.json.gz", "size": 161275556509 }, ...]Downloading the Data
The Bulk Data API files are hosted on Backblaze B2 which supports the ability to download the data in chunks which means you can use multiple connections to download a single file. It will significantly speed up the downloads if you can take advantage of that, especially as the bulk data file sizes continue to increase. The recommended tool for downloading the data is aria2c. The following is a sample command using aria2c that downloads a file with 4 concurrent connections to the server:
aria2c -x 4 -s 4 -o filename.json.gz http://<bulk-data-url>The aria2c process will pre-allocate the entire data file and then fill in the data as it is downloaded.
Quickstart
If you’re just getting started and want to try out the Bulk Data API then check out the Shodan CLI. It supports all the Bulk Data API methods. For example, to get a list of the available datasets:
shodan data listTo get a list of files for a dataset:
shodan data list --dataset=banners-dailyAnd then to download a specific file within a dataset:
shodan data download internetdb internetdb.sqlite.bz2However, the shodan data download command downloads the data using a single connection which will be significantly slower than using a tool such as aria2c. Below is the equivalentaria2c command to download the InternetDB SQLite file using 4 concurrent connections:
aria2c -x 4 -s 4 -o internetdb.sqlite.bz2 https://f001.backblazeb2.com/b2ap...Useful Links
- Datapedia: https://datapedia.shodan.io/
- aria2c: https://aria2.github.io/
- Developer documentation: https://developer.shodan.io/api
- Postman collection for REST API: https://www.postman.com/shodanhq/workspace/shodan/folder/5677612-ed460277-6845-4a40-9f5e-ba803cfa9f74
- Shodan CLI and Python library: https://github.com/achillean/shodan-python