Welcome to CollecTor, your friendly data-collecting service in the Tor network. CollecTor fetches data from various nodes and services in the public Tor network and makes it available to the world. If you're doing research on the Tor network, or if you're developing an application that uses Tor network data, this is your place to start.
The Tor network data provided here comes from currently five different sources (each of which is explained in more detail on a separate page):
We have over 10 years of Tor network data available for download in monthly tarballs. The latest tarballs are updated every few days. So, if you want to fetch data covering an extended period of time, monthly tarballs are for you. Just be careful: these tarballs can decompress to 20 times the compressed size or even more. Monthly tarballs can be browsed and downloaded in the archive/ subdirectory.
If you're only interested in recently published data, we also have data from the last 72 hours available for you. In contrast to monthly tarballs, this data set is updated every hour. If you have already bootstrapped your application with monthly tarballs and want to stay up-to-date, or if you just want to take a peak at the latest data, this is your data set. If you're using special software to download these files, you may want to configure it to accept gzip-compressed data to save us all some bandwidth. The latest 72 hours of data are available in the recent/ subdirectory.
We developed two parsing libraries, one for Java and one for Python:
If you developed a parsing library for another language and want it to be listed here, please let us know!
We wrote a couple of applications, and researchers wrote research papers using the Tor network data provided here. The following list is not at all exhaustive:
If you wrote an application or research paper that uses Tor network data and that is not yet listed here, please let us know! Please include a short description what your application does or what your research was about.
If you have any questions about the Tor network data provided here, we'd like to hear from you! Of course, suggestions or other feedback are welcome, too.