Index of /datasets/supplement/2020-imc-hoiho

      Name                    Last modified      Size  Description
Parent Directory - 20100129-orgs.txt 2020-09-20 21:41 35K 20100401-orgs.txt 2020-09-20 21:41 36K 201007-midar-iff.rou..> 2020-09-20 21:41 13M 201007-node2as.re 2020-09-20 21:41 9.4K 20100831-orgs.txt 2020-09-20 21:41 37K 201104-midar-iff.rou..> 2020-09-20 21:41 17M 201104-node2as.re 2020-09-20 21:41 9.3K 20110420-orgs.txt 2020-09-20 21:41 39K 201110-midar-iff.rou..> 2020-09-20 21:41 17M 201110-node2as.re 2020-09-20 21:41 11K 20111003-orgs.txt 2020-09-20 21:41 42K 20120629-orgs.txt 2020-09-20 21:41 46K 201207-midar-iff.rou..> 2020-09-20 21:41 18M 201207-node2as.re 2020-09-20 21:41 9.9K 201304-midar-iff.rou..> 2020-09-20 21:42 20M 201304-node2as.re 2020-09-20 21:42 12K 20130401-orgs.txt 2020-09-20 21:42 50K 201307-midar-iff.rou..> 2020-09-20 21:42 20M 201307-node2as.re 2020-09-20 21:42 13K 20130701-orgs.txt 2020-09-20 21:42 52K 201404-midar-iff.rou..> 2020-09-20 21:42 21M 201404-node2as.re 2020-09-20 21:42 15K 20140401-orgs.txt 2020-09-20 21:42 73K 201412-midar-iff.rou..> 2020-09-20 21:42 21M 201412-node2as.re 2020-09-20 21:42 17K 20150101-orgs.txt 2020-09-20 21:42 80K 20150701-orgs.txt 2020-09-20 21:42 86K 201508-midar-iff.rou..> 2020-09-20 21:42 21M 201508-node2as.re 2020-09-20 21:42 17K 201603-midar-iff.rou..> 2020-09-20 21:42 22M 201603-node2as.re 2020-09-20 21:42 19K 20160401-orgs.txt 2020-09-20 21:42 93K 201609-midar-iff.rou..> 2020-09-20 21:42 24M 201609-node2as.re 2020-09-20 21:42 21K 20161001-orgs.txt 2020-09-20 21:42 97K 201702-midar-iff.rou..> 2020-09-20 21:42 23M 201702-node2as.re 2020-09-20 21:42 22K 20170701-orgs.txt 2020-09-20 21:42 104K 201708-midar-iff.rou..> 2020-09-20 21:42 24M 201708-node2as.re 2020-09-20 21:42 23K 20170908-peeringdb.re 2020-09-20 21:42 8.0K 20170908-peeringdb.r..> 2020-09-20 21:42 875K 20171001-orgs.txt 2020-09-20 21:42 106K 201803-midar-iff.rou..> 2020-09-20 21:42 25M 201803-node2as.re 2020-09-20 21:42 24K 20180401-orgs.txt 2020-09-20 21:42 117K 201901-midar-iff.rou..> 2020-09-20 21:42 21M 201901-node2as.re 2020-09-20 21:42 22K 20190101-orgs.txt 2020-09-20 21:42 125K 201904-midar-iff.rou..> 2020-09-20 21:42 24M 201904-node2as.re 2020-09-20 21:42 26K 20190401-orgs.txt 2020-09-20 21:42 127K 202001-midar-iff.rou..> 2020-09-20 21:42 25M 202001-node2as.re 2020-09-20 21:42 30K 20200101-orgs.txt 2020-09-20 21:42 135K 20200215-peeringdb.re 2020-09-20 21:42 17K 20200215-peeringdb.r..> 2020-09-20 21:42 1.5M README.txt 2020-09-21 13:46 1.6K public_suffix_list.dat 2018-01-10 12:42 188K web/ 2020-09-21 14:26 -
This public dataset contains the data used to train our system to
learn regular expressions that extract ASNs from router hostnames.  It
also includes the "best" regular expressions inferred for each suffix
with at least one training router.  Note, not all of the regular
expressions are useful, and you should exercise your best judgement as
to which expressions are useful.  In our IMC 2020 paper, we used the
regexes Hoiho classified as "good" or "promising".

If you use this data, you are required to cite:

 M. Luckie, A. Marder, M. Fletcher, B. Huffaker, and k. claffy.
 Learning to Extract and Use ASNs in Hostnames.
 Proc. ACM Internet Measurement Conference 2020.

You are also required to cite the ITDK, from which this data is
derived.  The instructions for citing the ITDK are included at:

 http://data.caida.org/datasets/topology/ark/ipv4/

The data is designed to be used with sc_hoiho, which is included
as part of scamper:

 https://www.caida.org/tools/measurement/scamper/

To obtain the inferred regular expressions which are included in this
dataset release, you will need to build sc_hoiho by passing
--with-sc_hoiho and either --with-pcre or --with-pcre2 to configure.
When building sc_hoiho, ensure pcre (or pcre2) is in the path where
your compiler looks for header files and libraries.  For example:

CFLAGS='-I/usr/local/include' LDFLAGS='-L/usr/local/lib' ./configure \
 --with-sc_hoiho --with-pcre2

and then run:

sc_hoiho -O learnasn -d best-regex public_suffix_list.dat <training-set>.routers

Other options to sc_hoiho are documented in the manual page for
sc_hoiho.

 https://www.caida.org/tools/measurement/scamper/man/sc_hoiho.1.pdf