T.Sharon-A.Frank 1 Internet Resources Discovery (IRD) Internet/WWW Technical Background Thanks to...

45
T.Sharon-A.Frank 1 Internet Resources Discovery (IRD) Internet/WWW Technical Background Thanks to Miki Even-Haim and Yoram Dahan
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    222
  • download

    0

Transcript of T.Sharon-A.Frank 1 Internet Resources Discovery (IRD) Internet/WWW Technical Background Thanks to...

T.Sharon-A.Frank1

Internet Resources Discovery (IRD)

Internet/WWW

Technical Background

Thanks to Miki Even-Haim and Yoram Dahan

T.Sharon-A.Frank2

Measuring the Web

"When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the state of science." - Lord Kelvin

T.Sharon-A.Frank3

Internet/WWW Statistics

• Internet Size & Growth

• Population Sizes

• Various Activities

• Web Size & Growth

• Web Pages and Formats

T.Sharon-A.Frank4

The Domain Survey

The Domain Survey attempts to discover every host (i.e., uniquely reachable connected computers) on the Internet by doing a complete search of the Domain Name System. The latest results gathered during late Jan 2001 are listed, together with Mark Lottor’s work in this area over many years. For more information see RFC 1296; for more data see the archive site at the Internet Software Consortium, http://www.isc.org/ds/

Beginning with the January 1998 survey, Lottor began using a new method of doing the survey to avoid the increasing blocking of DNS zone transfers. This method of querying the DNS for known IP address is explained at http://www.isc.org/ds/new-survey.html. It is not backward compatible with the old results. The old and the new data is juxtaposed in these trends graphs with dotted lines.

T.Sharon-A.Frank5

Internet Domain Survey Host Count

T.Sharon-A.Frank60

20,000,000

40,000,000

60,000,000

80,000,000

100,000,000

120,000,000

Jan-95 Jan-96 Jan-97 Jan-98 Jan-99 Jan-00 Jan-01

Internet Hosts 1995-2001

New survey data

Adjusted old survey data

T.Sharon-A.Frank710,000

100,000

1,000,000

10,000,000

100,000,000

1,000,000,000

Jan

-89

Jan

-90

Jan

-91

Jan

-92

Jan

-93

Jan

-94

Jan

-95

Jan

-96

Jan

-97

Jan

-98

Jan

-99

Jan

-00

Jan

-01

Jan

-02

Jan

-03

Jan

-04

Jan

-05

Jan

-06

Jan

-07

Internet Hosts - Overall Trend

ProjectedHistorical

T.Sharon-A.Frank8

Trends in Internet Hosts

• The figure of 109 million hosts represents a significant new benchmark for the number of Internet hosts. The current annual growth rate now stands at 51%, within the 46-67 % rates seen over the past 2 years.

• It shows continued strong exponential growth, with the 100 million host barrier being crossed in late 2000. If the same growth rate is sustained, the Internet would cross the 1 billion host mark in mid 2005.

• The Internet is now expanding at the rate of 63 new hosts and 11 new domains per minute worldwide.

T.Sharon-A.Frank9

Total domains registered

• Total domains registered worldwide: 33,293,791

• International (COM): 23,121,005

• International (EDU): 6,708

• International (GOV): 1,269

• International (NET): 4,343,150

• International(ORG): 2,671,279

• United Kingdom (CO.UK): 3,150,380

T.Sharon-A.Frank10

Where the Internet hosts are by domain (Jan 2000)

Germany2%

Canada2%

com35%

net23%

UK3% Japan

4%

US-dom3%

USA-dom3%

edu8%

Others18%

T.Sharon-A.Frank11

Where the Internet hosts are by domain (Jan 2001)

Germany2%

mil2% com

34%

net28%

Canada2% Japan

4%

US-dom3%

UK2%

edu6%

Others18%

T.Sharon-A.Frank12

Hosts: Large Three-Letter Domains

0

5,000,000

10,000,000

15,000,000

20,000,000

25,000,000

30,000,000

35,000,000

40,000,000

Jan-96 Jan-97 Jan-98 Jan-99 Jan-00 Jan-01

comnetedumilorggov

T.Sharon-A.Frank13

Trends in Domains Growth

• The largest domain is COM, jumping 7.8 million hosts since January 2000 to a new high of 36.3 million hosts. That represents a current annualized growth rate of 32%. As a percentage of the entire Internet, the COM host count stayed about the same at 33.2% of all Internet hosts.

• The number of hosts in the NET domain - which is heavily used by ISPs for dialup customers - remained the fastest growing of all the large domains, expanding at an annual growth rate of 45% to 30.8 million hosts.

T.Sharon-A.Frank14

Internet/WWW Statistics

• Internet Size & Growth

• Population Sizes

• Various Activities

• Web Size & Growth

• Web Pages and Formats

T.Sharon-A.Frank15

Netizens

T.Sharon-A.Frank16

Internet Users around the Globe

Source: http://www.geocities.com/Eureka/Enterprises/6930/enstat.html

T.Sharon-A.Frank17

Internet Users Statistics

Source: http://www.geocities.com/Eureka/Enterprises/6930/enstat.html

T.Sharon-A.Frank18

Top 20 Countries in Internet Usage

T.Sharon-A.Frank19

US Internet Users by Age

T.Sharon-A.Frank20

European Internet Users by Age

T.Sharon-A.Frank21

US Internet Users by Gender

T.Sharon-A.Frank22

European Internet Users by Gender

T.Sharon-A.Frank23

Internet/WWW Statistics

• Internet Size & Growth

• Population Sizes

• Various Activities

• Web Size & Growth

• Web Pages and Formats

T.Sharon-A.Frank24

Language populations

Source: Global reach http://glreach.com/globstats/index.php3?goto

T.Sharon-A.Frank25

US Internet Users by Income

1998

T.Sharon-A.Frank26

Number of surfers vs. computer holders

T.Sharon-A.Frank27

Operations done by Americans on the Internet

T.Sharon-A.Frank28

Leading Search Engines (by entries)

T.Sharon-A.Frank29

Internet/WWW Statistics

• Internet Size & Growth

• Population Sizes

• Various Activities

• Web Size & Growth

• Web Pages and Formats

T.Sharon-A.Frank30

Number of Web Sites

T.Sharon-A.Frank31

Number of Web Sites

• 1997: 1,570,000

• 1998: 2,851,000

• 1999: 4,882,000

• 2000: 7,399,000

• 2001: 8,745,0000

100000020000003000000400000050000006000000700000080000009000000

10000000

T.Sharon-A.Frank32

Number of Unique Web Sites*

• 1998: 2,636,000

• 1999: 4,662,000

• 2000: 7,128,000

• 2001: 8,443,000

0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

8000000

9000000

* If a site is located at multiple IP addresses, the site is retained in the sample only if the numerically lowest IP address is in the sample.

T.Sharon-A.Frank33

Types of Unique Web Sites

PublicPrivateProvisional

1998:1,457,000315,000864,000

1999:2,229,000790,0001,643,000

2000:2,942,0001,494,0002,692,000

2001:3,119,0002,078,0003,246,000

Public: Offers content that is freely accessible to the general public.

Private: Offers restricted access to content: for example, via fee payment or prior authorization.

Provisional: Is in a transitory or unfinished state (e.g., “under construction”).

T.Sharon-A.Frank34

Growth of Unique Web Sites

1997-2001

1997-1998

1998-1999

1999-2000

2000-2001

Sites:457%82%71%52%18%

Unique Sites:

N/aN/a77%53%18%

Public Sites:

290%82%53%32%6%

T.Sharon-A.Frank35

Web Pages Statistics (1)

Note: all numbers below (source data: "Accessibility of Information on the Web“) refer to publicly indexable web pages; publicly indexable web pages exclude pages that are not normally considered for indexing by web search engines, such as pages with authorization requirements (including firewalls), pages excluded from indexing using the robots exclusion standard, dynamic pages, etc;

• 12/97 : At least 320 million pages;

• 02/98 : 2.8 million servers on the publicly indexable web;

289 average pages per server;

800 million publicly indexable web pages;

18.7 kilobytes is the mean size of a page;

3.9 kilobytes is the median size of a page;

T.Sharon-A.Frank36

Web Pages Statistics (2)• 02/99:

– 7.3 kilobytes average size of textual content per page (after

removing HTML tags, comments and extra white space);

– 0.98 kilobytes median size of the textual content;

– 15 terabytes of pages is the amount of data on the web;

– 6 terabytes is the amount of text data;

– 62.8 images per web server;

– 15.2 Kbytes - average image size;

– 5.5 Kbytes - median image size;

– 180 million images on the publicly indexable web;

– 3 terabytes - total amount of image data;

T.Sharon-A.Frank37

Web Pages Statistics (3)

• As of 7/5/2000, the web has roughly:

– 2,170,000,000 pages;

– 40,800,000,000,000 bytes of text;

– 489,000,000 images;

– 8,160,000,000,000 bytes of image data;

• In the last 24 hours, the web added:

– 4,420,000 new pages;

– 82,800,000,000 new bytes of text;

– 994,000 new images;

– 16,600,000,000 new bytes of image data;

– 49,400,000 pages changed;

– 11,100,000 images changed;

• Average life span of the web page: 44 days;

T.Sharon-A.Frank38

Internet/WWW Statistics

• Internet Size & Growth

• Population Sizes

• Various Activities

• Web Size & Growth

• Web Pages and Formats

T.Sharon-A.Frank39

Number of Web Pages

T.Sharon-A.Frank40

Growth of the Internet in Pages

T.Sharon-A.Frank41

What is the "average page" like?

The page sizes are highly variable, as illustrated in Table , which covers one snapshot of 1.524million pages.Mean 6518Median 2021Standard Deviation 31678

T.Sharon-A.Frank42

Embedded Image Count

The Web is quite graphically rich. The Table shows that just over 50% ofall pages contain at least one image reference. It is interesting to note thatabout 15% of pages contain exactly one image. Quite likely, for many ofthe pages that contain large numbers of images, those images are in facttypographical marks of the "reddot.gif" () variety.

T.Sharon-A.Frank43

What data formats are being used?

T.Sharon-A.Frank44

More data formats being used

T.Sharon-A.Frank45

References• Internet Domain Survey

– http://www.isc.org/ds/• Online Computer Library Center

– http://wcp.oclc.org/stats/size.html• UCLA Center for Communication Policy

– http://www.ccp.ucla.edu/pages/InternetStudy.asp• Network Facts

– http://www.netfactual.com/• Internet Statistics

– http://www.mit.edu/people/mkgray/net/• Domain Statistics

– http://www.domainstats.com