Planning & Managing an Internet Service
Introduction to the WWW

By Pauline M. Berry


An Introduction to the WWW: INDEX

1.5.0 What is the WWW? 1.5.1 The WWW Technology 1.5.2 The WWW Addresses: (URLs etc.) 1.5.3 The WWW Specifications: (HTML, HTTP & PICS) 1.5.4 The WWW Software 1.5.5 Available Servers 1.5.6 The First of the Browsers: Mosaic 1.5.7 The Leading Browser: Netscape

1.5 What is the WWW?

"The World Wide Web (WWW or W3) is a "distributed heterogeneous collaborative multimedia information system". Tim Berners-Lee

However, the WWW , like the internet, can be viewed in a variety of ways by a variety of people:

As a concept:

The universal information database, a seamless world in which ALL information, from any source, can be accessed in a consistent and simple way. The principal being that data would not only be accessible to people around the world, but information would exist to link easily to other pieces of information so that only the most important data would be quickly found by a user. The ideas stem from the 1960s when people explored theoretically the idea of a "docuverse" that people could swim through, revolutionising all aspects of human-information interaction, particularly in the educational field. Now through the WWW technology is beginning to catch up with the ideas. The WWW depends on several basic principals, or concepts: universal readership, hypertext, searching, the client-server model, and format negotiation.

As a set of protocols:

The WWW, as with most networked based concepts can also be viewed as the addressing scheme and set of protocols which it encompasses, the most important of which are URL and HTTP.

As a Web of Information:

The internet is often viewed as a collection of resources many of which are publically available information resources, from libraries to discussion groups, from software to video clips and sound bites. The WWW forms a virtual Web of links connecting many of these sources allowing people to follow the strands or link in their own information

The WWW Project was originally developed to allow information sharing within internationally dispersed teams, and the dissemination of information by support groups. The development of the WWW has stimulated the explosion of interest in the internet.


1.5.1 The WWW Technology

One of the basic aims of the WWW project was to achieve Universal Readership: that information available on one computer should be accessible by any authorised person using any type of computer in any country and all using one simple program.

There are three main enabling technologies:

Hypertext/Hypermedia

Definitions:

"Hypertext is text which is not constrained to be linear."

"HyperMedia is hypertext which is not constrained to be text: it can include graphics, video and sound"

Hypertext and Hypermedia are terms coined by Ted Nelson in around 1965. They are concepts and not products.

The WWW relies on hypertext as a way of communicating with the user. Basically, hypertext is the same as ordinary text - it can be stored, read, searched, or edited - except that it contains connections to other document within the text. For example, consider the previous paragraph. If it had been written in hypertext the name Ted Nelson may have been highlighted. By clicking on the highlighted text you could retrieve one or more documents about Ted Nelson. Perhaps a biography of Ted Nelson, a note of his address and an entry in an on-line encyclopaedia. If the system was a hypermedia system then the link may also have returned a recording of his voice, an image of the man himself or even a video clip of a seminar about or by Ted Nelson.

Hypertext/Hypermedia can make the collection of text and other media into a complex "web of information" The WWW forms its web across the world via the internet.

The Client-Server model

The WWW uses the client server model. Basically, a computer running a program called a Web client (browser) allows the user to view a hypertext/hypermedia document (Web page). The user then clicks on a hypertext link. This causes the Web client to send a message (in a specific protocol) across the internet to the computer specified by the address of the message. This computer will have a program called a Web Server running which can understand the message and sends the text, image or other media back to the clients screen.

The protocol used to send these messages is called http (hypertext transmission protocol)

Format Negotiation

There are many different and competing data formats. A feature of HTTP is that the client program sends a list of the representations it understands along with its request, and the server can then ensure that it replies in a suitable way if possible. This is called format negotiation and it is designed to cope with the existing mass of graphics formats for example (GIF, TIFF, JPEG to name but a few). Format negotiation allows the web to distances itself from the technical and political battles of the data formats. Hopefully it will also enable the WWW to be adaptable to future innovations in data formats.

1.5.2 The WWW Addresses: (URLs etc.)

The general addressing scheme is one of the constituent technologies. It enables the naming, describing, and retrieving of resources (usually documents) on the Internet. It is based on the concept of a Uniform Resource Indicator (URI).

URI refers to the generic set of all names/addresses that are short strings that refer to objects on the Web. There are actually three sub classes of addresses: URLs URNs and UNCs.

URL:
Uniform Resource Locators are used to `locate' resources, by providing an abstract identification of the resource location. The format of a URL contains:

scheme://host/dir/subdir/filename.

scheme:
is one of http, ftp, gopher, file, news or telnet
host:
is the regular internet hostname of the server
dir/subdir/filename:
is the full pathname of the file the link wishes to retrieve.

Its flexibility allows the WWW to access all the existing data in FTP archives, news articles, and WAIS and Gopher servers.

URN:
Uniform Resource Name is a scheme under development which should provide for the resolution using internet protocols of names which have a greater persistence than that currently associated with internet host names or organizations.
URN:
Uniform Resource Citation. A type of resource description in the form of a set of attribute/value pairs. Some of the values may be URIs of various kinds. Others may include, for example, authorship, publisher, datatype, date, copyright status and shoe size.
Most WWW users are reasonably familiar with the URL, however, thay are also probably familiar with problems of dead links, overloaded sites, lots of trans-oceanic traffic, etc. These problems all arise because URLs confuse the name of a resource with its location. Thus the URI working group has established a scheme where a name (the URN) and an associated description (a URC) are assigned to a resource. That description will contain things like author, title, subject. It will also contain the possible locations for the resource as a set of URLs. Thus a browser can pick one of the URLs to try to retrieve the resource and if that doesn't succeed it will try another URL and so on.

1.5.3 The WWW Specifications: (HTML, HTTP & PICS)

http: allows the WWW to perform format negotiation.

HTTP (HyperText Transfer Protocol) "HTTP is an application-level protocol with the lightness and speed necessary for distributed, collaborative, hyper media information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems, through extension of its request methods (commands). A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred." WWW Consortion on HTTP

html: allows a basic document to be structured and contain hypertext links.

HTML (HyperText Markup Language), now on to specification version 3.2, HTML is an evolving language which is used to construct documents which can be viewed by World Wide Web browsers. HTML has been standardized by the WWW consortium. It is aimed at being a "simple scaleable document format that can be used for information exchange on virtually any platform" . Possible platforms include:

pics: facilitates the control of access to material on the Internet.

PICS (Platform for Internet Content Selection) is being designed to enable supervisors (parents, teachers, or administrators) to block access from their computers to certain Internet resources, without censoring what is distributed to other sites. It is based to the principle of labelling internet resources with ratings. For more information:

1.5.4 The WWW Software

There is a mass of software available, a lot of it currently freeware. These are

the main types:

Client software:
Programs to access the web directly from your computer, (e.g. Mosaic, Netscape..)

Server software:
Programs for publishing information on the Web,
Web Authoring Tools:
Programs for creating and editing Web pages, (e.g. Pagemill, HotMetaL)
Tools for information providers:
Generate HTML from other things, analyse log files, make a telnet server, web-roaming robots, etc.
Gateways Servers:
To make other existing information systems visible on the web.
Mail Robot:
A server which will returns any web document by mail, given a request sent by mail. Also manages mailing lists.
Common Code Library:
A public domain reference implementation forming the basis of many browsers and servers.
Shen:
Cryptographic security

1.5.5 Available Servers (The Web of Information)

In addition to the many WWW servers.It is also possible to access a variety of alternative server types. For example

FTP: The directory structure of an ftp server is presented by WWW as a list of hypertext links to each file/directory. Thus, the user can browse the ftp directory and retrieve the required files. An example of an interesting anonymous ftp server is URL ftp://src.doc.ic.ac

GOPHER: Is similar to ftp except there are no hypertext links anything is either a plain document or a menu. An example of an interesting Gopher server can be found at: URL gopher://glas.apc.org/1.

NEWS: A WWW server can read Usenet news. It uses hypertext to give instant links between related articles and newsgroups. Looking for example at WWW's own newsgroup, comp.infosystems.www which can be found at: URL news:comp.infosystems.www.

Other interesting servers can be found at:


1.5.6 The First of the Browsers: Mosaic

"NCSA Mosaic is a networked information discovery, retrieval, and collaboration tool and World Wide Web browser developed at the National Center for Supercomputing Applications. "

Mosiac was the first highly popular Web browsers, it has since been overtaken by a plethora of new browsers the most successful of which is Netscape. Estimates in 1995 gave Netscape 73% of the market and Mosaic only 8%. NCSA Mosaic was originally designed and programmed for the X platform by Marc Andreessen and Eric Bina at NCSA. Now there are versions available to run on Mackintosh and IBM PC compatible platforms. It basically provides a nice interface to the Internet.

It is based on the hypertext and the client/server model, any mosaic client can communicate with any HTTP server. Thus, mosaic provides a nice interface to the WWW protocols. It is also able to communicate with more traditional Internet protocols such as FTP, Gopher, WAIS, NNTP.

Mosaic also has some astonishing multimedia capabilities, in fact it is claimed to have "unlimited" capabilities. Many file types, such as inlined images, are handled by Mosaic internally. Other types, such as mpeg movies, sound files, Postscript documents, and JPEG images are automatically sent to external applications to be players or viewed.

The development of interfaces to the WWW like Mosaic and Netscape have opened up the world of the internet. It is the easy of use that has seen the most recent explosion of interest which has drawn the TV, radio, newspaper and even political commentators to the "superhighway".

"The dramatic technological achievement of NCSA Mosaic and the World Wide Web are affecting the lives of people globally. The information superhighway also presents a new paradigm for education and provides an opportunity for the federal government to energize the nation in a positive way. The vision is that every individual will become both viewer and author with interactive participation and access to knowledge. This new method of communication may create a community around the world that will not know boundaries of wealth or poverty, age, sex ,nationality, or race." [ref]


1.5.7 The Leading Browsers: Netscape

Click here to find out more about Netscape
Back To:
FULL COURSE INDEX | PAULINE BERRY | DIS | STRATHCLYDE UNIVERSITY


Last modified .