Planning & Managing an Internet Service
Introduction to the WWW
By Pauline M. Berry
1.5.0 What is the WWW?
1.5.1 The WWW Technology
1.5.2 The WWW Addresses: (URLs etc.)
1.5.3 The WWW Specifications: (HTML, HTTP & PICS)
1.5.4 The WWW Software
1.5.5 Available Servers
1.5.6 The First of the Browsers: Mosaic
1.5.7 The Leading Browser: Netscape
"The World Wide Web (WWW or W3) is a "distributed heterogeneous collaborative
multimedia information system".
Tim Berners-Lee
However, the WWW , like the internet, can be viewed in a variety of ways by a
variety of people:
The universal information database, a seamless world in which ALL information,
from any source, can be accessed in a consistent and simple way. The principal
being that data would not only be accessible to people around the world, but
information would exist to link easily to other pieces of information so that
only the most important data would be quickly found by a user. The ideas stem
from the 1960s when people explored theoretically the idea of a
"docuverse"
that people could swim through,
revolutionising all aspects of human-information interaction, particularly in
the educational field. Now through the WWW technology is beginning to catch up
with the ideas. The WWW depends on several basic principals, or concepts:
universal readership, hypertext, searching, the client-server model, and format
negotiation.
The WWW, as with most networked based concepts can also be viewed as the
addressing scheme and set of protocols which it encompasses, the most important
of which are URL and HTTP.
The internet is often viewed as a collection of resources many of which are
publically available information resources, from libraries to discussion
groups, from software to video clips and sound bites. The WWW forms a virtual
Web of links connecting many of these sources allowing people to follow the
strands or link in their own information
The WWW Project
was originally developed to allow information sharing within
internationally dispersed teams, and the dissemination of information by
support groups. The development of the WWW
has stimulated the
explosion of interest in the internet.
One of the basic aims of the WWW project was to achieve Universal Readership:
that information available on one computer should be accessible by any
authorised person using any type of computer in any country and all using one
simple program.
There are three main enabling technologies:
- Hypertext/Hypermedia
- The Client-Server model
- Format Negotiation
Definitions:
"Hypertext is text which is not constrained to be linear."
"HyperMedia is hypertext which is not constrained to be text: it can include
graphics, video and sound"
Hypertext and Hypermedia are terms coined by Ted Nelson in around 1965.
They are concepts and not products.
The WWW relies on hypertext as a way of communicating with the user. Basically,
hypertext is the same as ordinary text - it can be stored, read, searched, or
edited - except that it contains connections to other document within the text.
For example, consider the previous paragraph. If it had been written in
hypertext the name Ted Nelson may have been highlighted. By clicking on the
highlighted text you could retrieve one or more documents about Ted Nelson.
Perhaps a biography of Ted Nelson, a note of his address and an entry in an
on-line encyclopaedia. If the system was a hypermedia system then the link may
also have returned a recording of his voice, an image of the man himself or
even a video clip of a seminar about or by Ted Nelson.
Hypertext/Hypermedia can make the collection of text and other media into a
complex "web of information" The WWW forms its web across the world via the
internet.
The WWW uses the client server model. Basically, a computer running a program
called a Web client (browser) allows the user to view a hypertext/hypermedia
document (Web page). The user then clicks on a hypertext link. This causes the
Web client to send a message (in a specific protocol) across the internet to
the computer specified by the address of the message. This computer will have a
program called a Web Server running which can understand the message and sends
the text, image or other media back to the clients screen.
The protocol used to send these messages is called http (hypertext transmission
protocol)
There are many different and competing data formats. A feature of HTTP is that
the client program sends a list of the representations it understands along
with its request, and the server can then ensure that it replies in a suitable
way if possible. This is called format negotiation and it is designed to cope
with the existing mass of graphics formats for example (GIF, TIFF, JPEG to name
but a few). Format negotiation allows the web to distances itself from the
technical and political battles of the data formats. Hopefully it will also
enable the WWW to be adaptable to future innovations in data formats.
The general addressing scheme
is one of the constituent technologies. It enables the naming, describing, and
retrieving of resources (usually documents) on the Internet. It is based on the
concept of a Uniform Resource Indicator (URI).
URI refers to the generic set of all names/addresses that are short strings
that refer to objects on the Web. There are actually three sub classes of
addresses: URLs URNs and UNCs.
- URL:
- Uniform Resource Locators are used to `locate' resources, by providing
an abstract identification of the resource location. The format of a URL
contains:
scheme://host/dir/subdir/filename.
- scheme:
- is one of http, ftp, gopher, file, news or telnet
- host:
- is the regular internet hostname of the server
- dir/subdir/filename:
- is the full pathname of the file the link wishes to retrieve.
Its flexibility allows the WWW to access all the existing data in FTP
archives, news articles, and WAIS and Gopher servers.
- URN:
- Uniform Resource Name is a scheme under development which should provide
for the resolution using internet protocols of names which have a greater
persistence than that currently associated with internet host names or
organizations.
- URN:
- Uniform Resource Citation. A type of
resource description in the form of a set of attribute/value
pairs. Some of the values may be URIs of various kinds. Others may include, for
example, authorship, publisher, datatype, date, copyright status and shoe size.
Most WWW users are reasonably familiar with the URL, however, thay are also
probably familiar with problems of dead links, overloaded sites, lots of
trans-oceanic traffic, etc. These problems all arise because URLs confuse the
name of a resource with its location. Thus the
URI working group
has established a scheme where a name (the
URN) and an associated description (a URC) are assigned to a resource. That
description will contain things like author, title, subject. It will also
contain the possible locations for the resource as a set of URLs. Thus a
browser can pick one of the URLs to try to retrieve the resource and if that
doesn't succeed it will try another URL and so on.
- http: allows the WWW to perform format negotiation.
- HTTP (HyperText Transfer Protocol) "HTTP is an application-level protocol
with the lightness and speed necessary for distributed, collaborative, hyper media
information systems. It is a generic, stateless, object-oriented protocol which can
be used for many tasks, such as name servers and distributed object management systems,
through extension of its request methods (commands). A feature of HTTP is the typing
and negotiation of data representation, allowing systems to be built independently of
the data being transferred."
WWW Consortion on HTTP
- html: allows a basic document to be structured and contain hypertext links.
- HTML (HyperText Markup Language), now on to
specification version 3.2,
HTML is an evolving language which is used to construct documents which can be
viewed by World Wide Web browsers. HTML has been standardized by the
WWW consortium.
It is aimed at being a
"simple scaleable document format that can be used for information exchange on
virtually any platform" . Possible platforms include:
- Graphical User Interfaces, such as Windows, Macs and X11/Unix
- Text only systems for instance, VT-100 terminals
- Text to Speech devices
- Rendering to Braille
- pics: facilitates the control of access to material on the Internet.
- PICS (Platform for Internet Content Selection) is being designed to enable
supervisors (parents, teachers, or administrators) to block access from their
computers to certain Internet resources, without censoring what is distributed to
other sites. It is based to the principle of labelling internet resources with ratings. For more information:
There is a mass of software available, a lot of it currently freeware. These
are
the main types:
- Client software:
- Programs to access the web directly from your computer, (e.g.
Mosaic,
Netscape..)
- Server software:
- Programs for publishing information on the Web,
- Web Authoring Tools:
- Programs for creating and editing Web pages, (e.g.
Pagemill,
HotMetaL)
- Tools for information providers:
- Generate HTML from other things, analyse log files, make a telnet server, web-roaming robots, etc.
- Gateways Servers:
- To make other existing information systems visible on the web.
- Mail Robot:
- A server which will returns any web document by mail, given a request sent by mail. Also manages mailing lists.
- Common Code Library:
- A public domain reference implementation forming the basis of many browsers and servers.
- Shen:
- Cryptographic security
In addition to the many WWW servers.It is also possible to access a variety of
alternative server types. For example
FTP: The directory structure of an ftp server is presented by WWW
as a list of hypertext links to each file/directory. Thus, the user can browse
the ftp directory and retrieve the required files. An example of an interesting
anonymous ftp server is URL ftp://src.doc.ic.ac
GOPHER: Is similar to ftp except there are no hypertext links
anything is either a plain document or a menu. An example of an interesting
Gopher server can be found at: URL gopher://glas.apc.org/1.
NEWS: A WWW server can read Usenet news. It uses hypertext to
give instant links between related articles and newsgroups. Looking for example
at WWW's own newsgroup, comp.infosystems.www which can be found at: URL
news:comp.infosystems.www.
Other interesting servers can be found at:
"NCSA Mosaic is a networked information discovery, retrieval, and
collaboration tool and World Wide Web browser developed at the National Center
for Supercomputing Applications. "
Mosiac was the first highly popular Web browsers, it has since been
overtaken by a plethora of new browsers the most successful of which is
Netscape. Estimates in 1995 gave Netscape 73% of the market and Mosaic only 8%.
NCSA Mosaic was originally designed and programmed for the X platform by Marc
Andreessen and Eric Bina at NCSA. Now there are versions available to run on
Mackintosh and IBM PC compatible platforms. It basically provides a nice
interface to the Internet.
It is based on the hypertext and the client/server model, any mosaic client can
communicate with any HTTP server. Thus, mosaic provides a nice interface to the
WWW protocols. It is also able to communicate with more traditional Internet
protocols such as FTP, Gopher, WAIS, NNTP.
Mosaic also has some astonishing multimedia capabilities, in fact it is claimed
to have "unlimited" capabilities. Many file types, such as inlined images, are
handled by Mosaic internally. Other types, such as mpeg movies, sound files,
Postscript documents, and JPEG images are automatically sent to external
applications to be players or viewed.
The development of interfaces to the WWW like Mosaic and Netscape have opened
up the world of the internet. It is the easy of use that has seen the most
recent explosion of interest which has drawn the TV, radio, newspaper and even
political commentators to the "superhighway".
"The dramatic technological achievement of NCSA Mosaic and the World Wide
Web are affecting the lives of people globally. The information superhighway
also presents a new paradigm for education and provides an opportunity for the
federal government to energize the nation in a positive way. The vision is that
every individual will become both viewer and author with interactive
participation and access to knowledge. This new method of communication may
create a community around the world that will not know boundaries of wealth or
poverty, age, sex ,nationality, or race."
[ref]
Click here to find out more about Netscape
Back To:
FULL COURSE INDEX |
PAULINE BERRY | DIS |
STRATHCLYDE UNIVERSITY
Last modified .