The WWW is the Internet's best-known application, offering information on just about everything under the sun...
The World Wide Web, or 'web', is a service that allows computer users to quickly and easily navigate the Internet, giving them access to hundreds of millions of multimedia documents ('pages'), interlinked by hypertext -- references to other documents (on the same or another computer), that might also be of interest to the user. Hypertext allows users to select a linked word or image (typically by clicking a mouse), and obtain ('visit') another web document. These documents may comprise text, images, sounds, animations, or movies.
The World-Wide Web, also known as the WWW or simply as the Web, consists of "pages" of information stored on computers all around the world. These pages are available to anyone with a connection to the Internet. They are viewed with a Web browser such as Netscape or Internet Explorer. A page can contain text, pictures, sounds, 3-D graphics, movies, applets, and even interactive features such as fill-in forms.
The web has rapidly become the best-known Internet application. Although most people might not be able to differentiate between the web and the underlying Internet, nevertheless they are strictly speaking, different things. The Internet comprises many other services besides the web, e.g. email, telnet, FTP, etc. WWW also interfaces with other standard protocols (FTP, Telnet, NNTP, WAIS, gopher, ...) and their data formats.
The web is a distributed hypertext system. This means that words and pictures on a computer screen may be 'linked' to further information, e.g. like this. Mouse-clicking on these links 'takes' you directly to other pages on the Web. A hyperlink is a segment of text (word or phrase), or an inline image (an image displayed as part of the document) that refers to another document (text, sound, image, movie) elsewhere on the World-Wide Web.
Hyperlinks in a document are indicated in some way, e.g. in a graphical interface, by color & underlining for text; or by a colored border for an image; an audio clip might be represented by a speaker icon; or, in a text-based interface, by a number immediately afterwards.
When a hyperlink is selected (by mouse click in a graphical user interface, or entering the given number at a prompt in a text interface), the referenced document is fetched via the Internet, and is displayed appropriately (e.g. if its audio, the sound is played through a speaker). You can also access other tools of the Internet, such as FTP and Gopher, to help you explore and access Web resources.
Although they created text and graphical browsers, it was the release of NCSA's Mosaic for X-windows by Marc Andreessen and Eric Bina in 1993 that ignited the web explosion. Mosaic was well-publicised and its 'point and click' interface, familiar to PC users, made it a very pleasant and easy to use browser. Marc cofounded Netscape Communications Corporation in April 1994. Microsoft released Inrternet Explorer in 199?. These are the two most popular browsers, although they have been much criticised for their poor standards support.
The web has since expanded enormously, as an information resource, advertising and entertainment medium, for electronic commerce ('e-commerce'), and for personal home pages. Institutes and organisations with web sites range from the White House to NASA to 'virtual libraries'. Companies maintain web sites to promote products, offer services, and sell goods.
The Web is huge, and it has information on almost anything you can think of. There are millions of computers on the Internet. Each of those computers can run a "Web site" and publish Web pages. There is no central control or authority. In fact, you can publish your own information on the Web, using HTML.
People create web sites (e.g. such as EncycloZine) which consist of web pages (such as this one) using HyperText Markup Language or HTML, a standard format for describing the structure of web documents. HTML documents are ASCII files with embedded codes for logical markup, format (text styles, document titles, paragraphs, tables) and hyperlinks.
There are currently some one billion (1000 million) web pages. How are they organised? How do you navigate them? Each page has a unique address, or URL (Uniform Resource Locator). If you know the URL of a page (e.g. you saw it advertised, or a friend told it to you), you can type it into your browser and it will fetch the page for you. If you have no specific URL then perhaps you'll use one or more portals or search engines.
Other resources, such as pictures and sounds, are also identified by URL's. When you are viewing a page with a Web browser, the URL of that page is usually displayed in a box near the top of the browser's display window. If you know the URL for a page, you can go directly to that page by entering the URL in that box (and pressing return).
A typical URL is http://EncycloZine.com/WWW/. This URL has several meaningful parts:
e.g. http: The HyperText Transfer Protocol (HTTP) is the most common method used for communication on the Web. Another common protocol is File Transfer Protocol (FTP), an older method for transferring files from one computer to another. You might also run across some other protocols in URL's.
A domain name, such as EncycloZine.com, identifies a particular computer on the Internet. Most of the computers that are used as "servers" of data on the Web have domain names that begin with "www", such as www.whitehouse.gov. You can often read some information about a computer from its domain name. The computer named math.hws.edu is in the Mathematics Department ("math") at Hobart and William Smith Colleges ("hws"), which is an educational institution ("edu"). The last part of the domain name, such as "gov" or "edu" is called the top-level domain. Top-level domains include:
- COM for commercial purposes
- EDU for educational institutions
- GOV for government computers
- MIL for the military
- ORG for other organizations
- NET for certain Internet services
These domains are usually used by computers in the United States. Computers in other countries generally use two-letter country codes for their top-level domains. For example, a domain name ending in "it" indicates a computer in Italy, and "ca" is used by Canadian computers.
Many companies, organizations, and institutions have "home pages" on the Web. If you know something about domain names, you can often guess the URL used by a given company, organization, or institution. For example, you might guess that the home page of the United States Senate is http://www.senate.gov or that the Coca-Cola corporation has a home page at http://www.cocacola.com. (When you use a URL that omits the directory and file name, you will usually get the home page, or index page, from the specified computer.)
Portals are gateways to the WWW, featuring categorised directories, and supplementary services such as email, search, news, weather reports, stock quotes, chat rooms, etc. In the early web years (ca. 1993 - 1996) there were directories and search engines, which merged in a natural partnership and gradually added other useful services, such as news and shopping, and they began to call themselves 'portals'. They added personalisation features, such that you could have your own ('my') view of the portal, such as local weather, and your own selection of news sources.
They continued to expand the range of services, trying to be all things to all surfers. All this had to be presented on the home page at least, and to some extent on all other pages too - lest you fail to notice the richness of content. The result is that their pages are busy and crowded, and take longer to load - due to complex table layouts, and lots of banners and icons. So now the meaning of the term 'portal' is less clearly a directory and search function, but more, a plethora of services - which indeed may be very useful to a large proportion of the web surfing population, but has, in my estimation, made them more difficult to use as research tools, especially for the more academic subjects. Often it's not obvious where they have classified something since the top level refers to 'Autos', 'Shopping', and other consumer-oriented fare, while topics such as 'Science' may have been relegated to 'Education' or 'Library'.
Locating information on the web is becoming more and more problematic. Portals and search engines overwhelm users with vast quantities of information, much or most of which is not precisely what was wanted. Only a few of them provide any rating or evaluations, so you may have to visit several sites before you find one that might be usable and trustworthy. Quite often, the sites have moved, died, or changed to something else.
The portals and search engines all look so much alike, and present you with an indigestible mountain of choices, leading to crummy, irrelevant sites - you come away with frustration, probably not having really found what you were hoping for. These sites make research and learning tedious, instead of fun.
At EncycloZine we depend on portal directories very much to help us locate the best web sites for you. This means that we've had to examine them all very critically to identify the most helpful and reliable ones. In particular, those that have a zillion links for any particular topic, but no kind of rating system, are almost entirely worthless. It costs way too much time to visit the links, only to find that they're no good - irrelevant, dead or moved, or inaccurate, or hard to use, etc. Else, why not just use a search engine?
We expect a portal directory to add value by having experts review sites and rate them somehow, say by awarding a number of stars, or at the very least, by excluding sites that don't measure up to reasonably high standards. We have reviewed all the major portals and come up with a short-list of those we personally find the most useful. Our main criterion was quality over quantity; we also considered factors like objectivity and up-to-datedness.
Searching the Web
A search engine consists of an index of millions of Web pages and a program for searching the index. The index is made by a program that constantly downloads Web pages and adds their contents to the index. No index can include all the data on the Web because the Web grows so quickly. Also, some of the data in an index will be out of date because people change or delete their Web pages.)
To use an index, all you have to do is type some words into a box and click on a button. (You can do more advanced searches, but most search engines allow you to do simple searches in this way.) You'll get back a list of Web pages that contain the words you entered.