HTML Primer

Scott MacKenzie, April 2000


Table of Contents

  1. Introduction
  2. Tags
  3. Organization of a Web Page
  4. Headings
  5. Separators
  6. Emphasizing Text
  7. Colour
  8. Font Attributes
  9. Lists
  10. Definition List
  11. Images
  12. Links
  13. Tables
  14. Miscellaneous
  15. Frames
  16. Style Sheets
  17. Forms

Introduction

This brief introduction to HTML has been prepared for ITEC students working on homework or lab assignments.

Web pages are written and described in a language known as "HTML" for "hypertext markup language". HTML is not like other computer languages, since you do not compile and execute HTML "programs". HTML is a "markup" language. This means it describes the look of a web page. When a browser, like Netscape, reads in the web page, it interprets the HTML codes and displays the result on the screen. This is overly simplistic, but that's a good enough description for now.

Click here for a nice historical look at the web and hyptertext.

Tags

The basic building block of HTML is a tag, also called an element. HTML tags begin with less than (<) and end with greater than (>). Many tags bracket some text and determine how the text appears, for example

   <b>VERY IMPORTANT</b>

indicates that the text "VERY IMPORTANT" should appear in boldface type in the browser window. The first tag (<b>) turns on bold, and the second tag (</b>) turns off bold. The text would appear as follows:

   VERY IMPORTANT

Tags, for the most part, are not case sensitive, so <b> and <B> have the same effect.

Organization of a Web Page

The basic framework for a web page is

   <html>
   
   <head>
      <title>Title of the Web Page</title>
   </head>

   <body>
   Put the body of your web page here
   </body>

   </html>

Study this. Four pairs of HTML tags are illustrated. The <html></html> pair frames the web page. Within this pair, the <head></head> pair frames the header for the web page, and the <body></body> pair frames the body of the web page. Within the header, the <title></title> pair frames the title of the web page. Note that the <title></title> tags contain the text that appears in the title bar of the browser window. This text does not actually appear in the web page.

If the HTML code above is put in a text file named demo.html, it can be opened by a browser, like Netscape. The browser window will display

Put the body of your web page here

Note that HTML does not use end-of-line characters, as such. So, the following body

   <body>
   Put
   the
   body
   of
   your web page here </body>

appears the same.

OK, let's look at some specific types of HTML tags.

Headings

Headings are useful to give emphasis to the titles of sections, sub-sections, etc. Six levels of headings are supported:

   <h1> </h1>  begin and end a level 1 heading
   <h2> </h2>  begin and end a level 2 heading
   <h3> </h3>  begin and end a level 3 heading
   <h4> </h4>  begin and end a level 4 heading
   <h5> </h5>  begin and end a level 5 heading
   <h6> </h6>  begin and end a level 6 heading

For example

   

Level 1 Heading

Level 2 Heading

Level 3 Heading

Level 4 Heading

Level 5 Heading
Level 6 Heading

Separators

Separating sections of a web page is accomplished by the following tags:

   <p>    paragraph
   <br>   line break
   <hr>   horizontal rule

The <p> tag is generally used at the beginning or end of a paragraph of text. It forces the paragraph to start on a new line following an appropriate amount of space. For style, many authors prefer to use </p> at the end of a paragraph, instead of <p> at the beginning, but the effect is the same.

The <br> tag forces a break in the current line of text. Text continues immediately on the next line. This tag is useful in laying out addresses, for example,

York University
4700 Keele St.
Toronto, Ontario
M3J 1P3

The <hr> tag is used to create a horizontal line, or rule. An example follows:


Emphasizing Text

The most common tags for emphasizing text are

   <b>  </b>   begin and end boldface type
   <i>  </i>   begin and end italic type
   <u>  </u>   begin and end underlined type

Some examples follow:

   boldface
   italics
   underlined

Colour

Another way to emphasize text is by displaying it in a different colour, for example

RED GREEN BLUE

The HTML source code responsible for this beauty is shown below:

   <b>
   <font color = "red">RED</font>
   <font color = "green">GREEN</font>
   <font color = "blue">BLUE</font>
   </b>

This example uses the "font" tag, but the work is done using an attribute of the tag, in this case, the "color" attribute. (Note that American spelling is used.)

The following named colours are defined: black, silver, gray, white, maroon, red, purple, fuchsia, green, lime, olive, yellow, navy, blue, teal, and aqua. I'll leave it to you to experiment with these. But, please, try not to over use colour. The effect can be awful. If you don't believe me, have a peek at ICQ's home page:

   http://www.icq.com/

What were they thinking? Ugh!

If you're into designer paints, then you'll likely want more control than that afforded by the named colours. No problem, you can specify a colour using a 24-bit hexadecimal value, with 8 bits (2 hexadecimal digits) for red, green, and blue, respectively. For example, the word "YORK" in York University's logo (shown later in this web page) is rendered in the following "deep red" colour:

YORK

created with the following HTML source code:

   <b><font color = "#CC0000">YORK</font></b>

Can you guess how I determined that this is the code for the colour?

Since we're talking about colour, and we've also introduced the idea of attributes, you might be interested in this. Web pages commonly use a non-white background colour, just to create a nice effect. This is accomplished using the "bgcolor" attribute of the "body" tag. This web page, for example, is setup using the following body tag:

   <body bgcolor = "#fff8e0">

Do you like the effect? Oh well, I tried.

Font Attributes

Yet another way to add emphasis to text is through font size. The size of the text is controlled by the "size" attribute of the "font" tag. There are a few ways to do this.

The font size can be changed in "relative" terms or in "absolute" terms. To insert one slightly bigger word in some text, like this, the following HTML source code is used:

   <font size = "+2">this</font>

The </font> tag restores the font to the previous size, so

smaller normal bigger bigger bigger

is created with the following HTML source code:

   <font size = "-1">smaller</font>
   normal
   <font size = "+1">bigger</font>
   <font size = "+2">bigger</font>
   <font size = "+3">bigger</font>

By ommitting the "+" or "-" symbols, the specification is in absolute terms. I'll leave it with you to experiment with this.

The colour and/or family, or "face", of a font is controllable using the color (note American spelling) and face attributes. For example,

Yaba Duba Doo

is created with the following HTML source code:

   <font size=+2 color=blue face="Comic Sans MS">
   Yaba Duba Doo
   </font>

Lists

The most common tags for creating lists are

   <ul> </ul>  begin and end an unordered list
   <ol> </ol>  begin and end an ordered list
   <li>        begin a list entry

Note that list entries must be contained within either <ul></ul> tags or <ol></ol> tags, for unordered and ordered lists, respectively.

An unordered list is setup as follows:

   <ul>
   <li>apples
   <li>bananas
   </ul>

with the following result:

An ordered list is setup as follows:

   <ol>
   <li>Maple Leafs
   <li>Canadiens
   </ol>

with the following result:

  1. Maple Leafs
  2. Canadiens

Definition Lists

A special type of list is a Definition List. Three tags are required to setup a definition list:

   <dl> </dl>   begin and end a definition list
   <dt>         term
   <dd>         definition

Let's show an example using the definitions for the computer terms "RAM" and "ROM". Here's the HTML code:

   <dl>
   <dt>RAM
   <dd>Stands for "Random Access Memory".  RAM is a type of 
   semiconductor memory that can be read or written. RAM 
   is volatile, meaning that its contents are lost when power 
   is shut off.
   <dt>ROM
   <dd>Stands for "Read Only Memory". ROM is a type of semiconductor 
   memory that can be only be    read.  ROM is non-volatile, meaning 
   that its contents are retained even in the absence of power.
   </dl>

The result is shown below:

RAM
Stands for "Random Access Memory". RAM is a type of semiconductor memory that can be read or written. RAM is volatile, meaning that its contents are lost when power is shut off.
ROM
Stands for "Read Only Memory". ROM is a type of semiconductor memory that can only be read. ROM is non-volatile, meaning that its contents are retained even in the absence of power.

Images

An important feature of HTML is the ability to include images in web pages. This is accomplished as follows:

   <img src = "filename">

where "filename" is the name of an image file. In this example, "src" is the source attribute of the "img" (image) tag. Typically, the image is stored either in gif format or jpg format, and the filename ends with .gif or .jpg. If the image file is stored in the same location (directory) as the HTML file, then just the filename is required. Otherwise, a full path is required.

Let's attempt an example. If the York University logo is stored in a file called "yorklogo.gif" and that file is located in the same directory as this file, then the line

   <img src = "yorklogo.gif">

should display the York University logo:

   

If the logo appears above, whew! It worked. Too bad the white background in the image isn't transparent. This is possible with gif images, but that's another story.

Links

The "H" in HTML stands for "hypertext". Hypertext is a way of organizing documents so that an object within a document can refer to other objects. Typically, the object refers to objects in other documents, but, also, it may refer to an object within the same document. The object being referred to may even be an application program. The idea of "refer to" is implemented with a "hyperlink", usually just called a "link". Let's begin by illustrating links to other documents.

Links to Other Web Pages

By "other documents", we really mean "other web pages". A reference to another web page is made via an HTML link. An HTML link is created using the anchor tag as follows:

   <a href = "URI">
   click here
   </a>

where the term "href" is an attribute of the anchor tag. Clearly the anchor tag, a, is very important in HTML programming. You many want to spend some time reading about this tag (and its attributes) in an HTML reference guide.

The letters URI mean Uniform Resource Indicator, a generic term for all types of names and addresses of objects on the World Wide Web. A URL (Uniform Resource Locator) is one kind of URI. It means that the anchor is a HTML reference. A URL is an internet address. Instead of "URL", above, you enter a web address, for example,

   http://www.yorku.ca/

is the web address for York University's home page. The text between the tags, for example "click here", appears in the web page with a distinct appearance to emphasize that it is a link. Clicking on the highlighted text, advances the browser to the destination web location.

Let's try it. The text

   <a href = "http://www.yorku.ca/">click here</a>

appears in the web page as

   click here

The text above is a link, distinguished from regular text by the colour and underlining. If you click on the link, it will advance the browser to York University's home page.

Links Within a Web Page

The destination for a link may be within the same web page as the link itself. The destination is identified with a "bookmark". To do this, we again use the anchor tag, but this time with the "name" attribute to name the bookmark within the document:

   <a name = "foo">blah</a>

The HTML code above, or something similar, appears at the destination of the link. The text "blah" is optional. If used, it appears at the destination location, usually underlined, to show that it is the destination of a link.

The source of the link is constructed as shown earlier using the href attribute. However, since the destination is within the same document, only the named anchor is required. It must be preceded by a hash symbol:

   <a href = "#foo">click to go to blah</a>

Links within web pages are commonly used to create a table of contents. For example, the following is the HTML code for the first two entries of the table of contents shown at the top of this web page:

   <ol>
   <li><a href = "#i">Introduction</a>
   <li><a href = "#t">Tags</a>
   </ol>

At the location in this document where the sub-section "Tags" begins, the bookmark "t" is defined. The definition is coded like this:

   <a name = "t"></a>

E-Mail Links

A common variation of a link is an e-mail link. An e-mail link is setup up as follows:

   <a href = "mailto:email_address">
   send me email
   </a>

This is a link to the default email program on the system that is viewing the web page. If this link is clicked, the system's email program is launched and the composition of an email message is initiated with "email_address" appearing in the "To:" field.

Here's an example:

   To send email to the TA for ITEC 1010, Section M, click here

Here's the HTML source code for the above link:

   To send email to the TA for ITEC 1010, Section M, 
   <a href = "mailto:ta1010m@math.yorku.ca">click here</a>

Tables

Information may be conveniently organized and presented in a web page using a table. As a minimum, the following three tags are needed:

   <table> </table>   begin and end a table
   <tr>               table row
   <th>               table header
   <td>               table data 

Note that the header and data entries are organized within a row entry. It is considered good style to terminate each row, header, and data entry with </tr>, </th>, and </td>, but these tags are generally not needed.

The best way to proceed is with an example. Here's the definition list given earlier for RAM and ROM in the form of a table:

Type of Memory Definition
RAM Stands for "Random Access Memory". RAM is a type of semiconductor memory that can be read or written. RAM is volatile, meaning that its contents are lost when power is shut off.
ROM Stands for "Read Only Memory". ROM is a type of semiconductor memory that can be only be read. ROM is non-volatile, meaning that its contents are retained even in the absence of power.

It's a bit messy, but here's the HTML source code for the table above:

   <table align = "center" border=1 cellpadding=5 bgcolor="#f0f0f0" > 
   <tr align = "center" bgcolor="#e0e0e0">
      <th>Type of Memory
      <th>Definition
   </tr>
   <tr align = "left" valign = "top">
      <td>RAM
      <td>Stands for "Random Access Memory".  RAM is a type of 
      semiconductor memory that can be read or written. RAM 
      is volatile, meaning that its contents are lost when power 
      is shut off.
   </tr>
   <tr align = "left" valign = "top">
      <td>ROM
      <td>Stands for "Read Only Memory". ROM is a type of 
      semiconductor memory that can be only be    read.  ROM is 
      non-volatile, meaning that its contents are retained even 
      in the absence of power.
   </tr>
   </table>

Note the use of attributes to control the style of the table, or rows within the table. There is much more to tables than presented here, but we'll leave it to you to explore further.

Miscellaneous

Below are a few miscellaneous HTML tags and features.

These are all used "in" this page. From Netscape, click on View | Page Source to have a peek at how the examples herein are constructed. In fact, this is an excellant way to learn HTML.

Frames

Frames allow you to present documents in multiple views, for example to keep certain information visible, while other information is scrolled or replaced. Within the same window, one frame might display a company logo, a second a menu, and a third a primary document. The main document might include scrollbars, or, additionally, it might be scrolled though or replaced by navigating in the second frame

Normally, a web document contains one head element and one body element. With frames, a frameset element replaces the body element. Here's an example:

   <frameset cols="20%, 80%">
       <frameset rows="100, 200">
           <frame src="frame1.html">
           <frame src="frame2.gif">
       </frameset>
       <frame src="frame3.html">
       <noframes>
           <p>This frameset document contains:
           <ul>
              <li><a href="frame1.html">Some neat contents</a>
              <li><img src="frame2.gif" alt="A neat image">
              <li><a href="frame3.html">Some other neat contents</a>
           </ul>
       </noframes>
   </frameset>

This code could be used instead of a body element. It would create a frame layout something like this:

   ---------------------------------------
  |         |                             |
  |         |                             |
  | Frame 1 |                             |
  |         |                             |
  |         |                             |
  |---------|                             |
  |         |          Frame 3            |
  |         |                             |
  |         |                             |
  |         |                             |
  | Frame 2 |                             |
  |         |                             |
  |         |                             |
  |         |                             |
  |         |                             |
   ---------------------------------------

The first framset tag declares a frame with two columns. The first will occupy 20% of the screen width, the second 80%. Framset tags can be nested. The contents of the first (i.e., left) column is specified with another frameset tag. This divides the left column into two rows, the top row 100 pixels high, and the bottom row 200 pixels high. Have a close look at the code and the line drawing above, and you can see how this works.

The contents of each cell is specified with the frame tag and a src attribute referencing another file. The noframes elements is used for browsers not supporting frames, or in situations where the user has disabled frames in the browser's setup.

For another, more realistic, example of frames, click here, then navigate around the example page, and view the HTML source code.

Further technical details on frames may be found at the W3C Consortium's web page. Click here to go directly to the section on frames.

Style Sheets

Style sheets represent a major breakthrough for designing web pages. They allow the specification of the "look" of a web page to be separated from the "content". There are several important advantages of this approach:

As an example of in-line control of an element's properties, the following code

   <p style="color: red; font-size: 18pt; border: solid olive">
   This paragraph has unique properties.

yields the following result, but only for the current instance of the p (paragraph) element:

This paragraph has unique properties.

Header-defined style definitions occur within the head section of the web page and apply to all instances of the element in the page. For example, the headings for each section in this web page are Level-2 Headings, defined by the following code in this web page's head:

   <head>
      <style type="text/css">
      H2 {
         color: #cc3300;
         font-family: Arial,Helvetica;
      }
      </style>
   </head>
With this definition in the head, all instances of h2 elements (Level-2 Headings) appear with the properties indicated above. (If more than one font family is specified, the browser will use the first one available.)

The defintion appears within the style element. The type attribute is set to "text/css", defining the style language as conforming to the "cascading style sheet" standard.

For each element, its properties are defined using the following syntax:

   element { 
      property: setting; 
      property: setting; 
      ... 
   }

A more powerful way to control styles is through a separate style sheet file. Let's assume you want Level-1 Headings to appear centred in blue using the Arial font. If a file named MyStyle.css contains

   H1 {
      font-family: "Arial";
      text-align:  center; 
      color: blue;
   }

then the desired effect is invoked in an HTML document as follows:

   <html>
      <head>
         <link href="MyStyle.css" rel="stylesheet" type="text/css">
      </head>
      <body>
         <h1>Level 1 Heading</h1>
         <p>This is a paragraph.
      </body>
   </html>
Click here to see the web page.

In the HTML document, the external style sheet is specified with the link element combined with three attributes: The href attribute specifies the style sheet file. The rel attribute specifies whether the style sheet is persistent, preferred, or alternate. The argument stylesheet means persistent. The type attribute specifies the style sheet language, in this case text/css for cascading style sheet.

In the style sheet file, the properties for the various HTML elements are defined. In the example, the Level-1 Heading element is stylized to appear (a) using the Arial font family, (b) centred, and (c) in blue. Other HTML elements are similarily stylized.

Some of the more common properties are given below:

PropertyExamples
font-familyCourier, Times New Roman, serif, monospaced
font-style normal, italic)
font-weight normal, bold, bolder, lighter
font-size large, 10pt, larger, 90%, +5%
color blue, #ff00a0
background-color brown, #00aaff
text-align left, right, center, justify
text-indent 25, 10%

Further technical details on style sheets may be found at various web sites. For the W3C Consortium's page on this topic, click here. For the Web Design Group's page on this topic, click here.

Forms

An HTML form is a section of a document containing special elements called "controls". Controls are the HTML versions of checkboxes, radio buttons, menus, and other components familiar to users of graphical user interfaces (GUIs).

Typically, users "complete" a form by modifying its controls (entering text, selecting menu items, etc.), and then submit the form to an agent for processing (e.g., to a web server or a mail server).

Here's a simple form that includes labels, text fields, radio buttons, a drop-down list, and push buttons:

   <form action="http://somesite.com/prog/adduser" method="post"><p>
      Please complete this form:<p>
      <label for="firstname">First name: </label>
         <input type="text" id="firstname"><p>
      <label for="lastname">Last name: </label>
         <input type="text" id="lastname"><p>
      <label for="email">email: </label>
         <input type="text" id="email"><p>
      <label for="eyecolour">Eye Colour: </label>
         <select name="eyecolour">
            <option>blue
            <option>brown
            <option>hazel
         </select><p>
      <input type="radio" name="sex" value="Male"> Male<br>
      <input type="radio" name="sex" value="Female"> Female<p>
      <input type="submit" value="Send"> <input type="reset">
      </P>
   </form>

This code generates the following form:

Please complete this form:

Male
Female

The form is contained within the <form></form> tags. Two attributes appear in the example above. The "action" attribute specifies a form processing agent. This is typically an HTTP URI specifying a program to submit the form to, or a "mailto" URI to email the form. The "method" attribute specifies the HTTP method to submit the form data set. Possibilities include "get" (the default) and "post".

(Note: HTTP = hypertext transfer protocol, URI = uniform resource identifier)

Form controls are specified using the "input" element followed by the appropriate attributes. The "type" attribute specifies the type of control ("text", "radio", "submit", "select", and "reset" are shown above). The "id" attribute identifies each control. Text controls typically include labels, while radio buttons typically include values.

Further technical details on style sheets may be found at the W3C Consortium's web site. Click here to go directly to the section on forms.