Getting Started on the Web

This page is intended to help you get started on the web by showing you how to set up and maintaing a home page. It will suggest some free and low cost tools to help in this, and some procedures that you can use to get started.

Tools You Will Need

The School of Computer Science and Information Systems at Pace University maintains two servers: csis.pace.edu in Westchester, and sol.pace.edu in New York City. Both of these systems run the UNIX operating system and have the usual range of UNIX tools available. Most users will choose not to work directly on these systems, however. You can effectively work on either a PC or a Macintosh and simply upload your work to the server.

Telnet

A telnet client lets you log in remotely to the server over the net. There is a telnet client that comes with Windows 95 that works fine. Look in the Windows directory for it. On the Macintosh, I recommend the NCSA Telnet client that you can get from the National Center for Supercomputer Applications at the University of Illinois. To login to the remote machine you first have to establish a connection to it using the machine name like sol.pace.edu. You then have to give your user ID and password. You will then be able to issue all of the usual command mode UNIX commands. You will need only a few of these and you will need to use telnet only rarely. See below.

FTP

An ftp client lets you move files from your local machine to a remote machine and back. You can actually use telnet to do this, but it is awkward. You can also use most web browsers to do ftp. You should get a proper ftp client to ease your work. I think that there is one bundled with Windows 95, but a better one is called WS_FTP and you can get it from the usual shareware sites online (see below). This program has a very nice two window interface, one for each machine. You simply highlight a file on one machine and click a transfer button to put a copy on the other machine. On the Macintosh, I recommend Fetch, from Dartmouth College ( FTP Dartmouth ). This permits drag and drop transfers between a window representing the remote machine and your desktop.

You will use ftp to transfer web pages and graphics developed on your local machine to your personal web site.

Image Tools

The web knows how to display jpeg and gif files, but not bitmaps and some other graphics formats. You need to be able to translate some files from foreign formats into a web native format before they can be displayed. There are some commercial tools (Photoshop) that do an excellent job of this. There are also some shareware tools that can be used effectively for these simple translation tasks. I use GraphicConverter on my Macintosh to translate file formats, and JPEGView to view graphics files. Note that gif is best for line drawings and graphics and jpeg is best for photographs. Remembering this will help keep your image files small and compact.

You can find lots of tools for both Macintosh and PC at TUCOWS. It is also nicely indexed. If you use shareware programs, however, you should really pay the shareware fees. The authors have put a lot of effort into creating these programs. You can encourage the creation of this low cost software by rewarding the authors appropriately.

HTTP Composing Tools

Recent browsers, like Netscape Communicator have a composing mode that lets you build an html document without looking at the html code. These are visual editors that let you work directly with the rendered elements. They are very convenient, but do not always provide all elements of the html language that you might like to use.

You need to be able to create web pages in html format. HTML is a page formatting computer language. It is not especially easy to use and is quite ugly. You can learn the simple elements of the language in a few days and expand your knowledge as you go. Alternatively, you can use a translator to create the html for you. If you want to write your own, you need any editor that will create a simple text (ascii) file. Most word processors can do this as well, but you lose all word processing formats when you export text.

Some word processors, like Microsoft Word for Windows, have an html export function built in to the file save function. This is effective, but if you want to later edit the html file itself, you will find that the form produced by Word is difficult to use.

Alternatively, you can use Word and some other word processors to save your document in RTF (Rich Text Format). This is sometimes called Document Interchange Format, but I think that there is another format by that name that is not compatible. Once you have the RTF format, you can use a tool RTFtoHTML to translate the RTF file into HTML. I like this tool (both Macintosh and PC versions) because you can tailor the output to match your writing styles.

RTFtoHTML comes with a special formattind document that is used to lay out the HTML. It will translate Word style sheets into HTML styles. It isn't easy to learn the language of this format page, but it is quite powerful, and only needs to be modified occasionally.

RTFtoHTML will create graphics files from any graphics in your RTF document, but these files will probably be in the native format of your computer: bitmaps on the PC and pict files on the Macintosh. You need an image tool to translate these to jpeg or gif formats.

Another method of creating web documents is to avoid HTML altogether and use Adobe Acrobat (pdf) files instead. This requires that your user have the (free) Acrobat Reader or plug-in, but it gives beautiful results. To use it you need to pay money to Adobe to get the Adobe Acrobat Distiller. One way that it can be used is as a print driver. In this mode, you just write using your normal word processor and when you "print" it you get a pdf file that you can upload to the server. This is especially effective for publishing PowerPoint slide shows, for example.

 

What You Need to Do to Get Started

Get An Account

You need an account on one of the web servers in order to have a web page there. This will be a regular account with permissions to store files and send and receive mail.

Create the WWW Root Directory

Use your telnet client to log into your account. When you do so you should be in your personal root directory. For example, if your system name is jonesm and your account is on sol, your root directory will be /export/home/jonesm. To the web this directory is known as sol.pace.edu/~jonesm.

Use the mkdir command to create a new directory named "public_html"

mkdir public_html

This is your home site for the WWW. Anything within that directory can be accessed by web browsers. Things not in that directory are invisible to the web. You can organize that directory however you choose, but it is recommended that you use sub-directories to keep different projects separate. It is common, for example to have one subdirectory named images that is used just to hold images to be shown on your pages.

NOTE that different web servers have diffferent requirements. We use the Netscape Server Suite here at Pace, which requires users to use the public_html directory as a personal web site.

You can create subdirectories as needed with the mkdir command. You can switch to one of your directories with the cd command.

cd public_html
mkdir images

Create Your Pages and Collect Your Graphics

Use the editor or word processor to create your web page. In every directory on your web site, a file named index.html will be taken as the default document to show if the web user links to that directory. For example, if user jonesm has a file named index.html the directory public_html of his or her home directory, then a user linking to sol.pace.edu/~jonesm will see that page. If you name your files otherwise, then the user has to give the document name as well. If they don't give a document name and there is not index.html file then the server will generate a default directory page (ftp page) for them, but these pages are pretty ugly.

Rember that your files will be on a unix system eventually, where filenames are case sensitive. The name index.html is not the same as INDEX.HTML on a unix system.

The overall structure of an html document is as follows

 
<html>
<head>

</head>
<body>

</body>
</html>

In the head portion of the document you usually put a title, and perhaps an author. The title shows up as the title of the window in which the web browser displays the page. The title for this document is defined by:

<title> Getting Started on the Web </title>

All of the text and graphics that show up on the page itself are defined in the BODY portion of the page.

Most web pages show up as black print on a gray background. Pretty boring. This page has a white background because we put a bgcolor parameter in the BODY tag for the page:

<body bgcolor = ffffff>

The color is given in RGB format and ffffff is white. We have also decided that some of the text should be in red, we have enclosed various parts of the text in FONT tags of the form

<font color = red>
...
</font>

Get your graphics into a form that will be acceptable to the web (jpeg or gif). Assemble them all together in a directory structure on your local machine that mimics the directory structure on the server. For example, if you intend to put graphics in an "images" directory, then create an images directory on your local machine and put the files there.

TEST TEST TEST

Now use your web browser to open the local files and make sure that they look as you suspect. Check any embeded links (anchors) that you have in the document to make sure that you haven't mistyped something. Go back and edit as necessary. Do this before you publish your pages. Get the errors out before you go public.

Upload Your Pages and Graphics

When your pages look ok, use your FTP client to upload your pages, including your subdirectories to the web site. Make sure that your remote connection is to the public_html directory and not to your root directory.

Once your site is in place, you will upload only individual files to individual directories, but to start, you can upload directories as a whole.

Test Again

Now that you are online, test everything again. You will probably have to redo some things because of the case sensitivity of unix. Edit locally and reupload as necessary.

Hints

Here are some hints that will help you with HTML.

One of the best ways to learn html is to look at html documents. Most browsers let you do this. When you find a page that you like and wonder how it was done, you can select an option from one of your menus (view perhaps) that lets you see the document source. You will see the raw html page unrendered. From this you can see the various elements that were used to compose it.

Tags

HTML formats a document by including formatting tags within the text to be displayed. These tags are enclosed in angle brackets: < and >. Most tags come in pairs with an opening tag like <H1> which is the Heading one tag, followed by the same tag but with the slash character after the opening angle bracket, like </H1>. Everything between these tags will get a certain kind of formatting. In this case it is the large bold headline such as you see at the top of this page, which was created with the format instruction:

<center><h1> Getting Started on the Web </h1></center>

Note that the line structure of your original is not preserved in the page as viewed on the web. This is to make sure that a page viewed on a small screen doesn't requrire excessive scrolling to read it and one on a large screen will show the maximum amount of information. You can explicitly break lines with the <BR> tag, which has no matching close tag. You can formulate paragraph breaks with the <P> tag which you can close with the matching </P> tag, but the latter is optional in this case.

Note that the tags are not case sensitive.

If you have some text that you would like formatted exactly on the viewed page as in your original, you can include it in the <pre>...</pre> tag pair.

One problem that this page format syntax introduces into your documents is that the angle bracket symbols themselves can't be represented directly in your page. You need to use special character formatting symbols for them. Therefore, this page, which is intended to show some of the tags themselves, must use these codes to prevent the angle brackets from being interpreted as formatting commands. The code for the left angle bracket is &lt; by the way. The codes begin with ampersand and end with semicolon. There are also numeric forms for the formatting characters and other accented characters as well. The ampersand itself is &#38;.

Anchors

Hypertext linking between documents is achieved with an anchor tag. Like most tags, it has two parts, one to open the formatting and one to close it. Between the two parts is the visible text that is displayed, usually in a contrasting color. For example here is a link to Prof. Bergin's home page

Unlike simpler tags, however, the opening tag of an anchor has parameters. The important one here is the href parameter. A parameter is a named string that occurs within the tag itself. To achieve the above link, requires the anchor:

<a href = http://csis.pace.edu/~bergin> Prof. Bergin's home page</a>

An anchor can take you to any page in the world.

HTTP

In the above anchor we have used the http protocol as you have seen. This is a link to another document on the web. Other protocols are possible, such as ftp and gopher.

By the way, the double slash marks after the protocol are an historical artifact that comes from having treated the slash mark / specially and then needing to have some way treat it as a normal slash. As usual on unix systems, a character is "escaped" to take away a special meaning and the slash is the escape character.

MAILTO

Another protocol that you can use in the href parameter of an anchor is mailto. If the user's browser is correctly set up, the mailto anchor will open a send mail window to permit the user to send mail to the person whose email is in the tag. It is very helpful if you put such a tag on your home page to enable readers to reach you easily. Include your own email address in it. For example, here is a mailto anchor to send mail to Prof. Bergin. Mail to Prof. Bergin. The complete anchor that accomplishes this is:

<a href = mailto:berginf@pace.edu> Mail to Prof. Bergin </a>

IMG

To show an image file in an html document requires the <IMG> tag. This is a single part tag that has a SRC parameter. For example, to show the sol site logo requires an <IMG> tag like:

<center> <img src = ../soli.jpg> </center>

And here is the effect of embedding this tag in the document.

We combined the <IMG> tag with the <CENTER> tag, of course, to center the image. Note that we have given a relative path to the image file. The actual file soli.jpg resides in the parent directory (directory ".." ) of the one that contains this (index.html) file. We could also have given a complete path as http://sol.pace.edu/soli.jpg. Remember that the reference you give must be (case sensitive) consistent with the name of the file as stored on the server. The same is true in anchors as discussed above.

Note that if you include an <IMG> tag between the <A> and </A> tags of an anchor, then clicking on the image will take you along the link.

 

Get Started Page

Here is a template page that you can use to edit as a basis for your own home page. It shows some of the elements discussed here. All of the links in it are dummy links, however. The page also makes reference to an image that we have not included here, so the image is replaced by a standard image icon by the browser.

Design Notes

Keep your pages simple and clean.

Keep your image files small. They will load faster and your readers will be happier. Some of the tools will remap colors in an image to keep the file size down. You can also compress (jpeg ) files, though you lose some detail. You can also give the size (in pixels) of an image file in the <IMG> tag. That way the rest of the page can load and be displayed while the image is still loading. A good image editor can tell you the pixel size of your image.

Avoid state of the art HTML elements. Not all browsers will support them. You can also use the ALT parameter in <IMG> tags to display a line of text in place of the image in case the user is using a text only browser like LYNX.

And of course, everything on your page and everything directly linked from your page should reflect well on you and on the University.

To learn more about web site design you can visit a site maintained at Georgia Institute of Technology . There is also an HTML Crash Course for Educators maintained by the EdWeb at GSN.org.

 

Legal and Ethical Issues

There are a few legal and ethical issues that constrain what you may properly do in cyberspace and in particular on your web pages. When you create a web site you become a publisher. Pace University does not maintain editorial control over the contents of the personal sites maintained by its students and faculty. If you put something on your own site it is your responsibility.

There are a number of laws that govern what publishers may lawfully publish. There are additional ethical concerns that will be taken seriously by any responsible professional. Generally speaking, what would be illegal offline is also likely to be illegal online.

For both legal and ethical reasons you should consider carefully what you say about individual persons. There are strict laws governing slander and libel in the U.S. and elsewhere. Accusing someone of something that might be criminal or unethical can result in lawsuits.

Invading someone's privacy can also be illegal. This can mean looking in someone else's directory or attempting to read their email. Sending email can also be illegal if it is determined to be harassing the other person. Learn more about privacy at the Privacy Forum.

The content of your site is also subject to the copyright laws of the U.S. and perhaps other jurisdictions. If you didn't create what you publish, you need to be sure that you have permission to publish it. It should be fairly clear that putting a copy of a commercial program on your site, available for download, is illegal and almost certainly wrong, but there are some more surprising consequences of the copyright laws. In at least one place, a link to an existing site was ruled in violation of copyright law. The problem was that a newspaper set up a web site with a home page that required visitors to register before going further. Someone registered and then created links to pages behind the home page making it possible to visit the site without registering. This was declared illegal. The U.S. has recently passed a law that levies a penalty of up to $100,000 for a single violation of copyright law. To learn more about copyright you can look at a site maintained by the University of Iowa . You can also visit the Copyright Bay .

Obscenity is another area of concern for any publisher. Jurisdictions vary greatly in what is considered obscene, as opposed to merely offensive. The laws, however, are quite strict. It is possible to be obscene in both pictures and words. There are, of course, free speech and academic freedom issues that trade off against these issues. It is even possible that some of the laws are themselves unethical, but they are still laws, and subject the transgressor to penalties. Students should especially be aware that they will be judged by future employers based on what they publish online.

Attempting to break into (hack into) a computer without authorization is a crime with heavy penalties. So is disrupting a computer system by releasing viruses, worms, Trojan horses, etc. The penalties can be very severe for this kind of activity.

There are a number of places that have special laws for special reasons. For example, in Germany it is illegal to publish materials that might be considered Nazi propaganda. Most places have laws governing material that can be considered national secrets. There are also a number of treaties between countries that make it unclear whether a person is liable when they publish something somewhere that is illegal somewhere else.

You can learn more about some of these issues on the following web sites:
Various ethical issues from Don Gotterbarn of East Tennessee State University
"Cyberspace Law for Non-Lawyers"
"Law and the Web"
"10 Big Myths About Copyright Explained"
"Crash Course in Copyright"

I would like to thank Steven J. McDonald of the Office of Legal Affairs, Ohio State University, for an article he wrote in the Chronicle of Higher Education on these topics, and for follow up information he has supplied as well.



This page is maintained by Professor Joseph Bergin, berginf@pace.edu
Computer Science Department
Pace University
One Pace Plaza
New York, NY 10038 USA

Comments, suggestions, and bug reports are very welcome.