This page is intended to help you get started on the web by showing you how to set up and maintaing a home page. It will suggest some free and low cost tools to help in this, and some procedures that you can use to get started.
You will use ftp to transfer web pages and graphics developed on your local machine to your personal web site.
You can find lots of tools for both Macintosh and PC at TUCOWS. It is also nicely indexed. If you use shareware programs, however, you should really pay the shareware fees. The authors have put a lot of effort into creating these programs. You can encourage the creation of this low cost software by rewarding the authors appropriately.
You need to be able to create web pages in html format. HTML is a page formatting computer language. It is not especially easy to use and is quite ugly. You can learn the simple elements of the language in a few days and expand your knowledge as you go. Alternatively, you can use a translator to create the html for you. If you want to write your own, you need any editor that will create a simple text (ascii) file. Most word processors can do this as well, but you lose all word processing formats when you export text.
Some word processors, like Microsoft Word for Windows, have an html export function built in to the file save function. This is effective, but if you want to later edit the html file itself, you will find that the form produced by Word is difficult to use.
Alternatively, you can use Word and some other word processors to save your document in RTF (Rich Text Format). This is sometimes called Document Interchange Format, but I think that there is another format by that name that is not compatible. Once you have the RTF format, you can use a tool RTFtoHTML to translate the RTF file into HTML. I like this tool (both Macintosh and PC versions) because you can tailor the output to match your writing styles.
RTFtoHTML comes with a special formattind document that is used to lay out the HTML. It will translate Word style sheets into HTML styles. It isn't easy to learn the language of this format page, but it is quite powerful, and only needs to be modified occasionally.
RTFtoHTML will create graphics files from any graphics in your RTF document, but these files will probably be in the native format of your computer: bitmaps on the PC and pict files on the Macintosh. You need an image tool to translate these to jpeg or gif formats.
Another method of creating web documents is to avoid HTML altogether and use Adobe Acrobat (pdf) files instead. This requires that your user have the (free) Acrobat Reader or plug-in, but it gives beautiful results. To use it you need to pay money to Adobe to get the Adobe Acrobat Distiller. One way that it can be used is as a print driver. In this mode, you just write using your normal word processor and when you "print" it you get a pdf file that you can upload to the server. This is especially effective for publishing PowerPoint slide shows, for example.
What You Need to Do to Get Started
Use the mkdir command to create a new directory named "public_html"
mkdir public_html
This is your home site for the WWW. Anything within that directory can be accessed by web browsers. Things not in that directory are invisible to the web. You can organize that directory however you choose, but it is recommended that you use sub-directories to keep different projects separate. It is common, for example to have one subdirectory named images that is used just to hold images to be shown on your pages.
NOTE that different web servers have diffferent requirements. We use the Netscape Server Suite here at Pace, which requires users to use the public_html directory as a personal web site.
You can create subdirectories as needed with the mkdir command. You can switch to one of your directories with the cd command.
cd public_html
mkdir images
Rember that your files will be on a unix system eventually, where filenames are case sensitive. The name index.html is not the same as INDEX.HTML on a unix system.
The overall structure of an html document is as follows
<html> <head> </head> <body> </body> </html>In the head portion of the document you usually put a title, and perhaps an author. The title shows up as the title of the window in which the web browser displays the page. The title for this document is defined by:
<title> Getting Started on the Web </title>
All of the text and graphics that show up on the page itself are defined in the BODY portion of the page.
Most web pages show up as black print on a gray background. Pretty boring. This page has a white background because we put a bgcolor parameter in the BODY tag for the page:
<body bgcolor = ffffff>
The color is given in RGB format and ffffff is white. We have also decided that some of the text should be in red, we have enclosed various parts of the text in FONT tags of the form
<font color = red>
...
</font>
Get your graphics into a form that will be acceptable to the web (jpeg or gif). Assemble them all together in a directory structure on your local machine that mimics the directory structure on the server. For example, if you intend to put graphics in an "images" directory, then create an images directory on your local machine and put the files there.
Once your site is in place, you will upload only individual files to individual directories, but to start, you can upload directories as a whole.
Now that you are online, test everything again. You will probably have to redo some things because of the case sensitivity of unix. Edit locally and reupload as necessary.
One of the best ways to learn html is to look at html documents. Most browsers let you do this. When you find a page that you like and wonder how it was done, you can select an option from one of your menus (view perhaps) that lets you see the document source. You will see the raw html page unrendered. From this you can see the various elements that were used to compose it.
<center><h1> Getting Started on the Web </h1></center>
Note that the line structure of your original is not preserved in the page as viewed on the web. This is to make sure that a page viewed on a small screen doesn't requrire excessive scrolling to read it and one on a large screen will show the maximum amount of information. You can explicitly break lines with the <BR> tag, which has no matching close tag. You can formulate paragraph breaks with the <P> tag which you can close with the matching </P> tag, but the latter is optional in this case.
Note that the tags are not case sensitive.
If you have some text that you would like formatted exactly on the viewed page as in your original, you can include it in the <pre>...</pre> tag pair.
One problem that this page format syntax introduces into your documents is that the angle bracket symbols themselves can't be represented directly in your page. You need to use special character formatting symbols for them. Therefore, this page, which is intended to show some of the tags themselves, must use these codes to prevent the angle brackets from being interpreted as formatting commands. The code for the left angle bracket is < by the way. The codes begin with ampersand and end with semicolon. There are also numeric forms for the formatting characters and other accented characters as well. The ampersand itself is &.
Unlike simpler tags, however, the opening tag of an anchor has parameters. The important one here is the href parameter. A parameter is a named string that occurs within the tag itself. To achieve the above link, requires the anchor:
<a href = http://csis.pace.edu/~bergin> Prof. Bergin's home page</a>
An anchor can take you to any page in the world.
By the way, the double slash marks after the protocol are an historical artifact that comes from having treated the slash mark / specially and then needing to have some way treat it as a normal slash. As usual on unix systems, a character is "escaped" to take away a special meaning and the slash is the escape character.
<a href = mailto:berginf@pace.edu> Mail to Prof. Bergin </a>
<center> <img src = ../soli.jpg> </center>
And here is the effect of embedding this tag in the document.
Note that if you include an <IMG> tag between the <A> and </A> tags of an anchor, then clicking on the image will take you along the link.
Get Started Page
Here is a template page that you can use to edit as a basis for your own home page. It shows some of the elements discussed here. All of the links in it are dummy links, however. The page also makes reference to an image that we have not included here, so the image is replaced by a standard image icon by the browser.
Keep your image files small. They will load faster and your readers will be happier. Some of the tools will remap colors in an image to keep the file size down. You can also compress (jpeg ) files, though you lose some detail. You can also give the size (in pixels) of an image file in the <IMG> tag. That way the rest of the page can load and be displayed while the image is still loading. A good image editor can tell you the pixel size of your image.
Avoid state of the art HTML elements. Not all browsers will support them. You can also use the ALT parameter in <IMG> tags to display a line of text in place of the image in case the user is using a text only browser like LYNX.
And of course, everything on your page and everything directly linked from your page should reflect well on you and on the University.
To learn more about web site design you can visit a site maintained at Georgia Institute of Technology . There is also an HTML Crash Course for Educators maintained by the EdWeb at GSN.org.
Legal and Ethical Issues
There are a few legal and ethical issues that constrain what you may properly do in cyberspace and in particular on your web pages. When you create a web site you become a publisher. Pace University does not maintain editorial control over the contents of the personal sites maintained by its students and faculty. If you put something on your own site it is your responsibility.
There are a number of laws that govern what publishers may lawfully publish. There are additional ethical concerns that will be taken seriously by any responsible professional. Generally speaking, what would be illegal offline is also likely to be illegal online.
For both legal and ethical reasons you should consider carefully what you say about individual persons. There are strict laws governing slander and libel in the U.S. and elsewhere. Accusing someone of something that might be criminal or unethical can result in lawsuits.
Invading someone's privacy can also be illegal. This can mean looking in someone else's directory or attempting to read their email. Sending email can also be illegal if it is determined to be harassing the other person. Learn more about privacy at the Privacy Forum.
The content of your site is also subject to the copyright laws of the U.S. and perhaps other jurisdictions. If you didn't create what you publish, you need to be sure that you have permission to publish it. It should be fairly clear that putting a copy of a commercial program on your site, available for download, is illegal and almost certainly wrong, but there are some more surprising consequences of the copyright laws. In at least one place, a link to an existing site was ruled in violation of copyright law. The problem was that a newspaper set up a web site with a home page that required visitors to register before going further. Someone registered and then created links to pages behind the home page making it possible to visit the site without registering. This was declared illegal. The U.S. has recently passed a law that levies a penalty of up to $100,000 for a single violation of copyright law. To learn more about copyright you can look at a site maintained by the University of Iowa . You can also visit the Copyright Bay .
Obscenity is another area of concern for any publisher. Jurisdictions vary greatly in what is considered obscene, as opposed to merely offensive. The laws, however, are quite strict. It is possible to be obscene in both pictures and words. There are, of course, free speech and academic freedom issues that trade off against these issues. It is even possible that some of the laws are themselves unethical, but they are still laws, and subject the transgressor to penalties. Students should especially be aware that they will be judged by future employers based on what they publish online.
Attempting to break into (hack into) a computer without authorization is a crime with heavy penalties. So is disrupting a computer system by releasing viruses, worms, Trojan horses, etc. The penalties can be very severe for this kind of activity.
There are a number of places that have special laws for special reasons. For example, in Germany it is illegal to publish materials that might be considered Nazi propaganda. Most places have laws governing material that can be considered national secrets. There are also a number of treaties between countries that make it unclear whether a person is liable when they publish something somewhere that is illegal somewhere else.
You can learn more about some of these issues on the following web sites:
Various ethical issues from Don Gotterbarn of East Tennessee State University
"Cyberspace Law for Non-Lawyers"
"Law and the Web"
"10 Big Myths About Copyright Explained"
"Crash Course in Copyright"
I would like to thank Steven J. McDonald of the Office of Legal Affairs, Ohio State University, for an article he wrote in the Chronicle of Higher Education on these topics, and for follow up information he has supplied as well.
This page is maintained by Professor Joseph Bergin, berginf@pace.edu
Computer Science Department
Pace University
One Pace Plaza
New York, NY 10038 USA