LIS650lecture 0 Introductory lecture Thomas Krichel 2005-09-11.
LIS650lecture 0 Introductory lecture Thomas Krichel 2004-01-23.
-
Upload
matthew-stewart -
Category
Documents
-
view
216 -
download
0
Transcript of LIS650lecture 0 Introductory lecture Thomas Krichel 2004-01-23.
LIS650 lecture 0
Introductory lecture
Thomas Krichel
2004-01-23
administrative matters• Course home page is at
http://wotan.liu.edu/home/krichel/lis650p04s• First quiz next lecture! • Deadline to finish web site: one week after the end of
the last lecture. • You will not be able to change your web site between
the deadline and the time that the grade is issued!• Subscribe to class mailing list
https://lists.liu.edu/mailman/listinfo/cwp-lis650-krichel
today
• introduction to the course • talk about you• the basic ingredients of the web, without html• introduction to our basic technical set up• introduction to html
Course history
• Course was first run as an institute 2002-05-13 to 2002-05-17
• Title was “Webmastering I: the static web site”.• To the curriculum committee, this title did not
sound academic enough. • Since “Web Site Architecture and Design” is now
the full title, WeSaD (pronounced like “wizard”) is the official abbreviation.
• Webmastering is still what we want to learn.
teaching WeSaD
• WeSaD combines many aspects:– Authoring pages– Work on the organization of data to fit onto pages– Set display style of different pages– Organize the contribution of data– Maintain a technical web installation
• Some of them can be learned in a course, but others can not.
• Emphasis has to be on learnable elements.
teaching philosophy
• Point and click on a computer software is not enough
• Explain underlying principles• Promote standards
– HTML 4.01– CSS level 2.1
• Avoid proprietary software
WeSaD contents
• Deals with the maintenance of a static web site. Such a web site remains the same whatever the user does with it.
• Topics include– html– css– site usability and information architecture, as far as
relevant for static web sites– http, uri, web server
things this course does not do
• Forms: allow you to design forms that users fill in. But you do not have the programming skills to do something with the form.
• Any HTML elements that require executable contents are not covered.
• Frames: allow you to put several documents into one physical document. Most experts advise against them.
• We do not cover image maps.• We don’t do some advanced CSS properties.
Other courses: webmastering II
• Deals with building dynamic web sites. – Users fill in a form– Users submit the form– Web server return a page that is specific to the
request of the user.
• Teaches a language called PHP, that is widely used to generate such web sites.– Gets you introduced to computer programming– Gets you to train analytical thinking.
other courses: webmastering III
• Deals with XML– XML is a syntax to encode any kind of data. – XML can be constrained to only allow certain types of
data (XML Schema)– XML can be transformed to render the data in various
ways (XSLT)
• Achieve a separation of contents and presentation of a web page.
• advanced course, has both Schema and Transformation
The world wide web
The World Wide Web (Web) is a network of information resources. The Web relies on three mechanisms to make these resources readily available to the widest possible audience:– A uniform naming scheme for locating resources on
the Web (I.e. URIs). – Protocols, for access to named resources over the
Web (e.g., HTTP). – Hypertext, for easy navigation among resources (e.g.,
HTML).
URI introduction
• Every resource available on the Web -- HTML document, image, video clip, program, etc. -- has an address that may be encoded by a Universal Resource Identifier, or "URI".
• URIs typically consist of three pieces:– The naming scheme of the mechanism used to
access the resource. – The name of the machine hosting the resource. – The name of the resource itself, given as a path.
example URI
• http://openlib.org/home/krichel
This URI may be read as follows: There is a document available via the HTTP protocol, residing on the site openlib.org, accessible via the path "/home/krichel".
• mailto:[email protected]
This URI may be read as follows: There is email user krichel in a domain openlib.org to whom email may be sent.
client / server protocol• The web operates mostly on http.• This is a client-server protocol. • The client software is run on the local PC that
you are using. – It is called a web browser or user agent.
• Our server is a piece of hardware called wotan.liu.edu– It runs the Debian GNU/Linux operating system on a
Intel architecture. – It provides http daemon software that serves http
requests. The particular software is called Apache.
communication with the server
• The protocol for communicating with the server is the secure shell, short ssh. It is based public-key cryptography.
• We two two ssh clients – For file editing and manipulation, we use putty. – For file transfer, we use winscp.– Both are available on the web.
• Telnet and ftp servers are not available on wotan.liu.edu. Telnet and ftp do not encrypt the communication stream; therefore they are not secure.
registration time
• As part of the course, you are being provided with web space on the server wotan.liu.edu, at the URL
http://wotan.liu.edu/~username
where username is a user name that you will chose now.
• It is my intention to maintain this web space for you into the foreseeable future.
• You should also choose a password, now. • I will now register you.
login time
• Use putty, port 22 to wotan.liu.edu• set other attributes of the session as you like,
using the menu on the left, for example– colors– font shapes and sizes– bell
• Save the session as “wotan” (in the first screen) to save all the customization.
• You do not normally need to login to the machine, unless you want to work with it.
free software• I maintain wotan.liu.edu server but you can build
your own server if– you have Internet access– you have an old PC to spare
• All the server software, as well as putty and winscp are free, open-source.
• It is one of my fundamental beliefs that free information should run on free software.
• The library community can learn a hell of a lot from the free software community.
• See my talk at http://openlib.org/home/krichel/ presentations/new_york_2003-11-07.ppt
installing software at home
• Go to your favorite search engine to search for– putty– winscp
• Download and run windows-style installer software to install both pieces of software.
• Download and install a recent version of at least two browsers. I suggest– Netscape Navigator at
http://channels.netscape.com/ns/browsers/download.jsp– Opera at http://www.opera.com
putty and winscp
• You can either maintain files on wotan.liu.edu– by logging into wotan.liu.edu– using a file editor there, for example nano– past experience has shown that this is hard for
students with no UNIX experience.
• You can also maintain text files locally– each time you make a change, you save the file and
upload to wotan.liu.edu using winscp.– you can use Notepad locally to maintain text files– I do not recommend using WordPad and Word.
create a web page in MS notepad
• Open Microsoft notepad. Type the text
<!DOCTYPE HTML PUBLIC
"-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head><meta http-equiv="Content-Type" content="text/html; charset=UTF8"> <title></title></head><body>
<div></div></body>
</html>
Saving the web page
• save as “empty.html”.• If you want to open it again in notepad
– open notepad– select file/open– list all files– empty.html
• Don't click on the file.• Don't choose edit in the context menu.
upload and view file• Once you have your file “empty.html”, use the
menus of winscp to upload it to your file in the public_html directory of your home directory on wotan.liu.edu.
• It has to be in public_html ! • Once it is there, use a web browser to view it at
http://wotan.liu.edu/~user/empty.html, where user is your user id.
• Then validate it at http://validator.w3.org. – enter the URL of the page that you want to validate– hit the validate button
• It has to be in public_html !
public_html• Is your web directory. It is automagically created
for you when Thomas registers you.• The web server will map requests to
http://wotan.liu.edu/~user/file to show the file /home/user/public_html/file.
• Here user stands for your user id, and file is the file name.
• If file ends with “.html” or “.htm” the web browser will be told that the file is a html file. It will be rendered accordingly by the browser.
index.html
• The web server on wotan will map requests to http://wotan.liu.edu/~user to show the file ~user/public_html/index.html
• If this file is not there, the server will prepare a html document from the list of files that it finds in the directory and send it to the user agent.
• Once you have a file index.html, the web user can no longer see the individual files in your directory.
HTML and XHTML
• HTML is the hypertext markup language• HTML is a markup language that is widely used
on the Word Wide Web (WWW)• The latest, and probably last version of HTML is
at http://www.w3.org/TR/html4/• The WC3, the standard making body for the
WWW, have issued XHTML, a replacement of HTML that is compatible with XML.
• We will ignore XHTML for the rest of the course.
what is markup?
• Everything in a document that is not content. It can be give in two ways
• 1: Procedural– Codes identify point size, style, font, etc.– Usually only understood by defining tool– Example: Microsoft Word
• 2: Descriptive– Describes purpose of text within the document– Chapter head, Paragraph, Section Head, TOC– Structure and Style are kept separate– Example: LaTeX, SGML
SGML• Standard Generalized Markup Language• Descriptive approach with three separate layers
– structure: types of information in document– content: the information itself– style: matches typesetting with structure
• Developed for the publishing industry by a group around Goldfarb.
• So complicated that no software implements it fully • Document Type Definition (DTD)
– Defines the structure
Document Type Definition (DTD)
• Describes information the document handles– e.g Title,TOC, Chapter, Section
• Relationships between fields– e.g. A Chapter contains Sections
• Consistency• Logical structure• Information defined by tags
HTML
• HyperText Markup Language• Defines an SGML DTD
– Head, Title, Body, Paragraph, etc.– Headings, Bold, Italic, etc.– Table, List, Image, etc.– Links to other documents– Forms
• Style applied by Web Browser– User has some control
HTML history
• HTML was a very bare-bones language when first invented by Tim Berners-Lee. It did not describe pages with much of a visual appeal.
• In the 90s, successful browsers invented “extensions” that aimed to stretch the visual boundaries of HTML.
• Some of these extensions found their way in the official HTML spec issued by the W3C.
“my HTML”
• I will teach HTML 4.01. This version has two different DTDs:– the loose DTD– the strict DTD
• I will only do the tags of the strict DTD• The loose DTD has more tags, but all the
functionality of these tags is best done with style sheets.
• Thus, the pages created with HTML only will look rather boring.
• But we do cover style sheets later.
HTML tags• HTML markup is written as tags. Tags are written
as pairs (typically)– begin with <tag> "tag start"– end with </tag> "tag end"– tag is the tag name
• Can be nested • Can contain non-markup data• Tag names are case-insensitive, but it is best to
use the same case, consistently, for human readability.
attributes to tags
• <atag attribute_name_one="value_one" attribute_name_two="value_two">
• Here attribute_name_one and attribute_name_two are attribute names and value_one and value_two are attribute values.
• I will say: tag <tag> “requires” attribute "attribute".
• I will say tag <tag> “takes” attribute "attribute" if the attribute is optional.
Example
<a href="http://openlib.org/home/krichel" title="homepage of Thomas Krichel">Thomas Krichel</a> – the whole thing is an <a> tag.
(I surround tag names with <>)– “href” is an attribute name– “http://openlib.org/home/krichel” is the value of the
"href" attribute
(I surround attribute names with straight quotes)– “Thomas Krichel” is character data.
Characters: concept
• A character set combine two things– Character repertoire: a set of characters e.g. "A", "ض"
"‼", "₣"– Character code positions: defines a number for each
character in the repertoire.
• Character encoding is a way to encode the code positions in bytes
• To correctly display a document, the user agent needs to know both!
playing safe with characters
• Only use the characters on the US keyboard, don't insert symbols.
• Save as ascii or utf-8. • Never save as "Unicode" within MS Notepad.• If you encounter a character that is not on your
keyboard, use an SGML entity.
Special Characters
• Inserted as an entity reference– Format can be &code;
• Ex. & – Insert an ampersand
– Codes are often abbreviation of the character names– Codes can be in hex form
• Ex. & to insert an ampersand
http://www.w3.org/TR/REC-html40/sgml/entities.html has the list
classifying tags
• There is a whole bunch of different tags. • We can group tags together in different ways.• In the following, I will explain some of the ways.
– block-level vs text-level tags– tags that require closing vs those that do not.
block-level vs text-level tags• Block-level tags contain data that is aligned
vertical by visual user agent. • Text-level tags are aligned horizontally by visual
user agents.• There are a number of reasons behind this
distinction– Block level can contain other block level tags and
text-level tags.– Text-level tags can not contain block-level tags.– Visual user agents start a new line at the beginning of
block-level tags.– Multidirectional text would be impossible without it.
common frame for pages
• We look at empty.html again. Here is the start again
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN“ "http://www.w3.org/TR/html4/strict.dtd">
• This is an SGML document type declaration. • It says which kind of HTML it is.• Use empty.html as a start to compose all your
pages.
special topic: images
• The appeal of the web to the masses has a lot to do with its capability to transport image.
• Image format are independent of the web, but there are two classic format that are widely supported by user agents.– GIF– JPEG
GIF
• stands for graphics interchange format. • developed by CompuServe.• unresolved copyright issues make the format
abhorred by the free software community.• 250 colors maximum• uses a loss-less compression technique
GIF has three tricks• interlacing:
– when downloading the file, the browser can show every forth row first
– user gets in an idea of the picture before it is sharp
• transparency– some GIFs are transparent, so you can see them on
top of already exist– technically, the GIF has one color as the background
color, and pixels of that color are ignored by the user agent
• animation– some GIFs are in fact sequences of GIFs that can be
rendered one after the other.
JPEG
• The Joint Photographic Experts Group is a standard-making body for images
• They can support thousands of colors.• The compression is lossy, i.e. the JPEG file will
look like the original image, but not be the same.• The compression does not work well with
drawings. • There are no copyright and patent problems with
JPEG
working with wotan
• You can work with wotan directly if you like. Use putty to connect to wotan.liu.edu, then type
cd public_html• You can start from empty.html, the file that
validates, and copy it to test.html
cp empty.html test.html
nano test.html• Then you can change test.html to try out the
tags as I discuss them here.
working on the local machine• Open empty.html on your web site and save as
test.html• edit it with notepad to be safe• open with Internet Explorer to see the rendered
html• to validate
– you have to upload the file first to your public_html directory on wotan.liu.edu
– Then use the W3C validator at http://validator.w3c.org
literature
• I work from the text of the official standard at http://www.w3.org/TR/html4/
• To work with it faster, I made a copy at http://wotan.liu.edu/~krichel/html4/
• You can work from any HTML book.
Homework
• Look at course home page http://wotan.liu.edu/home/krichel/lis650p04s
• Send [email protected] your secret word for course result delivery.
• Prepare a one-page max summary of the type of website that you want to build, bring printed copy with you next week.
• Prepare for quiz at the beginning of next lecture.
http://openlib.org/home/krichel
Thank you for your attention!