AJAX Report v2

20
An AJAX Implementation of UVa Webmail Austin Kennedy ([email protected] ) Eugene Otto ([email protected] ) CS-457 Dec. 1, 2005

description

Implementation of University of Virginia webmail in AJAX.

Transcript of AJAX Report v2

An AJAX Implementation of UVa Webmail

Austin Kennedy ([email protected])Eugene Otto ([email protected])

CS-457Dec. 1, 2005

On my honor, I have neither given nor received unauthorized aid on this assignment.

Signed, Signed,Austin Kennedy Eugene Otto

Table of ContentsTable of Contents.................................................................................................................1Introduction..........................................................................................................................2The Origins of AJAX...........................................................................................................2

Server-Side Technologies............................................................................................2Client-Side Technologies.............................................................................................2The Birth of AJAX......................................................................................................3

System Components............................................................................................................3ajax.html Login Form..................................................................................................3workhorsies.js..............................................................................................................4popper.pl......................................................................................................................4parse_email.pl..............................................................................................................4

An Example.........................................................................................................................5Login............................................................................................................................5workhorsies.js Requests Output of popper.pl..............................................................5popper.pl Executes.......................................................................................................6popper.pl Returns Output to workhorsies.js................................................................8workhorsies.js Prints Inbox Values to ajax.html.........................................................9

UVa Webmail Service Performance....................................................................................9AJAX Client Performance.................................................................................................11Test Cases..........................................................................................................................12

Generating Correct XML Files from the POP Server...............................................12Handling XML Files..................................................................................................12Final Tests..................................................................................................................12

Future Work.......................................................................................................................12Conclusion.........................................................................................................................13References..........................................................................................................................13

1

IntroductionOur project is an AJAX implementation of UVa’s Webmail service. Our goal is to reduce the amount of bandwidth transferred by the Webmail service and to increase the responsiveness of the Webmail client.

The reason we chose this project is because we both use UVa’s Webmail system as our only e-mail client because of its convenience: We like to have all of our messages stored in one place where we can always access them anywhere in the world. However, we both feel that the Webmail system is too slow and that significant savings in bandwidth (and thus increase in network quality) can be made by optimizing the way it delivers messages.

To illustrate the advantages of an AJAX-based Webmail system, we have created a prototype inbox that connects to the UVa Central Mailing Service POP server and displays the five most recent messages. This implementation is accessible through most JavaScript-enabled browsers (it has been tested on Firefox, Opera, and Internet Explorer). As we will show, it reduces bandwidth significantly by loading the subject lines of only the newly received messages and it increases responsiveness for the same reason.

The Origins of AJAXAJAX stands for Asynchronous JavaScript and XML. It is not a new technology, but a conglomeration of two existing technologies that together provide a new and exciting way for web developers to deliver content to users. The most famous example of AJAX technology is Google Maps (http://maps.google.com). Using AJAX, Google was able to produce an extremely dynamic and responsive web-based application which is now used by millions and has sparked a grassroots web development revolution.

Server-Side TechnologiesDynamic pages have existed since the dawn of the World Wide Web in the forms of server-side technologies such as CGI (Perl and C) and have become very prevalent due to the development of more web-friendly languages like PHP, Java, ASP, etc. These server-side technologies are very powerful, but with them is it difficult to provide an interactive experience. This is because every time you want to see a change in the webpage, the page must be downloaded from the server. This will generally take quite a bit of time because of the size of a typical webpage, and it can be visually disorienting.

Client-Side TechnologiesClient-side technologies are also used to create dynamic webpages. Flash, Shockwave, Java applets, and JavaScript are used to provide highly responsive interactive experiences to users. These technologies, however, are largely limited to trivial applications like games, advertisements, and basic visualizations because it is difficult to store information between uses and because they are often reliant on third party plugins (i.e., Flash, Shockwave, Java) or on technologies that tend not to be supported the same way on different platforms (i.e., JavaScript).

2

The Birth of AJAXThe magic behind AJAX is a JavaScript feature that went largely unnoticed until Google came along. By creating an object of type XMLHttpRequest, you can open HTTP connections to a server and retrieve data, often an XML document. This gives you the best of both server-side and client-side technology: The disorientation caused by a reloading webpage is solved because now data is sent and retrieved in the background using JavaScript; this means that JavaScript is now also responsible for actually rendering the downloaded data which makes the page appear much more responsively.

System ComponentsOur system is made of four files which are described here. Figure 1 illustrates the interactions between these files, the browser, webserver, and POP server.

Figure 1 – Interactions between ajax.html, workhorsies.js, popper.pl, parse_email.pl, the browser, webserver, and POP server.

ajax.html Login FormThis is an HTML file that houses the main user interface. A screenshot is shown below in Figure 2.

3

Figure 2 – Screenshot of inbox

workhorsies.jsThis is a JavaScript file that contains all of the client-side logic. It is responsible for manipulating ajax.html as well as opening connections to the webserver and receiving and decoding XML documents.

popper.plThis is a Perl file that does most of the server-side work. It is responsible for opening a connection to the POP server, logging in with the provided username and password, and retrieving new messages. Once the messages are parsed and encoded into XML by parse_email.pl, popper.pl prints the XML to standard output.

parse_email.plThis is a Perl file that is responsible for parsing an e-mail message’s headers and body and returning certain values such as subject, date, and name in a string of XML tags.

4

An ExampleWe will now describe a typical execution flow.

LoginThe first step is to provide the client with your username and password; a screenshot of the login screen is shown in Figure 3. A current limitation is that all data transferred is unencrypted – this means that your username and password will be transferred in plaintext which makes it susceptible to anyone sniffing the network.

Figure 3 – The login screen

The moment the “click here to login” button is clicked, the getInfo() JavaScript function is called.

workhorsies.js Requests Output of popper.plgetInfo() starts a chain-reaction of JavaScript. Its first order of business is to open a connection to the local webserver using the call shown here.

myRequest.open('get', 'popper.pl?user='+document.MyForm.user.value+'&pass='+document.MyForm.pass.value+'&pop_server='+document.MyForm.pop_server.value+'&last_message_received='+document.getElementById("last_message_received").innerHTML, true);

This command requests the file popper.pl from the webserver and sends the following parameters:

1. user – the username entered in the form2. pass – the password entered in the form3. pop_server – the pop server entered in the form4. last_message_received – a hidden variable used to tell popper.pl the id number of

the last message received by the inbox

getInfo() then sets an event handler on myRequest to call the function onResponse() whenever myRequest’s state changes (i.e., when the webserver returns data).

5

popper.pl Executespopper.pl receives the login info provided to it in its argument list and open a socket connection to the specified POP server with the command shown here:

my $Socket = new IO::Socket::INET ( PeerAddr => $Form{'pop_server'}, PeerPort => '110', Proto => 'tcp', );die "Could not create socket: $!\n" unless $Socket;

POP servers generally communicate on port 110 using the TCP protocol, and as you can see, these are the values that we used to create the connection.

popper.pl then logs into the POP server by sending the username and password provided in its argument list. The interaction that occurs between popper.pl and the POP server is shown here (data sent by popper.pl is shown in green, data received is shown in black):

+OK CommuniGate Pro POP3 Server 4.3.9 ready <[email protected]>

USER eeo5w+OK please send the PASSPASS ********+OK 3977 messages (167102678 bytes)

We can see that my inbox has 3977 messages in it that take up 167102678 bytes. Note that while the password is all stars here (for the authors’ protection), in real life it’s plaintext, just like the username. popper.pl then downloads the most recent five messages by issuing the following commands one-by-one (the POP server’s responses have been omitted):

RETR 3977RETR 3976RETR 3975RETR 3974RETR 3973

The result of the first request is shown here (the rest are omitted for brevity):

+OK 1577 bytes will followReturn-Path: <[email protected]>Received: from [192.168.1.31] (HELO fork11.mail.virginia.edu) by cgatepro-4.mail.virginia.edu (CommuniGate Pro SMTP 4.3.9) with ESMTP id 160073762 for [email protected]; Thu, 01 Dec 2005 05:28:32 -0500

6

Received: from localhost (localhost [127.0.0.1]) by fork11.mail.virginia.edu (Postfix) with ESMTP id 41A121F53DD for <[email protected]>; Thu, 1 Dec 2005 05:28:32 -0500 (EST)Received: from fork11.mail.virginia.edu ([127.0.0.1]) by localhost (fork11.mail.virginia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 01024-07 for <[email protected]>; Thu, 1 Dec 2005 05:28:32 -0500 (EST)Received: from cgatepro-4.mail.virginia.edu (tetra.mail.Virginia.EDU [128.143.2.219]) by fork11.mail.virginia.edu (Postfix) with ESMTP id 100EF1F539A for <[email protected]>; Thu, 1 Dec 2005 05:28:32 -0500 (EST)Received: from [128.143.22.8] (account [email protected]) by cgatepro-4.mail.virginia.edu (CommuniGate Pro WebUser 4.3.9) with HTTP id 160073760 for [email protected]; Thu, 01 Dec 2005 05:28:32 -0500From: "Eugene Ewe Otto" <[email protected]>Subject: test -1To: [email protected]: CommuniGate Pro WebUser Interface v.4.3.9Date: Thu, 01 Dec 2005 05:28:32 -0500Message-ID: <[email protected]>MIME-Version: 1.0Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"Content-Transfer-Encoding: 8bitX-UVA-Virus-Scanned: by amavisd-new at fork11.mail.virginia.edu

this is a test.

This text represents the headers and body of an e-mail message. This text is sent to parse_email.pl where regular expressions are used to extract the values highlighted. These extracted values are placed in a hash and sent to a function that generates an XML file that incorporates the hash values. A weakness of our client is that we did not consult any RFCs to help us construct the regular expressions, and therefore the result of the parsing is spotty for e-mails sent from some providers.

7

popper.pl Returns Output to workhorsies.jsThis is the final output of popper.pl for would look like this (again, abbreviated for brevity):

Content-Type: text/xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><content><email> <headers> <subject>test -1</subject> <name>Eugene Ewe Otto</name> <addy>[email protected]</addy> <day>Thu</day> <date>01</date> <month>Dec</month> <year>2005</year> <hour>05</hour > <minute>28</minute> <second>32</second> </headers> <body>this is a test </body></email>...<num_new_messages>5</num_new_messages><lastmessagereceived>3977</lastmessagereceived></content>

Note the string at the top “Content-Type: text/xml” followed by a blank line. This is an HTTP header that must be sent to the browser to identify the incoming text as XML. We spent about an hour and a half trying to figure out why “Content-Type: text/html” wasn’t working and found that it was because we were sending the wrong header.

Notice also the string <?xml version="1.0" encoding="UTF-8" standalone="yes"?> -- this identifies the XML files properties and is also required by most browsers for proper formatting. The rest of the file consists of invented XML tags and e-mail data as well as a tag for the number of incoming messages (<num_new_messages>) and the id number of the last message received (<lastmessagereceived>).

The XML output is then sent back to workhorsies.js by the webserver.

8

workhorsies.js Prints Inbox Values to ajax.htmlmyRequest’s status has now changed and its event handler, onResponse(), is called. onResponse() receives the incoming XML file, parses the messages one-by-one, and stores them into an array of messages. Once this array is complete, the components of each message are printed to ajax.html, shown in Figure 4.

Figure 4 – View of the inbox after login

We’ve shown that we can make a connection using JavaScript, but what good does this do us? We still haven’t saved any bandwidth! Well, let’s explore what exactly is wrong with UVa’s Webmail system and how much we can potentially save.

UVa Webmail Service PerformanceFor every reload of UVa’s Webmail service inbox, around 11KB of data are transferred simply for page description (tables, formatting, etc.), not including message data. Typically, about 1KB of text is transferred per message to display. Webmail also has a feature where you can choose the number of messages to display per page (ranging between 5 and 100,000). The data shown below in Table 1 and graphed in Figure 5 was collected by loading the inbox with all possible numbers of messages to be displayed and

9

saving the page locally. In run 1, messages were sorted by date, and in run 2, messages were sorted by status. Data was not collected beyond 1,000 messages per page because our data set was too small.

Messages Per Page KB (run 1) KB (run 2)5 17 167 18 1710 21 2015 26 2520 32 3025 37 3530 43 4050 66 60100 122 111300 350 3131000 994 1018

Table 1

Messages Per Page vs. Size

0

200

400

600

800

1000

1200

0 200 400 600 800 1000 1200

Messages per Page

Siz

e (

KB

)

KB (run 1)

KB (run 2)

Figure 5 – Graph illustrating the amount of bandwidth spent per message per page

UVa’s Webmail can be set to check for new mail every 30 seconds. In the worst-case scenario, for a standard Webmail user to check for new messages, he/she would have to download 11 KB + 100,000*1 KB = 100,011 KB of data – just to check for new messages! Of course, this scenario is not entirely realistic, but the fact that these settings are allowed means that this is possible.

By default, UVa’s Webmail sets the number of messages per page to 20, and the refresh rate to once every five minutes. This means that every five minutes, about 31KB of data is transferred from the server.

10

AJAX Client Performance

When the inbox is refreshed for the AJAX client, performance largely depends on the amount of new mail that has been received since the last check. The client will always make a call to the webserver to request an XML file containing new emails. In the case that no new emails have been received, the webserver will return an XML output which only contains the number of the last message in the inbox (3977 in the example below).

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><content>

<lastmessagereceived>3977</lastmessagereceived></content>

This XML output contains 122 characters; therefore, 122 bytes will be sent back from the webserver. If there has been new mail, the webserver will return an XML output containing only the information needed to display these new messages, as opposed to returning data for the entire inbox, as Webmail would do. In the figure below, one new message has been received and transferred back from the server.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><content><email>

<subject>jason su has listed you as a friend...</subject>

<name>facebookcom</name><addy>confirmfacebookcom</addy><day>Sat</day><date>26</date><month>Nov</month><year>2005</year><hour>19</hour><minute>55</minute><second>42</second><body>this is a test</body>

</email></content>

The server will transfer 356 bytes of data for the above message. For e-mails with multiple messages, additional data will be transferred from the server for each e-mail (size varies depending on email content). The worst case for the AJAX client is when all messages are new, in which the returned XML file will describe all of the messages. This case still performs just as well as Webmail.

Since the formatting data and the already-retrieved messages do not have to be retransferred when the AJAX client checks for new mail, the amount of bandwidth required to inform the user about his/her inbox status can be up to a few orders of

11

magnitude lower than the amount required by the UVa’s webmail client (depending on the number of new messages, and the settings used in webmail).

Test CasesWhen assembling a system with such varied components that involves encoding and decoding to and from network connections, there are a lot of places where errors can occur. Through the course of the project, we came up with a lot of debugging techniques that progressed the project more smoothly.

Generating Correct XML Files from the POP ServerWhile building popper.pl, we found it invaluable to be able to execute it on the command line and pipe its output into an XML file. This allowed us fix buggy Perl code as well as to get an idea of exactly what the JavaScript file would eventually see.

Handling XML FilesTo ensure that our JavaScript file could interpret XML files correctly, we took the XML files that we generated by running popper.pl on the command line and were able to verify that it was standards compliant simply by opening the file in a browser and debugging errors as the browser suggested.

Knowing that our XML files were formatted correctly allowed us to build a prototype in JavaScript to interpret them and be confident that if an error occurred, the problem was in the JavaScript.

Final TestsWhen testing our final product, we wrote several emails to ourselves before runtime so that these emails could be downloaded and displayed. This allowed us to ensure problems with our software were not due to irregularities in random emails, but rather bugs in our code.

Future WorkTo take this project from a prototype into a live application, a lot of improvements would have to be made. This is a list of current limitations and how we would fix them.

Transactions are currently performed in plaintext – this is very insecure because your login information and e-mails could be intercepted by anyone sniffing the network. A later iteration of this client would use encryption, most likely SSL, to transfer data.

Obviously, an e-mail client needs to be able to send messages as well as receive them. To do this, we would add SMTP support and the necessary HTML and JavaScript components. Also, the IMAP protocol has become very popular for checking e-mail, so we would implement that as well (indeed, UVa’s EE and CS depts., do not support POP at all, only IMAP!).

12

A fairly significant problem is that the regular expressions we use to parse e-mail messages were designed around the format that UVa’s SMTP server uses. We didn’t make reference to any RFCs, and because of this, we’ve noticed that a lot of messages we receive are not represented correctly through our interface. Therefore, the regular expressions we use to parse the e-mail messages would need to be improved to fit the standards specified in POP RFCs. This also applies to any messages that contain attachments.

Also, standard error messages like incorrect username/password are not displayed. Lots of little user-interface issues would have to be addressed.

ConclusionWe had a great time with this project; it allowed us to explore an emerging technology and get our hands dirty with some programming languages, file formats, and protocols that we’d never used before.

We feel like this project enabled our learning very much :-D

References http://www.informit.com/articles/article.asp?p=425820&seqNum=6&rl=1 http://www.pixel2life.com/twodded/

t_ajax_asynchronous_javascript_and_xml_using_php_to_send_data_/page1/ "Perl in a Nutshell", O'reilly Publishing "Network Programming with Perl", Lincoln Stein http://www.mediacollege.com/internet/perl/form-data.html http://www.unix.org.ua/orelly/perl/cookbook/ch10_06.htm http://www.cs.mcgill.ca/~abatko/computers/programming/perl/howto/hash/ http://forums.devshed.com/archive/t-244820/responseXML-has-no-properties http://www.mediacollege.com/internet/perl/form-data.html http://www.devguru.com/Technologies/xmldom/quickref/

document_getElementsByTagName.html

13