Thesis

71
Hardware Implementation of Video Streaming By Jorgen Peddersen School of Information Technology and Electrical Engineering, The University of Queensland. Submitted for the Degree of Bachelor of Engineering (Honours) in the Computer Systems Engineering Stream October 2001

Transcript of Thesis

Page 1: Thesis

Hardware Implementationof

Video Streaming

ByJorgen Peddersen

School of Information Technology and Electrical Engineering,The University of Queensland.

Submitted for the Degree ofBachelor of Engineering (Honours)

in the Computer Systems Engineering Stream

October 2001

Page 2: Thesis
Page 3: Thesis

iii

62 Macalister StreetCarina Heights, Q 4152

Tel. (07) 3398 8424

October 19, 2001

The HeadSchool of Information Technology and Electrical EngineeringThe University of QueenslandSt Lucia, Q 4072

Dear Professor Kaplan,

In accordance with the requirements of the degree of Bachelor of Engineering

(Honours) in the Computer Systems Engineering stream, I present the following thesis

entitled “Hardware Implementation of Video Streaming”. This work was performed

under the supervision of Dr. Peter Sutton.

I declare that the work submitted in this thesis is my own, except as acknowledged in

the text and endnotes, and has not been previously submitted for a degree at the

University of Queensland or any other institution.

Yours sincerely,

Jorgen Peddersen

Page 4: Thesis

Abstract by Jorgen Peddersen

iv

Abstract

This thesis describes a pure hardware implementation of simple real-time video

streaming using an FPGA (Field Programmable Gate Array). Video streaming is

presently performed using mainly software-based techniques on dedicated computers,

as designing pure hardware solutions can be slower and harder to debug. The advantage

of hardware designs is in cost, as one chip could be mass produced to perform simple

video streaming tasks and used in areas such as security video cameras and other live

video feeds.

The implementation discussed herein uses the XSV-300 FPGA board designed by

XESS Corporation to implement a real-time video streaming system. The board

provides a simple video decoding chip, a network interface chip and a Xilinx XCV-300

FPGA. The FPGA is configured with code designed in VHDL that handles control of

the chips involved to implement a sturdy video streaming design. The resulting

implementation allows streaming of any RCA or S-Video data source into UDP packets

that can be transmitted over the network to a destination host.

The final result is a complete streaming design that does not require a PC. This design

has been fully tested and performs well. Possible sources that can be streamed are TV,

DVD and game consoles. At its present stage, the image quality and the network

bandwidth required for the design are not a match for software-based techniques,

although with some future work, the design could match these more expensive solutions

in quality and speed.

Page 5: Thesis

Acknowledgments by Jorgen Peddersen

v

Acknowledgments

This thesis was the product of many hours of work. Long ½ hour implementations that

don’t end up working can be frustrating as well as interesting to debug. The final

design could not be completed without the help of many people, the author therefore

wishes to thank:

Dr. Peter Sutton for his guidance and patience during many 5-minute meetings.

Mum and Dad for being so supportive and understanding.

Ashley Partis for proofreading and for co-writing the original VHDL IP stack.

James Brennan for writing the RAM code and helping with formatting.

Dave Vanden Bout for being a technical support genius who can actually solve

problems.

Alex Song for some brilliant inspiration.

Simon Leung for proofreading the thesis.

And last, but not least, Sri Parameswaran who inspired me to choose Computer

Systems Engineering.

Page 6: Thesis

Contents by Jorgen Peddersen

vi

ContentsAbstract ....................................................................................................... ivAcknowledgments........................................................................................ vContents.......................................................................................................viList of Tables.............................................................................................viiiList of Figures ............................................................................................. ixCHAPTER 1 – INTRODUCTION ..................................................................... 1

1.1 Introduction to Video Streaming.......................................................................11.2 The Problem ......................................................................................................21.3 FPGA Solution ..................................................................................................21.4 Overview ...........................................................................................................3

CHAPTER 2 – REVIEW OF PREVIOUS WORK .............................................. 52.1 Previous Work with Board................................................................................5

2.1.1 Video Decoder...............................................................................................52.1.2 Network Stack................................................................................................5

2.2 Video Streaming Formats..................................................................................62.3 Other work.........................................................................................................7

2.3.1 Xilinx/MidStream Server ...............................................................................72.3.2 Axis Web Cameras and Servers ....................................................................82.3.3 JPEG on FPGA .............................................................................................92.3.4 Ethernet Intellectual Properties ....................................................................9

2.4 Summary ...........................................................................................................9

CHAPTER 3 – PROBLEM DEFINITION........................................................ 113.1 General Problem..............................................................................................113.2 Video quality ...................................................................................................113.3 Network Issues ................................................................................................12

3.3.1 UDP or TCP?..............................................................................................123.3.2 Packet Format .............................................................................................13

3.4 PC Program .....................................................................................................143.5 Summary .........................................................................................................14

CHAPTER 4 – HARDWARE ENVIRONMENT ............................................... 154.1 Description of Board .......................................................................................15

4.1.1 FPGA...........................................................................................................154.1.2 CPLD...........................................................................................................164.1.3 Video Decoder Chip ....................................................................................164.1.4 Ethernet Port ...............................................................................................174.1.5 SRAM...........................................................................................................174.1.6 Flash RAM...................................................................................................17

4.2 VHDL / Foundation ........................................................................................184.2.1 VHDL ..........................................................................................................184.2.2 The Foundation Series ................................................................................18

4.3 Summary .........................................................................................................19

CHAPTER 5 – VHDL IMPLEMENTATION ................................................. 205.1 Video Decoding...............................................................................................20

Page 7: Thesis

Contents by Jorgen Peddersen

vii

5.1.1 Initialisation................................................................................................ 205.1.2 RAM Format ............................................................................................... 21

5.2 Networking ..................................................................................................... 215.2.1 Removal of IP Re-assembly ........................................................................ 215.2.2 Fixing Ethernet ........................................................................................... 215.2.3 ICMP........................................................................................................... 225.2.4 RAM Arbitration ......................................................................................... 225.2.5 PC SRAM Viewer........................................................................................ 23

5.3 Image Format .................................................................................................. 235.4 Video to Network Interface ............................................................................ 23

5.4.1 Video-In to UDP Packet Converter ............................................................ 245.4.2 UDP Connection Handler .......................................................................... 25

5.5 Complete FPGA Design ................................................................................. 265.6 CPLD Alteration ............................................................................................. 275.7 Summary......................................................................................................... 28

CHAPTER 6 – PC IMPLEMENTATION........................................................ 296.1 Programming I.D.E......................................................................................... 296.2 OpenPTC ........................................................................................................ 296.3 Winsock Sockets............................................................................................. 30

6.3.1 Microsoft Foundation Classes .................................................................... 316.3.2 Blocking Sockets ......................................................................................... 316.3.3 Non-blocking Sockets.................................................................................. 31

6.4 Protocol Definition ......................................................................................... 316.5 Graphical User Interface ................................................................................. 326.6 Summary......................................................................................................... 33

CHAPTER 7 – DESIGN EVALUATION ......................................................... 347.1 Streaming Results ........................................................................................... 34

7.1.1 Image Quality ............................................................................................. 347.1.2 Network Issues ............................................................................................ 34

7.2 Comparisons ................................................................................................... 357.3 Process Evaluation.......................................................................................... 367.4 Summary......................................................................................................... 36

CHAPTER 8 – FUTURE DEVELOPMENTS ................................................... 388.1 Image Format Changes ................................................................................... 388.2 100Mb/s Upgrade ........................................................................................... 38

8.2.1 16-bit RAM Functionality ........................................................................... 398.2.2 CRC Alteration ........................................................................................... 39

8.3 Fragment the UDP Packet............................................................................... 408.4 Image Compression ........................................................................................ 408.5 Audio streaming.............................................................................................. 418.6 Summary......................................................................................................... 41

CHAPTER 9 – CONCLUSION ...................................................................... 42References .................................................................................................. 43APPENDIX A – IMPLEMENTATION DATA..................................................A-1APPENDIX B – PARTIAL VHDL SOURCE CODE ...................................... B-1APPENDIX C – PARTIAL PC SOURCE CODE ............................................C-1

Page 8: Thesis

List of Tables by Jorgen Peddersen

viii

List of TablesTable 1: TCP header........................................................................................................13Table 2: UDP header .......................................................................................................13Table 3: Desired expectations and limitations ................................................................14Table 4: Ram memory map.............................................................................................17Table 5: Some alternate image formats...........................................................................38

Page 9: Thesis

List of Figures by Jorgen Peddersen

ix

List of FiguresFigure 1: MidStream's streaming server ........................................................................... 7Figure 2: Axis 2400 Video Server .................................................................................... 8Figure 3: Axis 2100 Web Camera .................................................................................... 8Figure 4: XSV-300 board and block diagram................................................................. 15Figure 5: Example of image quality................................................................................ 24Figure 6: Block diagram of final design ......................................................................... 27Figure 7: OpenPTC demonstrations ............................................................................... 30Figure 8: GUI for PC program........................................................................................ 32

Page 10: Thesis
Page 11: Thesis

Chapter 1 – Introduction by Jorgen Peddersen

1

Chapter 1 – Introduction1.1 Introduction to Video Streaming

Video streaming has become one of the most popular uses for the Internet in recent

times. Many Internet sites are dedicated to providing this service and it is used for

many different applications. The TV program Big Brother demonstrates how popular

video streaming has become, allowing Internet users to watch people in an enclosed

house 24 hours a day, 7 days a week. The demand for this type of media is growing and

new technologies must be developed to match this demand.

Initially, video media was played by downloading a movie file and displaying it on the

user’s computer after the download had completed. This method is very slow, and the

user had to wait for long periods while the video downloaded before they could play it.

The video streaming phenomena began when real-time streaming was introduced. By

using a free commercial product such as RealPlayer, streaming could now be

accomplished in real-time. New compression technology allowed the video to be

viewed as it is downloaded to its destination. With this method there are only delays

while waiting for the first few frames to be transmitted, then further frames are

displayed as they are transmitted, as if the video were playing in real-time.

Unfortunately, the speed of downloading comes at the cost of losing some image

quality. Real-time video streaming technology was taken further to introduce live video

streaming. Live streaming is the real-time streaming of live images from an input

source. It allows a video source such as a webcam or TV to be displayed as a real-time

stream on the destination computer. Many uses for live video streaming have been

developed, including applications like video conferencing.

Real-time and live streaming require massive amounts of resources to operate at high

resolution, but anyone can do it at home with the right hardware and software. Even

though this is the case, there is much more that can be done in the field. As technology

advances, so will the quality and availability of streaming media, e.g. TV channels in

the future may be accessed through the Internet and this method could be used for video

phones and other similar applications.

Page 12: Thesis

Chapter 1 – Introduction by Jorgen Peddersen

2

1.2 The Problem

Currently, almost all streaming applications are performed by computers with

complicated, expensive hardware requirements. There are many computers around the

world dedicated purely to video and audio streaming. Streaming of live video also

requires very fast machines to compress and transmit the massive amounts of data

involved.

The problem addressed in this thesis was to design a hardware system capable of

streaming live video data through a standard Ethernet network. This area is a relatively

new direction for video streaming, and has only been performed partially before. The

final design is not expected to fully match current software techniques, but to

demonstrate that streaming can be accomplished in hardware.

There are many advantages to implementing a purely hardware system for video

streaming. Cost and size can be reduced significantly, while improving the quality and

speed of transmission of the video stream itself. Unfortunately, hardware methods take

longer to design and are much more difficult to debug.

The ideal product is a system that can be connected to a network port and a video source

(such as a TV, VCR, DVD or game console), and allow viewing of the source’s output

from anywhere on the network. The device should not require any computer at the

video source to run, making it a purely hardware solution. Possible applications are

letting someone view a video tape externally, watching TV on your computer, or remote

viewing of a security camera among others.

1.3 FPGA Solution

One solution to the problems of debugging and testing of hardware is the FPGA1.

These are hardware chips that are reconfigurable, allowing many different functions to

be performed with one device. These are useful for development purposes as partial

products can be tested in hardware to aid simulation as the complexity increases.

1 Field Programmable Gate Array

Page 13: Thesis

Chapter 1 – Introduction by Jorgen Peddersen

3

The FPGA is a good choice for a live video streaming design as many designs can be

implemented on one piece of hardware, eliminating further hardware costs. FPGAs

have been around for some time, but they have only recently become an adequate size2

to build anything complicated. As the size increases, new uses for them are being

found, from neural networks to advanced digital signal processing.

This thesis discusses one implementation of FPGAs to produce a live video streaming

solution in pure hardware. For this purpose, the XSV-300 FPGA development board

from XESS Corporation [1] was provided. This board includes on-board hardware that

will make video streaming possible without requiring any external hardware to be

interfaced. This type of solution to the live streaming problem has not been previously

performed, so it is a very useful application for the FPGA and board.

1.4 OverviewThe remainder of this thesis explores some earlier work performed in the field of video

streaming and FPGAs, and discusses a complete, working solution to the general

problem presented in this chapter. The design and implementation of each of the areas

required to achieve a working video streaming design are each described separately and

comparisons to existing solutions are made.

Chapter 2 discusses previous work in the field of video streaming and network

applications in both software and hardware. The state of current hardware

implementations of video streaming are also assessed in this chapter. Chapter 3 defines

a general solution to the problem discussed in section 1.2 using the method described in

section 1.3. The problem is split into its major tasks and each task’s desired

specifications for the final design are discussed.

Based on the specifications determined above, Chapters 4, 5 and 6 describe the

implementation of each of the major tasks in detail. Chapter 4 explains the

programming environment used, including the board and tools. Chapter 5 discusses the

2 Number of equivalent gates

Page 14: Thesis

Chapter 1 – Introduction by Jorgen Peddersen

4

implementation of the live streaming design in VHDL. Chapter 6 gives an overview of

the decisions made for the PC program that will display the resulting stream.

Chapter 7 evaluates the final design in areas such as performance, quality of streaming

and stability. The design described in Chapters 4, 5 & 6 is evaluated and compared to

other systems in use. Chapter 8 discusses future work that could be performed to make

the implementation produced more complete. Improvements that could be made are

defined with the method that needs to be taken to achieve those improvements. Finally,

Chapter 9 presents a conclusion to the thesis, summing up the major points and

discussing the results.

Page 15: Thesis

Chapter 2 – Review of Previous Work by Jorgen Peddersen

5

Chapter 2 – Review of Previous Work

This chapter describes some of the work done in the fields of video streaming and

network architectures on FPGAs. Advantages and disadvantages are discussed, as well

as their relevance to the thesis.

2.1 Previous Work with BoardSome VHDL implementations for components of the XSV-300 board mentioned in

section 1.3 were designed by the author and two other students before the

commencement of the thesis. These designs included a program to create digital images

from the standard video cables used for TV, DVD etc. and a network stack design that

includes the Ethernet, ARP3, IP4, ICMP5 and partial UDP6 layers of the TCP7/IP

protocol suite. These designs are documented on the supervisor’s web page [2]. A brief

description of these designs follows.

2.1.1 Video DecoderThe video decoder project utilises the video decoder chip on the XSV-300 board to

convert images from RCA or S-Video format into a digital format. These formats are

how TV signals are typically transmitted over short cables. This project stores images

into on-board SRAM, which are then read by a design that displays the images on a

VGA monitor. The project can be used to convert the images into a format valid for

network transmission with some minor editing.

2.1.2 Network StackThe network stack design contains a partial implementation of the TCP/IP protocol suite

on the board. A network stack consists of multiple protocols that exist in theoretical

layers. Each layer provides services to the layers above, and utilises the layers below

for transmission purposes. The protocols that are implemented in this design include

Ethernet, ARP, IP, partial ICMP (request/reply support only) and a UDP receive

3 Address Resolution Protocol4 Internet Protocol5 Internet Control Message Protocol6 User Datagram Protocol7 Transport Control Protocol

Page 16: Thesis

Chapter 2 – Review of Previous Work by Jorgen Peddersen

6

application. This allows the board to be ‘pinged’ from any computer on the network. It

is also possible to add further transport layers (such as UDP transmit and TCP) to the

design if required.

This project also included a PC SRAM8 viewer for troubleshooting. At any time, the

PC can take a snapshot of the entire contents of RAM, downloading it into a file on the

PC. This file can be viewed in a hex editor for troubleshooting purposes. This feature

is an aid for designing new protocols to add to the stack as the data can be checked to

make sure that it is stored correctly.

2.2 Video Streaming Formats

Video streaming can occur in many different formats. Many of these formats employ

some sort of compression algorithm to lower the amount of data that is transmitted by

the stream, while not affecting the quality of the stream. Most commercial streaming

software (e.g. RealPlayer, QuickTime, Media Player etc.) use MPEG9 or Motion-

JPEG10 as the streaming formats. MPEG is a complicated format that is very hard to

encode in real-time. The reason for this is that it uses future frames as part of the

encoding scheme. MPEG can be used for real-time playback, with the information

decoded for future frames being stored for later use.

Motion-JPEG is a different type of streaming. This involves transmitting complete still

images encoded separately and sent to be displayed one by one. In the Motion-JPEG

scheme, images are encoded using the JPEG algorithm. Many applications use this type

of scheme as it is easier to encode and decode each image. Also, if an image is lost, it

won’t affect a large number of frames in the stream, whereas methods like MPEG may.

Many other streaming types are also possible that may be simpler or have better

compression, with a lower quality image. The simplest form is to send an image

without any compression taking place which is termed as raw formatting.

8 Static Random Access Memory9 Motion Picture Experts Group10 Joint Photographic Experts Group

Page 17: Thesis

Chapter 2 – Review of Previous Work by Jorgen Peddersen

7

2.3 Other work

This section describes other work being undertaken in the fields of video streaming and

network interfaces on FPGAs.

2.3.1 Xilinx/MidStream ServerOn October 1st 2001 Xilinx announced that “MidStream Technologies used Xilinx

Virtex-II Platform FPGAs to develop the world’s first true dedicated streaming server

that redefines performance in scalability, reliability and manageability while

dramatically lowering the cost of building and maintaining IP-based media delivery

infrastructures. The server is the first multi-gigabit streaming server capable of serving

all popular formats (Windows Media, Real Networks, Quick Time and MPEG-2) at

multiple bit rates simultaneously” [3].

Figure 1: MidStream's streaming server11

This server, pictured in Figure 1, was released three weeks before the completion of the

thesis. It demonstrates how hardware-based methods can match and greatly surpass

software-based designs. The MidStream server allows a high number of connections

for data transfer, and is much faster than its software counterparts. This device streams

purely from hardware, and does not require a computer to operate. Although the current

device does not support live video streaming from RCA or S-Video inputs, it is likely

11 Figure copied from MidStream’s home page[4]

Page 18: Thesis

Chapter 2 – Review of Previous Work by Jorgen Peddersen

8

that a future product will. Unfortunately, no white papers had been produced at the time

of writing of this thesis, but product information can be found at [4].

2.3.2 Axis Web Cameras and ServersAnother approach to hardware-like systems is the use of embedded systems to provide

web server capabilities. Embedded systems involve using a small microprocessor

inside a product, running software to control the hardware. This method is employed in

a group of products manufactured by Axis Communications [5].

Axis provides two types of product that provide video streaming: video servers and web

cameras. The servers can convert live video data into information on the network with

high quality images and refresh rates of up to 30 frames/second. One of these servers,

the 2400, is pictured in Figure 2. It can support up to four inputs of video and can be

connected via modem or Ethernet network.

Figure 2: Axis 2400 Video Server

The web cameras are a video camera with a web server on board. These systems are

usually not as powerful as the servers, typically outputting 10 frames/second. The 2100

camera pictured in Figure 3 is an example of these. Some of the newer cameras can

match the refresh rates of the servers, such as the 2120.

Figure 3: Axis 2100 Web Camera

Page 19: Thesis

Chapter 2 – Review of Previous Work by Jorgen Peddersen

9

The only problem with these designs comes with the cost. Placing an entire web server

inside the device requires very complicated code and hardware. The servers are the

most expensive with the Axis 2400 costing $A4305.0012. The Axis 2100 camera is the

cheapest product at $A1362, but the high quality Axis 2120 costs $A3212.

2.3.3 JPEG on FPGASeveral Motion-JPEG compression designs have been produced in FPGAs such as the

Motion-JPEG CODEC13 from 4i2i Communications [7]. This performs high level

Motion-JPEG encoding and decoding. Unfortunately, these designs fill an XCV-600

FPGA, so there is no chance of fitting it into an XCV-300 FPGA, especially taking into

account the size of the rest of the design that would need to be included.

2.3.4 Ethernet Intellectual PropertiesAs an alternative to the Ethernet network interface mentioned in section 2.1.2, it is

possible to use one of the 10/100Mbit Intellectual Property cores that are available.

These cores can operate at 10Mbps and/or 100Mbps, so would provide a faster

functionality than the other interface.

One of these is the Paxonet CS-1100 Fast 10/100 Ethernet Media Access Controller.

This design would take about ¼ of the FPGA’s available space so it is not very large.

Unfortunately this core is not free and requires a fee to be used in any design. An

evaluation version of this core was requested, but was never provided.

A similar core to implement 10/100Mbit Ethernet can be found through OpenCores [8].

This design is an almost complete design written in Verilog. This code does not seem

as compact as the Paxonet design, but it is free to be used.

2.4 Summary

The concept of designing a technique to stream video in hardware is a new one that has

just begun to be explored. The MidStream server claims to be the first hardware

12 All costs were provided by Webcam Solutions [6]13 COder DECoder

Page 20: Thesis

Chapter 2 – Review of Previous Work by Jorgen Peddersen

10

streaming server on the market, and it was released at about the same time as this thesis.

This shows that hardware streaming is the way of the future

The components involved in video streaming have been implemented in hardware, but

at the present time, designs only fit into the larger FPGAs. As FPGAs evolve, their

ability to perform digital signal processing will improve to a point where it is feasable to

perform high compression on live video for streaming, but for now, simplifications

must be made which will unfortunately impact on the quality.

Page 21: Thesis

Chapter 3 – Problem Definition by Jorgen Peddersen

11

Chapter 3 – Problem Definition

This chapter defines the problem that was chosen for this thesis. The required task is

explained and separated into its major components. The problems inherent in each

component are then described and solutions are proposed. This chapter does not

demonstrate how the problem was solved, but rather how it could be solved. Each of

the major component’s implementations are discussed in future chapters.

3.1 General Problem

The solution described in this thesis does not provide a complete streaming solution that

would be commercially viable. Instead, the solution that it attempts to provide is a

demonstration that live video streaming is possible at a reasonable quality using

hardware. Producing real-time streams from a live source is still slow in software, due

to the large computation power required in compressing the data. Most video streaming

technology can produce real-time decoding, but real-time encoding is a much more

difficult task, and one that has not previously been performed completely in hardware.

Therefore, the task is to create a complete live streaming board that may not have the

same quality as a computer-based stream, but has the potential to show that with future

work it could match these software-based implementations. The development boards

features will aid the design to a great extent. It contains both a video decoding chip and

an Ethernet Physical Layer encoder chip that can be accessed by a Xilinx XCV-300

FPGA. The board is further discussed in section 4.1. The environment required for

programming the board is discussed in section 4.2.

Other problems associated in designing the solution involve how to store images, how

to transmit images over the network and how to view the stream at another computer.

These issues are discussed further in the remainder of this chapter.

3.2 Video quality

Most image formats for streaming are based on MPEG or JPEG compression

algorithms. Unfortunately, these algorithms take a large amount of FPGA space, and

are difficult to do in real-time. RAM issues may also limit the types of compression

Page 22: Thesis

Chapter 3 – Problem Definition by Jorgen Peddersen

12

available. As the priority is to achieve fast live streaming, it may not be possible to read

all the image data for compression from RAM within one refresh.

Another factor involved in the quality of the video is the frame rate. An expectation of

the design is that the refresh rate will be fast enough so that individual images are not

seen and the picture appears to move seamlessly. Software-based video streaming with

a fast dedicated server for the streaming can often offer this, and it would be good to

match it. The human eye usually stops seeing individual frames at 24Hz14, but it can

just barely detect separate frames at 12Hz.

3.3 Network Issues

As the solution must stream over an Ethernet network, a protocol for the data to be sent

must be specified. The most common family used for the Internet and most LANs15 is

the TCP/IP protocol suite. A stack of these protocols must be implemented to achieve a

complete networking design. This stack is composed of various layers, each being

composed of one, or several protocols. The topmost layer required for transmission is

the transport layer, and for that there are two choices: TCP or UDP. The other layers

are fixed for most networks, in this case, IP with ARP and Ethernet. Apart from the

choice of protocol, the data format must also be defined.

3.3.1 UDP or TCP?TCP or Transmission Control Protocol which is defined in RFC793 [9] is a common

network protocol used when it is important that all data arrives at the destination

without errors. A three-way handshake is used to establish and close the connection to

avoid errors in communication. Acknowledgements are used to make sure that data

arrived at the destination, and retransmissions occur if this does not happen. This is

called a reliable connection. The header for TCP can be seen in Table 1. The sequence

number, acknowledgement number and control bit fields are used to acknowledge

packets transmitted via the connection. Other fields in the header are used for flow

control. The checksum field is required and will detect errors that occur due to

incorrect transmissions.

14 Hertz is frames per second15 Local Area Networks

Page 23: Thesis

Chapter 3 – Problem Definition by Jorgen Peddersen

13

Table 1: TCP header

Source Port Destination PortSequence Number

Acknowledgement NumberOffset Reserved Control bits Window

Checksum Urgent PointerOptions Padding

UDP, or the User Datagram Protocol which is defined in RFC 768 [10] is usually used

for applications where it doesn’t matter if some or all of the data is lost. There are no

acknowledgements, so if a packet is lost, it is never re-transmitted. This is commonly

used in most streaming formats as one packet being lost should not affect the quality of

the entire stream excessively. UDP is also easier to implement and is fast, without the

retransmissions and timeouts of TCP. The header for UDP is shown in Table 2. The

checksum is optional, but is calculated as a 1’s complement sum of the UDP header and

sections of the IP header.

Table 2: UDP header

Source Port Destination PortLength Checksum

Another technique that is often employed by real-time streaming systems is to use TCP

to set up and control a connection between the source and destination, while UDP is

used to transmit packets between the two endpoints. This method gives the benefits of

no retransmissions in UDP while also allowing the source to monitor the status of the

connection. In this case, it can stop the stream to avoid wasting network bandwidth.

3.3.2 Packet FormatThe network packet format will provide many limitations. The maximum UDP or TCP

packet size is 65535 bytes including the header. Packets output on an Ethernet network

have a further restriction whereby the maximum size is only 1500 bytes. IP

fragmentation can be used to cut the larger UDP packet size down into small fragments

which are re-assembled at the destination, so this will also need to be used.

Another limitation is the network data rate. Data can be sent on an 802.3 network at

one or both of 2 speeds: 10Mb/s or 100Mb/s. Both these rates will limit how much data

can be transmitted over the network, and image quality may need to be sacrificed if the

Page 24: Thesis

Chapter 3 – Problem Definition by Jorgen Peddersen

14

speed used is too slow. Packet size and sending rate will need to be controlled to handle

this.

3.4 PC Program

The PC program needs to receive incoming data and convert it into images that can be

displayed on a standard PC. For the purpose of this thesis, this other endpoint is a

Windows based PC, although programs for Unix, DOS etc. could also be written if

needed. This program is by no means the focus of the thesis and should simply allow

the video to be seen.

This program must include options to choose the IP address of the board, and allow

streaming to start and stop. In addition, it must include an image box or window where

the video stream will be displayed. Hopefully this will be able to be resized without

needing much code for the program. Access to network sockets will also be needed to

receive data from the network card on the computer.

3.5 SummaryCollating the information in this chapter provides several limitations and expectations

that the final design should be able to embody. These criteria are summarised in Table

3.

Table 3: Desired expectations and limitations

Criterion Expectation/LimitationImplementation Size Must fit on XCV-300 FPGAImage quality At least recognisableRefresh rate At least 12Hz (Seamless)PC code image display rate At least 12HzNetwork data rate < 100Mb/s or < 10Mb/s for 10Mbit onlyPacket size < 65536No. of concurrent users 1

Page 25: Thesis

Chapter 4 – Hardware Environment by Jorgen Peddersen

15

Chapter 4 – Hardware Environment

This chapter discusses the hardware environment required to solve the problem in the

manner specified in section 3.1. Factors affecting the hardware environment were the

FPGA board used and the method to program it.

4.1 Description of BoardAs stated in section 3.1 the board and FPGA choice were already defined at the start of

the thesis. The design makes use of the XSV-300 board from XESS Corporation. This

board was chosen for its additional on-board features. The board contains an on-board

video decoder chip and an Ethernet PHY chip that are both used in the implementation.

Figure 4 shows a picture of the board and a layout diagram.

Figure 4: XSV-300 board and block diagram16

The choice of the board means that the FPGA will be the only device requiring a large

amount of programming. SRAM17 included on the board is also used for the temporary

storage of data that is being transmitted or received by the network stack. A brief

description of how each of the components being used on the board works follows.

4.1.1 FPGAThe XCV-300 FPGA [12] included on the board is a standard 300k gate Virtex FPGA

from Xilinx. The FPGA is configured through the parallel port via an XC95108

CPLD18 which can also be seen in Figure 4. Multiple implementations can be

16 Images copied from XESS [11]17 Static Random Access Memory18 Complex Programmable Logic Device

Page 26: Thesis

Chapter 4 – Hardware Environment by Jorgen Peddersen

16

programmed into the FPGA to utilise its various features including block RAM/ROM,

logic and three-state circuits on-chip.

4.1.2 CPLDCPLDs are reprogrammable logic devices like FPGAs, except for a few differences.

Firstly, CPLDs are usually much smaller than FPGAs, they do not have the ability to

handle very complex designs. The other main distinction is that they are typically non-

volatile meaning that once programmed, they remember their configuration, even

without power. FPGAs lose their configuration every time they are turned off, and they

must be reprogrammed if this occurs.

The XC95108 CPLD [13] is not only used to program the FPGA via the parallel port,

but also can control much of the hardware on the board such as the LEDs. The CPLD

also has exclusive access to several of the network interface chip inputs and therefore

must be programmed to control these inputs correctly.

4.1.3 Video Decoder ChipThe SAA7113H [14] chip is used to convert PAL, NTSC or SECAM19 data from an

RCA or S-Video source into digital data. The SAA7113 provides its own clock for data

transmission, but before decoding can take place, it must be programmed using an I2C

bus interface. The chip is capable of 9-bit precision, but is usually used with 8-bit

precision. The chip outputs data in Standard ITU 656 YUV 4:2:2 format [15].

Through extensive testing of this chip, and e-mail correspondence with the board

technical support team, it was discovered that there was a problem with the crystal used

by the decoder chip to lock on to colour data. Some boards had been manufactured and

shipped with the wrong crystal installed. The incorrect crystal needed to be replaced to

get the colour to be decoded correctly by the SAA7113H. Once this component was

received from XESS, the decoding started working perfectly.

19 PAL is the television standard used in Australia and Europe. NTSC and SECAM are used in the USand Japan respectively.

Page 27: Thesis

Chapter 4 – Hardware Environment by Jorgen Peddersen

17

4.1.4 Ethernet PortThe Ethernet port interface is controlled by the LXT970A Dual-Speed Fast Ethernet

Transceiver [16]. This chip can communicate at either 10Mbps or 100Mbps using the

MII20. The inputs that control the functionality of the chip (data rate, duplex mode etc.)

are connected from the CPLD. The FPGA controls the data lines for receiving and

transmitting data. Therefore, both the CPLD and FPGA have to be programmed to use

this device fully. The chip contains CSMA/CD21 error checking as well as plain full

duplex operation. By connecting to an RJ45 socket on the board, full network

functionality can be provided.

4.1.5 SRAM

The SRAM [17] contained on the XSV board is composed of 4×4Mbit SRAMs

organised as a 2 512K×16-bit banks of SRAM. This RAM is used in the design for

storing packets used by each layer of the network stack. Only one bank needs to be

used for the storage of data packets due to the size of each bank of RAM, so there is a

free bank that other applications on the board could use if required. A memory map for

how the RAM is allocated for the final design is in Table 4.

Table 4: Ram memory mapMemory Range (hex) Memory Usage

00000 – 007FF ARP Send buffer – holds ARP packets to be sent (1500 bytes)00800 – 00FFF IP Sending Buffer – folds IP frames to be sent (1500 bytes)01000 – 0FFFF Free10000 – 107FF IP Receive buffer (1500 bytes)10800 – 2FFFF Free30000 – 3FFFF UDP transmit buffer – holds outgoing images (64 Kbytes)40000 – 4FFFF ICMP Reply buffer – holds outgoing echo replies (64 Kbytes)50000 – 7FFFF Free

4.1.6 Flash RAMThe board includes a 16MB Flash RAM [18] chip which can be used as a non-volatile

storage for bitstreams. The CPLD can be programmed to download the bitstream in two

ways, either wait for a configuration to arrive from the parallel port and download that,

or download the configuration stored in Flash RAM whenever the board is turned on. It

20 Media Independent Interface21 Carrier Sense Multiple Access/ Collision Detect

Page 28: Thesis

Chapter 4 – Hardware Environment by Jorgen Peddersen

18

is easy to program the Flash RAM and this method is required if the board must be used

away from a computer. As the solution’s main purpose is to eliminate the need for a

computer, this is required.

4.2 VHDL / FoundationTo program the internal configuration of the FPGA, a bitstream must be produced. A

bitstream determines exactly how each flip-flop and interconnect must be configured to

achieve the implementation that is required. To produce it, advanced software tools are

required to theoretically ‘compile’ code into the format for the bitstream. These tools

come in the form of VHDL and the Xilinx Foundation Series. VHDL or VHSIC22

Hardware Description Language is the language that the design is written in, while the

Foundation Series can ‘compile’ code into a bitstream. These two tools are described

further below.

4.2.1 VHDLVHDL allows the programmer to describe the type of hardware to be created inside the

FPGA. Common items are state machines, counters, shift registers, three-state buffers

etc. The language is based on the Ada syntax and allows very complicated hardware

designs to be created with relative ease. All the code for the FPGA and CPLD is

written in VHDL.

4.2.2 The Foundation SeriesThe Foundation Series contains many tools to create and analyse designs for Xilinx

FPGAs. The most important features that it provides are “synthesis” and

“implementation”. These two tasks allow the creation of a bitstream that can then be

programmed into an FPGA or CPLD.

Synthesis involves reading the VHDL code and determining what sort of hardware,

logic and memory is involved, then outputting that in a form that is ready to be placed

in the hardware. Flip-flops and registers are identified and logic is simplified and

defined. The output can be thought of as being a schematic for the internals of the chip.

22 Very High Speed Integrated Circuit

Page 29: Thesis

Chapter 4 – Hardware Environment by Jorgen Peddersen

19

Implementation takes the output from the synthesis stage and maps it to the specific

hardware being used. This stage routes the design inside the FPGA, configuring each of

the logic blocks inside to create the required design. The output of implementation is

the bitstream file which contains the configuration of every logic block within the

FPGA. Most logic blocks contain some logic and 2 flip-flops, so the required design is

mapped to achieve minimum distance between logic blocks and to eliminate clock

skew. This stage takes a very long time to complete due to the massive amounts of

computation that routing algorithms require.

Foundation also includes additional tools for analysing implementations after they have

been created. They include a tool to analyse the timing of the critical paths in the

design, tools to illustrate the relative placement of various parts of the design and tools

to program devices through a JTAG23 connector, among others.

4.3 Summary

This chapter has explained the hardware that is used and the software that is used to

design and program that hardware. The XSV-300 board is perfectly suited for this

project, and the design will make use of the on-board video decoder, network chip and

SRAM. The Foundation Series also allows simple programming of the board and some

simulation capabilities. With this environment, it should be easy to design and program

the live video streaming design into the FPGA on the board.

23 Joint Test Action Group

Page 30: Thesis

Chapter 5 – VHDL Implementation by Jorgen Peddersen

20

Chapter 5 – VHDL Implementation

This chapter details the choices made for the implementation of the design in VHDL.

The VHDL implementation is the main focus of the thesis, as this is how the video

streaming in hardware is performed. This implementation is designed to talk to the PC

program described in Chapter 6, but another design could easily be created to make

another XSV board read the incoming packets and convert it to VGA data for pure

hardware-to-hardware streaming.

5.1 Video DecodingThe video decoding circuit has not changed much from the Video in design mentioned

in section 2.1. Three main changes have been made to the code from this project to

allow it to be used in the streaming project. The first was the removal of the code which

displayed the images on a VGA monitor as it was no longer needed. The other two

changes are much more extensive to the functionality of the previous design to prepare

it for the new implementation.

5.1.1 InitialisationThe SAA7113H video decoder chip contains many configuration registers that are

initialised using the Philips I2C interface [19]. Previously, programming had been

performed by using the parallel port to emulate the I2C interface via software on a PC.

As this is not an option for a pure hardware implementation, the I2C interface had to be

created as part of the design.

Performing the initialisation as part of the design also meant that an initialisation table

of 72 8-bit entries had to be added. Using logic to implement this would require many

flip-flops and would waste a lot of space. Fortunately, the Virtex series of FPGAs

includes block RAM which can emulate a standard RAM or ROM module. By using

the Core Generator from the Foundation tools, it was possible to create a ROM which

stores the initialisation data. The rest of the I2C interface requires the design of an I2C

controller to control the two serial lines used in the communication. The state machine

required is started when the chip is reset, programs the SAA7113 and stops as

Page 31: Thesis

Chapter 5 – VHDL Implementation by Jorgen Peddersen

21

programming only needs to occur once. The initial values stored in the ROM can be

altered if NTSC or SECAM playback is preferred to PAL.

5.1.2 RAM FormatThe video input decoding state machine accessed RAM in a way that was not very

optimised, using more clock cycles than were needed. This method would not be

suitable for adding to the network stack design, so the protocol was simplified. Instead

of using delay signals, the signals to write RAM are now clocked in a way that does not

require extra clock cycles to complete like the original version.

5.2 NetworkingMany upgrades have been made to the network stack discussed in section 2.1. Each of

these upgrades was completed to simplify the hardware, achieve better routing

capabilities and/or clean up the code. Most of them were required for the new design,

but others were simply completed to make the design more compact.

5.2.1 Removal of IP Re-assemblyThe first of these changes was the removal of IP re-assembly. IP fragmentation and re-

assembly are required if packet sizes of more than 1500 bytes of data are being sent by

the IP layer24. As the board does not require large amounts of data to be received, this

part of the IP stack can be removed. Fragmentation on the sending side needs to

remain, as the board may need to send large packets of data. Removing this also meant

altering the ICMP ping program so that it no longer chooses which of the IP buffers to

read from (there is now only one).

5.2.2 Fixing EthernetAnother minor cosmetic change was to redesign the Ethernet sending layer. The design

had a case statement that did not synthesise well, causing several warnings. Some of

the logic in the case statement was removed and implemented as combinational logic,

fixing the design problem. The case statement was also rearranged to produce a design

that required less logic.

24 See Internet Protocol RFC 791 [20]

Page 32: Thesis

Chapter 5 – VHDL Implementation by Jorgen Peddersen

22

5.2.3 ICMPThe main implemented feature of the original IP stack was the partial ICMP layer.

Although it is not required to achieve video streaming, it is useful to leave it in the

design for testing capabilities. The ICMP layer allows the board to be ‘pinged’, a

mechanism to test whether the board is working on the network. ICMP is not officially

a transport layer protocol, but it can be treated as one and was previously the only layer

that controlled the IP sending layer. Some minor cosmetic changes were required in the

signals connecting these two layers so that the design could accept a new layer that

could also send data via IP.

5.2.4 RAM ArbitrationMany of the layers in the IP stack read and write to RAM to perform their functions.

Each layer also runs in parallel to the other layers in the design, so many different state

machines may need to access RAM at once. Unfortunately, only one access to RAM

can occur at a time, and these operations require multiple clock cycles. This means

some arbitration is required to only let one device access the address and data signals

for RAM at a time. Currently the stack design uses a simple multiplexing state machine

to determine which set of control signals should be allowed to access RAM.

Multiplexers require a lot of logic though, so some other designs were tested.

One of these designs was designed to make each layer normally output high impedance

and tie all the outputs for each signal on the bus together. Some arbitration logic is

used to determine which device is allowed to control the bus. Implementing this

method was quite difficult and involved changing almost every design in the system to

handle the new RAM format. When it was implemented however, it turned out to have

worse timing constraints than the multiplexer design so this format was not used.

Another arbitration mechanism that was attempted was to use a similar format to the

one just mentioned, except each layer would normally output logic zeroes until it was

told it could output its control values. All the outputs were connected via a logical OR.

This is similar to the three-state method, but does not require three-state elements to be

used. This design performed better than the three-state design, but it was slower than

the original design as well, so the original design was kept.

Page 33: Thesis

Chapter 5 – VHDL Implementation by Jorgen Peddersen

23

5.2.5 PC SRAM ViewerAs mentioned in section 2.1.2, the network stack project includes an SRAM viewer that

can be accessed through a parallel port via the PC. This design aided the debugging

process for alterations made to the design, but is not required in the final version. This

part of the code was therefore removed before the final implementation to simplify logic

slightly.

5.3 Image FormatDue to the way the data in RAM is stored, it is very difficult to produce any sort of

compression. To achieve a good frame rate, there is less than half a frame delay

between packets being sent to perform full image compression. Formats like Motion-

JPEG are impossible to implement in this design due to the way 2-dimensional blocks

are used to encode the data. As the data received moves left-to-right from the top to the

bottom of the image and the data is stored sequentially in RAM, there is not enough

time to load all the bytes needed from RAM for encoding before the next refresh time.

The only feasible algorithm to use would be one that only compresses each horizontal

line separately. This way, several of the previous bytes can be remembered until the

write to RAM occurs for those bytes. Also, the algorithm should have a fixed

compression ratio, as this will make it easier for the PC program to decode when it

receives the packet. Unfortunately, no good, simple to implement algorithm to fit these

criteria was found, so the problem was solved by sampling the image, e.g. taking every

2nd or 4th pixel of the image in the horizontal and vertical directions. This is extremely

lossy, but the image should still be adequate enough to demonstrate that video

streaming is occurring properly.

5.4 Video to Network Interface

The video to network interfacing program had to be written as the main part of the code.

This includes two state machines that interact with the video decoding interface and the

network interface to implement the streaming of the images from the input. The UDP

transport protocol was chosen for its ease of use and as packet losses are not detrimental

to the system. Each of the design issues that arose during implementation of this design

are now discussed.

Page 34: Thesis

Chapter 5 – VHDL Implementation by Jorgen Peddersen

24

5.4.1 Video-In to UDP Packet ConverterThe video decoded from the input is sent an image at a time through the network. The

main state machine controls the decoder, and writes the required packet into RAM so

that it can be read and fragmented by the IP sending layer. The maximum data rate

allows the following packet types to be sent:

• 180×144 resolution at 25Hz refresh.

• 360×144 resolution at 12.5Hz refresh.

• 360×288 resolution interleaved at 12.5Hz refresh.

By testing all these image formats, it was decided that 360×144 at 12.5Hz allowed a

good looking image that updates quite frequently. A screenshot at this resolution can be

seen in Figure 5. This quality is often better than many software-based systems,

although as there is no compression, the image does look rather blocky. A compression

algorithm would greatly increase the quality of the transmitted images, but the

restrictions mentioned in section 5.3 do not allow that to be done in the time and scope

of this thesis.

Figure 5: Example of image quality

The UDP header shown in Table 3 must be included with each packet that is

transmitted. The source port is hard-coded into the design as port 2038025. The

25 This is an arbitrarily chosen number (my birthday is 20 March 1980)

Page 35: Thesis

Chapter 5 – VHDL Implementation by Jorgen Peddersen

25

checksum is not required, and would just add extra complexity to the design so it is set

to zero (which indicates unused). The data length is easy to calculate, and the

destination port is provided by the connection handler (see section 5.4.2).

The conversion from the decoder chip to a packet in RAM is performed by another state

machine. Whenever streaming is enabled, this machine will tell the decoder state

machine to decode one frame, or a 720×576 interleaved image. The machine chooses

which pixels it wants to write to RAM and then writes these pixels into RAM as the

decoding continues. Writes to RAM have to be timed well, as the decoding machine

operates off a clock generated by the SAA7113 chip at about 27MHz. The main system

clock is 50MHz so some interface logic is needed.

When the packet is complete in memory, it tells the IP layer to transmit the data. The

extra logic added to the main stack program to allow ICMP communication is set up to

allow all UDP packets through in preference to ICMP. This means that some ICMP

packets may not be sent in time, but when the board is idle, it will still be possible to

ping the board from any location on the network to test its network status. When the

board is streaming data, ICMP replies will be sent during the pause between images

being sent, but replies to all requests will not necessarily be made.

5.4.2 UDP Connection HandlerAlso included as part of the network interface is the connection handler. This controls

whether or not streaming should occur, and handles the destination IP and port. This

state machine reads all incoming UDP packets, and listens to port 20380 for incoming

data. If the packet’s first byte is 5Bh26, the board will start streaming packets to the

source IP and port of the packet. If it receives a packet whose first byte is not 5Bh, it

will stop streaming after sending the current packet.

To determine if an incoming UDP packet is actually a control packet, the UDP header is

checked. The checksum field is ignored as it is valid to do so for UDP, and once again

it would just add unwanted complexity. If the destination port is 20380, the source IP

26 Also arbitrarily assigned

Page 36: Thesis

Chapter 5 – VHDL Implementation by Jorgen Peddersen

26

and port in the header are used to determine where data should be sent. The source IP is

provided by the IP layer so that the destination IP address can be determined.

This method is very useful for controlling the stream. The only problem occurs if the

source never sends a stop packet, or the stop packet is lost. If this were to occur, the

board would keep sending packets endlessly, denying service to the destination host.

Therefore a timeout is required and is implemented so that if another packet with 5Bh as

the first byte is not sent within 32 packets being output from the image streamer,

streaming stops. The destination for the streaming data must therefore send these

‘keep-alive’ packets periodically so the board will not timeout and stop sending.

An extra problem was noticed when the design was completed that would cause a

serious security hole. If an attacker on the network sent a start packet with their source

address set as the broadcast IP, the board would then stream 32 large packets onto the

network to whichever source port that was specified in the UDP header! In essence,

they could force the board to deny access to a port of their choice on every machine

connected to the network. Some filtering was added to stop this obvious security flaw

from being a possibility.

5.5 Complete FPGA DesignBy performing all the updates and implementing the new state machines described

above, the final design can be created. A block diagram showing all the main state

machines in the design is shown in Figure 6. This figure shows the state machines

present in the design, and the communication paths between them.

Page 37: Thesis

Chapter 5 – VHDL Implementation by Jorgen Peddersen

27

Figure 6: Block diagram of final design

5.6 CPLD Alteration

To remove the need for a computer altogether, the design needs to be programmed into

permanent storage on the board as the FPGA loses its configuration whenever the power

is turned off. The Flash RAM mentioned in section 4.1.6 will accomplish this task. To

get this method working, some alterations were required for the CPLD.

The CPLD controls the programming of the FPGA and also handles some configuration

options for the Ethernet PHY chip. These options include whether the PHY should

operate at 10Mb/s or 100Mb/s and similar choices, so they are very important to the

design. A new configuration for the CPLD was produced for the original design that

handled downloading through the parallel port, but not from Flash RAM. Therefore, a

Page 38: Thesis

Chapter 5 – VHDL Implementation by Jorgen Peddersen

28

new configuration had to be written to make the CPLD download from Flash and

control the network chip correctly. This design was modified from the source for the

normal Flash programmer available on the XESS website examples page [21].

5.7 SummaryThe VHDL design is the heart and soul of this design’s implementation. This performs

all the required tasks for the hardware to produce the video stream. The state machines

involved are often similar in their design, yet each is very complicated, implementing its

part of the design and interfacing with the rest of the state machines to achieve the

overall goal.

Much of the complete design was written before this thesis began, but many updates

have been made to each of the previous designs to prepare it for the new project. The

new state machines are also the most complicated hardware in the design as they do

most of the work in the system.

The final design allows full video streaming output to be achieved with a very simple

input protocol. Although no compression other than sampling was used in the encoding

of transmitted images, the design matches all expectations imposed in the problem

description. A version of the source file that includes the video decoder, UDP packet

converter and connection handler state machines is given in Appendix B.

Page 39: Thesis

Chapter 6 – PC Implementation by Jorgen Peddersen

29

Chapter 6 – PC Implementation

This chapter describes the implementation issues that arose during the writing of the PC

side of the streaming software. This must provide a simple, easy to use interface that

reads the packets arriving from the board and displays the images on the screen. The

first main decision involved was the choice of I.D.E. 27

6.1 Programming I.D.E.As the design was to work on a Windows 9X system, there were only two real choices,

Visual Basic and Visual C++. Both I.D.E’s provide object orientation, but there are a

few large differences between them. Visual Basic is a much simpler I.D.E. to use, but is

slow and harder to control. There are many more functional modules provided for

Visual C++ and it also produces smaller code.

If the program was written in VB, it would be harder to specifically handle sockets for

the network connection and to display images at a fast enough frame rate. Therefore,

Visual C++ was chosen as it is very fast and much more powerful compared to VB.

Although VC++ is harder to learn and program, it can do many more of the tasks

required by the program than VB.

6.2 OpenPTC

OpenPTC gives programmers a surface of pixels to draw to and high speed routines to

copy these pixels to the display. It also provides some other useful features such as

basic keyboard input and a high resolution timer [22]. It contains many libraries for

Windows, Dos, Unix and many other programming environments. The OpenPTC for

Windows libraries are available on the web under the terms of the GNU Library

General Public License. Essentially, this means that applications may link to the

OpenPTC dynamic link library free of charge so long as any improvements made to the

library itself are submitted back to the OpenPTC community [23].

27 Integrated Development Environment

Page 40: Thesis

Chapter 6 – PC Implementation by Jorgen Peddersen

30

OpenPTC was chosen to display the incoming images for its ease of use and its features

that allow the program to display images without needing to handle window sizing etc.

The OpenPTC implementation uses two classes to display the image. These classes are

the console and the surface, and together they provide a very powerful imaging tool.

To use the libraries, the program must first create a console and a surface associated

with the console. The console is the window in which the images are displayed and can

be any size. The surface has a size determined by the programmer for the number of

pixels in the horizontal and vertical directions. Whenever an image needs to be

displayed, the surface pixels are updated and then mapped to the console. Changing the

console size will display the correct number of surface pixels no matter what, and this is

all taken care of in the libraries. This shows how easy the OpenPTC libraries are to use,

and the library also comes with many examples that make it easy to create other

implementations. Some samples of OpenPTC programs are shown in Figure 7.

Figure 7: OpenPTC demonstrations6.3 Winsock SocketsWindows sockets allow the Windows programmer to add network functionality to the

design. Windows sockets are required to receive UDP or TCP packets in a Visual C++

application. There are several different ways to use these that range from using MFC28

to very low-level socket reads and writes. Sockets can also be set to be either

synchronous or asynchronous, which determines whether they block or cause an event

when they are triggered.

28 Microsoft Foundation Classes

Page 41: Thesis

Chapter 6 – PC Implementation by Jorgen Peddersen

31

6.3.1 Microsoft Foundation ClassesMicrosoft provides two types of sockets to be used for network communications. These

are CSocket and CAsyncSocket. These classes implement the common network

functions for a blocking socket and an asynchronous type socket. There was a problem

in attempting to use these classes, the correct .dll to get these working did not seem to

be present. Therefore, the implementation was not made with these types of sockets.

6.3.2 Blocking SocketsThe original implementation was completed in blocking sockets as they are much easier

to create and use without much preparation. These are similar to the sockets in Unix

and other systems. By using SOCK_DGRAM as the socket type (this is the type for

UDP), it is possible to create a socket, then use the sendto() and recvfrom() functions to

transfer data. The problem with blocking sockets is that there is no timeout on receive,

so if the system goes into a blocking receive and nothing arrives, the whole program

stalls. This caused the program to lock up occasionally if there was a transmission error

which is an obvious design flaw. For this reason, a non-blocking asynchronous socket

was required for the receive. A blocking send could still be used as the system should

always be ready to send when we need it to.

6.3.3 Non-blocking SocketsThe final implementation uses non-blocking asynchronous sockets to control the

sending and receiving of UDP data. Asynchronous sockets allows the sockets to

employ Windows event handling. The WSAAsyncSelect() function is used to tell the

socket to trigger an event on the current window whenever a receive arrives. This event

is set up to call a specific function using similar code to that used in “Programming

Microsoft Visual C++” [24]. Using a non-blocking receive allows the system to

perform other tasks and not poll the socket waiting for a packet to arrive. The socket

code can be used to receive data whenever it arrives, and the function that is called

operates on that data.

6.4 Protocol Definition

The protocol that the PC must use to keep the streaming going had to match the one

used for the board (see section 5.4.2). This meant that a ‘keep-alive’ packet with the

Page 42: Thesis

Chapter 6 – PC Implementation by Jorgen Peddersen

32

data ‘5bh’ must be sent to port 20380 of the board once to start transmission, and at

least once for every 32 packets received. This was implemented by sending the ‘keep-

alive’ on every received packet. Although it is not required as often as this, it will

definitely keep the stream going. All the program has to do is listen on the port that it

sent the original start message on to receive all image packets from the board and

display them.

This protocol was chosen for its simplicity and the fact that it reduces bandwidth on the

network. A timer could also be used to generate when the ‘keep-alive’ messages should

be sent, but this would require more coding than was required, and the implemented

method works just as well.

6.5 Graphical User Interface

The GUI for the device was designed to be as simple as possible, as the hardware

component is much more important than this software design. All the GUI needs is the

ability to select an IP address, and buttons to start and stop the stream. The OpenPTC

console acts as another window and is created at start-up. Therefore, the start and stop

buttons affect this window. Due to the power of OpenPTC, it is easy to resize the

console and still get the image at whatever window size is required. The GUI for the

design is shown in Figure 8.

Figure 8: GUI for PC program

Page 43: Thesis

Chapter 6 – PC Implementation by Jorgen Peddersen

33

6.6 Summary

The PC program discussed in this chapter was produced to demonstrate the output of

the video streaming. This was done for a windows environment, but similar programs

can also be created for Unix, DOS and other environments. This program is limited as

it was designed simply to display the output of the hardware board, although issues

forced it to become more complicated than initially planned.

The use of a free library package to control the imaging was extremely helpful to ensure

the code is small. This design would have taken a lot more time to create if OpenPTC

was not available. There are also versions for Unix, Dos etc. that could be used for

other versions of the PC program. Appendix C shows the main source file for this code.

Page 44: Thesis

Chapter 7 – Design Evaluation by Jorgen Peddersen

34

Chapter 7 – Design Evaluation

This chapter discusses the final implementation’s results and how well they compare to

other streaming methods. Topics discussed are the video quality and network usage.

The implementation data for the final design can be viewed in Appendix A.

7.1 Streaming ResultsThe streaming results that the board produces perform rather well, but need some work.

The limitations of a 10Mb/s network interface only allows raw data to be sent with a

very lossy compression algorithm.

7.1.1 Image QualityThe quality of the streaming image is recognisable, but quite blocky. This is due to the

large lossy compression ratio used. Attempts were made to increase the quality of the

streamed image, although the limitations stopped this from making the image even

better. As chapter 8 will discuss, there are many upgrades that can be made to the

program that would allow a much cleaner, less lossy image to be displayed.

Although the image quality is not perfect, it is OK to prove that hardware methods for

live video are possible, and could easily surpass software methods in the next few years.

The refresh rate of 12.5Hz is nearly undetectable, and this is much better than some

software methods that only achieve only a few frames per second. A dedicated

computer with a high bandwidth network card, a good network connection and a very

good video card are required to get really high quality results at a high refresh rate.

7.1.2 Network IssuesThere is one rare bug remaining in the final design. If one IP fragment is lost, a buffer

on the PC is wasted while the system waits for the packet to timeout before clearing the

buffer. After a long time of streaming, these partial packets start filling up all the

buffers on the system. This causes the system to start to lag, and the PC may not send

enough packets to keep the connection with the board alive. In this situation, streaming

stops and the start button must be pressed again.

Page 45: Thesis

Chapter 7 – Design Evaluation by Jorgen Peddersen

35

Attempts have been made to bypass this error, but no solution that works perfectly has

been devised. Lowering packet size decreases this effect, but sacrifices must be made

in image quality. Some alternative solutions that would alleviate this problem are also

discussed in Chapter 8.

Apart from this bug, communication through the network is fine on reliable networks.

Unfortunately, due to the large UDP packet size, the system will not work very well at

all on an unreliable network, as every fragment of a packet must arrive for the viewing

program to display it.

7.2 Comparisons

Although this design may not be better than other commercially available products, it

does almost match up in most respects. Against software methods, the board is

adequate in some areas, but severely lacking in others. To achieve good live streaming

using a computer, the requirements are a Pentium III processor, good video card and

good network card. This would end up costing more than the board, but the quality of

the video will be much better in the software. The design could match the refresh rates

of the board with less network bandwidth and clearer, more compressed images.

The MidStream server introduced in section 2.3.1 provides capability to work on

Gigabit networks and comes with hard drives to store the data. It can accept multiple

connections at once and provides a much better quality than the thesis implementation.

It does not, however do live streaming of RCA or S-Video data. The streaming data

must already exist on the drives in the hardware. This design also uses multiple Virtex-

II FPGAs in its construction. These FPGAs can provide much better digital signal

processing algorithms for encryption etc. This server shows what the thesis set out to

do in the first place, provide a solution which could match software in a pure hardware

design.

The Axis Webcams and video servers described in section 2.3.2 can also outperform the

thesis implementation in image quality, although they are also very expensive. The

main addition that allows this is compression which decreases the size of images. The

2100 webcam can only stream at rates up to 10 frames/second though, which is not as

Page 46: Thesis

Chapter 7 – Design Evaluation by Jorgen Peddersen

36

good as the thesis board which can reach 25 frames/second at low resolution. The

webcam can only output the images it sees as well, it can’t convert the output of TV or a

game console like the thesis implementation does. By adding a proper compression

algorithm to the implementation, it may be possible to increase the board’s

specifications to match those of the other servers and cameras.

These comparisons may seem to say that this implementation is not at the level of other

methods but, as the MidStream server shows, hardware versions of streaming will

slowly become cheaper and better than the computer-based software methods. The

thesis implementation has provided a new method of implementing live video streaming

that could lead to a range of low cost products that implement this task in the future.

7.3 Process Evaluation

The performance of the final design has been discussed, but what of the process that it

took to get to the final design? This section evaluates the work methods used and

experience learned during the process of the thesis.

The methods used were mostly correct, and they have produced a final design which

meets all the specifications and has the ability to exceed the specifications with some

extra work. The one design decision that was incorrect was to give up on compression

too early. The final image appears distorted and unclear as the sampling used is

extremely lossy. Some more time dedicated to this process would produce an image

with a much better quality. Apart from this, the other decisions made all seem correct

and helped to produce the final design. The VHDL and PC implementations give a

complete solution to the specified problem.

Much experience was gained through the process of completing the thesis design.

Many design methods were learnt, like the instantiation of block RAMs and ROMs as

described in section 5.1.1.

7.4 Summary

Considering these evaluations, it is plausible to say that the final design could be classed

as a success, but it is not the best result. The design has met all the goals defined in the

Page 47: Thesis

Chapter 7 – Design Evaluation by Jorgen Peddersen

37

problem definition, and has proven that hardware streaming is possible, and should

improve to a point where hardware-based designs outperform software-based designs.

The experience gained through applying correct methods during the design will also aid

in future work. Several improvements were desired to the final design, but limited time

stopped them from being implemented. These improvements are presented in the next

chapter.

Page 48: Thesis

Chapter 8 – Future Developments by Jorgen Peddersen

38

Chapter 8 – Future Developments

The final product was complete within the limitations of this thesis, but there are many

other upgrades that would make it into a viable product. Time constraints stopped these

upgrades from occurring but the method for each upgrade is easy to implement. This

chapter discusses each of the improvements that could be made, and details how much

they would improve the quality of the final design.

8.1 Image Format Changes

The current method only uses 8-bit colour data so there are only 256 possible colours.

This was chosen to keep packet sizes down. Higher colour densities can be achieved by

decreasing the resolution or frame rate so a trade off must be made. The maximum data

that can currently be sent must be less than 10Mb/s as this is all the network code can

handle. Ignoring the overhead of the network the data rate can be found with the

following formula:

Data rate = Hori. Resolution × Vert. Resolution × Refresh Rate × Colour Depth(in bits)

The method that was chosen uses 360×144×12.5Hz with 8-bit colour depth, which

comes out to be about 5.184Mb/s. It is also possible to interleave the vertical lines, only

sending either the even or old half per packet. It is easy to alter these values by a factor

of 2 so that the image formats shown in Table 5 can be used instead if desired. All

these formats result in a 5.184Mb/s data rate.

Table 5: Some alternate image formats

HorizontalResolution

VerticalResolution

RefreshRate

ColourDepth

Interleaved?

360 144 12.5Hz 8-bit No180 144 25Hz 8-bit No180 144 12.5Hz 16-bit No360 288 12.5Hz 8-bit Yes

8.2 100Mb/s UpgradeProblems in the low image quality were mainly due to the network data rate being only

10Mb/s. The original network stack code that was provided only allowed this speed and

Page 49: Thesis

Chapter 8 – Future Developments by Jorgen Peddersen

39

not the 100Mb/s speed common for most networks these days. The Ethernet PHY chip

is capable at running at 100Mb/s and by altering the stack, it is possible to achieve this

data rate.

If this upgrade were made, it would be possible to send more data per image or use a

higher colour depth, therefore greatly improving the resulting video streaming quality.

The main difficulty in converting to a 100Mb/s design is that a nibble of data is required

every two clock cycles, whereas previously there were 20 cycles. This means much of

the design needs to be altered to allow 100Mb/s functionality to occur. The following

sections explain how these alterations should be performed.

8.2.1 16-bit RAM FunctionalityIn order to achieve the 100Mb/s control signal rate, more RAM must be read at one

time. The RAM is not fast enough to only read or write one byte at a time, as the

network chip requires the data faster at the new clock rate. To fix this problem, the

entire stack needs to be redesigned to read and write 16-bit data rather than 8-bit data.

The RAM code provided will accept 16-bit reads and writes, and most of the other fixes

to the code are relatively minor.

8.2.2 CRC AlterationCRC29 is used in the Ethernet layer to check the integrity of packets transmitted over the

network. A CRC generator is required in the design for both incoming and outgoing

packets. This CRC generator is designed to accept one byte at a time and takes eight

clock cycles before it can accept another, i.e. 1 cycle per bit. At 100Mb/s, it must

handle a byte in every four cycles, so it would need to be redesigned. The best way to

do this is to use a look-up table in block ROM that stores the CRC generator vector

[25]. Using a nibble to index the table would require a 16×32-bit table that will allow

four bits to be encoded simultaneously, while minimizing the amount of block ROM

required to perform the alteration.

29 Cyclic Redundancy Check

Page 50: Thesis

Chapter 8 – Future Developments by Jorgen Peddersen

40

8.3 Fragment the UDP Packet

Streaming quality will be increased on non-reliable networks by segmenting the UDP

packet into multiple UDP segments rather than relying on IP fragmentation to do it.

Segmenting into multiple UDP packets rather than using IP fragmentation can alleviate

the problem described in section 7.1.2. If a packet is lost during this method, a buffer is

not filled and there is no timeout waiting for an entire image to arrive like the

implemented version. Only part of the image is lost, and that part will not update until

the next refresh cycle.

Segmentation is accomplished by reducing the size of UDP packets so that they fit

within the maximum packet size. The boundary can be placed on the end of a line so

that, for example, 4 lines can be transmitted per packet. This would mean that a line

width of 360 will give a packet size of 360×4 + 8 = 1448 bytes. One more byte would

be needed to indicate which set of lines this packet includes so that the PC side knows

where to place the lines on the OpenPTC surface.

This method would eliminate any problem of filling up the receive buffers, although

implementing it is difficult. Implementing this as part of the UDP layer is possible, but

tricky due to the amount of data that needs to be moved around to create each UDP

packet. It is much easier to alter the IP fragmenter to tack on the UDP header to any

outgoing fragments. This is fine as long as the board would only be sending UDP

packets. Unfortunately, it also has to recognise ICMP packets and not add the header to

these.

8.4 Image CompressionAnother addition that would be useful is some sort of compression algorithm for the

images being transmitted. Due to the way the image is stored in RAM, there are only a

few options for compression. To be able to maintain live streaming, it is impossible to

do two-dimensional compression. It is possible to remember the last few values, so

some sort of purely horizontal line compression algorithm could be implemented and

still retain the live streaming. This will lower the amount of data sent per image, so it is

possible to increase the image size and not flood the network with data.

Page 51: Thesis

Chapter 8 – Future Developments by Jorgen Peddersen

41

8.5 Audio streaming

Included in the work mentioned in section 2.1 was an application to use the audio input

port on the board to record audio data and play it back. Part of this design could be

utilised to add audio data to outgoing packets. This would be easy to add on the board

side of the communication, but a way to play the music must be found at the

destination. Many software streams use audio, and it would be a useful addition to the

project.

8.6 SummaryThe improvements mentioned in this chapter would all make the design work faster or

better than it currently performs. With a better compression algorithm and faster

network interface, the design could match its current software counterparts. Methods

for making all the improvements have been given, and would have been implemented,

except time constraints stopped them from being done.

Page 52: Thesis

Chapter 9 – Conclusion by Jorgen Peddersen

42

Chapter 9 – Conclusion

This thesis has presented an overview of the design and development of a hardware

implementation of live video streaming. This device performs streaming without

requiring a computer for decoding and transmitting the images.

A device capable of providing proof that video streaming in hardware is possible and

could be extended to overtake software in the market has been specified, designed and

implemented. An evaluation of the final design implementation is also performed to

demonstrate that it achieved its expectations. The project can be considered a success

due to the level of the final design.

Although the final implementation was complete, there are several areas of the design

that could be improved upon. These are listed, and possible methods for achieving

these improvements are suggested. This design proves that hardware implementations

of live video streaming are possible and will probably improve to a level to match

software techniques in future years.

Page 53: Thesis

References by Jorgen Peddersen

43

References[1] XESS Corp., XESS Corporation home page, http://www.xess.com (current 15

Oct, 2001).

[2] Brennan J, Partis A. and Peddersen J., VHDL XSV Board Interface Projects,http://www.itee.uq.edu.au/~peters/xsvboard (current 15 Oct, 2001).

[3] Xilinx, Xilinx Enables Breakthrough Video Streaming Technology in New Serverfrom MidStream, http://www.xilinx.com/prs_rls/0190MidStream.html (current 15Oct, 2001).

[4] MidStream, MidStream Technologies, http://www.MidStream.com (current 15Oct, 2001).

[5] Axis, Axis Communications, http://www.axis.com (current 15 Oct, 2001).

[6] Webcam Solutions, Webcam Solutions Price List,http://www.webcamsolutions.com.au/Pricing.htm (current 15 Oct, 2001).

[7] Riley L., Motion JPEG Video/Still Image CODEC,http://www.4i2i.com/JPEG_Core.htm (current 15 Oct, 2001).

[8] Mohor I., Mahmud G. and Novan H., Ethernet MAC 10/100 Mbps,http://www.OpenCores.org/cores/ethmac/ (current 15 Oct, 2001).

[9] RFC 793, Transmission Control Protocol, University of Southern California,1981.

[10] RFC 768, User Datagram Protocol, University of Southern California, 1980.

[11] XESS Corp., XSV-300 Virtex Prototyping Board,http://www.xess.com/prod014_3.php3 (current 15 Oct, 2001).

[12] Xilinx, Xilinx Home : Products : Devices : Virtex Series,http://www.xilinx.com/xlnx/xil_prodcat_landingpage.jsp?title=Virtex_Series(current 15 Oct, 2001)

[13] Xilinx, XC95108 In-System Programmable CPLD,http://www.xilinx.com/partinfo/95108.pdf (current 15 Oct, 2001).

[14] Philips, Philips Semiconductor; SAA7113H; 9-bit video input processor,http://www-us7.semiconductors.philips.com/pip/SAA7113H (current 15 Oct,2001).

[15] Digital Video Coding, Digital Video Coding: Digital Video,http://umi.eee.rgu.ac.uk/umi/digvid/digvid.html (current 15 Oct, 2001).

[16] Dragon, Product Lines,http://www.dragonhk.com/products/intel/features/LXT970A.htm (current 15 Oct,2001).

[17] Alliance Semiconductor, 5V/3.3V 512K × 8 CMOS SRAM,http://www.gaw.ru/doc/Alliance/as7c34096.pdf (current 15 Oct, 2001).

[18] Intel, 5 Volt FlashFile™ Memory; 28F004S5, 28F008S5, 28F016S5 (x8),http://developer.intel.com/design/flcomp/datashts/290597.htm (current 15 Oct2001).

Page 54: Thesis

References by Jorgen Peddersen

44

[19] Philips, Philips Semiconductors;I2c,http://www.semiconductors.philips.com/buses/i2c/ (current 15 Oct, 2001).

[20] RFC 791, Internet Protocol, University of Southern California, 1981.

[21] XESS Corp., Example Designs, Tutorials, Application Notes,http://www.xess.com/ho03000.html#Examples (current 15 Oct, 2001).

[22] OpenPTC, OpenPTC, http://www.gaffer.org/ptc (current 15 Oct, 2001).

[23] Fiedler, G., OpenPTC for Windows,http://www.gaffer.org/ptc/distributions/Windows/index.html (current 15 Oct,2001).

[24] Kuglinski, D., Programming Microsoft Visual C++, Microsoft Press, Redmond,Wash., 1998.

[25] Modicon, LRC/CRC Generation, http://www.modicon.com/techpubs/crc7.html(current 15 Oct 2001).

Page 55: Thesis

Appendix A –Implementation Data by Jorgen Peddersen

A-1

Appendix A – Implementation Data

Included here is some of the implementation data for the design. These reports are

segments of the reports generated by Foundation during implementation.

Design Summary-------------- Number of errors: 0 Number of warnings: 33 Number of Slices: 1,761 out of 3,072 57% Number of Slices containing unrelated logic: 0 out of 1,761 0% Number of Slice Flip Flops: 1,538 out of 6,144 25% Total Number 4 input LUTs: 2,323 out of 6,144 37% Number used as LUTs: 2,298 Number used as a route-thru: 25 Number of bonded IOBs: 81 out of 166 48% Number of Block RAMs: 1 out of 16 6% Number of GCLKs: 4 out of 4 100% Number of GCLKIOBs: 4 out of 4 100%Total equivalent gate count for design: 45,896Additional JTAG gate count for IOBs: 4,080

Timing summary:---------------

Timing errors: 547 Score: 3115439

Constraints cover 2633556 paths, 0 nets, and 10578connections (96.5% coverage)

Design statistics: Minimum period: 34.683ns (Maximum frequency:28.833MHz) Maximum path delay from/to any node: 14.813ns Minimum input arrival time before clock: 17.300ns Minimum output required time after clock: 20.523ns

Page 56: Thesis
Page 57: Thesis

Appendix B –Partial VHDL Source Code by Jorgen Peddersen

B-1

Appendix B – Partial VHDL Source Code

Included here is the main file of the VHDL source code that performs all the video

decoding and network interface code. This version performs 180×144×25Hz video.

SAA7113.vhd :

--------------------------------------------------------------------------------- saa7113.vhd---- Author(s): Jorgen Peddersen-- Created: Jan 2001-- Last Modified: Sep 2001---------------------------------------------------------------------------------library ieee;use ieee.std_logic_1164.all;use ieee.std_logic_unsigned.all;use work.global_constants.all;

entity saa7113 isport(

clk: in std_logic; -- 50 MHz clockrstn: in std_logic;stop: in std_logic;scl: inout std_logic; -- bidirectional I2C clock signalsda: inout std_logic; -- bidirectional I2C data signalllck: in std_logic; -- SAA7113 video clock (27 MHz)vpo: in std_logic_vector(7 downto 0); -- data from SAA7113rts: in std_logic_vector(1 downto 0); -- real-time video statuscomplete: in std_logic;datagramSent: in std_logic;bar: out std_logic_vector(6 downto 2); -- status displaysrdRAM: out std_logic;rdAddr: out std_logic_vector(18 downto 0);rdData: in std_logic_vector(7 downto 0);rdComplete: in std_logic;newDatagram: in std_logic;protocolIn: in std_logic_vector(7 downto 0);sourceIP: in std_logic_vector(31 downto 0);wrRAM : out std_logic;wrData: out std_logic_vector(7 downto 0);wrAddr: out std_logic_vector(18 downto 0);sendDatagram: out std_logic;datagramSize: out std_logic_vector(15 downto 0);destinationIP: out std_logic_vector(31 downto 0)

);end saa7113;

architecture saa7113_arch of saa7113 is

component i2c isport (

clk: in STD_LOGIC;rstn: in STD_LOGIC;go: in STD_LOGIC;done: out STD_LOGIC;sda: out STD_LOGIC;scl: out STD_LOGIC

);end component;

-- states for the SAA7113 interface circuittype VIDEO_STATE_TYPE is (

stIdle,stWaitForEscape,stCheckEscape1,

Page 58: Thesis

Appendix B –Partial VHDL Source Code by Jorgen Peddersen

B-2

stCheckEscape2,stCheckForNewPage,stCheckForFirstLine,stChromaBlue,stLumaBlue,stChromaRed,stLumaRed,stCheckForEndLine,stCheckForNewLine,stError,stNew,stError2

);

type CONTROL_STATE_TYPE is (stResetC,stWrHeaderC,stIdleC,stGoC,stWriteFrameC,stWriteImageC,stSendFrameC,stSyncC

);signal presStateC: CONTROL_STATE_TYPE;signal nextStateC: CONTROL_STATE_TYPE;

type READ_STATE_TYPE is (stIdleR,stReadR,stSignalR

);signal presStateR: READ_STATE_TYPE;signal nextStateR: READ_STATE_TYPE;

-- video in state signalssignal presState: VIDEO_STATE_TYPE;signal nextState: VIDEO_STATE_TYPE;signal returnState: VIDEO_STATE_TYPE;signal nextReturnState: VIDEO_STATE_TYPE;

signal bclk : std_logic; -- buffered version of the clock signalsignal gnd_net : std_logic; -- ground signal for connecting to unused ports

signal capture : STD_LOGIC; -- assert to capture a framesignal error : STD_LOGIC; -- is high while there is an error in decodesignal sda_out, sda_in, scl_out, scl_in : std_logic; -- internal I2C interface signals

signal grab_addr, disp_addr: std_logic_vector(18 downto 0); -- RAM addressessignal grab_cntr_hori: std_logic_vector(9 downto 0); -- horizontal write countersignal grab_cntr_vert: std_logic_vector(8 downto 0); -- vertical writecountersignal clr_grab_cntr: STD_LOGIC; -- clear both grab counterssignal inc_grab_hori: STD_LOGIC; -- increment horizontal counter for each 2 pixelssignal inc_grab_vert: STD_LOGIC; -- increment vert. counter and clear hori. counter

signal field : STD_LOGIC; -- remember which field we are operating on.signal nextField : STD_LOGIC; -- signal to remember the field

signal write : STD_LOGIC; -- non 50MHz signal to tell RAM to writesignal grab : std_logic; -- Assert to write current data when readysignal nextGrab : STD_LOGIC; -- Delays write signal by one clock cycle

signal vpoLatch: std_logic_vector(7 downto 0);-- synchronise data on the vpo bus to LLCK

-- signals to store and remember luminance and chrominance valuessignal luminanceB: std_logic_vector(7 downto 0);signal luminanceR: std_logic_vector(7 downto 0);signal chrominanceB: STD_LOGIC_VECTOR(7 downto 0);signal chrominanceR: STD_LOGIC_VECTOR(7 downto 0);signal nextLuminanceB: std_logic_vector(7 downto 0);signal nextLuminanceR: std_logic_vector(7 downto 0);

Page 59: Thesis

Appendix B –Partial VHDL Source Code by Jorgen Peddersen

B-3

signal nextChrominanceB: STD_LOGIC_VECTOR(7 downto 0);signal nextChrominanceR: STD_LOGIC_VECTOR(7 downto 0);

-- colour conversion signals from YUV to RGBsignal red: STD_LOGIC_VECTOR(17 downto 0);signal green: STD_LOGIC_VECTOR(17 downto 0);signal blue: STD_LOGIC_VECTOR(17 downto 0);signal colour : STD_LOGIC_VECTOR(14 downto 0);signal colour8 : STD_LOGIC_VECTOR(7 downto 0);

signal divclk: STD_LOGIC;signal divclkcnt: STD_LOGIC_VECTOR(4 downto 0);

signal vidindone: STD_LOGIC;

signal colourLatch: STD_LOGIC_VECTOR(7 downto 0);signal nextColourLatch: STD_LOGIC_VECTOR(7 downto 0);

signal datagramReady : STD_LOGIC;

signal sdaint:STD_LOGIC;signal sclint:STD_LOGIC;

signal captureLatch:STD_LOGIC;signal sw: STD_LOGIC;signal swLatch:STD_LOGIC;signal busy : STD_LOGIC;signal busyLatch : STD_LOGIC;signal grabLatch : STD_LOGIC;signal grabLatch2 : STD_LOGIC;signal i2cgo : STD_LOGIC;

signal wrCnt : STD_LOGIC_VECTOR(15 downto 0);signal clrWrCnt: STD_LOGIC;signal incWrCnt: STD_LOGIC;constant FRAME_SIZE : STD_LOGIC_VECTOR(15 downto 0) := x"6548";

signal rdCnt : STD_LOGIC_VECTOR(3 downto 0);signal clrRdCnt : STD_LOGIC;signal incRdCnt : STD_LOGIC;

signal LatchluminanceB : STD_LOGIC_VECTOR(7 downto 0);signal Latchgrab_cntr_hori : STD_LOGIC_VECTOR(9 downto 0);signal Latchgrab_cntr_vert : STD_LOGIC_VECTOR(8 downto 0);

signal destinationPortLatch : STD_LOGIC_VECTOR(15 downto 0);signal sourcePortLatch : STD_LOGIC_VECTOR(15 downto 0);signal latchSourcePort: STD_LOGIC;signal latchSourceIP: STD_LOGIC;signal latchDestinationData: STD_LOGIC;signal sourceIPlatch : STD_LOGIC_VECTOR(31 downto 0);

signal keepAlive: STD_LOGIC;signal stopTimer: STD_LOGIC;

signal timerCount: STD_LOGIC_VECTOR(4 downto 0);signal sendDatagramInt: STD_LOGIC;

begin

process(clk, rstn)begin

if rstn = '0' thendivclkcnt <= (others => '0');

elsif clk'event and clk = '1' thendivclkcnt <= divclkcnt + 1;

end if;end process;divclk <= divclkcnt(4);

Page 60: Thesis

Appendix B –Partial VHDL Source Code by Jorgen Peddersen

B-4

vidi2c: i2c port map(clk => divclk,rstn => rstn,go => i2cgo,done => vidindone,sda => sda,scl => sclint

);scl <= '0' when sclint = '0' else 'Z';

sendDatagram <= sendDatagramInt;

gnd_net <= '0'; -- ground signal for unused inputs in componentsbar(4) <= vidindone; -- shows when capture is occurring or not

-- state machine which decodes the data from the decoder chipprocess(llck,rstn)begin

if rstn = '0' then -- reset signals asynchronouslypresState <= stIdle;grab <= '0';returnState <= stIdle;field <= '0';

elsif llck'event and llck='1' then -- processes are clocked by llck captureLatch <= capture;

vpoLatch <= vpo; -- synchronize asynchronous datapresState <= nextState; -- go to next state

grab <= nextGrab; -- delay so colour can be calculated

returnState <= nextReturnState;field <= nextField;chrominanceR <= nextChrominanceR;chrominanceB <= nextChrominanceB;luminanceR <= nextLuminanceR;luminanceB <= nextLuminanceB;

-- operate on the grab counters for vert. and hori. movementif clr_grab_cntr = '1' then

grab_cntr_hori <= (others => '0');grab_cntr_vert <= (others => '0');

elseif inc_grab_hori = '1' then

grab_cntr_hori <= grab_cntr_hori + 1;end if;if inc_grab_vert = '1' then

grab_cntr_vert <= grab_cntr_vert + 1;grab_cntr_hori <= (others => '0');

-- clear horizontal counter with each new lineend if;

end if;end if;

end process;

process (presState, vpoLatch, returnState, field, luminanceB, luminanceR,chrominanceB, chrominanceR, captureLatch)

begin-- default signal valuesclr_grab_cntr <= '0';inc_grab_hori <= '0';inc_grab_vert <= '0';

nextGrab <= '0';

nextReturnState <= returnState;nextField <= field;nextLuminanceB <= luminanceB;nextLuminanceR <= luminanceR;nextChrominanceB <= chrominanceB;nextChrominanceR <= chrominanceR;

error <= '0';datagramReady <= '1';

Page 61: Thesis

Appendix B –Partial VHDL Source Code by Jorgen Peddersen

B-5

busy <= '1';bar(5) <= '0';bar(6) <= '0';case presState is

when stIdle =>busy <= '0';-- wait until capture is asserted then write a frameif captureLatch = '1' then

nextState <= stWaitForEscape;nextReturnState <= stCheckForNewPage;

elsenextState <= stIdle;

-- display_frame <= '1';end if;

-- The following three states form a sort of subroutine.

when stWaitForEscape =>bar(5) <= '1';bar(6) <= '1';-- Look for the first character in the sequenceif vpoLatch = X"FF" then

nextState <= stCheckEscape1;else

nextState <= stWaitForEscape;end if;

when stCheckEscape1 =>-- Second character in the escape sequence is 0.if vpoLatch = X"00" then

nextState <= stCheckEscape2;else

nextState <= stError;end if;

when stCheckEscape2 =>-- Third charcter in the escape sequence is 0.if vpoLatch = X"00" then

nextState <= returnState;else

nextState <= stError;end if;

when stCheckForNewPage =>-- Wait for an SAV or EAV in field 0 while in blankingif vpoLatch(6 downto 5) = "01" then

-- If it is then wait until the first linenextState <= stWaitForEscape;nextReturnState <= stCheckForFirstLine;clr_grab_cntr <= '1'; -- initialise counter

else-- Look for another SAV/EAVnextState <= stWaitForEscape;nextReturnState <= stCheckForNewPage;

end if;

when stCheckForFirstLine =>-- Wait for an SAV in field 0 while in the active regionif vpoLatch(6 downto 4) = "000" then

-- start recording datanextState <= stChromaBlue;nextField <= '0'; -- initialise field

else-- Look for another SAV/EAVnextState <= stWaitForEscape;nextReturnState <= stCheckForFirstLine;

end if;when stChromaBlue =>

-- This may be the start of another pair of pixels-- If the byte is FF then it is the start of the EAV.if vpoLatch = X"FF" then

nextState <= stCheckEscape1;nextReturnState <= stCheckForEndLine;

elsif vpoLatch = X"00" thennextState <= stError;

else

Page 62: Thesis

Appendix B –Partial VHDL Source Code by Jorgen Peddersen

B-6

-- latch data into register and continuenextState <= stLumaBlue;nextChrominanceB <= vpoLatch;

end if;

when stLumaBlue =>-- As long as valid data is present continue latching dataif vpoLatch /= X"FF" and vpoLatch /= X"00" then

nextState <= stChromaRed;nextLuminanceB <= vpoLatch;

elsenextState <= stError;

end if;

-- As long as valid data is present continue latching data

when stChromaRed =>if vpoLatch /= X"FF" and vpoLatch /= X"00" then

nextState <= stLumaRed;nextChrominanceR <= vpoLatch;

elsenextState <= stError;

end if;

when stLumaRed =>if vpoLatch /= X"FF" and vpoLatch /= X"00" then

nextState <= stChromaBlue;nextLuminanceR <= vpoLatch;nextGrab <= '1'; -- Set up a writeinc_grab_hori <= '1'; -- Increment hori counter

elsenextState <= stError;

end if;

when stCheckForEndLine =>-- possible conditions here are the end of field 0, end of field 1,-- or an EAV code indicating a new line in the active region.

if vpoLatch(6 downto 4) = "111" then -- end of field 1nextState <= stNew;datagramReady <= '1';

elsif vpoLatch(6 downto 4) = "011" then-- end of field 0clr_grab_cntr <= '1';

-- reset counter for field 1nextState <= stWaitForEscape;nextReturnState <= stCheckForNewLine;

elsif vpoLatch(5 downto 4) = "01" then -- end of lineinc_grab_vert <= '1';

-- go to next linenextState <= stWaitForEscape;nextReturnState <= stCheckForNewLine;

else -- EAV expected but SAV received

nextState <= stError;end if;

when stCheckForNewLine =>-- Wait until an SAV in the active video range arrivesif vpoLatch(5 downto 4) = "00" then

nextState <= stChromaBlue; -- capture next linenextField <= vpoLatch(6);

else-- Wait for another codenextState <= stWaitForEscape;nextReturnState <= stCheckForNewLine;

end if;

when stError =>-- Wait until another capture is requestedbar(5) <= '1';if captureLatch = '1' then

nextState <= stWaitForEscape;nextReturnState <= stCheckForNewPage;

elsenextState <= stError;error <= '1'; -- indicate error

Page 63: Thesis

Appendix B –Partial VHDL Source Code by Jorgen Peddersen

B-7

end if;

when stNew =>if captureLatch = '1' then

nextState <= stNew;else

nextState <= stIdle;end if;

when stError2 =>bar(6) <= '1';nextState <= stError2;

when others =>nextState <= stError2;

end case;end process;

process(clk, rstn)begin

if rstn = '0' thenpresStateC <= stResetC;

elsif clk'event and clk = '1' thencolourLatch <= nextColourLatch;presStateC <= nextStateC;busyLatch <= busy;grabLatch <= grab;grabLatch2 <= grabLatch;

-- swLatch <= sw;LatchluminanceB <= luminanceB;Latchgrab_cntr_hori <= grab_cntr_hori;Latchgrab_cntr_vert <= grab_cntr_vert;if clrWrCnt = '1' then

wrCnt <= (others => '0');elsif incWrCnt = '1' then

wrCnt <= wrCnt + 1;end if;

end if;end process;

rdAddr <= "001000000000000" & rdCnt;wrAddr <= "101" & wrCnt;datagramSize <= FRAME_SIZE;

process(presStateC, wrCnt, complete, swLatch, busyLatch, grabLatch, grabLatch2,Latchgrab_cntr_hori, Latchgrab_cntr_vert, LatchluminanceB, vidindone,

colourLatch,destinationPortLatch, datagramSent, colour8)

beginnextColourLatch <= colourLatch;clrWrCnt <= '0';incWrCnt <= '0';wrData <= (others => '0');wrRAM <= '0';sendDatagramInt <= '0';i2cgo <= '1';capture <= '0';bar(3) <= '0';bar(2) <= '0';sendDatagramInt <= '0';case presStateC is

when stResetC =>clrWrCnt <= '1';nextStateC <= stWrHeaderC;i2cgo <= '0';

when stWrHeaderC =>i2cgo <= '0';-- Write a byte to RAMif wrCnt(3 downto 0) = x"8" then

-- header has been fully written so go to data stagenextStateC <= stIdleC;

elsif complete = '0' thencase wrCnt(2 downto 0) is

when "000" =>wrData <= VIDEO_PORT(15 downto 8);

Page 64: Thesis

Appendix B –Partial VHDL Source Code by Jorgen Peddersen

B-8

when "001" =>wrData <= VIDEO_PORT(7 downto 0);

when "010" =>wrData <= destinationPortLatch(15 downto 8);when "011" =>

wrData <= destinationPortLatch(7 downto 0);when "100" =>

wrData <= FRAME_SIZE(15 downto 8);when "101" =>

wrData <= FRAME_SIZE(7 downto 0);when "110" =>

wrData <= x"00";when "111" =>

wrData <= x"00";when others =>

wrData <= (others => '0');end case;-- Wait for RAM to acknowledge the writenextStateC <= stWrHeaderC;wrRAM <= '1';

else-- When it does increment the counternextStateC <= stWrHeaderC;incWrCnt <= '1';

end if;when stIdleC =>

bar(3) <= '1';bar(2) <= '1';if swLatch = '1' and vidindone = '1' then

nextStateC <= stGoC;else

nextStateC <= stIdleC;end if;

when stGoC =>bar(3) <= '1';capture <= '1';if busyLatch = '1' then

nextStateC <= stWriteFrameC;else

nextStateC <= stGoC;end if;

when stWriteFrameC =>bar(2) <= '1';if wrCnt = FRAME_SIZE then

nextStateC <= stSendFrameC;elsif grabLatch = '0' and grabLatch2 = '1' and

Latchgrab_cntr_hori(0) = '0' andLatchgrab_cntr_vert(0) = '0' then

nextStateC <= stWriteImageC;nextColourLatch <= colour8;

elsenextStateC <= stWriteFrameC;

end if;when stWriteImageC =>

if complete = '0' then-- Wait for RAM to acknowledge the writenextStateC <= stWriteImageC;wrData <= colourLatch;wrRAM <= '1';

else

nextStateC <= stWriteFrameC;incWrCnt <= '1';

end if;when stSendFrameC =>

sendDatagramInt <= '1';nextStateC <= stSyncC;

when stSyncC =>if busyLatch = '1' then

nextStateC <= stSyncC;else

nextStateC <= stResetC;end if;

end case;end process;

Page 65: Thesis

Appendix B –Partial VHDL Source Code by Jorgen Peddersen

B-9

process(clk,rstn)begin

if rstn = '0' thenpresStateR <= stIdleR;

-- sw <= '0';timerCount <= (others => '0');

elsif clk'event and clk = '1' thenpresStateR <= nextStateR;if clrRdCnt = '1' then

rdCnt <= (others => '0');elsif incRdCnt = '1' then

rdCnt <= rdCnt + 1;end if;if latchSourceIP = '1' then

sourceIPLatch <= sourceIP;end if;if latchSourcePort = '1' then

sourcePortLatch <= sourcePortLatch(7 downto 0) & rdData;end if;if latchDestinationData = '1' then

destinationIP <= sourceIPLatch;destinationPortLatch <= sourcePortLatch;

end if;if keepAlive = '1' then

timerCount <= (others => '1');elsif stopTimer = '1' or stop = '1' then

timerCount <= (others => '0');elsif sendDatagramInt = '1' and timerCount /= 0 then

timerCount <= timerCount - 1;end if;

end if;end process;

swLatch <= '0' when timerCount = 0 else '1';

process (presStateR, newDatagram, protocolIn, rdCnt, rdComplete, rdData)begin

clrRdCnt <= '0';incRdCnt <= '0';latchSourceIP <= '0';rdRAM <= '0';latchSourcePort <= '0';latchDestinationData <= '0';keepAlive <= '0';stopTimer <= '0';case presStateR is

when stIdleR =>if newDatagram = '1' and protocolIn = 17 then

nextStateR <= stReadR;clrRdCnt <= '1';latchSourceIP <= '1';

elsenextStateR <= stIdleR;

end if;when stReadR =>

if rdCnt = 8 thennextStateR <= stSignalR;

elsif rdComplete = '0' and rdCnt(3 downto 2) = "00" thennextStateR <= stReadR;rdRAM <= '1';

elseincRdCnt <= '1';nextStateR <= stReadR;

if rdCnt(2 downto 0) = "010" and rdData /= VIDEO_PORT(15 downto 8) thennextStateR <= stIdleR;

end if;if rdCnt(2 downto 0) = "011" and rdData /= VIDEO_PORT(7 downto 0) then

nextStateR <= stIdleR;end if;if rdCnt(2 downto 1) = "00" then

latchSourcePort <= '1';end if;

end if;

Page 66: Thesis

Appendix B –Partial VHDL Source Code by Jorgen Peddersen

B-10

when stSignalR =>if rdComplete = '0' then

nextStateR <= stSignalR;rdRam <= '1';

elsif rdData = x"5b" thennextStateR <= stIdleR;keepAlive <= '1';latchDestinationData <= '1';

elsenextStateR <= stIdleR;stopTimer <= '1';

end if;when others =>

nextStateR <= stIdleR;end case;

end process;

red <= ("00" & luminanceB & x"00") + (("01" & x"24") * chrominanceR) - ("00" & x"7D00");blue <= ("00" & luminanceB & x"00") + (("10" & x"07") * chrominanceB) - ("00" &x"EE80");green <= ("00" & luminanceB & x"00") + ("00" & x"9200") - (("00" & x"65") *chrominanceB) - (("00" & x"95") * chrominanceR);

-- eliminate overflow caused by the calculations above-- Comment out the colour set that isn't needed-- colour is 15-bit 5:5:5 RGB format colour. colour 8 is 8-bit 3:3:2 RGB format.

with red(17 downto 16) select-- colour(14 downto 10) <= red(15 downto 11) when "00",-- (others => '0') when "11",-- (others => '1') when "01",-- (others => '0') when others;

colour8(7 downto 5) <= red(15 downto 13) when "00",(others => '0') when "11",(others => '1') when "01",(others => '0') when others;

with green(17 downto 16) select-- colour(9 downto 5) <= green(15 downto 11) when "00",-- (others => '0') when "11",-- (others => '1') when "01",-- (others => '0') when others;

colour8(4 downto 2) <= green(15 downto 13) when "00",(others => '0') when "11",(others => '1') when "01",(others => '0') when others;

with blue(17 downto 16) select-- colour(4 downto 0) <= blue(15 downto 11) when "00",-- (others => '0') when "11",-- (others => '1') when "01",-- (others => '0') when others;

colour8(1 downto 0) <= blue(15 downto 14) when "00",(others => '0') when "11",(others => '1') when "01",(others => '0') when others;

end saa7113_arch;

Page 67: Thesis

Appendix C –Partial PC Source Code by Jorgen Peddersen

C-1

Appendix C – Partial PC Source Code

Included here is the main file of the PC source code that performs all the network

communication, controls the GUI and updates the images. This version performs

180×144×25Hz video.

ThesisAsyncDlg.cpp:

// ThesisAsyncDlg.cpp : implementation file//

#include "stdafx.h"#include "ThesisAsync.h"#include "ThesisAsyncDlg.h"

#include "ptc.h"

#ifdef _DEBUG#define new DEBUG_NEW#undef THIS_FILEstatic char THIS_FILE[] = __FILE__;#endif

#define width 180#define length 144

Console console;Format format(32,0x00FF0000,0x0000FF00,0x000000FF);Surface surface(width,length,format);

WSAData wsaData;SOCKET sd;sockaddr_in sinBoard;

unsigned char acReadBuffer[30000];int status = 0;

static int gnWSNotifyMsg = RegisterWindowMessage(__FILE__ ":wsnotify");

/////////////////////////////////////////////////////////////////////////////// CAboutDlg dialog used for App About

class CAboutDlg : public CDialog{public:

CAboutDlg();

// Dialog Data//{{AFX_DATA(CAboutDlg)enum { IDD = IDD_ABOUTBOX };//}}AFX_DATA

// ClassWizard generated virtual function overrides//{{AFX_VIRTUAL(CAboutDlg)protected:virtual void DoDataExchange(CDataExchange* pDX); // DDX/DDV support//}}AFX_VIRTUAL

// Implementationprotected:

//{{AFX_MSG(CAboutDlg)//}}AFX_MSGDECLARE_MESSAGE_MAP()

};

CAboutDlg::CAboutDlg() : CDialog(CAboutDlg::IDD)

Page 68: Thesis

Appendix C –Partial PC Source Code by Jorgen Peddersen

C-2

{//{{AFX_DATA_INIT(CAboutDlg)//}}AFX_DATA_INIT

}

void CAboutDlg::DoDataExchange(CDataExchange* pDX){

CDialog::DoDataExchange(pDX);//{{AFX_DATA_MAP(CAboutDlg)//}}AFX_DATA_MAP

}

BEGIN_MESSAGE_MAP(CAboutDlg, CDialog)//{{AFX_MSG_MAP(CAboutDlg)

// No message handlers//}}AFX_MSG_MAP

END_MESSAGE_MAP()

/////////////////////////////////////////////////////////////////////////////// CThesisAsyncDlg dialog

CThesisAsyncDlg::CThesisAsyncDlg(CWnd* pParent /*=NULL*/): CDialog(CThesisAsyncDlg::IDD, pParent)

{//{{AFX_DATA_INIT(CThesisAsyncDlg)

// NOTE: the ClassWizard will add member initialization here//}}AFX_DATA_INIT// Note that LoadIcon does not require a subsequent DestroyIcon in Win32m_hIcon = AfxGetApp()->LoadIcon(IDR_MAINFRAME);

}

void CThesisAsyncDlg::DoDataExchange(CDataExchange* pDX){

CDialog::DoDataExchange(pDX);//{{AFX_DATA_MAP(CThesisAsyncDlg)DDX_Control(pDX, IDC_IPADD, m_IpAddress);//}}AFX_DATA_MAP

}

BEGIN_MESSAGE_MAP(CThesisAsyncDlg, CDialog)//{{AFX_MSG_MAP(CThesisAsyncDlg)ON_WM_SYSCOMMAND()ON_WM_PAINT()ON_WM_QUERYDRAGICON()ON_BN_CLICKED(IDC_STARTBUTTON, OnStartbutton)ON_BN_CLICKED(IDC_STOPBUTTON, OnStopbutton)

//}}AFX_MSG_MAP ON_REGISTERED_MESSAGE(gnWSNotifyMsg, OnWinsockNotify)END_MESSAGE_MAP()

/////////////////////////////////////////////////////////////////////////////// CThesisAsyncDlg message handlers

BOOL CThesisAsyncDlg::OnInitDialog(){

CDialog::OnInitDialog();

// Add "About..." menu item to system menu.

// IDM_ABOUTBOX must be in the system command range.ASSERT((IDM_ABOUTBOX & 0xFFF0) == IDM_ABOUTBOX);ASSERT(IDM_ABOUTBOX < 0xF000);

CMenu* pSysMenu = GetSystemMenu(FALSE);if (pSysMenu != NULL){

CString strAboutMenu;strAboutMenu.LoadString(IDS_ABOUTBOX);if (!strAboutMenu.IsEmpty()){

pSysMenu->AppendMenu(MF_SEPARATOR);pSysMenu->AppendMenu(MF_STRING, IDM_ABOUTBOX, strAboutMenu);

}}

Page 69: Thesis

Appendix C –Partial PC Source Code by Jorgen Peddersen

C-3

// Set the icon for this dialog. The framework does this automatically// when the application's main window is not a dialogSetIcon(m_hIcon, TRUE); // Set big iconSetIcon(m_hIcon, FALSE); // Set small icon

// TODO: Add extra initialization hereconsole.option("windowed output");

console.open("FPGA Streaming",format); if ((WSAStartup(MAKEWORD(1, 1), &wsaData)) != 0) { MessageBox("WSAStartup error"); } sd = socket(AF_INET, SOCK_DGRAM, 0); if (sd != INVALID_SOCKET) { sockaddr_in sinInterface; sinInterface.sin_family = AF_INET; sinInterface.sin_addr.s_addr = htonl(INADDR_ANY); sinInterface.sin_port = htons(0); bind(sd, (sockaddr*)&sinInterface, sizeof(sockaddr_in)) ; } WSAAsyncSelect(sd, m_hWnd, gnWSNotifyMsg, FD_READ); sinBoard.sin_family = AF_INET; sinBoard.sin_port = htons(20380); m_IpAddress.SetAddress(130,102,75,192); unsigned int rcv_size = 30000; setsockopt(sd, SOL_SOCKET , SO_RCVBUF, (char *)&rcv_size, sizeof(rcv_size));

return TRUE; // return TRUE unless you set the focus to a control}

void CThesisAsyncDlg::OnSysCommand(UINT nID, LPARAM lParam){

if ((nID & 0xFFF0) == IDM_ABOUTBOX){

CAboutDlg dlgAbout;dlgAbout.DoModal();

}else{

CDialog::OnSysCommand(nID, lParam);}

}

// If you add a minimize button to your dialog, you will need the code below// to draw the icon. For MFC applications using the document/view model,// this is automatically done for you by the framework.

void CThesisAsyncDlg::OnPaint(){

if (IsIconic()){

CPaintDC dc(this); // device context for painting

SendMessage(WM_ICONERASEBKGND, (WPARAM) dc.GetSafeHdc(), 0);

// Center icon in client rectangleint cxIcon = GetSystemMetrics(SM_CXICON);int cyIcon = GetSystemMetrics(SM_CYICON);CRect rect;GetClientRect(&rect);int x = (rect.Width() - cxIcon + 1) / 2;int y = (rect.Height() - cyIcon + 1) / 2;

// Draw the icondc.DrawIcon(x, y, m_hIcon);

}else{

CDialog::OnPaint();}

}

// The system calls this to obtain the cursor to display while the user drags// the minimized window.

Page 70: Thesis

Appendix C –Partial PC Source Code by Jorgen Peddersen

C-4

HCURSOR CThesisAsyncDlg::OnQueryDragIcon(){

return (HCURSOR) m_hIcon;}

void CThesisAsyncDlg::OnStartbutton(){ unsigned long address;

if (m_IpAddress.GetAddress(address) != 4) MessageBox("Invalid IP Address"); else { sinBoard.sin_addr.s_addr = htonl(address); status = 1; char msg = 0x5b; connect(sd, (struct sockaddr *) &sinBoard, sizeof(sockaddr_in)); send(sd, &msg, 1, 0); }}

void CThesisAsyncDlg::OnStopbutton(){ status = 0;

char msg = 0x00; unsigned int rcv_size; char string[30]; int rcv_size_len = sizeof(rcv_size); send(sd, &msg, 1, 0);

}

LRESULT CThesisAsyncDlg::OnWinsockNotify(WPARAM, LPARAM lParam){

int nError = WSAGETASYNCERROR(lParam);if (nError != 0) {

switch (nError) {case WSAECONNRESET:

MessageBox("Connection was aborted.");closesocket(sd);break;

case WSAECONNREFUSED:MessageBox("Connection was refused.");closesocket(sd);break;

default:MessageBox("Async failure notification");

}return 0;

}

int i; int nReadBytes; int32 *pixels;

switch (WSAGETSELECTEVENT(lParam)) {case FD_READ:

nReadBytes = recv(sd, (char *)acReadBuffer, 29000, 0); if (nReadBytes < 0) { MessageBox("Receive error"); } else if (nReadBytes == 0) { MessageBox("Received no data"); } if (status != 1) return 0; msg = 0x5b; send(sd, &msg, 1, 0);

// lock surface pixels pixels = (int32*) surface.lock();

for (i = 0; i < nReadBytes; i++) { const int x = i % width; const int y = i / width; const int colour = (unsigned long)acReadBuffer[i];

Page 71: Thesis

Appendix C –Partial PC Source Code by Jorgen Peddersen

C-5

pixels[x+y*width] = ((colour & 0xE0) << 16) + ((colour&0x1C) << 11) +((colour & 0x03) << 6); }

// unlock surface surface.unlock(); // copy to console surface.copy(console); // update console console.update();

break;default:

MessageBox("WSEV: Unknown event recieved: ");}

return 0;}