Nic Shulver, [email protected] Data Representation Data Usually computing systems are complex...
-
Upload
camilla-preston -
Category
Documents
-
view
213 -
download
0
Transcript of Nic Shulver, [email protected] Data Representation Data Usually computing systems are complex...
Nic Shulver, [email protected]
Data RepresentationData
Usually computing systems are complex devices, dealing with a vast array of information categories
Nic Shulver, [email protected]
Data RepresentationComputing Systems Data
Computing systems store, present, and help us modify:TextAudioImages and graphicsVideo
Nic Shulver, [email protected]
Data RepresentationDigital vs. Analog
Computing systems are finite machines. They store an limited amount of information, even if the limit is very big.
The information can be represented in one or two ways: analog or digital.
Nic Shulver, [email protected]
Data RepresentationDigital vs. Analog (1)
Analog data is a continuous representationA mercury thermometer is an analog device
Digital data is a discrete representation, breaking the information up into separate (discrete) elements
Computers can’t work with analog information, so a need do digitize the analog information arise
Nic Shulver, [email protected]
Data RepresentationWhy digital signals?
Electronic signals (analog and digital) degrade as they propagate. The strength of the signal fluctuates due to environmental effects.
Analog signals lose information. Since any voltage level within the range is valid, it is impossible to know that the original signal was even changed
A digital signal can degrade quite a bit until the information is lost, because any value over a certain threshold is considered high value and below the threshold is considered a low value
Nic Shulver, [email protected]
Data Representation
Threshold1 1 1 1 1 1 1 10 0 0 0 0 0 0 0
Digital Signal
Analog Signal
Digital Signal Degradation
Analog Signal Degradation
Digital vs. Analog (3)
• You can still retrieve the information from a reasonably degraded digital signal
• Periodically a digital signal is reclocked to regain its original shape. As long as it is reclocked before too much degradation, no info is lost.
Nic Shulver, [email protected]
Data RepresentationCount like a computer
“There are 10 types of people – those who
understand binary, and those who don’t.”
Nic Shulver, [email protected]
Data RepresentationDigital Hardware Systems
Digital Binary SystemTwo discrete values:
yes, on, non-zero volts, current flowing, "1" no, off, 0 volts, no current flowing, "0”
Advantage of binary systems:rigorous mathematical foundation based on logicit’s easy to implement
Nic Shulver, [email protected]
Data RepresentationBinary Representation (1)
Why binary representation (as suppose to decimal or octal, etc..)?Because the devices that store and manage the
digital data are far less expensive and complex for binary representation.
They are also far more reliable when they have to represent one out of two possible values.
Because the electronic signals are easier to maintain if they carry only binary data.
Nic Shulver, [email protected]
Data RepresentationBinary Representation (2)
One bit can be either 0 or 1. Therefore, one bit can represent only two things.
To represent more than two things, we need multiple bits.Two bits can represent four things because there are four
combinations of 0 and 1 that can be made from two bits: 00, 01, 10,11.
In general, n bits can represent 2^n things because there are 2^n combinations of 0 and 1 that can be made from n bits.
Note that every time we increase the number of bits by 1, we double the number of things we can represent.
Nic Shulver, [email protected]
Data RepresentationBinary Bit and Group Definitions
Bit - a single binary digitNibble - a group of four bitsByte - a group of eight bitsWord - depends on processor; 8, 16, 32, 64
or more bitsLSB - Least Significant Bit (on the right)MSB - Most Significant Bit (on the left)
Nic Shulver, [email protected]
Data RepresentationBinary Number System
Just like decimal numbers exceptThe only valid digits are 0 and 1The base is 2 instead of 10
Binary to decimal conversion is just the explicit expression of the positional values,
1 0 11 x 20 = 10 x 21 = 01 x 22 = 4
Total = 5
Nic Shulver, [email protected]
Data RepresentationOctal & Hexadecimal Number Systems
Systems with different radix and digitsOctal:
Radix = 8Digits = 0,1,2,3,4,5,6,7
Hexadecimal:Radix = 16Digits = 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F
Primary advantage of both is it’s easy to convert to/from binary
Nic Shulver, [email protected]
Data RepresentationData Formats - How to Interpret Data
Internal representation must be appropriate for type of processing taking place:i.e. Images & sound: have to be digitized
Images – need detailed description of the data, how colour is represented at each data point
Sound – need sampling rate
Proprietary formats are unique to a product or company, e.g., PDF, FLA, H.264
Nic Shulver, [email protected]
Data RepresentationData Formats - How to Interpret Data
Nic Shulver, [email protected]
Data RepresentationWhy Standards?
Standards evolve two ways:Proprietary formats become de facto standards (e.g., Adobe
PostScript, Apple Quick Time)Committee is struck to solve a problem (Motion Pictures
Experts Group, MPEG)
They exist because they are:Convenient – sometimes time to market is very important
whenever trying to finish a product. Existing standards may be used to save time.
Efficient – most of the standards are put together by committees of experienced engineers and designers
Nic Shulver, [email protected]
Data RepresentationWhy Standards?
Flexible – often allow for extensionsAppropriate – solve specific problemInteroperability of dataInteroperability of hardware and softwareBut sometimes standards are arbitrary and
have some content derived from “accidents of history”
Nic Shulver, [email protected]
Data RepresentationStandards Organizations
ISO – International Standards OrganizationCSA – Canadian Standards AssociationANSI – American National Standards InstituteIEEE – Institute for Electrical and Electronics
Engineers
Nic Shulver, [email protected]
Data RepresentationExamples of Standards
Type of Data Standards
Alphanumeric ASCII, Unicode
Image JPEG, PNG, PCX, TIFF, etc
Motion picture MPEG-2, MPEG-4, etc
Sound WAV, MP3, AAC, FLAC, etc..
Outline graphics/fonts PostScript, TrueType, PDF
Nic Shulver, [email protected]
Data RepresentationAlphanumeric Data
Three standards for representing letters (alpha) and numbersASCII – American Standard Code for Information
Interchange (old)EBCDIC – Extended Binary-Coded Decimal
Interchange Code (very old - not used anymore, was used in IBM mainframes)
Unicode
Nic Shulver, [email protected]
Data RepresentationCodes and Characters
The problem:Representing text strings, such as
“Hello, world”, in a computer
Each character is coded as a byte ( = 8 bits)Most common coding system was ASCIIDefined in ANSI document X3.4-1977
Nic Shulver, [email protected]
Data Representation“Hello, world” Example
============
Binary010010000110010101101100011011000110111100101100001000000111011101100111011100100110110001100100
Hexadecimal48656C6C6F2C207767726C64
Decimal72
1011081081114432
119103114108100
Hello, world
============
============
Note: 12 characters – requires 12 bytesEach character requires 1 byte
Nic Shulver, [email protected]
Data RepresentationUnicode (1)
The extended version of the ASCII character set is not enough for international use.
The Unicode character set uses 16 bits per character. Therefore, the Unicode character set can represent 2^16, or over 65 thousand, characters.
Unicode was designed to be a superset of ASCII. That is, the first 256 characters in the Unicode character set correspond exactly to the extended ASCII character set.
Nic Shulver, [email protected]
Data RepresentationUnicode (2)
Version 2.1, 1998Improves on version 2.0 Includes the Euro sign (U+20AC = “€” = “EURO SIGN”)From the standard: …contains 38,887 distinct coded
characters derived from the supported scripts. These characters cover the principal written languages of the Americas, Europe, the Middle East, Africa, India, Asia, and Pacifica.
The 2012 update of Unicode was 6.2 (see http://www.unicode.org for current version)