Presentation Transcript
2. Data Formats: 2. Data Formats Chapt. 3
Introduction: Introduction Examples pp. 59.-61 Input device
Format must be appropriate: Format must be appropriate The internal representation must be appropriate for the type of processing to take place (e.g., text, images, sound)
Rules/Conventions: Rules/Conventions Proprietary formats
Unique to a product or company
E.g., Microsoft Word, Corel Word Perfect, IBM Lotus Notes
Standards
Evolve two ways:
Proprietary formats become de facto standards (e.g., Adobe PostScript, Apple Quick Time)
Committee is struck to solve a problem (Motion Pictures Experts Group, MPEG) pp. 61-62
Standards Organizations: Standards Organizations ISO – International Standards Organization
CSA – Canadian Standards Association
ANSI – American National Standards Institute
IEEE – Institute for Electrical and Electronics Engineers
Etc.
Examples of Standards: Examples of Standards
Why Standards?: Why Standards? Standard are “arbitrary”
They exist because they are
Convenient
Efficient
Flexible
Appropriate
Etc.
Alphanumeric Data: Alphanumeric Data Problem: Distinguishing between the number 123 (one hundred and twenty-three) and the characters “123” (one, two, three)
Four standards for representing letters (alpha) and numbers
BCD – Binary-coded decimal
ASCII – American standard code for information interchange
EBCDIC – Extended binary-coded decimal interchange code
Unicode pp. 63-69
Standard Alphanumeric Formats: Next 2 slides Standard Alphanumeric Formats BCD
ASCII
EBCDIC
Unicode
Binary-Coded Decimal (BCD): Binary-Coded Decimal (BCD) Four bits per digit Note: the following bit patterns are not used:
1010 1011 1100 1101 1110 1111
Example: Example 709310 = ? (in BCD) 7 0 9 3
0111 0000 1001 0011
Standard Alphanumeric Formats: Next 22 slides Standard Alphanumeric Formats BCD
ASCII
EBCDIC
Unicode
The Problem: The Problem Representing text strings, such as “Hello, world”, in a computer
Codes and Characters: Codes and Characters Each character is coded as a byte
Most common coding system is ASCII (Pronounced ass-key)
ASCII = American National Standard Code for Information Interchange
Defined in ANSI document X3.4-1977
ASCII Features: ASCII Features 7-bit code
8th bit is unused (or used for a parity bit)
27 = 128 codes
Two general types of codes:
95 are “Graphic” codes (displayable on a console)
33 are “Control” codes (control features of the console or communications channel)
ASCII Chart: ASCII Chart
Slide18: Most significant bit Least significant bit
Slide19: e.g., ‘a’ = 1100001
Slide20: 95 Graphic codes
Slide21: 33 Control codes
Slide22: Alphabetic codes
Slide23: Numeric codes
Slide24: Punctuation, etc.
“Hello, world” Example: “Hello, world” Example
Common Control Codes: Common Control Codes CR 0D carriage return
LF 0A line feed
HT 09 horizontal tab
DEL 7F delete
NULL 00 null Hexadecimal code
Terminology: Terminology Learn the names of the special symbols
[ ] brackets
{ } braces
( ) parentheses
@ commercial ‘at’ sign
& ampersand
~ tilde
Escape Sequences: Escape Sequences Extend the capability of the ASCII code set
For controlling terminals and formatting output
Defined by ANSI in documents X3.41-1974 and X3.64-1977
The escape code is ESC = 1B16
An escape sequence begins with two codes: ESC [ 1B16 5B16
Examples: Examples Erase display: ESC [ 2 J
Erase line: ESC [ K
Standard Alphanumeric Formats: Next 1 slides Standard Alphanumeric Formats BCD
ASCII
EBCDIC
Unicode
EBCDIC: EBCDIC Extended BCD Interchange Code (pronounced ebb’-se-dick)
8-bit code
Developed by IBM
Rarely used today
IBM mainframes only
Standard Alphanumeric Formats: Next 2 slides Standard Alphanumeric Formats BCD
ASCII
EBCDIC
Unicode
Unicode: Unicode 16-bit standard
Developed by a consortia
Intended to supercede older 7- and 8-bit codes
Unicode Version 2.1: Unicode Version 2.1 1998
Improves on version 2.0
Includes the Euro sign (20AC16 = )
From the standard: …contains 38,887 distinct coded characters derived from the supported scripts. These characters cover the principal written languages of the Americas, Europe, the Middle East, Africa, India, Asia, and Pacifica. http://www.unicode.org
Keyboard Input: Keyboard Input Key (“scan”) codes are converted to ASCII
ASCII code sent to host computer
Received by the host as a “stream” of data
Stored in buffer
Processed
Etc. pp. 69
Shift Key: Shift Key inhibits bit 5 in the ASCII code a a Shift
Control Key: Control Key inhibits bits 5 & 6 in the ASCII code c c Ctrl Controlcode
Other Input: Other Input OCR – optical character recognition
Bar code readers
Voice/audio input
Punched cards
Images / objects
Pointing devices
pp. 69-86
OCR: OCR Hello, world Page of text Optical scan 10110110… Computer file
Other Input: Other Input OCR – optical character recognition
Bar code readers
Voice/audio input
Punched cards
Images / objects
Pointing devices
pp. 69-86
Bar Codes: Bar Codes An automatic identification (Auto ID) technology that streamlines identification and data collection
See
http://www.digital.net/barcoder/barcode.html
Other Input: Other Input OCR – optical character recognition
Bar code readers
Voice/audio input
Punched cards
Images / objects
Pointing devices
pp. 69-86
Voice/audio Input: Voice/audio Input Input device: microphone
Audio input is “digitized” and stored
Processed in two ways
As is (no recognition)
Recognized and converted to alphanumeric data (ASCII) Digitize 10110010…
Other Input: Other Input OCR – optical character recognition
Bar code readers
Voice/audio input
Punched cards
Images / objects
Pointing devices
pp. 69-86
Punched Cards: Punched Cards Invented by Herman Hollerith (founder of IBM)
Each card holds 80 characters
Other Input: Other Input OCR – optical character recognition
Bar code readers
Voice/audio input
Punched cards
Images / objects
Pointing devices
pp. 69-86
Images: Images Typically images are pictures that are optically scanned and saved as a “bit map” or in some other format
Many formats
gif, jpeg, …
Typical “Save As” Dialog: Typical “Save As” Dialog
Objects: Objects Images made of geometrically definable shapes
Offer efficiency, flexibility, small size, etc.
Other Input: Other Input OCR – optical character recognition
Bar code readers
Voice/audio input
Punched cards
Images / objects
Pointing devices
pp. 69-86
Pointing Devices: Pointing Devices Originally used for specifying coordinates (x, y) for graphical input
Today used as general purpose device for “graphical user interfaces” (GUIs)
Thank you: Thank you