02 Dataformats

Uploaded from authorPOINT Lite
Download as
 PPT
Presentation Description 

No description available

By:
 (10 month(s) ago)  
how to download?

Views: 208
Like it  ( Likes) Dislike it  ( Dislikes)
Added: February 27, 2008 This Presentation is Public 
Presentation Category : Education All Rights Reserved
Presentation Transcript

2. Data Formats: 2. Data Formats Chapt. 3


Introduction: Introduction Examples pp. 59.-61 Input device


Format must be appropriate: Format must be appropriate The internal representation must be appropriate for the type of processing to take place (e.g., text, images, sound)


Rules/Conventions: Rules/Conventions Proprietary formats Unique to a product or company E.g., Microsoft Word, Corel Word Perfect, IBM Lotus Notes Standards Evolve two ways: Proprietary formats become de facto standards (e.g., Adobe PostScript, Apple Quick Time) Committee is struck to solve a problem (Motion Pictures Experts Group, MPEG) pp. 61-62


Standards Organizations: Standards Organizations ISO – International Standards Organization CSA – Canadian Standards Association ANSI – American National Standards Institute IEEE – Institute for Electrical and Electronics Engineers Etc.


Examples of Standards: Examples of Standards


Why Standards?: Why Standards? Standard are “arbitrary” They exist because they are Convenient Efficient Flexible Appropriate Etc.


Alphanumeric Data: Alphanumeric Data Problem: Distinguishing between the number 123 (one hundred and twenty-three) and the characters “123” (one, two, three) Four standards for representing letters (alpha) and numbers BCD – Binary-coded decimal ASCII – American standard code for information interchange EBCDIC – Extended binary-coded decimal interchange code Unicode pp. 63-69


Standard Alphanumeric Formats: Next 2 slides Standard Alphanumeric Formats BCD ASCII EBCDIC Unicode


Binary-Coded Decimal (BCD): Binary-Coded Decimal (BCD) Four bits per digit Note: the following bit patterns are not used: 1010 1011 1100 1101 1110 1111


Example: Example 709310 = ? (in BCD) 7 0 9 3 0111 0000 1001 0011


Standard Alphanumeric Formats: Next 22 slides Standard Alphanumeric Formats BCD ASCII EBCDIC Unicode


The Problem: The Problem Representing text strings, such as “Hello, world”, in a computer


Codes and Characters: Codes and Characters Each character is coded as a byte Most common coding system is ASCII (Pronounced ass-key) ASCII = American National Standard Code for Information Interchange Defined in ANSI document X3.4-1977


ASCII Features: ASCII Features 7-bit code 8th bit is unused (or used for a parity bit) 27 = 128 codes Two general types of codes: 95 are “Graphic” codes (displayable on a console) 33 are “Control” codes (control features of the console or communications channel)


ASCII Chart: ASCII Chart


Slide18: Most significant bit Least significant bit


Slide19: e.g., ‘a’ = 1100001


Slide20: 95 Graphic codes


Slide21: 33 Control codes


Slide22: Alphabetic codes


Slide23: Numeric codes


Slide24: Punctuation, etc.


“Hello, world” Example: “Hello, world” Example


Common Control Codes: Common Control Codes CR 0D carriage return LF 0A line feed HT 09 horizontal tab DEL 7F delete NULL 00 null Hexadecimal code


Terminology: Terminology Learn the names of the special symbols [ ] brackets { } braces ( ) parentheses @ commercial ‘at’ sign & ampersand ~ tilde


Escape Sequences: Escape Sequences Extend the capability of the ASCII code set For controlling terminals and formatting output Defined by ANSI in documents X3.41-1974 and X3.64-1977 The escape code is ESC = 1B16 An escape sequence begins with two codes: ESC [ 1B16 5B16


Examples: Examples Erase display: ESC [ 2 J Erase line: ESC [ K


Standard Alphanumeric Formats: Next 1 slides Standard Alphanumeric Formats BCD ASCII EBCDIC Unicode


EBCDIC: EBCDIC Extended BCD Interchange Code (pronounced ebb’-se-dick) 8-bit code Developed by IBM Rarely used today IBM mainframes only


Standard Alphanumeric Formats: Next 2 slides Standard Alphanumeric Formats BCD ASCII EBCDIC Unicode


Unicode: Unicode 16-bit standard Developed by a consortia Intended to supercede older 7- and 8-bit codes


Unicode Version 2.1: Unicode Version 2.1 1998 Improves on version 2.0 Includes the Euro sign (20AC16 = ) From the standard: …contains 38,887 distinct coded characters derived from the supported scripts. These characters cover the principal written languages of the Americas, Europe, the Middle East, Africa, India, Asia, and Pacifica. http://www.unicode.org


Keyboard Input: Keyboard Input Key (“scan”) codes are converted to ASCII ASCII code sent to host computer Received by the host as a “stream” of data Stored in buffer Processed Etc. pp. 69


Shift Key: Shift Key inhibits bit 5 in the ASCII code a a Shift


Control Key: Control Key inhibits bits 5 & 6 in the ASCII code c c Ctrl Controlcode


Other Input: Other Input OCR – optical character recognition Bar code readers Voice/audio input Punched cards Images / objects Pointing devices pp. 69-86


OCR: OCR Hello, world Page of text Optical scan 10110110… Computer file


Other Input: Other Input OCR – optical character recognition Bar code readers Voice/audio input Punched cards Images / objects Pointing devices pp. 69-86


Bar Codes: Bar Codes An automatic identification (Auto ID) technology that streamlines identification and data collection See http://www.digital.net/barcoder/barcode.html


Other Input: Other Input OCR – optical character recognition Bar code readers Voice/audio input Punched cards Images / objects Pointing devices pp. 69-86


Voice/audio Input: Voice/audio Input Input device: microphone Audio input is “digitized” and stored Processed in two ways As is (no recognition) Recognized and converted to alphanumeric data (ASCII) Digitize 10110010…


Other Input: Other Input OCR – optical character recognition Bar code readers Voice/audio input Punched cards Images / objects Pointing devices pp. 69-86


Punched Cards: Punched Cards Invented by Herman Hollerith (founder of IBM) Each card holds 80 characters


Other Input: Other Input OCR – optical character recognition Bar code readers Voice/audio input Punched cards Images / objects Pointing devices pp. 69-86


Images: Images Typically images are pictures that are optically scanned and saved as a “bit map” or in some other format Many formats gif, jpeg, …


Typical “Save As” Dialog: Typical “Save As” Dialog


Objects: Objects Images made of geometrically definable shapes Offer efficiency, flexibility, small size, etc.


Other Input: Other Input OCR – optical character recognition Bar code readers Voice/audio input Punched cards Images / objects Pointing devices pp. 69-86


Pointing Devices: Pointing Devices Originally used for specifying coordinates (x, y) for graphical input Today used as general purpose device for “graphical user interfaces” (GUIs)


Thank you: Thank you