数据表示

本文主要介绍了数据表示的基本概念分类及应用。首先学习了以比特表示信息,以及比特在算法上的运算定义。学习利用比特来代表数字与符号与二进制表示。通过例题学习二进制算法。正如我们所讨论的,2C编码的加法使用与二进制加法是相同的过程。
展开查看详情

1. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Idea 1: Bits Two states in a digital machine: 1. presence of a voltage – we’ll call this state “1” 2. absence of a voltage – we’ll call this state “0” A binary digit, or bit is the abstraction of a voltage reading Use multiple bits to represent information • Common grouping of bits: 4 bits – a nybble 8 bits - a byte, the basic unit • Word size – the number of bits typically handled as a unit in a machine 2-1

2. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Idea 2: Represent Information in Bits A collection of n bits has 2n possible states Interpret a collection of states as a binary number 22 21 20 Decimal # 0 0 0 0 0 0 1 1 0 1 0 2 0 1 1 3 1 0 0 4 1 0 1 5 1 1 0 6 1 1 1 7 2-2

3. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Idea 3: Define Operations Algorithmically Base-2 addition – just like base-10! • add from right to left, propagating carry carry 10010 10010 1111 + 1001 + 1011 + 1 11011 11101 10000 10111 + 111 Subtraction, multiplication, division,… also like base-10 Except for OVERFLOW 2-3

4. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Idea 4: Represent Signed Numbers In a collection of n bits, divide the 2n possible states between negative and positive numbers 22 21 20 Decimal # 0 0 0 0 Positive numbers 0 0 1 1 0 1 0 2 0 1 1 3 1 0 0 -4 1 0 1 -3 Negative numbers 1 1 0 -2 1 1 1 -1 2-4

5. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Two’s Complement Representation If number is positive or zero, • normal binary representation, zero in MSB If number is negative, • start with positive number • flip every bit (i.e., take the one’s complement) • then add one 00101 (5) 01001 (9) 11010 (1’s comp) (1’s comp) + 1 + 1 11011 (-5) (-9) 2-5

6. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Two’s Complement Shortcut To negate a two's complement number: • copy bits from right to left until (and including) the first “1” • flip remaining bits to the left 011010000 011010000 100101111 (1’s comp) (flip) (copy) + 1 100110000 100110000 2-6

7. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Operations on Two’s Complement As we’ve discussed, 2C encoded addition uses the same procedure as binary addition • assume all integers have the same number of bits • ignore carry out 01101000 (104) 11110110 (-10) + 11110000 (-16) + (-9) 01011000 (98) (-19) Assuming 8-bit 2C numbers. 2-7

8. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Subtraction in 2C Negate subtrahend (2nd no.) and add. • assume all integers have the same number of bits • ignore carry out 01101000 (104) 11110110 (-10) - 00010000 (16) - (-9) 01101000 (104) 11110110 (-10) + 11110000 (-16) + (9) 01011000 (88) (-1) But what about OVERFLOW 2-8

9. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Sign Extension To add two 2C encoded numbers, we must represent them with the same number of bits. But what if one is 4-bit word length and the other is 8-bits? If we just pad with zeroes on the left, it doesn’t work!: 4-bit 8-bit 0100 (4) 00000100 (still 4) 1100 (-4) 00001100 (12, not -4) Instead, replicate the MS bit -- the sign bit: 4-bit 8-bit 0100 (4) 00000100 (still 4) 1100 (-4) 11111100 (still -4) Called “Sign extension” 2-9

10. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Overflow If operands are too big, then sum cannot be represented as an n-bit 2C number. 01000 (8) 11000 (-8) + 01001 (9) + 10111 (-9) 10001 (-15) 01111 (+15) We have overflow if: • signs of both operands are the same, and • sign of sum is different. • In other words, you can’t add two positive numbers and get a negative answer, or two negative numbers and get a positive answer! Another test -- easy for hardware: • carry into MS bit does not equal carry out 2-10

11. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Hexadecimal Notation • fewer digits -- four bits per hex digit • less error prone -- easy to corrupt long string of 1’s and 0’s • Trivia: ‘hexadecimal’ is a mixture of Greek (hex) and Latin (decimal) because the full-Latin word for 16 makes people blush (‘sexadecimal’) Binary Hex Decimal Binary Hex Decimal 0000 0 0 1000 8 8 0001 1 1 1001 9 9 0010 2 2 1010 A 10 0011 3 3 1011 B 11 0100 4 4 1100 C 12 0101 5 5 1101 D 13 0110 6 6 1110 E 14 0111 7 7 1111 F 15 2-11

12. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Converting from Binary to Hexadecimal Every four bits is a hex digit. • start grouping from right-hand side! 011101010001111010011010111 3 A 8 F 4 D 7 NOTE! This is not a new machine code, just a convenient way for humans to write a long binary number!! 2-12

13. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Converting From Base 10 to Base 2 Recall the previous approach of identifying the largest powers of 2 that contribute to the base number: A better algorithm 2-13

14. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Fractions: Fixed-Point Code How can we represent fractions? • Use a “binary point” to separate positive from negative powers of two -- just like “decimal point.” • 2’s comp addition and subtraction still work.  if binary points are aligned 2-1 = 0.5 2-2 = 0.25 2-3 = 0.125 00101000.101 (40.625) + 11111110.110 (-1.25) 00100111.011 (39.375) No new operations needed -- same procedure as integer arithmetic. 2-14

15. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Very Large and Very Small: Floating-Point Code Large values: 6.023 x 1023 -- requires 79 bits Small values: 6.626 x 10-34 -- requires >110 bits Use equivalent of “scientific notation”: F x 2E Need to represent F (fraction), E (exponent), and sign. IEEE 754 Floating-Point Standard (32-bits): 1b 8b 23b Encoded S Exponent Fraction N  1S 1.fraction 2encodedexponent 127 , 1 encoded exponent 254 2-15

16. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Floating Point Example Single-precision IEEE floating point number: 10111111010000000000000000000000 sign encoded fraction exponent • Sign is 1 – number is negative. • Encoded exponent field is 01111110 = 126 (decimal). • Fraction is 0.100000000000… = 0.5 (decimal). Value = -1.5 x 2(126-127) = -1.5 x 2-1 = -0.75. 2-16

17. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Another Floating Point Example • Sign bit is 1, because number is negative. • 118.625 (decimal) is 1110110.101 (binary). • This is 1.110110101×26. • Exponent is 6: • encoded as 6+127 = 133 = 10000101 (binary). • Fraction is 11011010100000000000000: • first 1 is hidden, padded with zeroes on the right http://en.wikipedia.org/w/index.php?title=IEEE_754-1985 2-17

18. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Text: ASCII Characters ASCII: Maps 128 characters to 7-bit code. • both printable and non-printable (ESC, DEL, …) characters 00 nul10 dle20 sp 30 0 40 @ 50 P 60 ` 70 p 01 soh11 dc121 ! 31 1 41 A 51 Q 61 a 71 q 02 stx12 dc222 " 32 2 42 B 52 R 62 b 72 r 03 etx13 dc323 # 33 3 43 C 53 S 63 c 73 s 04 eot14 dc424 $ 34 4 44 D 54 T 64 d 74 t 05 enq15 nak25 % 35 5 45 E 55 U 65 e 75 u 06 ack16 syn26 & 36 6 46 F 56 V 66 f 76 v 07 bel17 etb27 ' 37 7 47 G 57 W 67 g 77 w 08 bs 18 can28 ( 38 8 48 H 58 X 68 h 78 x 09 ht 19 em 29 ) 39 9 49 I 59 Y 69 i 79 y 0a nl 1a sub2a * 3a : 4a J 5a Z 6a j 7a z 0b vt 1b esc2b + 3b ; 4b K 5b [ 6b k 7b { 0c np 1c fs 2c , 3c < 4c L 5c \ 6c l 7c | 0d cr 1d gs 2d - 3d = 4d M 5d ] 6d m 7d } 0e so 1e rs 2e . 3e > 4e N 5e ^ 6e n 7e ~ 0f si 1f us 2f / 3f ? 4f O 5f _ 6f o 7f del 2-18

19. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Fun tricks with ASCII Code What is relationship between a decimal digit ('0', '1', …) and its ASCII code What is the difference between an upper-case letter ('A', 'B', …) and its lower-case equivalent ('a', 'b', …)? Given two ASCII characters, how do we tell which comes first in alphabetical order? Are 128 characters enough? (http://www.unicode.org/) No new operations -- integer arithmetic and logic. Copyright © Thomas M Conte 2-19

20. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Other Data Types (codes)… millions of them! Text strings • sequence of characters, terminated with NULL (0) • typically, no hardware support Image • array of pixels  monochrome: one bit (1/0 = black/white)  color: red, green, blue (RGB) components (e.g., 8 bits each)  other properties: transparency • hardware support:  typically none, in general-purpose processors  MMX -- multiple 8-bit operations on 32-bit word Sound • sequence of fixed-point numbers 2-20