What is the data type and codes in programming?

Introduction to Data Type

A data type is a collection of values that a variable can hold, as well as a set of operations that can be done on that variable. The type of data to be utilized in a program is defined by the data type.

Integer: The simplest data type is an integer, which contains discrete numbers (both +ve and –ve) such as 2,-5, and so on. The range of integer values varies depending on the programming language. An integer data type cannot have a fraction of an exponent.

Real: Real data type refers to a data type that contains any number representation. It doesn't matter if the value is signed or unsigned, if it's in fractions or if it's exponential. Real numbers include 0, 0.5, 4.7234, 3.0e-4, and so on.

Characters: Characters are a data type that includes any printable alpha-numeric character as well as a special character such as #, @, percent, and so on.
Example: Alphabets: a to z/ A to Z, Numbers: 0 to 9

Boolean (logical): This data type is a data type that only accepts one of two possible values at any given time. The two values are true and false. It's useful for determining the state of a program.
Example: AND, OR and NOT.

Array: An array is a collection of sequences containing the same sort of data, such as integers, real numbers, or characters. It has one to three dimensions.

String: String is an array of characters. They are used to store and manipulate text such as words, names, and sentences in programming. The string data type is made up of multiple characters enclosed in double quotation marks, such as "welcome," "056-2345," and so on. Various string processing activities, such as string comparison, string sorting, and so on, can be performed in programming.

Introduction to Codes

A computer can only understand binary numbers and they are in the form of two electronic states i.e. high voltage and low voltage. These notations are then converted into standard codes, which can be used to describe data for users' convenience. Some of the popular codes are:

Absolute Binary (pure binary): A positive number is represented by placing 0 before the binary number in an absolute binary technique, while a negative number is represented by placing 1 before the binary number. The sign bit is represented by the most significant bit in a binary integer, whereas the remaining bits constitute the actual number. The binary number is expressed in bits of 8, 16, 32, 64, and so on.

BCD (Binary Coded Decimal): It is a simple system for converting decimal numbers to binary numbers, in which each decimal number is transformed into binary independently and spaces are placed between numbers. Each decimal digit in BCD takes up four bits. For example, the decimal number 24 can be written as (0010 0100)2 in BCD.

Image of Binary Coded Decimal
Fig: Binary Coded Decimal

ASCII (American Standard Code for Information Interchange):
ASCII is a coding system that assigns numeric values to letters, numbers, punctuation marks, and control characters in order to ensure interoperability with a variety of hardware and peripherals. Standard ASCII (7 bits code, 128 characters) and Extended ASCII (8 bits code, 256 characters) were created in 1968. To represent foreign language characters and other graphical symbols, most systems employ 8 bit extended ASCII.

Each character in ASCII is represented by a single integer value ranging from 0 to 255. Non-printing control characters are represented by the numbers 0 to 31, whereas letters of the alphabet and common punctuation marks are represented by the numbers 32 to 127. For example, the ASCII code for the capital letter A is 65, whereas the value for the symbol * is 42. Because ASCII code employs 8 bits, each ASCII character takes up one byte of storage space in a computer.

EBCDIC (Extended Binary Coded Decimal Interchange Code): It is an 8-bit code system that is commonly used on large IBM mainframe computers, most IBM minicomputers, and computers from many other manufacturers. It allows computers to represent 256 characters.
When converting from EBCDIC to ASCII and vice versa, the positioning of the letters of the alphabet is discontinuous, and there is no direct character to character match.

Unicode: Unicode is a 16-bit character code created by the Unicode Consortium and the International Standard Organization (ISO), which can hold up to 65,536 characters. It enables the representation of all characters and symbols in each language in the world using a single code. The Chinese language, for example, includes about 10,000 characters that can only be represented by Unicode. If Unicode is widely used, multilingual software will be considerably easier to build and maintain.

Image of Unicode in Python
Fig: Unicode in Python
(Source: ian-albert.com)

Unicode employs 16 bits, so each Unicode character takes up 2 bytes of storage space on the computer. This coding system was created to address the limitation of ASCII code, which only permits 256 different characters, which is sufficient just for English but not for other languages with more than 256 characters, such as Chinese and Japanese. The Unicode Worldwide Character Standard provides up to 4 bytes (32 bits) now.

Post a Comment

Please do not enter any spam link in the comment.

Previous Post Next Post