ADVERTISEMENT
ADVERTISEMENT

Different Data Types in NumPy 

NumPy is a popular Python library used for scientific computing, particularly for working with arrays. In NumPy, there are several data types, also known as "dtypes," that are used to represent various kinds of data.

What is dtype in NumPy?

In NumPy, dtype (short for data type) refers to the data type or format of the elements in a NumPy array. It describes how the bytes in the fixed-size block of memory corresponding to each element of the array should be interpreted.

A dtype object can be created using a string that represents the data type using a shortcut code or using a constructor function.

Here are some examples of dtype objects:

import numpy as np
# Create a dtype object using a shortcut code
dt_int = np.dtype("i")   # integer data type
dt_float = np.dtype("f")   # floating-point data type
dt_complex = np.dtype("c")   # complex data type

Different Data Types in NumPy with Examples

Here is the list of different data types in NumPy explained with examples:

1. Int: This data type is used to represent integer values. NumPy provides several subtypes of int dtype, depending on the number of bits used to represent the integer. For example, int8, int16, int32, and int64 represent integers with 8, 16, 32, and 64 bits respectively.

For Example

import numpy as np
a = np.array([1, 2, 3], dtype=np.int32)
b = np.array([10, 20, 30], dtype=np.int64)

print(a.dtype)  # Output: int32
print(b.dtype)  # Output: int64

In this example, we created a NumPy array a with the int32 data type using the dtype parameter of the np.array() function.

2. Float: This data type is used to represent floating-point numbers. Similar to int dtype, NumPy provides several subtypes of float dtype, depending on the number of bits used to represent the number. For example, float16, float32, and float64 represent floating-point numbers with 16, 32, and 64 bits respectively.

For Example

import numpy as np
a = np.array([1.0, 2.5, 3.7], dtype=np.float32)
b = np.array([10.0, 20.5, 30.7], dtype=np.float64)

print(a.dtype)  # Output: float32
print(b.dtype)  # Output: float64

3. Bool: This data type is used to represent boolean values, which can be either True or False. The dtype for boolean values is called bool.

For Example

import numpy as np
a = np.array([True, False, True], dtype=np.bool)

print(a.dtype)  # Output: bool

4. String: This data type is used to represent strings of characters. The dtype for string values is called str, and it is often used in conjunction with the object dtype to create arrays of strings of varying lengths.

For Example

import numpy as np
a = np.array(["hello", "world", "numpy"], dtype=np.str)

print(a.dtype)  # Output: <U5 (unicode string with length up to 5)

5. Object: This data type is used to represent any Python object. It is often used to create arrays of mixed types or arrays of arrays.

For Example

import numpy as np
a = np.array([1, "two", [3, 4]], dtype=np.object)

print(a.dtype)  # Output: object

6. Complex: This data type is used to represent complex numbers. The dtype for complex values is called complex, and it can be used with float subtypes to represent complex numbers with different levels of precision. For example, complex64 and complex128 represent complex numbers with 64 and 128 bits respectively.

For Example

import numpy as np
a = np.array([1+2j, 2+3j, 3+4j], dtype=np.complex64)
b = np.array([10+20j, 20+30j, 30+40j], dtype=np.complex128)

print(a.dtype)  # Output: complex64
print(b.dtype)  # Output: complex128

7. Datetime: This data type is used to represent dates and times. NumPy provides several subtypes of datetime dtype, depending on the precision of the time measurement. For example, datetime64[D] represents dates with day-level precision, while datetime64[ns] represents dates with nanosecond-level precision.

For Example

import numpy as np
a = np.array(["2022-01-01", "2022-02-01", "2022-03-01"], dtype=np.datetime64)
b = np.array(["2022-01-01T00:00:00", "2022-02-01T01:00:00", "2022-03-01T02:30:00"], dtype=np.datetime64)

print(a.dtype)  # Output: datetime64[D]
print(b.dtype)  # Output: datetime64[s]

8. Timedelta: This data type is used to represent time durations. It is similar to datetime, but instead of representing a specific point in time, it represents the difference between two points in time. The dtype for timedelta values is called timedelta, and it can be used with various precision levels.

import numpy as np
a = np.array(["2022-01-01", "2022-02-01", "2022-03-01"], dtype=np.datetime64)
b = np.array(["2022-01-02", "2022-02-03", "2022-03-04"], dtype=np.datetime64)

delta = b - a

print(delta.dtype)  # Output: timedelta64[D]

9. Void: This data type is used to represent structured data. A structured data type consists of a sequence of named fields, each with its own data type. The dtype for structured data is called void.

For Example

import numpy as np
# Define a structured data type
dt = np.dtype([("name", np.str_, 16), ("age", np.int32)])

# Create an array of structured data
a = np.array([("Alice", 25), ("Bob", 30), ("Charlie", 35)], dtype=dt)

print(a.dtype)  # Output: [("name", "<U16"), ("age", "<i4")]

10. Unicode: This data type is used to represent Unicode strings. The dtype for Unicode values is called unicode, and it can be used with various precision levels.

For Example

import numpy as np
a = np.array(["hello", "world", "numpy"], dtype=np.unicode)

print(a.dtype)  # Output: <U5

Short Codes for Data Types in NumPy

Several shortcut codes that can be used to represent different data types in NumPy. These codes are typically used when specifying the data type of an array or when converting data to a specific data type using the astype() method. Here are some common shortcut codes:

  1. b: Represents boolean data type
  2. i: Represents integer data type
  3. u: Represents unsigned integer data type
  4. f: Represents floating-point data type
  5. c: Represents complex data type
  6. m: Represents datetime data type
  7. M: Represents datetime with timezone data type
  8. O: Represents object data type
  9. S: Represents fixed-length string data type
  10. U: Represents Unicode string data type
  11. V: Represents structured data type (void)

What is astype() in NumPy ?

In NumPy, astype() is a method used to convert the data type of a NumPy array. It creates a new array with the same data but with a different data type.

The astype() method can be called on a NumPy array object and takes one argument, which is the target data type. The argument can be specified using the data type object or a string that represents the data type using a shortcut code.

Example of how to use astype():

import numpy as np
a = np.array([1, 2, 3])

# Convert to float data type
b = a.astype(float)

print(b.dtype)  # Output: float64

In this example, we created a NumPy array a with integer values and then used the astype() method to convert it to a float data type. The resulting array b has the same data as a, but the data type is now float64.

It's important to note that astype() creates a new array with the specified data type, and the original array remains unchanged. Therefore, if you want to change the data type of an array permanently, you need to assign the result of astype() back to the original array or to a new variable.

import numpy as np
a = np.array([1, 2, 3])

# Convert to float data type
a = a.astype(float)

print(a.dtype)  # Output: float64

What is the difference between dtype and astype() in NumPy?

In NumPy, dtype and astype() are related but distinct concepts.

dtype refers to the data type or format of the elements in a NumPy array. It describes how the bytes in the fixed-size block of memory corresponding to each element of the array should be interpreted. dtype can be specified when creating a NumPy array using the dtype parameter of the np.array() function.

astype() is a method used to convert the data type of a NumPy array. It creates a new array with the same data but with a different data type. The astype() method can be called on a NumPy array object and takes one argument, which is the target data type. The argument can be specified using the data type object or a string that represents the data type using a shortcut code.

In summary, dtype is the property that defines the data type of an array, while astype() is the method that creates a new array with a different data type. In other words, dtype is used to specify the data type of an array, and astype() is used to change the data type of an existing array.

 

 

 


ADVERTISEMENT

ADVERTISEMENT