Arrays Vs. Lists in Python

In the exploration of Python, I discovered a subtle but interesting difference between Arrays and Lists in Python.

Image result for python difference between array and list

Arrays and lists are both used in Python to store data, but they don’t serve exactly the same purposes. They both can be used to store any data type (real numbers, strings, etc), and they both can be indexed and iterated through, but the similarities between the two end there.

The array.array type is just a thin wrapper on C arrays. It can hold only homogeneous data, all of the same type, and so it uses only sizeof(one object) * length bytes of memory. Mostly, you should use it when you need to expose a C array to an extension or a system call (for example, ioctl or fctnl).

This is how you’d define an array:

import array
x = array.array('i', [1, 2, 3])

The array module defines an object type which can compactly represent an array of basic values: characters, integers, floating point numbers. Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. The type is specified at object creation time by using a type code, which is a single character. Like the 'i' type code corresponds to Python’s int and 'b' type code corresponds to char etc.

array.array is also a reasonable way to represent a mutable string in Python 2.x (array('B', bytes)). However, Python 3.x offers a mutable byte string as bytearray.

Python Lists, on the other hand, are very flexible and can hold completely heterogeneous, arbitrary data, and they can be appended to very efficiently, in amortized constant time. If you need to shrink and grow your array time-efficiently and without hassle, they are the way to go. But they use a lot more space than C arrays.

It does take an extra step to use arrays because they have to be declared while lists don’t because they are part of Python’s syntax, so lists are generally used more often between the two, which works fine most of the time.

However, if you want to do the math on a homogeneous array of numeric data, then you’re much better off using NumPy, which can automatically vectorize operations on complex multi-dimensional arrays.

To make a long story short: array.array is useful when you need a homogeneous C array of data for reasons other than doing the math, like you may consider using arrays if you’re storing a large amount of data since arrays will store your data more compactly and efficiently.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.