Arrays — Cost of flexible size

February 3, 2022

6 mins

command

How flexible are programming languages and and at what costs!

by Deepakumar M,

Let’s see the flexibility provided by programming languages and their associated costs.

Declaring arrays without size

Most of the programming languages supports initializing arrays without defining the size. For example, vectors in C++, arrays in many interpreted languages. So what’s special with this flexibility? Since the size is not provided explicitly, what’s the size to be set initially?

That’s a technique used in many places. Start with smaller value, grow very fast upto a threshold and then grow slowly. So for our array, initially the language compiler/interpreter will set the size as 1. When the size is insufficient it will just double the size(varies between languages). Based on your free memory and programming language, threshold differs. Lets create an array without defining the size and see what happens when we insert 5 elements.

Initially this is how our array will be.

Initial Array — Visualizing the Initial array

After inserting 10,

Visualizing the array after inserting one element

Now let’s insert one more element — 20. There is no enough memory and so the size has to be doubled now. But doubling the size is not simple always. Say your compiler has allocated the memory for the array where there is only space for storing one element and the memory cells next to it are already filled.

Here are steps that the compiler will do now.

Find a address(referred as base address) where 2 elements can be contiguously stored.
Point arr to the base address found.
Copy the data already stored in the older memory locations(first element 10 in our case) to the new memory locations.
Initialize default value for the new memory locations(specific to programming languages).

So after inserting 20 our array will be,

Visualizing the array after inserting 2 elements

Let’s insert 30 — again the size will be doubled now. This makes the array size to be 4 now.

Visualizing the array after inserting 3 elements

After inserting 40,

Visualizing the array after inserting 4 elements

Finally let’s add one more element 50.

Visualizing the array after inserting 5 elements

So now we can insert 3 more elements without doing any additional work. The same set of steps is applicable when we push the elements into the front of the array. Why can’t we just increase the size of array by 1 at each step? — if we do so, these steps has to be done every time when we push an element to the array which can be avoided.

Cost of flexible size

So far one point is very clear — defining the size of array while initializing it is always a best choice as it simplifies the work for the language interpreters. It is obvious that for inserting one element, the time complexity will be O(1) in the best case, O(N) in the worst case where N is the number of elements in the array before insertion. Since the complexity is not same for all insertions, we would do the amortized time complexity analysis here. In that way, when we insert N elements into an array, the amortized time complexity for inserting one element into the array will be O(1). We can discuss more on this later, may be as a separate blog.

Many compiled languages intentionally provides the flexibility with different data types. This would easily enforce the developer to use them only in places where they are necessary. For example, vectors in C++. In some interpreted languages you can’t see much difference in performance. That’s because the way the arrays were implemented in that language were different. Still it is good to define the size while initialization.

It’s clear that in arrays, the cost of flexible size over predefined size are unwanted I/O operations and extra CPU cycles.

How can we avoid this cost?

Initialize arrays with the size whenever possible.
When the random access is not necessary, prefer other data structures like linked lists, hash, set etc.

Of course there are places where the size of the array can’t be found upfront where we need worry about this cost factor.

More real world examples

When we visualize this as a technique, this can be applied in many places. Let’s discuss few scenarios where it is used or it can be used.

In Software applications, while performing the load testing, how could we find the maximum no. of requests per second that can be handled by the server and when should we stop increasing the load? We can’t directly try out a big value which would simulate a DOS attack. Let’s apply the technique we did on arrays. Start with a very small no. of requests — assume 10RPS(request per second). Monitor the average response time for 10RPS. Let’s increase the load by a multiplication factor of 3–30RPS, 90RPS, 270RPS, 810RPS and so on. Whenever the response time increases, that means the server is getting saturated by the requests. We should stop increasing the request at 3x rate. Cool! We have found out where to stop. From there we can increase slowly- may be we can increase 10RPS each time. 810RPS to 820RPS, to 830 RPS and so on.
Finding the appropriate congestion window size (no. of the packets) in computer networks — TCP congestion control.
Filling up a tank with petrol whose capacity is unknown. Say petrol is available in 1 liter bottles. Pour 1 bottle, then 2 bottles, 4 bottles, 8 bottles so on. When the tank reaches half of its capacity slow down the pouring speed(may be 1 bottle at a time). Considering petrol just because people don’t worry about the overflow if it’s water.
Finding the no. of tea spoons of salt to be added in the food when we cook for the very first time(though not a perfect analogy).

We might come up with different techniques and they might be appropriate in some situations too. For example like our brain we can learn initially and we can come up with a approximate values which is very much close. Like how we are guessing the capacity of a tank or no. of spoons of salt needed for the food, just by looking at them. Choosing between the techniques depends on the scenarios and its purely a personal choice even in programming languages .

With the clarity of understanding the difference between declaring arrays with or without size, in our next blog let’s discuss how in many interpreted languages, different data types can be added to an array without compromising the random access.

We at CaratLane are solving some of the most intriguing challenges to make our mark in the relatively uncharted omnichannel jewellery industry. If you are interested in tackling such obstacles, feel free to drop your updated resume/CV to careers@caratlane.com!

How flexible are programming languages and and at what costs!

Declaring arrays without size

Cost of flexible size

How can we avoid this cost?

More real world examples

Discussions

Leave a Reply Cancel reply