Tuesday, November 12, 2019

Database: How Indexing Works

You must have heard the word indexing in DB ? anywhere else ? Every place is our general life is filled with word indexing.

1. Indexing of Books
2. Indexing of Pages
3. Indexing of houses
4. Indexed diaries and etc...

Why do we index things around us ?

The simple answer is, we want to keep things in order to search easily and fast..

Same applies to database... database index does the same, it takes indexed columns or primary keys to sort the elements in ordered manner then start storing elements around that key pointed pages or buckets.

Let's take example of google map, if all addresses are stored in DB, how will those be stored ?

The search key would be area name + city name + country name, so all countries ->  cities -> areas are stored in ordered manner in DB after that all addresses belonging to that area will be stored somewhere in the memory block which will be somehow pointed by this combination  = area name + city name + country name.

Now any person searching for any address, when it will provide the area, city and country name combination -> all addresses of that area will be accessible immediately.

Now it can easily traverse through the area's addresses to get exact address details.

The question you might be thinking when I wrote 

all entries belongs to that area will be stored in a memory block which will be somehow pointed by this combination

How memory block is pointed by the combination ?

There are multiple techniques:

1. Ordered way
2. Hashing way.

In the sense, you have assigned addresses to various buckets (memory blocks) and those addresses can be assigned either in order to these combination or we can calculate the hash value ( by applying some algorithm to calculate the value) of the combination and assign the memory bucket available at that address to that combination.

So any address belonging to that combination's hash value will go in that bucket.

One major issue can happen is that in case a particular memory block getting more entries and it is filled up and overflowing, we might get trouble.

We can take care of it by many techniques, however major is double hashing. 

No comments: