Saturday, 19 April 2025

How to approach System design interview questions (HLD)

 Background

Design questions are an integral part of today's IT interview process. Design questions can generally be categorized into
  1.  High-level design (HLD) - High-level architecture design of a system
  2.  Low-level design (LLD) - Low-level class / Component design of the system

In High-level design, you would typically get a question to design a system like YouTube, Google Maps, or an e-commerce website, and you would provide the high-level architecture of the system, starting from the client all the way to backend systems that would process and return the request. Focus would be to solve functional use cases, but also address non-functional ones like scalability, availability, performance, etc.

In Low-level design, however, you will focus more on writing classes or components to solve the problem statement. For eg, let's say you are designing a Chess game, then you need to design a class for various components like the Board, Player, etc., along with methods/variables that will be used to implement interactions effectively. Focus would be on implementing the solution, but also sticking to design principles like SOLID.

In this post, we will look at what are some common things you look at in terms of approaching a high-level design question. We will look at a specific design question and see how we approach it based on various aspects we discuss here.





How to approach System design interview questions

1. Summarize the requirements of the System and clarify open questions

The 1st thing you should think about once you know the problem statement is to summarize the requirements of the System (functional and non-functional) and clarify open questions. Once the requirements are summarized, you can check with the interviewer if the list looks ok, if there is anything specific the interviewer wants to discuss, we add it to our list of requirements. This is very important to build the foundation of how you approach the question. This will help you with the following
  1. Understand all aspects of the problem that you are trying to solve
  2. You can refer to these when you are actually building your system to ensure you have addressed all the requirements you noted earlier.
  3. This shows you can systematically address the question and not jump to solutions.
Make sure when you are summarizing functional and non-functional requirements, also mention why you think a requirement is important.

Let's see some examples
  1. Let's say you are designing a YouTube video upload and viewing service. 
    1. Functional requirements can be 
      1. The user should be able to upload the video
      2. The user should be able to specify video metadata like title, description, thumbnail, etc.
      3. The user should be able to view the video
      4. The user should be able to generate the caption, etc.
    2. Non-functional requirements can be
      1. The system should be available & fault-tolerant (no downtime)
      2. The system should be able to scale with load
      3. The system should be performant (Less video processing time, less rendering time, etc.)
      4. The system should be durable (Video once uploaded should be persisted)
With summarization, make sure you ask open questions, e.g., 
  1. Do we want to support video streaming?
  2. What is the expected system load like? Designing a system for a global user base will require a highly scalable system.
  3. Do we want to consider front-end system design (UX experience) or just the backend design, etc?

2. Take time to think about your high-level approach and design

Once you clarify the requirements, you should take some time to think through what your approach should be to implement a design that solves the problem at hand (Do not jump to giving random suggestions - that's a red flag).

A few things you can think of
  1. What components will be involved in the design (API gateway, web services, persistent storage options, event-driven communication, etc), and how will they interact with each other?
  2. Go through your non-functional requirements and see what is needed to handle those. E.g., for scalability, you might need auto-scaling infrastructure, for availability, you might need a cluster of DB nodes, for performance, you might need in-memory caches, etc.
  3. You can also think about aspects like what APIs are needed, what API structure will look like, what are expected input and output of APIs are. If you have to store information, think about what the DB schema will look like.
  4. Think about what DB you need - Relational or non-relational, etc.
Standard components that you might need
  1. API Gateway: API gateway is probably the most common component in any architecture. API gateway is used for rate limiting, authentication, response caching, and data transformations.
  2. Microservices node cluster: Modern systems use microservices rather than a monolithic architecture. You would typically want autoscaling enabled so that you can scale the microservice nodes on demand. Instead of actual server nodes, you could have a serverless architecture like AWS Lambda.
  3. Storage: You need storage options. It could be a database(Relational or non-relational), storage service like S3, etc. 
  4. Event-driven system: If you need an asynchronous workflow, then you need to decouple your microservice nodes from the component that will process your request, eg, image processing or video processing, which are expensive operations. For this, you can have a message queue system like Active MQ, AWS SQS, or Kafka.
  5. Caching system: To improve performance, you might want to have some in-memory caching system like Redis, memcache, etc.
  6. Cron jobs/ batch infra: For use-cases where you need a periodic job to run, you can use a cron-based scheduler or batch job infrastructure.
  7. Monitoring/Analytics systems: Depending on use cases, you might also need a monitoring setup like monitoring system health,  system load, security alerts in cases like a DOS attack, etc. There could be an analytics requirement like monitoring logs, user access patterns, etc. Eg., AWS CloudWatch.
  8. CDN: Content delivery network or CDN is a set of servers at edge locations (near user locations) that are responsible for serving static content like CSS, images, etc., to reduce the latency.
Concepts that you need to consider for the design
  1. Authentication: As we saw above API gateway will handle authentication. You can have various authentication mechanisms like OAuth 2.0, JWT, etc.
  2. Relational vs No-SQL DB: You need to decide based on system requirements whether you need a relational DB like SQL Server or a non-sql DB like MongoDB or maybe something else. You can follow below general guidelines below to decide
    1. If you want to scale, like if you are dealing with big data, you would generally want to go to NoSQL, as relational DB might be difficult to scale. You can argue that you can have sharding to scale the DB, and while that is a valid point, it becomes difficult to scale relational DB after a threshold.
    2. If you do not have a fixed schema, then again, NoSQL is the way to go.
    3. If you need low latency, then again, NoSQL is the way to go.
    4. If you need ACID properties, then you should use a relational DB. For eg, if you are designing a payment system, you should use a relational DB because you need strong consistency. If you are ok with eventual consistency, you can go with NOSQL.
    5. If your access queries and patterns are not fixed, then NoSQL is probably a good choice; else if it is fixed relational DB might work out.

  3. CAP theorem: The CAP theorem, also known as Brewer's theorem, states that a distributed system can only guarantee two out of three properties: Consistency, Availability, and Partition Tolerance. This means that designers must make trade-offs when building distributed systems, prioritizing the most important guarantees for their specific needs. 
  4. ACID properties: ACID is an acronym that stands for atomicity, consistency, isolation, and durability (ACID). Together, ACID properties ensure that a set of database operations (grouped together in a transaction) leave the database in a valid state even in the event of unexpected errors. Further, ACID transactions provide the level of transactional guarantees that many regulatory agencies require.
  5. DB concepts: If you do get into DB architecture discussions, you might need the following concepts handy
    1. Sharding
    2. Paritioning
    3. Indexes
    4. Batch writes
    5. Read replicas
  6. API design: You might have to design the APIs as well, for which you need to consider:
    1. API type (Socket communication / REST API's)
    2. Versioning
    3. Pagination
    4. API protocol (HTTP / HTTPS)
    5. API timeout 
    6. Rate limiting, etc.
  7. UX concepts
    1. Server-side rendering vs client-side rendering: (If SEO, Search engine optimization, is a use case, then we need server-side rendering as 1st render should have all data, which is not the case with client-side rendering)
    2. Debouncing: Call the API or a function after a deal, so that if a user mistakenly clicks the submit button 3/4 times call is made only once. 

3. Implement your design

Once you have thought about your design, you can start implementing it. You can do this over a whiteboard or a Word doc (whatever is available). You should consider following at this stage
  1. While suggesting a component, also give a reason why you think the component is suggested, and what use case it solves. Be as vocal as you can and constantly communicate your thought proceess.
  2. You can expect probing questions that will be asked to make sure you understand the depth of the solution you are suggesting. So, do not suggest anything that you might have heard but are not sure how it actually works, and help solve the user case at hand. You should be confident enough to back up your answers.
  3. Take pauses after periodic discussion to check if the interviewer has any questions about what has been suggested/discussed so far. 
  4. Be prepared for course corrections. It can happen that the interviewer asks what you will do if the requirement changes, you should be able to course correct your approach to align with the new temporary goal set. This is again to understand that you can do necessary course corrections when needed.

4. Summarize your solution

Once implementation is done, go through your initial notes on functional and non-functional requirements and summarize how your system design handles them. 

NOTES:
  1. Design discussion is all about open discussion, unlike problem solving, where you have a specific solution/ direction for the problem.
  2. There can be more than one solution to the problem, so just be honest, confident, and vocal about your answers. Justify and back up your answers.
  3. Never take assumptions or jump to solutions. Clarify all the questions upfront. If you are suggesting something based on some assumption, mention it upfront as well.
  4. Do not propose or suggest anything that you are not aware of or cannot justify. If you suggest something but cannot justify it, it will be considered a red flag. It's ok to say if you are not aware of something specific, like how DB indexes work or how sharding works. 

Related Links

Friday, 11 April 2025

Understanding Global Interpreter Lock (GIL) in Python

 Background

If you have been using Python then you must have come across the term GIL or the fact that Python is primarily a single-thread language. In this post, we will see what this Global Interpreter Lock (GIL) in Python is.



Understanding Global Interpreter Lock (GIL) in Python

The GIL or Global interpreter lock is a lock or a mutex that allows only one thread to hold the control of the Python interpreter. This means that at a time only one thread can be running (You cannot use more than one CPU core at a time). 

This essentially prevents Python's internal memory from being corrupted. GIL ensures there are no dangling pointers and memory leaks. Think of this as the equivalent of each object in Python requiring a lock before accessing them, if you do this on your own this might cause deadlock which is prevented with GIL but it essentially makes it single-threaded.

However, note that even with GIL there can be race conditions because all operations will not be atomic. So let's say we have a method that takes in an object and appends it to the end of an array. Thread 1 can go inside the method, read the index that it needs to add the new object to, and then suspend(release GIL) before it can be inserted. Thread 2 gets GIL and does the same for other object and suspends. Now when thread 1 comes back it will update the same index thread 2 wrote to which will overwrite the data and cause race condition.

How does GIL prevent memory corruption? 

Python does memory management by keeping a count of references of each object. When this count reaches 0 memory help by that object is freed. 

Let's see an example of this
1
2
3
4
5
import sys
 
a = ["A", "B"]
b = a
print(f"References to a : {sys.getrefcount(a)}")

The output is: References to a : 3
It is 3 because one reference is a , 2nd reference is b and 3rd reference is locally created when it is passed in sys.getrefcount method. When this reference reaches 0 then the memory associated with this list object will be released.

Now if multiple threads were allowed then we could have two threads simultaneously increasing or decreasing the count. This can lead to
  1. There are no actual references to the object but the count is 1 due to race condition. This is essentially a memory leak i.e. the object is not referenced but cannot be garbage collected as well due to bad reference count.
  2. There are still references to objects but the count is 0 due to race conditions and Python frees the object memory. This will lead to a dangling pointer.
One way to handle the above case is to lock each object in Python before its reference is updated. But as we know locking comes with drawbacks like deadlock, so if two threads are waiting for the lock deadlock will happen. This can also impact the performance as the threads will frequently acquire and release locks.

The alternative is the GIL - single lock on the interpreter itself. So any thread that needs to execute any code needs to acquire GIL to be able to run the code via interpreter. This prevents deadlocks but essentially makes Python single-threaded.


NOTE: Many potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. 

Related Links

Monday, 31 March 2025

Understanding Bisect module in Python

 Background

You would have often come across a use case of finding an element in a sorted list. Typically you would use a binary search which takes O(logN). Python provides a module that does that for us - it is called bisect. The methods in this API allow us to find a position in the sorted list where a new element can be inserted to keep the lost sorted. In this post, we will explain how to use this module.


Understanding Bisect module in Python

Let's try to see the method bisect_left that gives us a position in a sorted list where a new element can be inserted to keep the list in sorted order. The syntax is
  • bisect_left(list, element , low, high)
The arguments are
  • list - a sorted list where we want to find the element to be inserted
  • element - The element to be inserted
  • low - the starting index of the list from where we want to start the search
  • high - the ending index of the list from where we want to end the search
See the "Related links" section at the end for the official documentation link.

Now let's see an example
1
2
3
4
5
6
7
import bisect
 
list = [1, 3, 5, 9, 10 ,15]
idx = bisect.bisect_left(list, 6)
print(f"Index to insert 6 is {idx}")
list.insert(idx, 6)
print(f"New list {list}")

Above code prints: 
Index to insert 6 is 3
New list [1, 3, 5, 6, 9, 10, 15]

You can also do both of the steps shown above - getting the index to be inserted for maintaining sorting and inserting it into the list in a single API provided by bisect. The API is
  • insort_left(list, element)
Let's see how we can use above API to achieve the same result
1
2
3
4
5
import bisect
 
list = [1, 3, 5, 9, 10 ,15]
bisect.insort_left(list, 6)
print(f"New list {list}")

Output is: New list [1, 3, 5, 6, 9, 10, 15]

NOTE: bisect_left as the name suggests gives the left-most possible index to insert, whereas there is a similar API called bisect_right that gives the right-most possible index to insert. Similarly to insort_left we also have insort_right  that does in place insertion of given element in given list.

Let's see an example of the above
1
2
3
4
5
6
7
import bisect
 
list = [1, 3, 5, 5, 5, 9, 10 ,15]
idx = bisect.bisect_left(list, 5)
print(f"Left most index to insert 5 is {idx}")
idx = bisect.bisect_right(list, 5)
print(f"Right most index to insert 5 is {idx}")

Above code prints:
Left most index to insert 5 is 2
Right most index to insert 5 is 5


NOTE: Time complexity is O(logN) which is the same as that of binary search.


We can actually use this API for multiple use-cases, let's see them below

1. Binary Search

As we say above you can use bisect to implement binary search.
1
2
3
4
5
6
7
8
9
10
11
import bisect
def binary_Search(arr, element, low, high):
    idx = bisect.bisect_left(arr, element, low, high)
    if idx < len(arr) and arr[idx] == element:
        return idx
    return -1
 
 
arr = [1, 3, 5, 7, 12, 20]
search_idx = binary_Search(arr, 7, 0, len(arr))
print(f"Binary search index for 7 : {search_idx}")


Output is: Binary search index for 7 : 3

2. Prefix search

If your list had all strings (lower case) and in sorted order then we can use bisect for prefix search as well as follows:
1
2
3
4
5
6
7
8
9
10
11
import bisect
def prefix_search(arr, prefix):
    idx = bisect.bisect_left(arr, prefix)
    if idx >= len(arr):
        return None
    el = arr[idx]
    return el if el.startswith(prefix) else None
 
 
arr = ["apple", "cat", "dog","elephant"]
print(prefix_search(arr, "ap"))

Output is: apple

3. Find no of repeating values 

If you have a sorted array and you want to find the number of times a number is repeated then we can use bisect again. Note that since the array is sorted the number will be sequential.
1
2
3
4
5
6
7
8
9
10
11
12
import bisect
def count_repeated_no(arr, no):
    left_idx = bisect.bisect_left(arr, no)
    right_idx = bisect.bisect_right(arr, no)
 
    if left_idx >= len(arr):
        return -1
    else:
        return right_idx - left_idx
 
arr = [1, 2, 5, 5, 5, 5, 9, 1]
print(f"Count of 5 in array {count_repeated_no(arr, 5)}")
Output: Count of 5 in array 4

Performance

Note that this API is really fast. 
  1. One because it uses binary search so it's time complexity is O(LogN)
  2. Secondly, it's precompiled in C so it's faster than if you implement it yourself (See SO question on comparison with direct lookup in list)

Related Links

Wednesday, 26 March 2025

SOLID principles in Object Oriented Programming

 Background

This post will try to see the 5 SOLID design principles used for object-oriented software design. These principles are
  • S: Single Responsibility Principle (SRP)
  • O: Open/Closed Principle
  • L: Liskov’s Substitution Principle (LSP)
  • I: Interface Segregation Principle (ISP)
  • D: Dependency Inversion Principle (DIP)
Whenever you write new code or design new software, ensure you respect the SOLID principles mentioned above. This helps create clean and modular code that will help you maintain and scale it in the future. Let's see each of these in detail.




Single Responsibility Principle (SRP)

This principle states that - "A class should have single responsibility". 

Let us take an example to understand this better. Let's assume you are building software for a restaurant and you create a class classed Worker which has method like prepare_food & serve_food.
1
2
3
4
5
6
class Worker:
    def prepare_food(self):
        print("Preparing Food")
 
    def serve_food(self):
        print("Serving Food"


What do you think about the above design? It's bad but why? It's bad because it's very tightly coupled with 2 responsibilities - one is preparing food and one is serving food which is independent tasks. Now let's say you make any changes to how you prepare food in the future you also need to test and ensure that serve_food functionality is not impacted. This unnecessary coupled dependency has made code hard to maintain. A better design would be to have separate classes for each
  • Class Chef - For Preparing Food
  • Class Waiter - For serving Food
That way there is separation of concern, and each class has logic for the business use case it is responsible for. It's easy to maintain & scale.

Open/Closed Principle

This principle states that - "A class should have open for extension but closed for modification". 

Let's see an example of this as well. Let's again take our above example of writing software for restaurant business. Let's say we are implementing a Chef class below.
1
2
3
class Chef:
    def prepare_chinese_food(self):
        print("Preparing Chinese Food"




Do you see any potential problems with the above? What happens when your restaurant starts preparing continental cuisine. You will add another method called prepare_continental_food to our Chef class. This will increase the number of methods in the class as we will have to add one each time we support a new cuisine. This is bad because you are modifying a code that is already tested and functioning which means adding Continental cuisine support has put the original implementation to support Chinese at risk. 

To fix this a better way would be to extend the Chef class to support new functionality
1
2
3
4
5
6
7
8
9
10
11
12
class Chef:
    def prepare_food(self):
        print("Preparing Chinese Food")
 
 
class ContinentalChef(Chef):
    def prepare_food(self):
        print("Preparing Continental Food")
 
 
chef = ContinentalChef()
chef.prepare_food()


As you can see now you can extend the base class Chef and add your new functionality without impacting the original implementation. If for any additional feature in code if you have to make lot of modification to your original code it is definitely violating this principle, so you should step back and see how you can make your code generic so that very less code changes are needed for future enhancements.

Liskov’s Substitution Principle (LSP)

This principle states - "Derived or child classes must be substitutable for their base or parent classes",

For this, we can again use the above example. What would have happened if we had implemented our Chef class a little differently?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Chef:
    def prepare_chines_food(self):
        print("Preparing Chinese Food")
 
 
class ContinentalChef(Chef):
    def prepare_chines_food(self):
        print("Preparing Continental Food")
 
    def prepare_continental_food(self):
        print("Preparing Continental Food")
 
 
chef = ContinentalChef()
chef.prepare_chines_food()


This is also bad because you have overridden the prepare_chinese_food functionality to actually prepare continental food. The caller of the Chef class will call prepare_chines_food under the impression that Chinese food will be prepared but due to the way the above classes are implemented it will do the unexpected thing (which is preparing continental cuisine).


The principle says you should be able to use ContinentalChef anywhere you have used Chef class without breaking the software contract. If you really intended ContinentalChef to not prepare the Chinese food you can raise an Exception from the prepare_chinese_food method of child class.

Interface Segregation Principle (ISP)

This principle states - "Do not force any client to implement an interface which is irrelevant to them".

This principle is very similar to Single responsibility principle we saw above except it tasks about interfaces rather than classes.

For example, let's say you have Chef class and Cusine Interface with methods like create_chinese, create_continental, etc. Instead of having a single interface definition for all abstract method, it's good to have separate interfaces - ChineseCuisine, ContinentalCuisine, etc. That way your concrete classes can choose to implement the ones they think are correct for their use cases.

In real work a chef can know both cuisines or a single cuisine. By creating a single interface you are forcing the concrete class to implement both even if they are not supporting bot cuisines which is bad. Concrete class should be free to choose what interface to implement based on the cuisine the chef can prepare.

Dependency Inversion Principle (DIP)

This principle states - "High-level modules should not depend on low-level modules. Both should depend on abstractions"


This one is actually my favorite. Let's see an example to understand this. Let's say you have a Restaurant class and it has a ChinesePrepatingChef as an instance member. Tomorrow if the chef changes to ContinentalPreparingChef then you need to modify the Restaurant code to use the new class. 

Instead of this you should define a base class called Chef and create two concrete classes - ChinesePreparingChef & ContinentalPreparingChef. In Restaurant you will just have a reference to Chef instance which can be either of the concrete classes depending on the use case.




Friday, 7 February 2025

Handling missing values in DataFrame with Pandas

 Background

In the last few posts, we have been seeing the basics of Pandas - Series, DataFrame , how to manipulate data etc. In this post, we will try to see how to handle missing values in DataFrame.



Handling missing values in DataFrame with Pandas

Padas accept all following values as missing data
  • np.nan
  • pd.NA
  • None
We can use the isna or notna function to detect these missing data. 
  • The isna function evaluates each cell in a DataFrame and returns True to indicate a missing value. 
  • The notna function evaluates each cell in a DataFrame and returns True to indicate a non-missing value. 
Let's try to see an example for the above:

Code:
1
2
3
4
5
6
7
8
9
10
import pandas as pd
import numpy as np
 
df = pd.DataFrame({"Name": ["Aniket", "abhijit", pd.NA, "Anvi"],
                   "Role": ["IT Dev", None, "IT QA", np.nan],
                   "Joining Date": [20190101, 20200202, 20210303, 20220404]})
 
print(df.to_string())
print(df.isna())
print(df.notna())


Output:
      Name    Role  Joining Date
0   Aniket  IT Dev      20190101
1  abhijit    None      20200202
2     <NA>   IT QA      20210303
3     Anvi     NaN      20220404

    Name   Role  Joining Date
0  False  False         False
1  False   True         False
2   True  False         False
3  False   True         False

    Name   Role  Joining Date
0   True   True          True
1   True  False          True
2  False   True          True
3   True  False          True

You can now use the truth data to filter rows as follows

1
2
3
4
5
6
7
8
import pandas as pd
import numpy as np
 
df = pd.DataFrame({"Name": ["Aniket", "abhijit", pd.NA, "Anvi"],
                   "Role": ["IT Dev", None, "IT QA", np.nan],
                   "Joining Date": [20190101, 20200202, 20210303, 20220404]})
 
print(df[df["Role"].notna()])

Output:

     Name    Role  Joining Date
0  Aniket  IT Dev      20190101
2    <NA>   IT QA      20210303


Dropping (dropna)& replacement (fillna)of missing data

  • dropna : The dropna function is used to drop rows and columns with missing values. It takes following arguements
    • axis - 0 for rows and 1 for columns
    • how - any for dropping if any data point is missing , all for dropping if all data points are missing
    • thresh - Threshold number of data points missing for dropping
    • inplace - Instead of returning a new modified DataFrame does dropping inplace for same DataFrame.
  • fillna: The fillnafunction is used to fill missing values with some data.

Example for dropna:
1
2
3
4
5
6
7
8
9
10
11
12
import pandas as pd
import numpy as np
 
df = pd.DataFrame({"Name": ["Aniket", "abhijit", pd.NA, "Anvi"],
                   "Role": ["IT Dev", None, "IT QA", np.nan],
                   "Joining Date": [20190101, 20200202, 20210303, 20220404]})
 
# Drop all columns with missing any data
print(df.dropna(axis=1, how="any"))
 
# Drop all rows with missing any data
print(df.dropna(axis=0, how="any"))
Output:
   Joining Date
0      20190101
1      20200202
2      20210303
3      20220404

     Name    Role  Joining Date
0  Aniket  IT Dev      20190101

Example for fillna:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import pandas as pd
import numpy as np
 
df = pd.DataFrame({"Name": ["Aniket", "abhijit", pd.NA, "Anvi"],
                   "Role": ["IT Dev", None, "IT QA", np.nan],
                   "Joining Date": [20190101, 20200202, np.nan, 20220404]})
 
# Replace missing values with some default value
print(df.fillna({"Name": "Default Name", "Role": "Default Role"}))
 
# Replace missing joining date with latest/max joining date in data
print(df["Joining Date"].fillna(value=df["Joining Date"].max()))
 
# forward fill the missing data
print(df.ffill(limit=1))

Output:

           Name          Role  Joining Date
0        Aniket        IT Dev    20190101.0
1       abhijit  Default Role    20200202.0
2  Default Name         IT QA           NaN
3          Anvi  Default Role    20220404.0

0    20190101.0
1    20200202.0
2    20220404.0
3    20220404.0
Name: Joining Date, dtype: float64

      Name    Role  Joining Date
0   Aniket  IT Dev    20190101.0
1  abhijit  IT Dev    20200202.0
2  abhijit   IT QA    20200202.0
3     Anvi   IT QA    20220404.0

NOTE: Previously we could pass method argument as bfill as ffill but that is deprecated now. Worning message: DataFrame.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead

t> UA-39527780-1 back to top