Design Hit Counter | Interview Question | Discuss different approaches
It is a popular design interview question. It has been frequently asked in companies like Google, Amazon, Dropbox, Microsoft and more. A hit counter system counts the number of hits made.
Clarifications
- Is there a time range give for which the counts need to be aggregated?
- Or given the timestamp, only certain minutes is to be calculated?
- What API calls should look like?
- Are concurrent requests being made to the system?
Based on above clarifications, this question can take different turns. It is important to understand the interviewer’s expectations. Additionally, it shows the depth of your problem analysis.
A simple version of question is present on LeetCode 362. Design Hit Counter. Let’s start with the following assumptions and see how the explanation changes.
Assumptions
- Timestamp is passed as a parameter.
- API calls are made in the chronological order of increasing timestamp.
- Support concurrent requests
- For simplicity, lets count the number of hits in the last five minutes
- Timestamp starts with 1. This is the earliest possible timestamp.
Example
hitCounter = HitCounter()
// hit at timestamp 1hitCounter.hit(1)
// hit at timestamp 2.
// Assuming multiple requests were madehitCounter.hit(2)
hitCounter.hit(2)
// On getting hits, it should return the count as 3counter.getHits(30)
Approach 1 (Using dict):
from collections import defaultdict class HitCounter: def __init__(self): """ Initialize """" self.hits = defaultdict() def hit(self, timestamp: int) -> None: """ Receive a timestamp """ if timestamp in self.hits: self.hits[timestamp] += 1 else: self.hits[timestamp] = 1 def getHits(self, timestamp: int) -> int: """ Return the total number of hits in last five minutes """ totalHits = 0 for i in range(max(1, timestamp-299), timestamp+1): if i in self.hits: totalHits += self.hits[i] return totalHits
Hit Counter implementation in Python3 using dict
Time and space complexity for aggregating hits for last five minutes (300 seconds):
TC = O(300) ~ O(1),
SC = O(n) where n is count of timestamp
Based on the analysis, the above solution is not space optimal. If the number of hits are very large and the timestamp number is really large then this approach does not look good.
Note: The dict type data structure is preferred here as the dict lookup is O(1) operation. Plus this approach could be advantageous in case hits needs to be calculated between two timestamps.
Approach 2 (Using Heap):
from heapq import heappush, heappop class HitCounter: def __init__(self): """ Initialize. """ self.minHeap = [] def hit(self, timestamp: int) -> None: """ Receive a hit. """ heappush(self.minHeap, timestamp) self.clear(timestamp) def getHits(self, timestamp: int) -> int: """ Aggregate the number of hits in last five minutes. """ self.clear(timestamp) return len(self.minHeap) def clear(self, t): """ Remove the older values from heap """ while self.minHeap and self.minHeap[0] <= t - (5 * 60): heappop(self.minHeap)
The above approach doesn’t work in case of concurrent requests. This only helps if only single request is made at a given timestamp.
Approach 3 (Best):
class HitCounter: def __init__(self): self.hits = [(0, 0)] * 300 def hit(self, timestamp: int) -> None: """ Receive a hit. """ i = (timestamp-1) % 300 t, count = self.hits[i] if timestamp - t < 300: self.hits[i] = (timestamp, count+1) else: self.hits[i] = (timestamp, 1) def getHits(self, timestamp: int) -> int: """ Aggregate the number of hits in last five minutes. """ totalHits = 0 for modeTimestamp, count in self.hits: if timestamp - modeTimestamp < 300: totalHits += c return totalHits
Most Optimal solution for Hit Counter
This solution is scalable. It takes care of the concurrent request made if any. Additionally, this approach is an optimized version of Approach 1.
This approach handles the case if multiple hits carry the same timestamp.
Follow Up:
If we analyze more, with approach 3, race condition could be attained in a scenario when two requests try to update the list concurrently. Further, one the changes might be seen due to the race conditions.
How race condition can be solved?
Well, the obvious solution is to protect the list by locking mechanism. A thread that is trying to update the list gets the lock and release the lock once it updates it successfully. The second thread has to wait until the lock is released. This sounds like we solved the problem of race condition.
Yay — not so fast.
However, this solution will not work efficiently in case of huge volume of requests made with the same timestamp. Moreover, latency and performance repercussions cannot be neglected with the locking implementation as this mechanism is costly and could become a bottleneck with too many simultaneous requests.
Scaling (Think distributed)
Adding a server to handle more requests or distribute load seem the next viable option to resolve the aforementioned bottleneck of performance. Nevertheless, distributing load across multiple machines comes with added complexity.
Assuming the load balancer handles the nitty gritty of distributing or managing traffic across available nodes. This step cannot be neglected for performance benefits. Maybe traffic can be distributed using some userId hash value.
Now, each machine works on its own to aggregate the total counts based on the timestamp provides. Some aggregator functions collects this number from all the machines and returns the sum of all counter values to give the final result.
Originally published at https://raghavnyati.ghost.io on December 8, 2020.