# Solving Find Median from Data Stream

To see the question, click **here**.

**Naive Approach**

The idea is to store the elements in a list. At each stage of adding an element, we will sort the list because the elements may not be added to the list in a sorted fashion.

```
// TC: O(nlogn) for addNum and O(1) for findMedian
// SC: O(n)
import java.util.List;
import java.util.ArrayList;
import java.util.Collections;
class FindMedianfromDataStream {
List<Integer> list;
public FindMedianfromDataStream() {
list = new ArrayList<>();
}
public void addNum(int num) {
list.add(num);
Collections.sort(list);
}
public double findMedian() {
int size = list.size();
if (size % 2 != 0) {
return list.get(size / 2);
}
Double ans = (list.get(size / 2) + list.get(size / 2 - 1)) / 2.0;
return ans;
}
public static void main(String args[]) {
FindMedianfromDataStream fmd = new FindMedianfromDataStream();
fmd.addNum(1);
fmd.addNum(2);
System.out.println(fmd.findMedian());
fmd.addNum(3);
System.out.println(fmd.findMedian());
}
}
```

**Performance**

The time complexity of the `addNum`

method is O(nlogn) due to the sorting operation performed after each addition. The time complexity of the `findMedian`

method is O(1) because it only involves accessing elements by index and performing arithmetic operations. The space complexity is O(n) because we store all the numbers in the list.

**Refined Approach**

The idea is to maintain a balanced and sorted data stream structure in real time, allowing for efficient median retrieval. The data stream is divided into two halves: a lower half (`low`

) and an upper half (`high`

). The `low`

half contains the smaller elements stored in descending order. The `high`

half contains the larger elements stored in ascending order. The `TreeMap`

data structure is used to maintain order and allow quick access to the smallest and largest elements. Each `TreeMap`

also tracks the count of each element to handle duplicates. After every insertion, the sizes of the two halves are balanced to ensure that the difference in size is at most 1. This balancing ensures that the median can be easily calculated based on the top elements of the two halves.

```
// TC: O(logn) for addNum and O(1) for findMedian
// SC: O(n) for TreeMap
import java.util.TreeMap;
import java.util.Collections;
class FindMedianfromDataStream {
private TreeMap<Integer, Integer> low;
private TreeMap<Integer, Integer> high;
private int lowSize;
private int highSize;
public FindMedianfromDataStream() {
low = new TreeMap<>(Collections.reverseOrder());
high = new TreeMap<>();
lowSize = 0;
highSize = 0;
}
public void addNum(int num) {
if (lowSize == 0 || num <= low.firstKey()) {
low.put(num, low.getOrDefault(num, 0) + 1);
lowSize++;
} else {
high.put(num, high.getOrDefault(num, 0) + 1);
highSize++;
}
balance();
}
public double findMedian() {
if (lowSize > highSize) {
return low.firstKey();
} else if (lowSize < highSize) {
return high.firstKey();
} else {
return (low.firstKey() + high.firstKey()) / 2.0;
}
}
private void balance() {
if (lowSize > highSize + 1) {
moveFirstElement(low, high);
lowSize--;
highSize++;
} else if (highSize > lowSize) {
moveFirstElement(high, low);
highSize--;
lowSize++;
}
}
private void moveFirstElement(TreeMap<Integer, Integer> from, TreeMap<Integer, Integer> to) {
int firstKey = from.firstKey();
int count = from.get(firstKey);
if (count == 1) {
from.remove(firstKey);
} else {
from.put(firstKey, count - 1);
}
to.put(firstKey, to.getOrDefault(firstKey, 0) + 1);
}
public static void main(String args[]) {
FindMedianfromDataStream fmd = new FindMedianfromDataStream();
fmd.addNum(1);
fmd.addNum(2);
System.out.println(fmd.findMedian());
fmd.addNum(3);
System.out.println(fmd.findMedian());
}
}
```

**Performance**

The time complexity for `addNum`

is O(log n) due to the insertion and potential rebalancing of the `TreeMap`

, and O(1) for `findMedian`

because it is implemented efficiently using TreeMap's firstKey/lastKey. The space complexity is O(n) as it stores each input number in the `TreeMap`

, with `n`

being the number of elements added. The `TreeMap`

ensures ordered keys, allowing efficient median finding and insertion operations.

**Efficient Approach 1**

Instead of two treemaps, we can utilize two heaps. The concept remains the same i.e., `maxHeap`

stores the smaller half of the numbers while `minHeap`

stores the larger half of the numbers. When a new number is added, it is compared with the root of `maxHeap`

. If the new number is smaller than or equal to the root of `maxHeap`

, it is added to `maxHeap`

. Otherwise, it is added to `minHeap`

. After adding the number, the sizes of the two heaps should balance. If `maxHeap`

has more than one extra element compared to `minHeap`

, the root of `maxHeap`

is moved to `minHeap`

. If `minHeap`

has more elements than `maxHeap`

, the root of `minHeap`

is moved to `maxHeap`

. If `maxHeap`

has more elements, the median is the root of `maxHeap`

. If `minHeap`

has more elements, the median is the root of `minHeap`

. If both heaps have the same size, the median is the average of the roots of `maxHeap`

and `minHeap`

.

```
// TC: O(logn) for addNum and O(1) for findMedian
// SC: O(n) for maxHeap and minHeap
import java.util.PriorityQueue;
import java.util.Collections;
class FindMedianfromDataStream {
private PriorityQueue<Integer> maxHeap; // For the smaller half
private PriorityQueue<Integer> minHeap; // For the larger half
public FindMedianfromDataStream() {
maxHeap = new PriorityQueue<>(Collections.reverseOrder()); // Max-Heap
minHeap = new PriorityQueue<>(); // Min-Heap
}
public void addNum(int num) {
if (maxHeap.isEmpty() || num <= maxHeap.peek()) {
maxHeap.add(num);
} else {
minHeap.add(num);
}
// Balance the heaps
if (maxHeap.size() > minHeap.size() + 1) {
minHeap.add(maxHeap.poll());
} else if (minHeap.size() > maxHeap.size()) {
maxHeap.add(minHeap.poll());
}
}
public double findMedian() {
if (maxHeap.size() > minHeap.size()) {
return maxHeap.peek();
} else if (maxHeap.size() < minHeap.size()) {
return minHeap.peek();
} else {
return (maxHeap.peek() + minHeap.peek()) / 2.0;
}
}
public static void main(String args[]) {
FindMedianfromDataStream fmd = new FindMedianfromDataStream();
fmd.addNum(1);
fmd.addNum(2);
System.out.println(fmd.findMedian());
fmd.addNum(3);
System.out.println(fmd.findMedian());
}
}
```

The time complexity for adding a number is O(logn) because we add the number to one of the heaps and then potentially balance the heaps, which involves removing and adding elements to the heaps. The time complexity for finding the median is O(1) because we can directly access the top elements of the heaps. The space complexity is O(n) because we store all the heaps' elements.

Thank you for reading!

Check out other DSA patterns **here**.

You can support me by **buying me a book****.**