Please see github.com/anusha-murali for all of my repositories.
This is a beautiful problem from LeetCode (Problem #2542). Although this problem is categorized as a “medium” difficulty problem, I believe it should be moved to the “hard” category.
Problem: You are given two $0$-indexed integer arrays nums1 and nums2 of equal length $n$ and
a positive integer $k$. You must choose a subsequence of indices from nums1 of length $k$.
For chosen indices $i_0, i_1, \ldots, i_{k - 1}$, your score is defined as:
The sum of the selected elements from nums1 multiplied with the minimum of the selected elements
from nums2.
It can be defined simply as:
Return the maximum possible score.
A subsequence of indices of an array is a set that can be derived from the set ${0, 1, \ldots, n-1}$ by deleting some or no elements.
Example 1:
Input: nums1 = [1,3,3,2], nums2 = [2,1,3,4], $k = 3$; Output: 12.
Explanation: The four possible subsequence scores are:
Therefore, we return the max score, which is 12.
Example 2:
Input: nums1 = [4,2,3,1,1], nums2 = [7,5,10,9,6], $k = 1$; Output: 30.
Explanation: Choosing index 2 is optimal: nums1[2] * nums2[2] = 3 * 10 = 30 is the maximum possible score.
Solution Using MinHeap
We are asked to select $k$ indices from the set ${0, 1, \ldots, n-1}$ in such a way that the sum of
integers from nums1 corresponding to those $k$ indices multiplied by the minimum value among the $k$ integers from nums2 corresponding to those $k$ indices is the maximum possible.
There are $\binom{n}{k}$ distinct sets of $k$ indices that can be selected from $n$ indices. This is exponential on $n$, which means, a brute-force approach examining all the $\binom{n}{k}$ possiblities will lead to exponential runtime.
We use the following approach to significantly reduce the runtime.
(nums1[i], nums2[i]).
list(zip(nums1, nums2))
nums2 in descending order. This ensures that we prioritize elements with larger num2 values.
pairedSortedList = sorted(list(zip(nums1, nums2)), key=lambda x: -x[1])
(nums1[i], nums2[i]), the current minimum value of nums2 in the min-heap is nums2[i] and the current sum of the $k$ values in the min-heap is simply sum(min-heap). The min-heap allows us to efficiently maintain the su of the largest $k$ elements.Our Python code is as follows:
def maxScore(nums1, nums2, k):
import heapq
pairedSortedList = sorted(list(zip(nums1, nums2)), key=lambda x: -x[1])
currSum = currMax = 0
minHeap = []
for n1, n2 in pairedSortedList:
heapq.heappush(minHeap, n1)
currSum += n1
if len(minHeap) == k:
currMax = max(currMax, currSum * n2)
currSum -= heapq.heappop(minHeap)
return currMax
Intuition: The pairedSortedList in our algorithm is a sorted list on nums2 in descending order, which allows us to maximize the minimum value of nums2. Let us consider a concrete example, with $n = 5$ and $k = 3$. The first full min-heap will have the three (nums1, nums2) pairs, where the nums2 correspond to the largest 3 of the 5. The smallest of the 3 nums1 will be found at the root of the min-heap, which will be removed in the next iteration and will be replaced with another (nums1, nums2) pair. Since pairedSortedList is sorted in descending order, the nums2 in the newly added pair will be smaller than either the one that was just removed or smaller than the two that are still present in the min-heap. Since we are dealing with (nums1, nums2) pairs, iterating through the pairedSortedList dictates which of the 3 nums1 could be present in the min-heap during each iteration of the for loop.
Therefore, we see that this algorithm explores all possible values of the minimum nums2 in a systematic way (from largest to smallest). For each value of the minimum nums2, it maximizes the sum of the corresponding nums1 elements.
Proof of Correctness
The score is calculated as:
\[\text{score} = \left ( \sum_{i \in S} \text{nums1}[i] \right ) \times \min_{i \in S} \text{nums2}[i].\]The algorithm ensures that (a) the sum of the selected nums1 elements is maximized for the current nums2 value (since we use a min-heap to keep the largest nums1 elements). (b) The nums2 value used in the calculation is the minimum in the current subset (because the tuples are sorted in descending order of nums2).
Proof of Optimality
nums2
nums2 ensures that when we consider a subset of size $k$, the minimum nums2 value in the subset is as large as possible.nums2 values are processed in descending order, so the minimum nums2 value in any subset is determined by the smallest nums2 value in the subset.nums1 Elements
nums1, we ensure that the sum of the selected nums1 elements is maximized for the current nums2 value.nums1 elements for the same nums2 value), the min-heap ensures that we always keep the largest possible sum.nums1 elements and (2) the minimum nums2 value in the subset.nums2 value is as large as possible.nums2 in a systematic way (from largest to smallest). (2) For each value of the minimum nums2, it maximizes the sum of the corresponding nums1 elements.Inductive Argument
Base Case:
nums1[i] * nums2[i], which is clearly optimal.Inductive Step:
nums1 and nums2).nums1 elements is maximized for the current nums2 value.Runtime: The time complexity is $O(n \log n)$ due to sorting and the min-heap operations. The space complexity is $O(n)$ for storing the pairs and the min-heap.
Back to Heap and Priority Queue Problems