language-icon Old Web
English
Sign In

Subset sum problem

In computer science, the subset sum problem is an important decision problem in complexity theory and cryptography. There are several equivalent formulations of the problem. One of them is: given a set (or multiset) of integers, is there a non-empty subset whose sum is zero? For example, given the set { − 7 , − 3 , − 2 , 5 , 8 } {displaystyle {-7,-3,-2,5,8}} , the answer is yes because the subset { − 3 , − 2 , 5 } {displaystyle {-3,-2,5}} sums to zero. The problem is NP-complete, meaning roughly that while it is easy to confirm whether a proposed solution is valid, it may inherently be prohibitively difficult to determine in the first place whether any solution exists. In computer science, the subset sum problem is an important decision problem in complexity theory and cryptography. There are several equivalent formulations of the problem. One of them is: given a set (or multiset) of integers, is there a non-empty subset whose sum is zero? For example, given the set { − 7 , − 3 , − 2 , 5 , 8 } {displaystyle {-7,-3,-2,5,8}} , the answer is yes because the subset { − 3 , − 2 , 5 } {displaystyle {-3,-2,5}} sums to zero. The problem is NP-complete, meaning roughly that while it is easy to confirm whether a proposed solution is valid, it may inherently be prohibitively difficult to determine in the first place whether any solution exists. The problem can be equivalently formulated as: given the integers or natural numbers w 1 , … , w n {displaystyle w_{1},ldots ,w_{n}} does any subset of them sum to precisely W {displaystyle W} ? Subset sum can also be thought of as a special case of the knapsack problem. One interesting special case of subset sum is the partition problem, in which W {displaystyle W} is half of the sum of all elements in the set (i.e., W = 1 2 ( w 1 + ⋯ + w n ) {displaystyle W={frac {1}{2}}(w_{1}+dots +w_{n})} ). The complexity of the subset sum problem can be viewed as depending on two parameters, N, the number of decision variables, and P, the precision of the problem (stated as the number of binary place values that it takes to state the problem). (Note: here the letters N and P mean something different from what they mean in the NP class of problems.) The complexity of the best known algorithms is exponential in the smaller of the two parameters N and P. Thus, the problem is most difficult if N and P are of the same order. It only becomes easy if either N or P becomes very small. If N (the number of variables) is small, then an exhaustive search for the solution is practical. If P (the number of place values) is a small fixed number, then there are dynamic programming algorithms that can solve it exactly. Efficient algorithms for both small N and small P cases are given below. There are several ways to solve subset sum in time exponential in n {displaystyle n} . The most naïve algorithm would be to cycle through all subsets of n {displaystyle n} numbers and, for every one of them, check if the subset sums to the right number. The running time is of order O ( 2 n n ) {displaystyle O(2^{n}n)} , since there are 2 n {displaystyle 2^{n}} subsets and, to check each subset, we need to sum at most n {displaystyle n} elements. A better exponential time algorithm is known which runs in time O(2N/2). The algorithm splits arbitrarily the N elements into two sets of N/2 each. For each of these two sets, it stores a list of the sums of all 2N/2 possible subsets of its elements. Each of these two lists is then sorted. Using a standard comparison sorting algorithm for this step would take time O(2N/2N). However, given a sorted list of sums for k elements, the list can be expanded to two sorted lists with the introduction of a (k + 1)st element, and these two sorted lists can be merged in time O(2k). Thus, each list can be generated in sorted form in time O(2N/2). Given the two sorted lists, the algorithm can check if an element of the first array and an element of the second array sum up to s in time O(2N/2). To do that, the algorithm passes through the first array in decreasing order (starting at the largest element) and the second array in increasing order (starting at the smallest element). Whenever the sum of the current element in the first array and the current element in the second array is more than s, the algorithm moves to the next element in the first array. If it is less than s, the algorithm moves to the next element in the second array. If two elements with sum s are found, it stops. Horowitz and Sahni first published this algorithm in a technical report in 1974. The problem can be solved in pseudo-polynomial time using dynamic programming. Suppose the sequence is

[ "Knapsack problem" ]
Parent Topic
Child Topic
    No Parent Topic