Skip to content

Commit

Permalink
text formatting
Browse files Browse the repository at this point in the history
  • Loading branch information
sangupta committed Nov 29, 2016
1 parent 27b27ef commit ee12032
Show file tree
Hide file tree
Showing 2 changed files with 36 additions and 28 deletions.
28 changes: 16 additions & 12 deletions solutions/2016/fastest-duplicates-integer-sets.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
# Problem

You are given multiple integer sets and you need to find duplicates in them, ie. you need to find the intersection
of all the sets.
You are given multiple integer sets and you need to find duplicates in them, ie.
you need to find the intersection of all the sets.

# Solution

Suppose we have 3 arrays of integer sets and we need to merge them. The fastest solution would be possible
in time of `O(n1 + n2 + n3)` where `n1`, `n2`, `n3` are the lengths of the three sets respectively. The solution
lies in using two bit-vectors (also called as bit-set or a bit-array) to represent intersection between any two
sets and then using the resultant to intersect with the next one, and so on.
Suppose we have 3 arrays of integer sets and we need to merge them. The fastest
solution would be possible in time of `O(n1 + n2 + n3)` where `n1`, `n2`, `n3`
are the lengths of the three sets respectively. The solution lies in using two
bit-vectors (also called as bit-set or a bit-array) to represent intersection
between any two sets and then using the resultant to intersect with the next one,
and so on.

* Construct a running bit-set and populate it with the first array
* Using a second bit-set intersect the first bit-set with second array
Expand Down Expand Up @@ -56,14 +58,16 @@ public void findDuplicates(int[] array1, int[] array2, int[] array3) {
}
```

The solution above can be extended to as many arrays as are provided in the problem definition. The time to
sort will still remain `O(N)` where `N` is the sum of total number of elements across all provided arrays.
The solution above can be extended to as many arrays as are provided in the
problem definition. The time to sort will still remain `O(N)` where `N` is the
sum of total number of elements across all provided arrays.

## Optimizations available

* One can make use of sparsed-bit-arrays to reduce memory consumption. Refer to [brettwooldridge/SparseBitSet]
(https://github.com/brettwooldridge/SparseBitSet) for one such implementation.
* One can make use of sparsed-bit-arrays to reduce memory consumption. Refer to
[brettwooldridge/SparseBitSet](https://github.com/brettwooldridge/SparseBitSet)
for one such implementation.

* If the arrays are really, really huge - an implementation that uses file-based persistence of a bit-array
can be used. Refer to [one such implementation](https://github.com/sangupta/jerry-core/blob/master/src/main/java/com/sangupta/jerry/ds/bitarray/MMapFileBackedBitArray.java)
* If the arrays are really, really huge - an implementation that uses file-based
persistence of a bit-array can be used. Refer to [one such implementation](https://github.com/sangupta/jerry-core/blob/master/src/main/java/com/sangupta/jerry/ds/bitarray/MMapFileBackedBitArray.java)
available in the [jerry-core](https://github.com/sangupta/jerry-core) project.
36 changes: 20 additions & 16 deletions solutions/2016/fastest-sorting-integers.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
# Problem

We are given a huge array of bounded, non-duplicate, positive integers that need to be sorted. What's the
fastest way to do it.
We are given a huge array of bounded, non-duplicate, positive integers that need
to be sorted. What's the fastest way to do it.

# Solution

Most of the interview candidates that I have talked to about this problem come up with the quick answer
as `MergeSort` or divide-and-conquer. The cost of sorting being `O(N * Log(N))` - which in this case is
not the fastest sorting time.
Most of the interview candidates that I have talked to about this problem come up
with the quick answer as `MergeSort` or divide-and-conquer. The cost of sorting
being `O(N * Log(N))` - which in this case is not the fastest sorting time.

The fastest time to sort an integer array is `O(N)`. Let me explain how.

* Construct a boolean array of length N
* For every integer `n` in array, mark the boolean at index `n` in array as `true`
* Once the array iteration is complete, just iterate over the boolean array again to print all the indices
that have the value set as `true`
* Once the array iteration is complete, just iterate over the boolean array again
to print all the indices that have the value set as `true`

```java
public void sortBoundedIntegers(int[] array) {
Expand Down Expand Up @@ -45,18 +45,22 @@ public void sortBoundedIntegers(int[] array) {

## Additional constraints and optimizations available

* A `boolean` occupies one-byte of memory. Thus, switching to a bit-vector (also called as bit-array) will
reduce the memory consumption by a factor of 8. Check code sample 2.
* A `boolean` occupies one-byte of memory. Thus, switching to a bit-vector (also
called as bit-array) will reduce the memory consumption by a factor of 8. Check
code sample 2.

* In case the integers are also negative, another bit-array can be used to check for negatives, and then both
iterated one-after-another to produce result. Check code sample 2.
* In case the integers are also negative, another bit-array can be used to check
for negatives, and then both iterated one-after-another to produce result. Check
code sample 2.

* To further reduce the memory consumption, one can make use of sparsed-bit-arrays. This can lead to huge drop
in memory consumption if the integers are spaced apart a lot. Check code sample 2. Refer to [brettwooldridge/SparseBitSet]
(https://github.com/brettwooldridge/SparseBitSet) for one such implementation.
* To further reduce the memory consumption, one can make use of sparsed-bit-arrays.
This can lead to huge drop in memory consumption if the integers are spaced apart
a lot. Check code sample 2. Refer to [brettwooldridge/SparseBitSet](https://github.com/brettwooldridge/SparseBitSet)
for one such implementation.

* In case the integer array contains duplicates, use a small `short` array than the `boolean` array to hold the
number of times an integer has been seen, thus still sorting in `O(N)` time.
* In case the integer array contains duplicates, use a small `short` array than
the `boolean` array to hold the number of times an integer has been seen, thus
still sorting in `O(N)` time.

### Code Sample 2

Expand Down

0 comments on commit ee12032

Please sign in to comment.