Bags
A bag is a data structure that can contain the same element multiple times which is why it is often also called a multiset. The order of adding elements is not necessarily given, this depends on the implementation. Common operations on a bag are adding elements, removing elements and searching for a specific element.
Implementing a Bag
One of the simplest ways of implementing data structures is by using arrays. When implementing a data structure the time complexities can be different on whether the data is always in a sorted state or not.
// TODO
When implementing a sorted collection in Java you can either implement your own binary search or you can use java.util.Arrays.binarysearch(a, from, to, key)
which returns the index of the key, if it is contained and otherwise with insertion point being the point where the key would be inserted, i.e the index of the first element greater than the key.
// TODO
Time Complexities
Operation | UnsortedBag | SortedBag |
---|---|---|
add(E e) | no search, or shift | search + shift right |
search(Object o) | linear search | binary search |
remove(Object o) | search + remove | search + shift left |
Ideal use case | When adding a lot | When searching a lot |
Bag of Words
What is a bag of words? How is it used in NLP?