๐ง Contemplating a True Set
๐ง Equivalence in JavaScript: Contemplating a True Set
In JavaScript, equivalence behaves differently depending on whether you're working with primitive or non-primitive types.
For primitives—such as numbers, booleans, strings, and symbols—equivalence is determined by comparing actual values. If two variables hold the same value, they are considered equivalent.
However, for objects and arrays, equivalence is by reference. Two arrays [0] and [0] are not considered equal because they are separate instances, even though they contain identical content.
๐ฌ Snippet 1: Equivalence by Value and by Reference
new Set([0, 0]); // => Set { 0 }
new Set([[0], [0]]); // => Set { [0], [0] }
JavaScript’s Set treats the two 0 values as equivalent but considers [0] and [0] as distinct because they are different object instances.
This reveals JavaScript’s minimalist view of equivalence: only primitives benefit from value-based equivalence.
But that's not the full story.
๐งฉ Contemplating a True Set
Even if Set implemented value-based equivalence for arrays or objects by deeply inspecting all their properties, it still wouldn’t capture the true essence of a Set.
Equivalence can go beyond mere value: it can be based on length, structure, content, or other contextually relevant attributes.
A "true set" should support any meaningful notion of equivalence.
๐ Snippet 2: Equivalence by Length
import { TrueSet } from "@fizzwiz/sorted";
const byLength = new TrueSet(equivalenceByLength);
byLength.add("what");
byLength.add("that"); // Considered duplicate: same length as "what"
In this example, "that" is considered equivalent to "what" because both have length 4.
But how is this equivalence logic passed to the set?
๐ ️ Defining Equivalence
Let’s explore three ways to define equivalence in code:
❌ Boolean Equality Function
A general approach is to define a boolean function:
function equals(a, b) { return ...; }
However, this requires comparing each item with every other—inefficient for large collections.
This approach is often paired with:
function hashCode(x) { return ...; }
The hashCode() function narrows comparisons by assigning each item to an integer that represents an equivalence class larger than the intended equivalence equals(a, b). Such broader equivalence is often further enlarged by the modulo operation hashCode(x) % n. While this method is efficient, it’s conceptually disputable. Why not define the intended equivalence directly, using its natural representation function?
✅ Representation Function
A clean and conceptually elegant solution uses a repr(item) function directly to map an item to a representative of its equivalence class.
function repr(word) {
return word.length;
}
Here, words of the same length are considered equivalent.
However, we still need to define how to compare two representations.
✅ Comparator Function
A numeric compare(a, b) function defines equivalence by returning 0 for equivalent values:
function compare(a, b) {
return a < b ? -1 : a > b ? 1 : 0;
}
This enables efficient O(log n) comparisons, removing the need for auxiliary hashing. Comparators are composable, making them powerful for sorting and equivalence alike.
๐ ️ Multivariate Representation
Since the value returned by the representation function is not relevant as long as it is the same for all equivalent items, we can assume — without loss of generality — the representation function returns arrays of values. This makes it ideal for expressing compound equivalences.
๐ฆ Snippet 3: Equivalence by Length and First Character
import { TrueSet } from "@fizzwiz/sorted";
const repr = item => [item.length, item[0]];
const byLengthAndFirstCharacter = new TrueSet(repr);
byLengthAndFirstCharacter.add("what");
byLengthAndFirstCharacter.add("that"); // -> true
๐ ️ TrueSet as a Sorted Collection of Entries
From this perspective, a TrueSet of items appears to be implementable as a collection of entries in the form:
[repr(item), item]
The constructor should accepts:
- A
reprfunction - A
keyComparatorfunction for comparing keys in the representation - A
itemComparatorfunction for comparing items within each equivalence class.
The last comparator would determine whether two items with the same representation are allowed to coexist.
๐ฆ Snippet 4: A Rich Set (Multiple Equivalent Items)
import { TrueSet } from "@fizzwiz/sorted";
const repr = item => item.length;
const byLength = new TrueSet(repr); // by default, duplicate items are allowed within
byLength.add("what");
byLength.add("that"); // -> true
Here, equivalent items can coexist and be enumerated.
๐ ️ Classifier as a Rich Map
Therefore, the TrueSet is a specialized case of a more general data structure: the Classifier as a rich Map allowing multivariate keys and multiple values per key.
import { Classifier } from "@fizzwiz/sorted";
const classifier = new Classifier();
classifier.add([4, "w"], "what");
classifier.add([4, "t"], "that");
classifier.add([3, "w"], "who");
classifier.get([4]); // -> "what", "that"
classifier.get([undefined, "w"]); // -> "who", "what"
๐งต Consequences of This Approach
- ๐ก Representation-based equivalence reduces to simple comparators.
- ✨ Collections of items are intrinsically sorted.
- ✅ Any iterable can represent an item as long as a comparator exists for its components.
๐งญ What’s Next
In future posts, we’ll explore the internal mechanics of Classifier and TrueSet, as well as advanced applications.
Comments
Post a Comment