Native-Category Flattening¶
Core Idea¶
A source system carries its own meaning-bearing partition of the world. When an external observer recodes it into a foreign taxonomy without first preserving the source partition, the source's distinctions are silently collapsed — cells kept apart are merged, cells kept together split — and the residue is reported as if it were original. The failure is not classification but premature commitment to a foreign partition, and the loss is irrecoverable from the recoded labels.
How would you explain it like I'm…
Squishing My Groups
Foreign Boxes Crush The Real Ones
Whose Partition Wins
Broad Use¶
- Ethnography: coding participants' phenomena into an a-priori codebook erases the kin-term, illness, or moral distinctions they maintain.
- Clinical coding: a patient's narrative of distress is recoded into diagnostic codes, losing their own "good days / the wave" partition.
- Translation and NLP: source-language distinctions of aspect, evidentiality, or kinship collapse into target-language defaults.
- Database interoperability: a source's category values without target analogues are bucketed into "other," losing distinctions downstream uses depended on.
- Colonial administration: state taxonomies for caste, tribe, or occupation overwrite the lived partitions populations use.
- Machine learning: annotators apply a fixed label set to data carrying finer native cuts, constraining every downstream model.
Clarity¶
Separates classification (necessary) from premature classification into a foreign partition (avoidable), turns the question from "what bucket does this go in?" to "did the source already have a bucket, and have I preserved it?", and itemizes the cost as specific merges and splits.
Manages Complexity¶
Decomposes a single "coding" act into two engineerable stages — preserve the source partition first, then map deliberately and reversibly — converting an opaque irreversible collapse into an inspectable transformation.
Abstract Reasoning¶
Exposes an asymmetry between cheap labels and expensive partitions (once lost, the cut is gone from the data), and a feedback loop in which an imposed taxonomy reshapes the population's self-description so later measurement "finds" it as a self-fulfilling artifact.
Knowledge Transfer¶
- Across substrates: the roles correspond (source partition, external partition, recoding step), so the same moves transfer — carry source codes alongside analyst codes, build a documented reversible mapping, audit "other"-bucket heterogeneity, defer commitment.
- Ethnography → ML/clinical: the preserve-first-then-map fix transfers without modification from kinship coding to dataset annotation to chart coding.
Example¶
Coding interviews about kinship into an English-based codebook merges the native mother's-brother / father's-brother contrast into "uncle" and splits a single native cousin-category across several English types — and once coded "uncle," no downstream analysis can recover which native category was meant.
Relationships to Other Primes¶
Parents (2) — more general patterns this builds on
- Native-Category Flattening is a kind of, typical Translation and Conceptual Bridging — It is a pathology of translating one category scheme into another — lossy, asymmetric, one-way. Owner picks classification vs translation lineage.
- Native-Category Flattening presupposes, typical Classification — The failure is a lossy recoding of a source's meaning-bearing partition into a foreign taxonomy; it presupposes a classification/recoding act and names its destructive (merge/split, irrecoverable) special case. Built on the recoding step.
Path to root: Native-Category Flattening → Classification
Not to Be Confused With¶
- Native-Category Flattening is not Segmentation because flattening destroys an already-drawn partition by recoding, whereas segmentation creates boundaries in an un-partitioned domain.
- Native-Category Flattening is not Decomposition because flattening is lossy and asymmetric (its labels cannot reconstruct the source cut), whereas decomposition's parts reassemble into the original.
- Native-Category Flattening is not Interleaving because flattening is the one-way overwriting of one partition by another, whereas interleaving is the alternation of coexisting streams that can be unwoven.