OffsetFusionFuzzy (Distance-Tolerant Fusion)
Idea
OffsetFusionFuzzy merges boundaries that are close in character space.
Useful when engines produce slightly misaligned offsets.
Method
- Collect all boundaries.
- Sort them.
- Merge boundaries within
max_distancecharacters. - Convert merged boundaries to segments.
Parameter
max_distance: int(default:1) Maximum character distance for merging.
Effect
- Tolerant to minor offset drift.
- Prevents fragmentation due to near-identical boundaries.
Recommended for
- LLM offset-based engines.
- Mixed statistical + LLM pipelines.