Show HN: FlashTokenizer – 10x faster C++ tokenizer for Python(https://github.com/NLPOptimize/flash-tokenizer)5 points|springkim|11 days ago|0 commentsI built a tokenizer in C++ with a Python binding that outperforms HuggingFace tokenizers by 10x on large inputs. It's optimized for minimal memory usage and latency.Benchmarks and comparison included in README. Would love feedback or contributions!
Benchmarks and comparison included in README. Would love feedback or contributions!