Unifying your spelling dictionaries with merge-dictionaries

If you work across multiple IDEs, you’ve probably noticed that each one maintains its own spelling dictionary. Add a word in PyCharm, and PhpStorm still underlines it. Teach Hunspell a word, and your IDE doesn’t know it.

That’s where merge-dictionaries comes in.

This tool automatically discovers your dictionaries, extracts the words, and merges them into a single unified set. You can:

  • Merge dictionaries with --merge
  • Extract words safely with --extract
  • Format output for Hunspell or IDE-specific formats with --format
  • Delete unwanted words with --delete-words word1 word2 …

You can even sync your dictionary with a Git repository, ensuring consistency across machines and teammates.

Getting started

Install with pip

$ pip install merge-dictionaries

Example: merge all your IDE dictionaries

$ merge-dictionaries --merge

That command will find your dictionaries in known paths, sync them all.
It will also publish on Git if so configured.

Example: build a Hunspell personal dictionary

$ merge-dictionaries --extract > ~/.hunspell_default

Did you know? Hunspell is the spell checker of vim, LibreOffice / OpenOffice, Firefox, gedit, Eclipse, Scribus, Texmaker, etc.

Example: sync with Git

Create a repository. Keep the name simple, like dictionary and write a $HOME/.config/merge-dictionaries.conf file with a YAML list of repositories:

git:
  - git:@github.com:luser/dictionary.git

Tip: you can also use a dotfiles-like repository: merge-dictionaries will add a dictionary.txt file in the top folder.

Tip: to help you know what word come from what machine, the sync commit includes the name of the machine you run the script.

Example: delete unwanted words

We understand it can be tricky to delete a word from a dictionary if at each time we run the merge command, the words are restored from Git or if you forgot you moved from CLion to RustRover, but the CLion dictionary is still there.

So the –delete-words command will find any local dictionary, your Git repository if configured and will remove the specified words from there.

$ merge-dictionaries --delete-words teh alowed

Current state of the project and future plan

merge-dictionaries currently supports:

  • All JetBrains IDEs (application-level dictionary)
  • Hunspell personal dictionaries
  • Git repositories for syncing

If you want to contribute, I’ve documented in the project README where to add new back-end to support more formats. The project is open source, released under BSD-2-Clause license.

This project is supported by the Nasqueron open source project, as part of the Nasqueron development tools.

Why it matters

Having consistent spelling across tools improves focus, reduces duplication, and eliminates those “why is this word still red?” moments.

It also avoids you to add on every new machine your name, the name of your project or company, your username, method names like strlen, etc.


Links

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.