Manually set up nltk-related data to start the toolkit

Error: nltk resource not found

T Miyamoto
1 min readDec 31, 2020

When you start to use nltk module, you need to download and install the data relating to the module. For example

>>> import nltk
>>> nltk.download('punkt')

Then, if you have the error messages like

[nltk_data] Error loading punkt: <urlopen error [SSL:
[nltk_data] CERTIFICATE_VERIFY_FAILED] certificate verify failed:
[nltk_data] unable to get local issuer certificate

you need to go to this website and manually download models. In this case, you need to unzip the downloaded .zip files, and you move the extracted files into directory /usr/local/share/nltk_data/tokenizers/ , provided that you use MacOS.

Similarly, if you run

>>> from nltk.book import *

and get the error message like “Resource gutenberg not found”, “Resource genesis not found”, “Attempted to load corpora/genesis”, “Resource inaugural not found”, “Attempted to load corpora/inaugural”, “Resource nps_chat not found”, “Attempted to load corpora/nps_chat”, “Resource webtext not found”, “Attempted to load corpora/webtext”, “Resource treebank not found”, “Attempted to load corpora/treebank/combined”, then you deal with the issue by manually install those nltk_data and put them into /usr/local/share/nltk_data/corpora/.

--

--