puremoe ships three
MeSH reference tables: a thesaurus of descriptors and entry terms, a
tree of hierarchical classifications, and a bundled table of MeSH
annotation counts per descriptor across PubMed.
data_mesh_thesaurus() downloads and combines the MeSH
Descriptor Thesaurus and Supplementary Concept Records (SCR). One row
per term, including synonyms and entry terms for each descriptor.
data_mesh_trees() provides the hierarchical
classification structure. Each descriptor can appear in multiple
branches; tree_location encodes the full path (e.g.,
I01.880.604 = Social Sciences > Political Science >
Political Systems).
data_mesh_frequencies is a bundled dataset giving the
annotation frequency of each MeSH descriptor across the full PubMed
corpus (39.7 M PMIDs, April 2026). Counts reflect the number of records
indexed with each descriptor by NLM curators, not text frequency, making
them suitable as a baseline for enrichment analyses against arbitrary
PubMed subsets.
The downloaded MeSH thesaurus and tree tables are fetched from GitHub
on each call by default. To avoid re-downloading every session, set
use_persistent_storage = TRUE; the files are cached to a
system data directory and reused on subsequent calls.
data_mesh_frequencies is bundled with the package and does
not need to be downloaded.