--- title: "MeSH Tables" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{MeSH Tables} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( message = FALSE, warning = FALSE, comment = "#>" ) ``` `puremoe` ships three MeSH reference tables: a thesaurus of descriptors and entry terms, a tree of hierarchical classifications, and a bundled table of MeSH annotation counts per descriptor across PubMed. ```{r libs} library(puremoe) library(dplyr) library(DT) ``` ## MeSH thesaurus `data_mesh_thesaurus()` downloads and combines the MeSH Descriptor Thesaurus and Supplementary Concept Records (SCR). One row per term, including synonyms and entry terms for each descriptor. ```{r thesaurus} thesaurus <- puremoe::data_mesh_thesaurus() ``` ```{r thesaurus-table} thesaurus |> head(20) |> DT::datatable(rownames = FALSE, options = list(scrollX = TRUE)) ``` ## MeSH trees `data_mesh_trees()` provides the hierarchical classification structure. Each descriptor can appear in multiple branches; `tree_location` encodes the full path (e.g., `I01.880.604` = Social Sciences > Political Science > Political Systems). ```{r trees} trees <- puremoe::data_mesh_trees() ``` ```{r trees-table} trees |> head(20) |> DT::datatable(rownames = FALSE, options = list(scrollX = TRUE)) ``` ## MeSH descriptor frequencies `data_mesh_frequencies` is a bundled dataset giving the annotation frequency of each MeSH descriptor across the full PubMed corpus (39.7 M PMIDs, April 2026). Counts reflect the number of records indexed with each descriptor by NLM curators, not text frequency, making them suitable as a baseline for enrichment analyses against arbitrary PubMed subsets. ```{r frequencies} puremoe::data_mesh_frequencies |> head(20) |> dplyr::mutate(prop_total = round(prop_total, 4)) |> DT::datatable(rownames = FALSE, options = list(scrollX = TRUE)) ``` ## Persistent storage The downloaded MeSH thesaurus and tree tables are fetched from GitHub on each call by default. To avoid re-downloading every session, set `use_persistent_storage = TRUE`; the files are cached to a system data directory and reused on subsequent calls. `data_mesh_frequencies` is bundled with the package and does not need to be downloaded. ```{r persistent, eval=FALSE} thesaurus <- puremoe::data_mesh_thesaurus(use_persistent_storage = TRUE) trees <- puremoe::data_mesh_trees(use_persistent_storage = TRUE) ```