in development
pubtator_sentences() for mapping PubTator3 entity annotations to the
sentence in which they occur. The function uses PubTator's own passage text
and offsets for alignment, preserves empty PubTator title/abstract placeholder
rows for transparency, and returns a clean annotation table with
sentence_id and sentence columns.pubtator_cooccurrence() for entity co-occurrence counts from the
sentence-mapped table returned by pubtator_sentences(). Counts unordered
entity pairs within the same sentence (window = 0) or within window
sentences of each other, aggregated by entity type or by specific entity.
De-duplicates entities per sentence and drops same-entity pairs. With
evidence = TRUE, returns one row per co-occurrence instance with the joined
sentence context, so every count is traceable to concrete text.citation_snowball() for citation-based corpus expansion. Takes an icites
data.table and follows one-hop citation links using the NIH Open Citation
Collection data already embedded in every iCite response. Supports
max_nodes (hard ceiling on corpus size), direction, and min_links.
Returns a candidate table with seed flags and citation-link counts, and does
not make a second iCite call; pass snowball$pmid explicitly to
get_records() when metadata for the expanded corpus is needed.citation_network() for citation network analysis. Takes an
icites data.table from get_records() and returns a
named list with nodes (full iCite metadata as node attributes, including
relative_citation_ratio and is_clinical) and edges
(from_pmid, to_pmid), filtered to within-corpus pairs only. Output is
ready for igraph or tidygraph.get_records(endpoint = "pubtations") now includes PubTator passage text and
passage offsets in its raw output, allowing downstream sentence mapping to use
the same text that PubTator annotated.2026-04-21
data_mesh_frequencies, a bundled dataset of MeSH descriptor frequencies
across the full PubMed corpus (39.7 M PMIDs, April 2026). Columns DescriptorUI,
DescriptorName, n_pmids, and prop_total. Intended as a baseline for
MeSH term enrichment analyses.pmid_to_ftp() updated to use the PMC Cloud Service on AWS S3
(pmc-oa-opendata.s3.amazonaws.com) in response to NCBI's migration away from
the legacy PMC FTP Service (transition period February–August 2026; FTP
decommissioned August 2026). The function interface is unchanged.2026-01-26
pmid_to_ftp() to convert PMIDs to full-text download URLs for open-access PMC articles; pass $url to get_records(endpoint = 'pmc_fulltext').endpoint_info() to provide schema, columns, and rate limits for each endpoint.data_mesh_embeddings() function).2024-10-15
biocjson format from the API endpoint.2024-05-13
pmc_fulltext endpoint.