Package: textpress 1.1.1

Jason Timm

textpress: A Lightweight and Versatile NLP Toolkit

An R toolkit for building text corpora and searching them. No custom object classes, just plain data frames from start to finish. Covers the full arc from URL to retrieved passage through a consistent four-step API: Fetch, Read, Process, Search. Traditional tools (KWIC, BM25, dictionary matching) sit alongside modern ones (semantic search, LLM-ready chunking), all compatible with the native R pipe ('|>').

Authors:Jason Timm [aut, cre]

textpress_1.1.1.tar.gz
textpress_1.1.1.zip(r-4.7)textpress_1.1.1.zip(r-4.6)textpress_1.1.1.zip(r-4.5)
textpress_1.1.1.tgz(r-4.6-any)textpress_1.1.1.tgz(r-4.5-any)
textpress_1.1.1.tar.gz(r-4.7-any)textpress_1.1.1.tar.gz(r-4.6-any)
textpress_1.1.1.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
textpress/json (API)
NEWS

# Install 'textpress' in R:
install.packages('textpress', repos = c('https://jaytimm.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/jaytimm/textpress/issues

Pkgdown/docs site:https://jaytimm.github.io

On CRAN:

Conda:

corpus-searchnlpweb-scraping

3.78 score 3 stars 6 scripts 498 downloads 18 exports 31 dependencies

Last updated from:33c73e76ae. Checks:7 WARNING, 2 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64WARNING170
source / vignettesOK158
linux-release-x86_64WARNING164
macos-release-arm64WARNING133
macos-oldrel-arm64WARNING136
windows-develWARNING79
windows-releaseWARNING95
windows-oldrelWARNING73
wasm-releaseOK108

Exports:abbreviationsdict_generationsdict_politicalfetch_urlsfetch_wiki_refsfetch_wiki_urlsnlp_cast_tokensnlp_index_tokensnlp_roll_chunksnlp_split_paragraphsnlp_split_sentencesnlp_tokenize_textread_urlssearch_dictsearch_indexsearch_regexsearch_vectorutil_fetch_embeddings

Dependencies:askpassclicpp11curldata.tablegenericsgluehttrjsonlitelatticelifecyclelubridatemagrittrMatrixmimeopensslpbapplypillarpkgconfigR6rlangrvestselectrstringistringrsystibbletimechangeutf8vctrsxml2