
arrow - Integration to 'Apache' 'Arrow'
'Apache' 'Arrow' <https://arrow.apache.org/> is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. This package provides an interface to the 'Arrow C++' library.
Last updated 10 hours ago
arrowcurlopensslcpp
19.33 score 15k stars 82 dependents 11k scripts 346k downloadsstringdist - Approximate String Matching, Fuzzy Text Search, and String Distance Functions
Implements an approximate string matching version of R's native 'match' function. Also offers fuzzy text search based on various string distance measures. Can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences. This package is built for speed and runs in parallel by using 'openMP'. An API for C or C++ is exposed as well. Reference: MPJ van der Loo (2014) <doi:10.32614/RJ-2014-011>.
Last updated 3 months ago
openmp
15.54 score 327 stars 179 dependents 2.0k scripts 67k downloadspdftools - Text Extraction, Rendering and Converting of PDF Documents
Utilities based on 'libpoppler' <https://poppler.freedesktop.org> for extracting text, fonts, attachments and metadata from a PDF file. Also supports high quality rendering of PDF documents into PNG, JPEG, TIFF format, or into raw bitmap vectors for further processing in R.
Last updated 14 days ago
pdf-filespdf-formatpdftoolspopplerpoppler-librarytext-extractioncpp
13.11 score 529 stars 48 dependents 3.3k scripts 29k downloadsrocnp - Work with Romanian Personal Numeric Codes PNC / CNP
A set of tools for working with Romanian personal numeric codes. The core is a validation function which applies several verification criteria to assess the validity of numeric codes. This is accompanied by functionality for extracting the different components of a personal numeric code. A personal numeric code is issued to all Romanian residents either at birth or when they obtain a residence permit.
Last updated 3 years ago
2.70 score 2 scripts 148 downloads