Linguistics Library
The functions in this library provide simple linguistics processing capabilities.
Functions
clinical-context
(concept: String, sentence: String) →
Determines the context of a concept within a sentence. The returned vector contains the following elements in order:
- concept
- sentence
- negation context (“affirmed”, “negated”, “possible”)
- temporality context (“recent”, “hypothetical”, “historical”)
- experiencer context (“patient”, “other”)
Examples
clinical-context("pneumonia", "The patient denied a history of pneumonia.") =
< "pneumonia",
"The patient denied a history of pneumonia.",
"negated",
"historical",
"patient"
>
Implementation
This is implemented internally using Lingua::Context.
classify-text-language
(text: String, languages -> «en») → String
Classifies the text as one of the listed languages. Returns the most likely language.
Implementation
This is implemented internally using Lingua::YALI.
sentences
(text: String) →
Splits a text into a vector of sentences. Uses a list of common abbreviations for the text’s language to avoid breaks in the middle of sentences.
Examples
sentences("The big black bug bit the big black bear. Suzy sold seashells by the sea shore. The lazy dog jumped over the crazy cow.") =
< "The big black bug bit the big black bear.",
"Suzy sold seashells by the sea shore.",
"The lazy dog jumped over the crazy cow."
>
Implementation
This is implemented internally using Lingua::Sentence.
stop-words
(language: String) →
Lists the default stop words for the given language. Available languages are listed in stop-word-languages
.
Implementation
This is implemented internally using Lingua::StopWords.
Streams
language-classifier-languages
A vector of languages recognized by the language classifier.
Implementation
This is implemented internally.
stop-word-languages
A vector of languages for which stop word lists are available.
Implementation
<<da nl en fi fr de hu it no pt es sv ru>>