Branch | Status | Coverage |
---|---|---|
master |
go get -u github.com/OpenSystemsLab/stopwords
import (
"fmt"
"github.com/OpenSystemsLab/stopwords"
)
func main() {
// Register a language first
stopwords.RegisterLanguage("fr")
// Check if a word is a stopword
fmt.Print(stopwords.IsStopWord("fr", "avec")) // true
}
The library now includes a Registry system that allows you to load only the languages you need, saving memory and improving performance. This is especially useful when working with multiple languages or in memory-constrained environments.
package main
import (
"fmt"
"log"
"github.com/OpenSystemsLab/stopwords"
)
func main() {
// Register a single language
err := stopwords.RegisterLanguage("en")
if err != nil {
log.Fatal(err)
}
// Check stopwords
fmt.Printf("Is 'the' a stopword? %t\n", stopwords.IsStopWord("en", "the"))
fmt.Printf("Is 'car' a stopword? %t\n", stopwords.IsStopWord("en", "car"))
}
// Register multiple languages at once
err := stopwords.RegisterLanguages("en", "fr", "es", "de")
if err != nil {
log.Fatal(err)
}
// Now you can check stopwords in any registered language
fmt.Printf("English: %t\n", stopwords.IsStopWord("en", "the"))
fmt.Printf("French: %t\n", stopwords.IsStopWord("fr", "le"))
fmt.Printf("Spanish: %t\n", stopwords.IsStopWord("es", "el"))
fmt.Printf("German: %t\n", stopwords.IsStopWord("de", "der"))
// Check which languages are currently loaded
loaded := stopwords.LoadedLanguages()
fmt.Printf("Loaded languages: %v\n", loaded)
// Check if a specific language is loaded
fmt.Printf("Is English loaded? %t\n", stopwords.IsLanguageLoaded("en"))
// Unregister a language to free memory
stopwords.UnregisterLanguage("de")
// Clear all loaded languages
stopwords.Clear()
For isolated use cases, you can create your own registry:
// Create a custom registry
customRegistry := stopwords.NewRegistry()
// Load languages only in this registry
err := customRegistry.RegisterLanguage("ja")
if err != nil {
log.Fatal(err)
}
// Use the custom registry
fmt.Printf("Japanese stopword: %t\n", customRegistry.IsStopWord("ja", "の"))
// The default registry remains unaffected
fmt.Printf("Japanese in default registry: %t\n", stopwords.IsStopWord("ja", "の"))
Get a list of all supported languages:
supported := stopwords.GetSupportedLanguages()
fmt.Printf("Total supported languages: %d\n", len(supported))
fmt.Printf("Languages: %v\n", supported)
The library supports 60+ languages including English, French, Spanish, German, Japanese, Chinese, Arabic, and many more.
RegisterLanguage(lang string) error
- Load a single languageRegisterLanguages(langs ...string) error
- Load multiple languagesIsStopWord(lang, word string) bool
- Check if a word is a stopwordIsLanguageLoaded(lang string) bool
- Check if a language is loadedLoadedLanguages() []string
- Get list of loaded languagesUnregisterLanguage(lang string)
- Remove a language from memoryClear()
- Remove all languages from memoryGetSupportedLanguages() []string
- Get all supported language codes
- Memory Efficient: Only load the languages you need
- Thread Safe: All operations are protected by read/write mutexes
- Flexible: Use the default registry or create custom ones
- Error Handling: Proper error handling for unsupported languages
- Performance: Faster startup and reduced memory footprint
stopwords is licensed under the MIT license.