language-icon Old Web
English
Sign In

Bag-of-words model

The bag-of-words model is a simplifying representation used in natural language processing and information retrieval (IR). In this model, a text (such as a sentence or a document) is represented as the bag (multiset) of its words, disregarding grammar and even word order but keeping multiplicity. The bag-of-words model has also been used for computer vision. The bag-of-words model is a simplifying representation used in natural language processing and information retrieval (IR). In this model, a text (such as a sentence or a document) is represented as the bag (multiset) of its words, disregarding grammar and even word order but keeping multiplicity. The bag-of-words model has also been used for computer vision. The bag-of-words model is commonly used in methods of document classification where the (frequency of) occurrence of each word is used as a feature for training a classifier. An early reference to 'bag of words' in a linguistic context can be found in Zellig Harris's 1954 article on Distributional Structure.

[ "Machine learning", "Artificial intelligence", "Pattern recognition", "Natural language processing", "Image spam" ]
Parent Topic
Child Topic
    No Parent Topic