What is term frequency-inverse document frequency (TF-IDF)?

Information Retrieval Questions



44 Short 80 Medium 48 Long Answer Questions Question Index

What is term frequency-inverse document frequency (TF-IDF)?

Term frequency-inverse document frequency (TF-IDF) is a numerical statistic used in information retrieval to measure the importance of a term within a document or a collection of documents. It is calculated by multiplying the term frequency (the number of times a term appears in a document) by the inverse document frequency (the logarithmically scaled inverse fraction of documents that contain the term). TF-IDF helps to identify the relevance of a term in a document by giving higher weight to terms that appear frequently in a document but rarely in the entire collection of documents.