School of Computer Science
Technical Workshop Series: Information of Retrieval Techniques
Presenter: Zahra Taherikhonakdar
Date: Wednesday, November 29th 2:00pm – 3:00pm
Location: 4th Floor (Workshop space) at 300 Ouellette Avenue (School of Computer Science Advanced Computing Hub)
LATECOMERS WILL NOT BE ADMITTED once the presentation has begun.
Abstract:
Information Retrieval (IR) is finding material (usually documents) of (untrusted nature (usually text) that satisfies an information need from within large collection. These days we frequently think first of web search, but there are many other cases: web search, Searching your Laptop, Corporate knowledge bases, Legal information retrieval. An information retrieval process begins when a user or searcher enters a query into the system. Queries are formal statements of information needs, for example search strings in web search engines. In information retrieval a query does not uniquely identify a single object in the collection. Instead, several objects may match the query, perhaps with different degrees of relevance.
An object is an entity that is represented by information in a content collection or database. User queries are matched against the database information. However, as opposed to classical SQL queries of a database, in information retrieval the results returned may or may not match the query, so results are typically ranked. This ranking of results is a key difference of information retrieval searching compared to database searching.[2]
Workshop Outline:
In this workshop I will going to introduce the techniques that are used in IR:
- Introducing ranked retrieval
- Scoring with the Jaccard coefficient
- Term frequency weighting
- Inverse document frequency weighting
- The vector space model
- Calculating TF IDF cosine scores
Prerequisites:
Computer Science knowledge
Biography:
I am a PhD student at University of Windsor. My research is in the area of Information Retrieval. Particularly My research is about how improve query refinement as a technique to make search engines to retrieve the most related documents based on users initial query.