TECHNICAL WORKSHOP SERIES - Information Retrieval Techniques - Part 2 (2nd Offering) by: Zahra Taherikhonakdar

Wednesday, June 26, 2024 - 15:30

Technical Workshop Series

Information Retrieval Techniques - Part 2 (2nd Offering)

 

Presenter: Zahra Taherikhonakdar

Date: Wednesday, June 26th, 2024

Time:  3:30 PM

Location: 4th Floor (Lecture Space) at 300 Ouellette Avenue (School of Computer Science Advanced Computing Hub)

 

Reminders: This is the 2nd part of a 2-part workshop. If you previously registered for Part 1 (2nd Offering), you were automatically registered for Part 2 (2nd Offering). 

Note for MAC students only: MAC students must attend both parts to get any points (no points will be given if you only attend one).

 

Abstract: Information Retrieval (IR) is finding material (usually documents) of (an untrusted nature (usually text) that satisfies an information need from within an extensive collection. These days, we frequently think of web search first, but there are many other cases:  web search, searching your laptop, corporate knowledge bases, and legal information retrieval. An information retrieval process begins when a user or searcher enters a query into the system. Queries are formal statements of information needs, such as search strings in web search engines. In information retrieval, a query does not uniquely identify a single object in the collection. Instead, several objects may match the query, perhaps with different degrees of relevance. 

An object is an entity that is represented by information in a content collection or database. User queries are matched against the database information. However, as opposed to classical SQL queries of a database, in information retrieval the results returned may or may not match the query, so results are typically ranked. This ranking of results is a key difference of information retrieval searching compared to database searching.

Workshop Outline:

In this workshop I will going to introduce the techniques that are used in IR:

  • Introducing ranked retrieval
  • Scoring with the Jaccard coefficient
  • Term frequency weighting
  • Inverse document frequency weighting
  • The vector space model
  • Calculating TF IDF cosine scores

 

Prerequisites:

Computer Science knowledge.

 

Biography: 

Zahra is a PhD student at the University of Windsor. Her research is in the area of Information Retrieval, particularly about how to improve query refinement as a technique to make search engines retrieve the most related documents based on users’ initial query.