Implement several neural information retrieval methods

Publish By: Admin,

Last Updated: 17-Oct-23

Price: $120

Information Retrieval and Web Search

Aim

Project aim: The aim of this project is for you to implement several neural information retrieval methods, evaluate them and compare them in the context of a multi-stage ranking pipeline.

The specific objectives of Part 2 are to:

Set up your infrastructure to index the collection and evaluate queries. Implement neural information retrieval models (only inference).
Examine your ability to perform evaluation and analysis when different neural models are used.

The Information Retrieval Task: Web Passage Ranking

As in part 1 of the project, in part 2 we will consider the problem of open-domain passage ranking in answer to web queries. In this context, users pose queries to the search engine and expect answers in the form of a ranked list of passages (maximum 1000 passages to be retrieved).

The provided queries are actual queries submitted to the Microsoft Bing search engine. There are approximately 8.8 million passages in the collection, and the goal is to rank them based on their relevance to the queries.

What we provide you with:

Files from practical

A collection of 8.8 million text passages extracted from web pages ( collection.tsv - provided in Week 1).
Pytorch file for ANCE model(refer to week10-prac ).

Standard DPR model, use BertModel.from_pretrained("ielabgroup/StandardBERT- D R").eval() to load this model.
Extra files for this project

A query dev file that contains 30 queries for you to perform retrieval experiments
A query dev file that contains 30 queries (same query ids with previous one, but with typos in the query
text ( data/dev_typo_queries.tsv )

A qrel file that contains relevance judgements for you that can be used to tune your methods for dev queries( data/dev.qrels ).

A leaderboard system for you to evaluate how well your system performs.

A test query file that contains 60 queries for you to generate run files to submit to the leaderboard

This jupyter notebook, which you will include inside your implementation, evaluation and report.

An hdf5 file that contains TILDEv2 pre-computed terms weights for the collection. Download from this link Typo-aware DPR model, use BertModel.from_pretrained("ielabgroup/StandardBERT-DR- a ug").eval() to load this model.

Put this notebook and the provided files under the same directory.

What you need to produce
You need to produce:

Correct implementations of the methods required by this project`s specifications.

An explanation of the retrieval methods used, including the formulas that represent the models you implemented and the code that implements that formula, an explanation of the evaluation settings followed, and a discussion of the findings. Please refer to the marking sheet to understand how each of these requirements is graded.

You are required to produce both of these within this jupyter notebook.

Required methods to implement
In Part 2 of the project, you are required to implement the following retrieval methods as two-stage ranking pipelines (bm25 + one dense retriever). All implementations should be based on your code (except for BM25, where you can use the Pyserini built-in SimpleSearcher).
. ANCE Dense Retriever: Use ANCE to re-rank BM25 top-k documents. See the practical in Week 10 for background information.
. Standard DPR Dense Retriever: Use standard DPR to re-rank BM25 top-k documents. See the practical in Week 10 for background information.
. Typo-aware DPR Dense Retriever: typo-aware DPR is a DPR model that is fine-tuned with augumented typos in the training samples, please use this model (provided in the project) to re-rank BM25 top-k documents, the inference is the same to standard DPR Dense Retriever.
. TILDEv2: Use TILDEv2 to re-rank BM25 top-k documents. See the practical in Week 10 for background
information.
For TILDEv2, unlike what you did in practical, we offer you the pre-computed term weights for the whole collection (for more details, see the Initial packages and functions cell). This means you can have a fast re-ranking speed for TILDEv2. Use this advantage to trade off effectiveness and efficiency for your ranking pipeline implementation.

You should have already attempted many of these implementations above as part of the computer prac exercises.

Required evaluation to perform

In Part 2 of the project, you are required to perform the following evaluation: we consider two types of queries, one of which contains typos (i.e. typographical mistakes, like writing iformation for information , and another one with the typos resolved. An important aspect of the evaluation in the project is to compare the retrieval behaviour of search methods on queries with and without typos (note this is the same as project part 1).

. For all methods, evaluate their performance on data/dev_typo_queries.tsv (queries with typos) and data/dev_queries.tsv (the same queries, but typos are corrected), using data/dev.qrels with four evaluation metrics (see below).

. Report every method`s effectiveness and efficiency (average query latency) on the data/dev_queries.tsv (no need for typo queries) and the corresponding cut-off k for reranking into a table. Perform statistical significance analysis across the results of the methods and report them in the tables.. Produce a gain-loss plot that compares the most and least effective ones of the four required methods above in terms of on data/dev_typo_queries.tsv .. Comment on trends and differences observed when comparing your findings.

Does the typo-aware DPR model outperform the others on the data/dev_typo_queries.tsv queries?

When evaluating the data/dev_queries.tsv queries, is there any indication that this model loses its effectiveness?

Is this gain/loss statistically significant? (remember to perform a t-test as well for this task).

(optional) submit your runs on the data/test_queries.tsv based on your implemented methods from the dev sets to the leaderboard system (not counted in your mark for this assignment, but the top-ranked student on the leaderboard could request for a recommendation letter from Professor Guido Zuccon).

Implement several neural information retrieval methods

Total: GBP120

Company

Services

Services

Useful Links

Secure Payment