# Field model score problem

## Details

• Type: Bug
• Status: Open
• Priority: Minor
• Resolution: Unresolved
• Affects Version/s: 3.5
• Fix Version/s: None
• Component/s:
• Labels:
None

## Description

Hello

I think there is a problem in the filed model in the score calculation, for example if we index 2 fields. (first field simple terms, second filed bigram)
If we set the parameter w.1 of the filed 2 ( bigram) to zero , i think here we only calculate the score for simple terms, because the bigram score is set to zero, the probelm is that we don't find the same score as if we index only simple terms.

P(Q|D)= w.0 P(ti/D)+ w.1 P(ti_tj|D) , in this example if we set W.1 we will not find the same score as if we calculate this:

P(Q|D)=W.0 P(ti/D)

so i think there is a probelm

## Activity

Hide
Craig Macdonald added a comment -

which field model?

Craig

Show
Craig Macdonald added a comment - which field model? Craig
Hide
chedi bechikh added a comment -

I tested the BM25F model, but I think the problem is in the way of adding the scores of the two fields.
Please correct me, if i set the second field score to zero, I must have the same score as if I use only single term query and index?
Thank you

Show
chedi bechikh added a comment - I tested the BM25F model, but I think the problem is in the way of adding the scores of the two fields. Please correct me, if i set the second field score to zero, I must have the same score as if I use only single term query and index? Thank you
Hide
Craig Macdonald added a comment -

For BM25F, I don't think this would be the case, as the Nt (the number of documents in which the term appears) does not count occurrences in the different fields. It might work for PL2F, as we record F (the number of occurrences in each field) separately for fields.

Show
Craig Macdonald added a comment - For BM25F, I don't think this would be the case, as the Nt (the number of documents in which the term appears) does not count occurrences in the different fields. It might work for PL2F, as we record F (the number of occurrences in each field) separately for fields.
Hide
chedi bechikh added a comment -

Thank you craig

I make the same run with the PL2F model and the MAP was :
1) with tow fields ( unigram field and bigram field with w.0=1 and W.1 =1) the MAP =0.1922
2) with w.0=1 and w.1=0 the map= 0.1904
3)if i use only the unigram with the query the MAP was 0.2174 (w.0=1 and w.1=1)
4) the same query with (w.0=1 and w.1=0) the MAP 0.2204

Now i tested the PL2 with only unigram the MAP=.2172

the result are confusing because i think if the w.1=0 we must retrieive the same as the PL2 and not 0.1904

Best regards

Show
chedi bechikh added a comment - Thank you craig I make the same run with the PL2F model and the MAP was : 1) with tow fields ( unigram field and bigram field with w.0=1 and W.1 =1) the MAP =0.1922 2) with w.0=1 and w.1=0 the map= 0.1904 3)if i use only the unigram with the query the MAP was 0.2174 (w.0=1 and w.1=1) 4) the same query with (w.0=1 and w.1=0) the MAP 0.2204 Now i tested the PL2 with only unigram the MAP=.2172 the result are confusing because i think if the w.1=0 we must retrieive the same as the PL2 and not 0.1904 Best regards

## People

• Assignee:
Craig Macdonald
Reporter:
chedi bechikh
• Watchers:
0 Start watching this issue

## Dates

• Created:
Updated: