Package org.terrier.utility
Class Distance
- java.lang.Object
-
- org.terrier.utility.Distance
-
public class Distance extends java.lang.ObjectClass containing useful utility methods for counting the number of occurrences of two terms within windows, etc.- Since:
- 3.0
- Author:
- David Hannah and Craig Macdonald
-
-
Constructor Summary
Constructors Constructor Description Distance()
-
Method Summary
All Methods Static Methods Concrete Methods Deprecated Methods Modifier and Type Method Description protected static intcountTrue(boolean[] in)static intfindSmallest(int[] x, int[] y)Find smallest difference between two elements of two arraysstatic intnoTimes(int[][] blocksForEachTerm, int windowSize, int documentLengthInTokens)Counts number of blocks where all given terms occur within a block of windowSize in length, in a document of length documentLengthInTokens where the blocks for the terms are as givenstatic intnoTimes(int[] blocksOfTerm1, int[] blocksOfTerm2, int windowSize, int documentLengthInTokens)Counts number of blocks where two terms occur within a block of windowSize in length, in a document of length documentLengthInTokens where the blocks for the terms are as givenstatic intnoTimes(int[] blocksOfTerm1, int start1, int end1, int[] blocksOfTerm2, int start2, int end2, int windowSize, int documentLengthInTokens)Counts number of blocks where two terms occur within a block of windowSize in length, in a document of length documentLengthInTokens where the blocks for the terms are as givenstatic intnoTimesNEW(int[] term0Positions, int[] term1Positions, int windowSize, int documentLengthInTokens)Returns the number of windows that have the both terms occurring, in the order specified.static intnoTimesSameOrder(int[][] blocksOfAllTerms1, int documentLengthInTokens)Deprecated.static intnoTimesSameOrder(int[] term0Positions, int[] term1Positions, int windowSize, int documentLengthInTokens)static intnoTimesSameOrder(int[] posTerm1, int start1, int end1, int[] posTerm2, int start2, int end2, int windowSize, int documentLength)static intnoTimesSameOrderOLD(int[] blocksOfTerm1, int[] blocksofTerm2, int windowSize, int documentLengthInTokens)number of blocks wherestatic voidwindowsForTerms(int[] blocksOfTerm, int windowSize, int numberOfNGrams, int[] windows_for_term)Sets the number of occurrences of a term in each window, given the specified window size, the number of n-grams in the document, and the blocks of the term.static voidwindowsForTerms(int[] blocksOfTerm, int start, int end, int windowSize, int numberOfNGrams, int[] windows_for_term)Sets the number of occurrences of a term in each window, given the specified window size, the number of n-grams in the document, and the blocks of the term.
-
-
-
Method Detail
-
noTimes
public static final int noTimes(int[] blocksOfTerm1, int start1, int end1, int[] blocksOfTerm2, int start2, int end2, int windowSize, int documentLengthInTokens)Counts number of blocks where two terms occur within a block of windowSize in length, in a document of length documentLengthInTokens where the blocks for the terms are as given- Parameters:
blocksOfTerm1-start1- The start index for the correct blockIds in blocksOfTerm1end1- The end for the correct blockIds in blocksOfTerm1blocksOfTerm2-start2- The start index for the correct blockIds in blocksOfTerm2end2- The end index for the correct blockIds in blocksOfTerm2windowSize-documentLengthInTokens-
-
noTimes
public static final int noTimes(int[] blocksOfTerm1, int[] blocksOfTerm2, int windowSize, int documentLengthInTokens)Counts number of blocks where two terms occur within a block of windowSize in length, in a document of length documentLengthInTokens where the blocks for the terms are as given- Parameters:
blocksOfTerm1-blocksOfTerm2-windowSize-documentLengthInTokens-
-
noTimes
public static final int noTimes(int[][] blocksForEachTerm, int windowSize, int documentLengthInTokens)Counts number of blocks where all given terms occur within a block of windowSize in length, in a document of length documentLengthInTokens where the blocks for the terms are as given- Parameters:
blocksForEachTerm- - array of int[] of blocks for each termwindowSize-documentLengthInTokens-
-
windowsForTerms
public static final void windowsForTerms(int[] blocksOfTerm, int start, int end, int windowSize, int numberOfNGrams, int[] windows_for_term)Sets the number of occurrences of a term in each window, given the specified window size, the number of n-grams in the document, and the blocks of the term. To control how much of array is examined, see windowsForTerms(int[], int, int, int, int, int[]).- Parameters:
blocksOfTerm- - block occurrences for termstart- - start index to consider in blocksOfTermend- - end index to consider in blocksOfTermwindowSize- - size of each windownumberOfNGrams- - number of windows in documentwindows_for_term- - array of length numberOfNGrams
-
windowsForTerms
public static final void windowsForTerms(int[] blocksOfTerm, int windowSize, int numberOfNGrams, int[] windows_for_term)Sets the number of occurrences of a term in each window, given the specified window size, the number of n-grams in the document, and the blocks of the term. To control how much of array is examined, see windowsForTerms(int[], int, int, int, int, int[]).- Parameters:
blocksOfTerm- - block occurrences for termwindowSize- - size of each windownumberOfNGrams- - number of windows in documentwindows_for_term- - array of length numberOfNGrams
-
noTimesSameOrder
public static final int noTimesSameOrder(int[] term0Positions, int[] term1Positions, int windowSize, int documentLengthInTokens)
-
noTimesSameOrder
public static final int noTimesSameOrder(int[] posTerm1, int start1, int end1, int[] posTerm2, int start2, int end2, int windowSize, int documentLength)
-
noTimesNEW
public static final int noTimesNEW(int[] term0Positions, int[] term1Positions, int windowSize, int documentLengthInTokens)Returns the number of windows that have the both terms occurring, in the order specified. New version, implemented 10/6/2010 by craigm.
-
countTrue
protected static final int countTrue(boolean[] in)
-
noTimesSameOrder
@Deprecated public static final int noTimesSameOrder(int[][] blocksOfAllTerms1, int documentLengthInTokens)Deprecated.number of blocks where terms occur in an ajdacent manner. dont use this method, it has no concept of windows
-
noTimesSameOrderOLD
public static final int noTimesSameOrderOLD(int[] blocksOfTerm1, int[] blocksofTerm2, int windowSize, int documentLengthInTokens)number of blocks where
-
findSmallest
public static final int findSmallest(int[] x, int[] y)Find smallest difference between two elements of two arrays
-
-