[TR-364] TwitterJSONCollection incorrectly assumes all files are gzipped. Created: 15/Sep/14  Updated: 01/Dec/15  Resolved: 04/Nov/15

Status: Resolved
Project: Terrier Core
Component/s: None
Affects Version/s: 4.0
Fix Version/s: 4.1

Type: Bug Priority: Trivial
Reporter: Craig Macdonald Assignee: Craig Macdonald
Resolution: Fixed  
Labels: None

Documentation for the TwitterJSONCollection class says: "Each file is assumed to be in gzip format, with one tweet per line. "

Line 109 has:
 currentTweetStream = new BufferedReader(new InputStreamReader(new GZIPInputStream(new FileInputStream(file)),"UTF-8"));

This can simply be replaced with
currentTweetStream = Files.openFileReader( file, "UTF-8" );
and the documentation updated.

Comment by Craig Macdonald [ 04/Nov/15 ]

Committed r4015

Generated at Tue Feb 25 13:01:32 GMT 2020 using JIRA 7.1.1#71004-sha1:d6b2c0d9b7051e9fb5e4eb8ce177ca56d91d7bd8.