Apache Lucene Java coursework
$30-80 USD
Bezahlt bei Lieferung
I want you to use the Lucene Library and implement a Java Application
that will retrieve information. The user will type his query e.g.
+George +Rice -eat -pudding
Apple ??"pie +Tiger
animal:monkey AND food:banana
And the Application will display texts retrieved by relevance (most relevant first etc).
In order to test the application you should use the CACM archive
[url removed, login to view]
The above archive contains 3.200 articles of a magazine (communications
of ACM). The most important fields in there are: article title, article
authors and an article summary.
The same archive containes 64 queries concerning the file query.text.
Also [url removed, login to view] (containing answers to the queries exist to make you
familiar with the technology).
I want you to use only the article title, article author, article
summary, ignore common words (words stored in file [url removed, login to view])
use the Porter for stemming algorithm and evaluate the efficiency and
produce an accuracy diagram for each query. Then i want the average
efficiency among all 64 queries. Finally for each query the "“Average
Precision at Seen Relevant Documents?? and "R-precision" must be
calculated. Excel can be used to produce graphs.
**The code of this project is 99% done.** I need you to improve it a bit, and re-design from scratch the user-interface. Then provide a word report with a short explanation, the graphs i mentioned in the above paragraph and then implement a small algorithm that will improve the results (will provide more info on that algorithm in time)
Projekt-ID: #3882757