Free Academic Seminars And Projects Reports
Text Auto Summarisation Tool - Printable Version

+- Free Academic Seminars And Projects Reports (https://easyreport.in)
+-- Forum: Seminars Topics And Discussions (https://easyreport.in/forumdisplay.php?fid=30)
+--- Forum: Miscellaneous Seminars Topics (https://easyreport.in/forumdisplay.php?fid=21)
+---- Forum: General Seminar Topics (https://easyreport.in/forumdisplay.php?fid=58)
+---- Thread: Text Auto Summarisation Tool (/showthread.php?tid=50278)



Text Auto Summarisation Tool - giriv - 10-04-2017

Text Auto Summarisation Tool

[attachment=18392]

Motivation

The ideas of NLP and AI have greatly motivated out team to have chosen this topic. Though we haven t been able to implement NLP directly, we still have cherished the idea to summarise.

Related Work

Several Institutes across the world have been into related projects such as:
1. Newspaper Headline generators.
2. Search Engines
3. Making Formal Reports and Presentations
4. Notes Generation (by Professors, Speakers etc.)
And many others.

Deliverables
a) Minimum guaranteed:
Summaries will be generated for texts that are not too concise already. The summary will contain direct sentences from the main text and will be as long as the user desires based on fractions of the original text. There will definitely be an option to generate summaries based on keywords from the user.
b) Safe fall back:
The fraction of the original text that the user wishes to summarise to will have an error of up to 10% of the original text. Though the error is bound to increase if the document contains less than about fifty sentences or so. Only text will be outputted, all tabular data and similar details will be eliminated.
c) Optimistic Wish List:
We would have loved to simply generate keywords from sentences and frame entirely new sentences in the summary for better effectiveness. We will still try to divide text into smaller entities by using other punctuation marks other than full stop (comma, colon, semi colon etc) for better efficiency. Summaries can be better generated if the program is aware of what kind of summary it is generating, such as newspaper, report etc. We will try to add this feature. A document database search engine which will look for key words throughout the directory and output the search in order of its relevance. Hashing has been widely used in ranking. However the speed of summary generation is not expected to be impressive. We look to look into this matter seriously.

Work to be Done
1. Analyse and create summaries manually and try to imbibe better tactics into the program. A lot of time and research is demanded for this. The project survives mostly in this task.
2. Creation of database of dictionary and an appropriate list of stop words.
3. Deciding on how much weight age that needs to be given on each kind of words, phrases, sentences, paragraphs etc. Highly crucial as it scores the sentences.
4. Generating a good hashing function or otherwise create an efficient way to rank the scores in the sentences and print similarly.
5. Working on the interface.

ii) Time Estimates
1. Logic and Algorithm: 20 hours
2. Manual practice: 10 hours
3. Project Report: 2 hours
4. Databases and storing scores for dictionary words: 5 hours
5. Scoring of the sentences: 15 hours
6. Ranking and hashing of the sentences for summary generation: 20 hours
7. Work on interface: 10 hours
8. Testing in order to decide on output percentage and validity: 10 hours

ii) Load Sharing
1. Rajdeep Mandal
Algorithm planning and scoring implementation with respect to the database.
2. Aditya Gandhi
Creation and maintenance of databases and dictionary
3. Zibran Shaikh
Hashing program to rank sentences and output handling
Load sharing planning is not yet final