Overview

TextWonder is an online application I developed alone with the aim to provide a quick overview of any piece of writing. TextWonder is published at www.textwonder.com.

As a second year University student, I felt that I was falling behind my readings when it came to news, or reading a large chunk of texts. So, during the topics of Graph algorithms in the course of Algorithm and Data Structure taught in second year, it came to my realisation that texts if broken down into sentences can be represented as a weighted undirected graph. In a nutshell, a graph is just a collection of nodes connected through edges. And in the case of a weighted undirected graph, I mean that the edges that connect the nodes have real numbers associated with them.

In case of texts, if one breaks down a text into its sentences and let those sentences be the nodes of the graph, then the whole text can be represented as a weighted undirected graph, with the values of the edges showing the similarities between the different sentences.

Understanding this definition and connection between texts and graphs, it was clear to me that a text overview is just another graph (subgraph) of the whole graph where this subgraph contains nodes that are highly similar. In my definition, a text overview is a collection of very similar sentences of a piece of writings.

Service and Implementation

TextWonder provides the following services:

  • Find KeyWords: able to parse through a file and get key words
    • I have used this sometimes to find vocabularies mentioned in the texts so I can look them up before reading the texts.

 

  • Text Overview: TextWonder was created for this purpose. Given a piece of writing, it returns a collection of sentences that are highly similar to each other and that best describe the whole texts.
  • Text Comparison: given several set of texts, compare them against each other.
  • Question Generator: this has been naively implemented. But I will definitely improve this. But it does generate question based on the text.

 

I chose to implement TextWonder in Python because Python has great libraries that deal with text. And the one I used in particular is Natural Language Toolkit (NLTK http://www.nltk.org/ ).  I came across this during the Natural Language Processing course at the University of Edinburgh.

The implementation has involved coding algorithms associated with each of the services explained above. And a customized web-server, somewhat lightweight implemented by myself.

But as I have gained more knowledge in Computer Science in particular Machine Learning, I am going to reimplement majority of those Algorithm to suit the vision I have for TextWonder.

Vision

My vision for TextWonder is that it is going to be an evaluator and assistant for students or anyone who reads a lot of texts. I want TextWonder as an evaluator to be able to have a dialogue with its users based on a particular text or looking through a records of pieces of readings. And as an assistant I want it to remind its users about their courses and events. In a nutshell, the endgoal of TextWonder is going to be an A.I.

My first step towards this goal has involved restructuring the backend of TextWonder which I am currently doing. Because I want it to be accessible through many platforms, from desktops to mobile phones.

I call the project TextWonder Reboot and it is structured like this:

textwonder_reboot

90%
80%
80%
75%
70%