Bookmark and Share
  • Text +
  • Text -

Computer Science

ESSE: A New Interface for Searching the Blogosphere

James Buerger ('10); Lucy Vasserman ('10); Sara Sood

Unlike its more successful commonly used counterpart, web search, blog search has struggled to expand to widespread use or popularity. Although similar to web search, blog search must do more. A successful blog search engine will find a relevant, timely and diverse set of opinions that fit the user’s tastes. To find such opinions, this project system classifies a set of between fifty and two hundred blog posts by three factors: date, emotion, and political affiliation. After classification, the blog posts are ordered with respect to each dimension – that is, the blog posts are ordered from newest to oldest, happiest to least happy, and most conservative to most liberal. The user can then traverse search results along these dimensions to find a recent blog written from a specific emotional or political stance. The interface will be extended to classify blogs results by other factors, including reading level and popularity.
Funding provided by: The Norris Foundation

Using Robots to Introduce High School Students to Computing Concepts

Maribel Gonzalez ('10); Lucy Vasserman ('10); Sara Owsley Sood; Tzu-Yi Chen

To expose local high school students to the field of Computer Science, we developed two programs to serve the participants of the Pomona Academy for Youth Success. The first was an afternoon activity in which students learned to write Python programs to control a robot called the “Scribbler.” The second was a research opportunity in which rising seniors learned Python in greater depth and were able to use this knowledge to complete projects for the Scribbler robot. We found that it was incredibly difficult for students to understand the concepts behind their code with the limited amount of time. As a result, the participants of the research group came away with a greater understanding of programming concepts than the activity group. Both programs were successful as students were introduced to the field of Computer Science, making them more likely to consider it as a viable career option.
Funding provided by: The Norris Foundation

Exploring Document Structure to Improve Emotion Classification

Lucy Vasserman ('10); Sara Sood

Affective computing, building machines with emotional intelligence, is an important field of Artificial Intelligence research because future machines will need to connect with users on an emotional level in addition to performing complex computations. An emotionally intelligent computer must be able to both identify emotions in its user and express emotions itself. My previous work was a system that recognizes emotion in text, identifying the mood of a document using a traditional “bag of words” representation, ignoring word order. As an extension, this project examines the emotional structure of documents to identify trends that may be used to improve overall mood recognition. It also investigates ways to use the document structure to pinpoint sarcasm and other obstacles to text classification. In addition, I’ve adapted my previous work to now identify neutral text, as well as the initial happy, sad, and angry mood possibilities with the addition of an ‘intensity’ dimension.
Funding provided by: The Norris Foundation

Topic Segmentation With a Self-Updating Hidden Markov Model

Christopher Wienberg ('10); Sara Sood

In the day and age of utilities like Yahoo! and Google, searching by topic has become a routine task. Though excellent resources for finding topical information, these tools fail to extract relevant data in any meaningful way. One must wade through relevant results to find the section applicable to the initial search. We propose a system to segment text (perhaps a blog post) into topics, allowing topical segments to be identified. Our system is trained on news stories, and updated with new data each day; this keeps the system up to date on new and emerging topics. The training data is compared and grouped based on similarity of word usage; as new documents are added, they join the cluster that contains documents that are most similar. Our solution performs very well when given data similar to the training data, and admirably when no guarantees can be made about data similarity.
Funding provided by: The Norris Foundation

Research at Pomona