Skip to content. Skip to main navigation.

News Archive 2001 - 2010

Researchers Seek Improved Sequence Searches in Large Databases

August 28, 2008

A cell phone company touts the ability its product to identify the name of a song simply by capturing a brief section of the melody. An American Sign Language user searches video databases for occurrences of specific signs. These are just two examples of a process called subsequence matching.

Now researchers at The University of Texas at Arlington are working to improve and expand subsequence matching capabilities for wider and scientifically productive uses in areas such as stock market modeling, seismic activity analysis and sensor-based health monitoring.

Computer Science & Engineering Professors Vassilis Athitsos and Gautam Das, along with collaborator Professor George Kollios at Boston University, have secured a three-year, $450,000 grant from the National Science Foundation to support their project titled “Time Series Subsequence Matching for Content-based Access in Very Large Multimedia Databases.” In it, they will develop methods for efficient subsequence matching in large time-series databases.

Dr. Athitsos is also currently collaborating with Professors Stan Sclaroff and Carol Neidle from Boston University on another large database project – a system for looking up complex American Sign Language gestures; a sign-spotting system. Work on this project provided motivation to consider other uses for retrieving the “best matching sequences” in a time series for a given query sequence.

Mathematical techniques (embeddings) will be designed that partially convert the subsequence matching problem into the much more manageable problem of similarity search in a vector space. This conversion will allow leveraging of the full battery of vector and metric indexing methods to speed up subsequence matching.

To showcase the commercial, social and educational impact of this research, the team will produce three demonstration systems: a query-by-humming system, a handwritten document search-by-keyword system and a sign spotting system – all examples of achieving efficient subsequence matching in the presence of large amounts of data.

“We are at a stage, technologically, where we can create very large music/video/multimedia databases, but it can be really hard for users to find what they are looking for,” said Dr. Athitsos. “There is a clear need for better search methods for multimedia databases. Our goal is to design such methods and to integrate these methods into real-world, demonstrable systems.”

 ∧