Project: Web Based Prosodic Labeling Chain
An overview
Most people probably don't think about it but when you are speaking with someone and your tone doesn't match theirs they will start to pick up on that and it will cause an awkwardness or distance between you two. This is especially true with non-native english speakers that have a hard time not only speaking the language but picking up on certain aspects of language such as sarcasm, sincerity, compassion, or emphasis. Thats where our project comes in to play because the goal is to basically analyze prosody which is the patterns of stress and intonation in speech. This is important so that the speaker and researchers can better understand these parts of speech and learn from the results. Dr. Okim our client/sponsor is interested in comparing two frameworks which are currently used today to analyze prosody one being David Brazil's model which is the one she has been studying and thinks is the most accurate when it comes to conversational speech. The other framework being ToBI(Tones and Break Indices) which is a model for analyzing prosody developed by researchers at MIT and is more recognized and established than David Brazil's model.
An analysis of two seperate speakers saying the same thing.
This is what Dr. Okim is wanting to change because she believes her framework is more accurate and wants to prove or disprove it. Currently the system for using both of these frameworks and comparing them is very tedious and complex. For example, the tool chain is currently spread across several independent software products running on different platforms which is incovenient and time consuming. Thats where our project comes in because we are to simplify this process by creating a web-based prosodic labeling toolchain that allows them to continue researching in one place so that each can be tested and analyzed in a much more convenient way. This will save our sponsor so much time and effort because she will not only have a simple way to test these things herself but also a way for other people to test and compare results so that she can prove her point.
Technologies
The Project has two main parts being the Prosodic Labeling Chain, and the web portal.
Prosodic Labeling Chain:
- Weka Java Machine Learning API will be used for this application mainly because of the group's programming experience with Java. Using Java's many libraries and API's will no doubt streamline the process in translating the AuToBi output into prosodic measures and calculating the audio samples proficiency using a Neural Network.
Web portal:
- Hosting will be done on the Digital Ocean. The demands of server to client distribution will be very small since only a few colleagues of hers and herself will be using the application at least until further testing and publicity.
- Web langauges like HTML5, CSS3, and JavaScript will be needed for writing our front-end as the provide the structure and styling to most web applications today. We have plenty of experience with these markup languages as the Team Inventory suggests, so this was a natural choice.
- Node.js, a web 2.0 runtime environment for JavaScript serverside coding. This choice makes for uniformity amongst writing the front end and back end of our web application and hence is a sound choice for consistency and code maintainablility since no syntax conversions need to be made when working on either side of the Internet DMZ. With JavaScript serverside as opposed to the traditional PHP to client via AJAX method, we unlock the 475k module apis available through NPM, or Node Package Manager. Nearly any simple function we need can be imported, trusted to be well tested, and used to make this web application quickly.
- MongoDB for our database. A JavaScript database for user accounts and stored sound files again helps us to be consistent in language choice for our web application. Instead of having to switch between syntax of JavaScript and SQL (for example), we can use JavaScript throughout all of the web application. While SQL and MongoDB could likely do a similar job of storing and retrieving client data, MongoDB functionality could be aided by some of the many APIs JavaScript has included (and some of the NPM packages) to make our job of coding even more streamlined. With this group's heavy experience with JavaScript, make a full stack Node.js application is the best way to go in completing the project to LingoPros' high quality standards.
Schedule
This is our schedule for Spring semester where we created a functional prototype of the software for our client, then tested and updated it so that it works desirably and satisfies the client's needs.