CPR Voice Assistant

Posted by : at

Category : Virtual Agents


Sudden Cardiac arrest is a leading cause of death in the United States [6]. The majority of cardiac arrests occur out of the hospital (OHCA) with less than an 8% survival rate [1]. Even though the majority of cardiac arrests are OHCA, 70% of Americans feel helpless in an emergency situation because they are unsure how to perform CPR [2]. However, there is a lack of existing tools and instructions on how to perform CPR during an emergency situation. A recent study showed that half of the nation’s emergency dispatchers do not have public safety answering points for performing CPR, and very few provide Hands-Only instructions [7]. Also, when asking Siri to provide steps on how to perform CPR she searches the Web which may cause delays in an emergency situation. Therefore, we built a Hands-Only Cardiopulmonary resuscitation (CPR) Assistant which is a system designed to help users go through the Hands-Only CPR process with ease while performing the procedure. The American Heart Association promotes Hands-Only CPR in an emergency OHCA situation [2], since it has been shown to be as effective as traditional CPR with rescue breaths, and bystanders are more likely to step in when not performing mouth-to-mouth. Our system provides clear speech instructions to perform Hands-Only CPR in an emergency OHCA situation through a dialogue based system. The CPR Assistant is a .NET Web based application written in C#, javascript and HTML. The system takes speech from the user as input and presents information using synthesized speech, as well as provides audio and visuals to aid in performing CPR. We tested the CPR Assistant through iterative user testing, and tested the final system on a total of 23 participants (M = 24.4, SD = 3.9, 4 females). Overall, the participants were satisfied with our system (M = 3.7, SD = 0.55) based on a five point Likert scale.

Hands-only CPR Assistant

Hands-Only Cardiopulmonary Resuscitation (CPR) Assistant is a system designed to help users go through the Hands-Only CPR process with ease while performing the procedure. It is a .NET Web based application written in C#, javascript and HTML. The CPR Assistant assesses the situation and provides instruction through dialogue from the user. When opened, the system immediately addresses the user by stating “Hello I am Mike. I can help you perform CPR. Is this an emergency?”. Then based on the user’s response the CPR Assistant will either provide simple steps on how to perform CPR, if it is not an emergency, or assess the situation and walk the user through performing CPR, if it is an emergency. Our CPR Assistant provides instructions on how to perform CPR based off of the steps from the American Red Cross and the American Heart Association [2,4]. The system will walk the user through: checking if in a safe location, checking if the person is breathing, determining if CPR should be performed, calling 911, and providing compressions.

Our system provides visual feedback as well as audio feedback. The system tracks the conversation and prints out the conversation in real time, allowing the user to see their statements as well as the systems. This provides a way for the user to track the conversation, as well as be able to read the instructions again if in a noisy or busy environment. It also allows the user to see if the system misunderstood them, which if misunderstood the system could provide wrong instructions when performing CPR. The system also provides images in real time based on the current step and situation to instruct the user to perform CPR. For example, the system will provide an image to illustrate how to check if breathing, when to call 911, and how to position your hands on the chest of the person who needs CPR. Our system also provides audio feedback by providing a beat to follow along to, to do 100 compressions per minute in an emergency situation. This makes CPR compressions easier to perform since the user would not have to count and just follow along to the beat. Through providing images and sound as well as audio instructions, we were able to reach our stretch goals and develop a well-rounded system that provides feedback to the user.

Pilot Testing

While developing the system we iteratively tested the system with different users to assess the speech recognition, speech synthesis, and functionality of the system

We first tested the speech recognition by having 5 participants (M = 24 yrs, 3 female) state six sentences that were preset. The sentences were selected from a recording of a CPR 911 dispatch call [14]. For example, “He is not breathing” and “I don’t know how to perform CPR”. We then calculated the Word Error Rate (WER) which was 22%, 51% for non-native English speakers (N = 2) and 2.8% for native English speakers (N = 3). To test the speech synthesis, we played four different sentences (e.g., Are you in a safe location?) at different speeds and with a male and female voice. We then asked the participants what voice and what speed they preferred. The majority consensus was the male voice played at -5% speed.

After developing the initial prototype, we then tested the functionality with 10 participants (M = 23 yrs, 6 female). We allowed the participants to freely interact with the system and did not constrain their input. This allowed us to test for any bugs, implement any functionality that we missed, and gain more examples of dialogue to train and test the system on for recognition. For example, we added was more support for “I don’t know” statements, if the user moved to a safe location, and if help arrived.

User Testing

The final user testing was accomplished through a round robin method, where different people walked around and interacted with systems. It was done in a small sub-area of a classroom and therefore was a noisy and busy environment.

We had a total of 24 participants interact with our system, however one user missed questions on the post survey which was used for analysis so we only examined the responses for 23 participants (M = 24.4, SD = 3.9, 4 females). Ten of the participants were non-native English speakers, and only nine of the participants knew CPR before using the system.


In order to compute the Word Error Rate (WER) the recordings were transcribed in order to have the user’s intended response. It was then compared to what the system recognized, which was located in the logs. In total we had an 8.9% WER, 10.3% for non-native English speakers (N = 10) and 8% for native English speakers (N = 13). Overall, this is an improvement from the 22% WER from the pilot testing. Also, since the testing occurred in a noisy and busy environment the WER would probably dramatically improve in a quieter environment.

For the 11 Likert questions, the responses were converted to a number based system (e.g., Strongly agree = 5 and Strongly disagree = 1). The scores from the negative statements such as “I thought there was too much inconsistency in this system” and “I found the system cumbersome to use” were then inverted. The average score for each participant was then calculated, and then the total average was 3.7 (SD = 0.55) which illustrates that participants were generally satisfied with our system.

Multiple regression analysis was performed to determine how individual factors affected average user satisfaction. For our data analysis, we used continuous scale variable called average user satisfaction (scale between 1 to 5) as dependent variable. It was calculated as an average of individual ratings from usability scale. We also used seven variables as independent variables, namely – English as first language (Yes/No), previous knowledge of CPR (Yes/No), WER, duration of conversation (in seconds), number of clarification request from system to user (e.g., system does not know), number of user and system dialogue turns, and task success (Yes/No). Task success implies if system was able to either assist user to perform CPR in emergency case or help them learn. Based on above data, multiple regression was performed to calculate coefficients and average user satisfaction using SPSS tool. Although no significant results were obtained from our analysis , results indicate that three variables with negative coefficients (English as first language, WER and number of clarification request from system to user) reduced the user satisfaction. Considering the weights of each coefficient, WER and user’s first language were two major factors in lower user satisfaction. These results suggest that improving the ASR can possibly increase the average user satisfaction. There can be several possible reasons like noisy environment, user’s English accent etc. for word errors during speech recognition. Reducing the background noise or training ASR with different English accents can improve the system. Another factor which negatively affected user satisfaction was number of clarification request from system to user. Considering the scope of this project, system was trained for very limited number of responses. Training the system with large number of possible inputs along with low WER can possibly reduce the number of clarification requests from system. Although system was implemented with multiple fallback responses for clarification requests, improving the system with extensive implemented just in time instructions from system can help reduce number of clarifications.


We created a Hands-Only Cardiopulmonary Resuscitation (CPR) Assistant, which is a system designed to help users go through the Hands-Only CPR process with ease while performing the procedure. It is a .NET Web based application and it assesses the situation and provides instructions through dialogue from the user. Overall, the participants were satisfied with our system and thought it was easy to use. This system demonstrates the applicability of a spoken dialogue system in helping individuals in emergency scenarios. This proof of concept can be extended to a real world application by improving it further and making a hands-free CPR assistant