Voice Assistant | Portfolio

VOICE

ASSISTANT

Experimental Research on the Students Satisfaction of the Intelligent Voice Assistants, Siri and Google Assistant.

OVERVIEW

Voice assistants (VA) have iterated with many improvements, which delivers increasingly valuable and accurate performances for consumers. And they have also been developed into incredibly intelligent and useful tools, for many everyday users. They can be used for general searching, scheduling, calculations, checking messages and emails, etc.

However...

Are users satisfied with current

voice assistants?

Our mission is to conduct experimental research to understand user satisfaction with intelligent voice assistants and compare the satisfaction between the most popular two voice assistants on smartphones, Siri and Google Assistant.

MY ROLE

UX Researcher

METHODS

Literature Review

Experimental Design

Questionnaire

Qualitative Interview

DURATION

Two Months

Oct - Dec, 2017

• Project Timeline •

RESEARCH DESIGN

Design Process of the Experimental Research from Preparation to Data Collection

Siri and Google Assistant, as the VA of iOS and Android system respectively, they may have a different experience in terms of using voice assistants, which may result they have different satisfaction. So, in our experimental research, we used ‘between subjects’ study, trying to explore whether users who utilize Google assistant have a higher overall sense of satisfaction, compared to that of Siri users.

• Preparation •

DEVICES

Public iPhone with Siri

Public Android Phone with Google Assistant

PARTICIPANTS

12 participants totally

Including 3 RIT staff who deal with student affairs, 9 RIT students

CONTENTS

Semi-structured interview

10 prepared questions for students

4 prepared questions for staff

RECORD

iOS Screen Recorder: Quicktime Player

Android Screen Recorder: AZ Screen Recorder

Google Form: Answers to the Questionnaire

Paper: Results of Observation and Follow-up Questions

DURATION

20 minutes / per participant

• Experiment Process •

• Tasks •

• Questionnaire & Follow-up Questions •

The Questionnaire included three parts which are Consent Form, basic Demographics and Likert Questions of each task. The Likert scales were separated into four aspects as follows. For the structured dialogue, the questions were prescribed for each sub-task.

Before each task, there were detailed description and requirement of the task. And after each section, there was an informal interview with several follow-up questions which depended on the performance of participants for the two tasks.

What problems did you come across when you were doing these tasks

FINDINGS

Data Analysis with ANOVA Test

We collected all the answers to the Likert questions which would be used to run ANOVA tests.

We also broke down each question in our Task Groups based on user-goals: Completion, Effort, Recognition, and User Satisfaction, which were used to find if any aspects of the tested Voice Assistants were viewed in a significantly different way compared to its counterpart.

According to the analysis of results, all the p-values are above 0.05, which means there is no statistical significance in user satisfaction, task completion, speech recognition, and effort taken. and even sub-goals, between Google Assistant and Siri. We cannot reject our null hypothesis: there is no significant difference in user satisfaction between Google Assistant and Siri amongst student users.

DISCUSSION

Discuss the Possible Reasons for the Results and Get Insights from the Results.

Why our results did not match our initial expectations?

We think THE MOST IMPORTANT ONE is that our experimental design only tested Siri with iOS users, and test Google Assistant with Android users. Participants only experienced the system they were most likely going to be comfortable with. Without providing our participants with a comparison of these two VA’s, it is highly likely that they bound to feel satisfied with the VA they tested.

Besides, There are also some other possible reasons as follows:

- At the present state, Siri and Google Assistant are established intelligent systems, that are capable of handling users’ basic needs and expectations. This translated to consistent high scores in the majority of our experimental tests.

- Given the age and technical environment our participants stemmed from, our test subjects were nearly all advanced VA users, who have several years of experience interacting with either Siri or Google Assistant. Their performance for our basic tasks, like setting a reminder and general searching, required little effort on their part and generated high marks in completion, satisfaction, and recognition.

- Our use of the Likert scale, which was utilized to collect the subjective responses from the participants, may have generated a positive response bias. Though we took several actions to avoid biased responses, like using neutral wording, providing questions instead of statements, providing response labels, and conducting the questionnaires with the computer-based presentation, we cannot rule out the potential for biased conditions. For instance, the participants we recruited from the class were prone to have a ‘can do’ attitude towards the tasks because they were volunteering for the research testing and possibly had more patience and confidence to accomplish the tasks

Although we could not provide a statistically significant difference for user satisfaction between Siri and Google Assistant, based on the Likert scale survey results above, we do find some differences and problems from our observation and informal interviews during the experiment.

Task 6 - Structured Dialogue: After searching the flight to NYC, ask “What is the weather like there?”

This may suggest that Google Assistant is better in interpreting contextual dialogue than Siri. But future work needs to be done to explore this hypothesis.

What insights we got from the results?

Speech Recognition

Task 3 - Web Search: Explain ‘experimental research’

Task 5 - Structured Dialogue: Ask movie times for Thor and navigator to the theater

Google Assistant may do better in responding with systematically formatted responses and asking follow-up questions and giving one-step answers to solve problems directly than Siri. However, further work needs to be done to establish whether users prefer the response manner of Google Assistant and for what tasks if the preference really exists.

Response Dialogue

CONCLUSION & FINAL DOCS

From our experimental research, although we didn’t find statistical significance in the satisfaction between Google Assistant and Siri, we summarized four possible reasons for the results: 1) The established technique of the two voice assistants. 2) Our participants’ familiarity to the two voice assistants. 3) The lack of comparison between the two different voice assistants. 4) The positive response bias to the Likert scale.

And regardless of the observation and informal interviews, we proposed plausible differences and problems between Siri and Google Assistant in two aspects:

1) Speech recognition.

2) Response dialogue.

If you want more details about the project, see Final Paper.

VOICE

ASSISTANT

OVERVIEW

However...

Are users satisfied with current

voice assistants?

​MY ROLE

METHODS

DURATION

• Project Timeline •

RESEARCH DESIGN

Design Process of the Experimental Research from Preparation to Data Collection

• Preparation •

• Experiment Process •

• Tasks •

• Questionnaire & Follow-up Questions •

FINDINGS

​Data Analysis with ANOVA Test

DISCUSSION

Discuss the Possible Reasons for the Results and Get Insights from the Results.

Why our results did not match our initial expectations?

What insights we got from the results?

Speech Recognition

Response Dialogue

CONCLUSION & FINAL DOCS

MY ROLE

Data Analysis with ANOVA Test