- Open Access
An experimental result of estimating an application volume by machine learning techniques
© Hasegawa et al.; licensee Springer. 2015
Received: 25 August 2014
Accepted: 4 January 2015
Published: 1 February 2015
In this study, we improved the usability of smartphones by automating a user’s operations. We developed an intelligent system using machine learning techniques that periodically detects a user’s context on a smartphone. We selected the Android operating system because it has the largest market share and highest flexibility of its development environment. In this paper, we describe an application that automatically adjusts application volume.
Adjusting the volume can be easily forgotten because users need to push the volume buttons to alter the volume depending on the given situation. Therefore, we developed an application that automatically adjusts the volume based on learned user settings. Application volume can be set differently from ringtone volume on Android devices, and these volume settings are associated with each specific application including games. Our application records a user’s location, the volume setting, the foreground application name and other such attributes as learning data, thereby estimating whether the volume should be adjusted using machine learning techniques via Weka.
Use of mobile devices is remarkably growing all over the world. In 2013, according to a report from Strategy Analytics Inc.a, the smartphone market share continues to increase and has reached 60% globally. These smartphones have many functions, such as calling, e-mail, camera, games, media players, calendars, and alarm clocks. Thus, smartphones have almost the same abilities as personal computers and users always carry a smartphone. In addition, smartphones have many sensors as part of their standard equipment, unlike personal computers. Android smartphones generally have many sensors such as a light sensor, a microphone, an accelerometer, a gyroscope, and a magnetic field sensor. Android smartphones are suitable for recording lifelogs because they are equipped with many sensors, users always carry their smartphones, and smartphones offer high computing power similar to personal computers.
The term lifelog is composed of the words life and log; we define a lifelog as recorded individual behavior and information. As an example of a lifelog application, Hori and Aizawa (2003) and Sellen et al. (2007) have developed a system that records user’s memories, events, and experiences as a movie; therefore users can easily create such movies of their life stories. Up, a system developed by Jawboneb can record our daily exercise and sleep patterns via an accelerometer installed in a wristband device; therefore users can use this recorded information to keep track of their health. Adami et al. (2003) used various types of sensors such that users could improve their quality of life via lifelog records. By simply recording the user’s location, we can develop an activity management system. Lifelogs are often manually recorded by user; however, it is generally desirable for the system to automatically record such information. To this end, although users should carry many devices on their, such as sensors and cameras, it makes users’ burdensome. In contrast, smartphone makes a much smaller burden for users because various sensors are readily available, and users always carry such devices of which lifelogs can automatically be recorded. Therefore, smartphones are suitable devices for recording lifelogs.
A smartphone is useful for users’ daily life; however, manners and morals of users have become a problem. For example, if someone played a video game at a loud volume in a public library, others may get disturbed. Setting the volume control of a smartphone is generally necessary in public places such as public libraries, theater, and so on. Moreover, in one’s social life, such as in school and at the workplace, it is desirable to set a suitable volume; however, a user needs to manually adjust the volume setting. Some users intentionally do not adjust volume settings depending on their morals, whereas others often forget to adjust the volume accordingly. This problem not only causes disturbances in the user’s surroundings, but also adversely affects a user’s own performance (Cutrell et al. 2001). If a user’s smartphone makes loud sounds during a class because he/she forgot to set it to silent mode, it not only causes confusion but also interrupts the class. This may also adversely affect a user’s reputation. Unnecessary noise during work adversely affects work performance and may increase stress and the likelihood of operational errors (Eyrolle and Cellier 2000). Although sound volume is an important setting, users need to manually adjust it according to their situation; therefore, automatic sound adjusting system is desirable.
In this study, we propose a system that automatically switches application volume ON or OFF for Android smartphones via machine learning techniques using users’ lifelogs. Furthermore, application volume can be set differently from ringtone volume on Android smartphone. There are two key reasons why we focused on application volume. First, application volume is not associated with silent mode preference, whereas ring tone volume is connected with silent mode preference. More specifically, unless application volume is set to zero, the smartphone makes noise even if the silent mode preference is set to silent, and this is uncomfortable for an inexperienced smartphone user. Second, there are users who adjust application volume on a regular basis without switching their silent mode preferences. In our experiment, five out of nine subjects adjusted application volume on a regular basis, but only one subject switched silent mode preference on a regular basis. Our nine subjects were composed of postgraduates, undergraduates and office workers. Because of the small number of subjects, we cannot be certain that most smartphone users do not switch silent mode preferences on a regular basis: however, users who do not set silent mode preferences do tend to adjust application volume on a regular basis. For this study, focusing on the volume settings of a smartphone, we propose an intelligent system that automatically sets suitable application volume by first learning a user’s ordinary volume setting patterns. Because an automatic application volume setting system prevents the smartphone from suddenly making noise, we achieve improved smartphone usability. Our proposed system runs in the background of an Android smartphone, constantly records terminal operational logs, and learns from these recorded logs. While changing foreground applications, a suitable application volume is estimated and automatically set. In general, application volume can be set to one of sixteen-gradations; however, proposed system estimates whether application volume is ON or OFF as the first step because the objective of this study is to prevent extreme negative influence for users and disturbance to their surroundings.
We investigated studies related to our proposed system, including a study that enables users to automatically switch silent mode preferences and a study that prevents unwanted interruptions from incoming calls.
A lifelog is recorded individual behavior or information. We consider a user’s lifelog to be an individual’s behavior or information observed by a smartphone alone (i.e., without any other special devices). Various studies have focused on lifelogs observed by smartphones (Kobayashi et al. 2011; Kwapisz et al. 2010; Lee and Cho 2011). In these studies, systems inferred the user’s movement states by using analysis and classification of values observed via a three-dimensional accelerometer. In particular, in Kobayashi et al. (2011), successfully inferred how a user moves, including walking, running, riding a bicycle, stopping, driving a car, riding a bus, and riding a train. Further, we can also observe a user’s state via new activity recognition APIs on Android by Google. These studies make it possible to track how the user moves and are active as the user’s lifelogs. Other studies also make it possible to record various types of users’ lifelog information. Ouchi and Doi (2012) focused on indoor activity recognition, inferring an indoor activity such as cleaning or brushing one’s teeth. Chen et al. (2013) and Hao et al. (2013) measured sleep duration by using values observed on a smartphone’s three-dimensional accelerometer. We consider user’s lifelogs to include these user states inferred from sensor values as well as location information acquired via GPS, environmental noisiness, and the smartphone’s state (e.g., foreground application name, silent mode state, screen orientation, and Wi-Fi connection state); all such information as well as timing of such information is important to the composition of lifelogs.
Sound volume setting
A study using an alternative approach was proposed by Khail and Connelly (2005), who proposed a method that automatically switches silent mode preferences depending on calendar information. This method learns silent mode preferences that depend on calendar information, including meetings, lunch dates, and shopping excursions, set by the user in advance. They analyzed the relationship between calendar information and silent mode preferences through an experiment in which users input suitable silent mode preferences using the PDA terminal as a cell phone while receiving incoming calls in daily life. The experimental results indicate that silent mode preferences could be set depending on calendar information to a certain degree; further, in the questionnaire, almost all subjects stated that they would like to use the system if it were available on their smartphone, although, there is no relationship between the users. Since this method focused on users who constantly switch silent mode preferences based on their schedule, this method may be suitable for users who can minutely update their calendars. In (Dekel et al. 2009), by using additional information, such as the day of the week, time, and the active profile, suitable silent profiles (e.g., general, silent, meeting, outdoor, and pager) are estimated using the k-nearest neighbor algorithm.
Context awareness is a research field related to our study. Context awareness is defined as a technology or concept in which a computer observes contexts such as users’ behavior, surrounding environment, and situations. From (Abowd et al. 1999) context awareness service is a general system that provides relevant services to users by automatically observing contexts without input from the user; because it provides a service by observing the user’s current situation and environment, it is often studied with mobile devices that are always or often carried by their users. Previously, a context-aware computer had to be connected to other sensing devices to collect contexts (Gellersen et al. 2002; Ho and Intille 2005), because sensing technologies on mobile devices had not yet been developed. Recent smartphones (e.g., Android and iPhone) are equipped with various sensors as standard functions, and have, therefore, been easily able to automatically collect contextual information without using other devices.
In this study, we prevent disturbance to those in the surroundings and prevent decrease in user performance due to users forgetting to adjust the application volume. Therefore, we developed an automatic application volume setting system by learning contexts collected by using context-awareness techniques. As described above, related work regarding automatically switching silent mode preferences has been investigated using various methods; in most of these methods, users need to set specific criteria terms such as time, location, and detailed schedules in advance. Conversely, in this study, we provide a feature for automatically setting application volume by learning contexts collected by a smartphone, thereby gradually learning suitable volume settings without requirng additional user inputs such as time, location and schedule in advance. Moreover, unlike much of the related work, we define a correct setting based on a user’s daily operations without having the user input correct (or incorrect) settings, thereby preventing trouble for the user’s volume settings, forgetting application volume settings, causing surrounding disturbances, and decreasing the user’s work performance.
Automatic adjusting application volume
Patterns of recorded data
Whether the screen is on or off
Whether the displayed screen is landscape or portrait
Application name displayed in the foreground
Time of day
Day of week
Day of week
Silent mode preference
What type of silent mode preference is set
Whether a headset is connected
An index number of clustered GPS data
(For inner process) A long value of datetime
(For inner process) A latitude value observed with GPS
(For inner process) A longitude value observed with GPS
(For inner process) An accuracy value observed with GPS
(For inner process) An application volume setting
Timing of fetching data
Data fetching timing
Turning on the screen
Turning off the screen
Connecting the headset
Disconnecting the headset
Connecting the battery
Disconnecting the battery
Reacting to the proximity sensor
Losing reaction with the proximity sensor
Setting the silent mode
Canceling the silent mode
Dialing a phone call
Receiving an incoming call
Connecting a phone call
Returning to standby
Setting Wi-Fi on
Setting Wi-Fi off
Idle for 1 min
Charging a remaining battery
Pushing the home button
Before using recorded contexts as learning data for application volume estimation, they need to be preprocessed. One of the preprocessing steps involves GPS data, whereas another involves correct data generation for machine learning.
Subsequently, our system preprocesses clustering data using completed latitude and longitude for extracting significant places. Significant places are meaningful areas that the user has visited, for example a user’s home, a station, and the workplace. There are numerous clustering algorithms to choose from, including k-means clustering, which is a non-hierarchical clustering method that divides a dataset into k clusters using a center-of-gravity approach; however, significant places cannot be identified by simply applying this algorithm to location information (i.e., latitude and longitude). In this study, we decided to use the time-based clustering method proposed by Kang et al. (2004), which labels locations as significant places when the distance from the measured location is less than a constant distance, and the measured interval is more than a constant interval time using a time series incidental to latitude and longitude. We decided to use this method because this algorithm is simple and processes data sequentially. Simplicity is especially important for development of mobile devices to ensure low battery consumption and little use of limited CPU resources. The time and distance threshold values are set to 300 s and 50 m respectively.
A portion of sample recorded data
On or Off
On or Off
Unlike the conventional application and related work, our system focuses on application volume estimation and does not require users to provide correct data inputs (e.g., time, location, and calendar) in advance. Therefore, it is difficult to simply compare our proposed system with related works having automatic silent mode setting. For this study, we showed our system’s usability by considering the accuracy with which our system can estimate the amount of burden our system can reduce. To evaluate accuracy of application volume estimation via our proposed system, we constructed an experiment to collect contexts from daily smartphone operations. For our experiment, we developed an application that records contexts (see Table 1) when specific events occur (see Table 2), and then sends recorded contexts to a web server on a regular basis. Since we can estimate users’ situations using observed contexts and what settings a user is using, we can analyze the amount of burden, caused by the users’ need to adjust application volume settings manually that this system can reduce. Since our application frequently records various contexts (e.g., GPS), battery consumption increased in this experiment, and it was difficult to convince users to participate in this experiment for a long time. For this experiment, we had nine participants, including undergraduates, postgraduates, office workers. The experimental period was 7-79 days, depending on the user, with an average period of 30 days per person. Among them, five users adjusted application volume on a regular basis; their experimental period was 13-79 days, with an average period of 41 days per person. Since only users who constantly adjust application volume constantly are the target audience for our proposed system, for evaluation accuracy, we analyzed data for only these five users. We measured estimation accuracy via the following steps: (1) we generated correct data using the steps described in Section 2; (2) we generated estimation results from our proposed system through a simulation that was the same as the actual application’s steps; and (3) we compared correct data with estimation results for evaluation. More specifically for (2), our simulation involved the following steps: (a) loading the recorded context one at a time; (b) preprocessing the complementation of GPS data and generating correct data; (c) estimating suitable application volume; (d) proceeding to the next context. Through simulation, we obtained results that were the same as actual data observed on the smartphone with actual usage.
Our system used machine learning techniques provided by the data mining software Weka for application volume estimation.
Further, to study suitable classifiers as leaf nodes in Figure 5, we compared the accuracies calculated by simulation using the following types of classifiers: random Forest (RF), support vector machine (SVM), naive Bayes (NB), PART. RF (Breiman 2001) is a method that calculates a result by majority vote in the generated trees’ classified results. SVM (Cortes and Vapnik 1995) is a supervised learning model that can efficiently perform nonlinear classification using a technique named kernel trick, implicitly mapping inputs into high-dimensional feature spaces. NB (Lewis 1998) is a simple probabilistic classifier based on Bayes’ theorem with strong independence assumptions. PART (Frank and Witten 1998) is a rule-learning classifier that infers rules by repeatedly generating partial decision trees, thus combining the two major paradigms for rule generation, creating rules from decision trees and the separate-and-conquer technique. We selected these algorithms because of their high accuracy and well-established learning models. We used these algorithms with following parameters which are initial values on Weka. In RF, maximum depth of the tree and the number of features were not limited, and the number of trees was 10. In SVM, the kernel function was RBF kernel, the cost was 1.0, and the gamma was 0.0. In NB, kernel estimator was unavailable. In PART, the minimum number of objects for making rules was two instances, pruning was unavailable.
Sensitivity refers to the number of times that the system could automatically adjust the application volume on out of the number of times the user manually adjusted the application volume on (i.e., how much the system could estimate the application volume on). Specificity refers to the number of times that the system could automatically adjust the application volume off out of the number of times that the user manually adjusted the application volume off (i.e., how much the system could estimate the application volume off). Positive Predictive Value (PPV) refers to the number of times that the user wanted to adjust the application volume on out of the number of times that the system automatically adjusted application volume off (i.e., the rate of correct estimation of application volume on versus adjusted by the system). Negative Predictive Value (NPV) refers to the number of times that the user wanted to adjust the application volume off out of the number of times that the system automatically adjusted the application volume off (i.e., the rate of correct estimation the application volume off versus adjusted by the system). Accuracy refers to the rate of correct estimation of all. The higher the sensitivity, the fewer times the user needs to adjust the application volume on. The higher the specificity, the fewer times the user needs to adjust the application volume off.
To describe our proposed system’s usability, we compared our proposed system accuracy with the accuracy of the simple method where application volume is connected with silent mode preferences. We selected this method for comparison because it is easy to realize and can be used for simple application volume estimation.
Comparison of accuracy between our proposed system and a simple method connected with silent mode preferences
First, we compare Simple with our proposed system. In terms of accuracy, while Simple had 80.1% accuracy, our proposed system was 94.6%, 95.2%, 94.6% and 95.0%, consistently higher than Simple.
As a result, our proposed system could estimate more accurately than Simple. In terms of sensitivity and specificity, Simple had the highest specificity; however it had the lowest sensitivity. This indicates that Simple strongly related to the application volume off; however, only using simple method can not automatically adjusted when the user want to adjust the application volume on with silent mode on. In contrast, our proposed system can estimate both application volume on and off, moreover the specificity was not greatly different from Simple’s one. Therefore, our proposed system has better usability than Simple, because it hardly obstructs volume setting and often supports user volume setting. Second, we compared the accuracy of each classifier. There is no difference in each algorithm. We considered that the reason of this result is that we used the classifier with the structure shown in Figure 5.
Consequently, our proposed system can automatically switch application volume with higher accuracy than the method simply connected with silent mode preferences.
The objective of this study was to support user operations by learning contexts observed on a smartphone. In this section, we described our proposed system that automatically adjusts application volume to prevent interruptions in the surrounding and decrease in user performance caused by forgetting to adjust application volume settings. Our proposed method estimates a suitable application volume and automatically adjusts it by observing and learning contexts on a regular basis. As a result of our experimental accuracy evaluation via simulation using actual contexts, we found that our proposed method could set application volume more effectively than the simple method connected with silent mode preferences. In terms of a classifier, we obtained the best accuracy through the tree structure based on initial sound volume and foreground application with the PART classifier.
To evaluate our proposed system through a simulation using contexts observed by actual usage, we considered the accuracy with which our system could estimate if the user utilized our proposed system. In this paper, we evaluated our method through a simulation. We will continue our work by improving estimation accuracy through developing an application that users all over the world can easily use; further, we will consider our proposed system usability and battery consumption by machine learning on the smartphone.
a[Strategy Analytics Inc] http://www.strategyanalytics.com/
b[Up by Jawbone] https://jawbone.com/up
e[Google Play] https://play.google.com/
- Abowd, GD, Dey AK, Brown PJ, Davies N, Smith M, Steggles P (1999) Towards a better understanding of context and context-awareness. Comput Syst 40: 304–307.Google Scholar
- Adami, AM, Hayes TL, Pavel M (2003) Unobtrusive monitoring of sleep patterns In: Proceedings of the 25th annual international conference of the IEEE engineering in medicine and biology society (IEEE Cat. No.03CH37439), 1360–1363.. vol. 2 IEEE, NJ, USA.View ArticleGoogle Scholar
- Breiman, L (2001) Random forests. Mach Learn 45: 5–32.View ArticleGoogle Scholar
- Chen, Z, Lin M, Chen F, Lane ND, Cardone G, Wang R, Li T, Chen Y, Choudhury T, Campbell AT (2013) Unobtrusive sleep monitoring using smartphones In: Pervasive computing technologies for healthcare (pervasive health) 2013 7th international conference on, 145–152.. IEEE, NJ, USA.Google Scholar
- Cortes, C, Vapnik V (1995) Support-vector networks. Mach Learn 20: 273–297.Google Scholar
- Cutrell, E, Czerwinski M, Horvitz E (2001) Notification, disruption, and memory: Effects of messaging interruptions on memory and performance In: Conference on human computer interaction (Interact 2001).. IOS Press, Amsterdam, The Netherlands.Google Scholar
- Dekel, A, Nacht D, Kirkpatrick S (2009) Minimizing mobile phone disruption via smart profile management In: 11th international conference on human-computer interaction with mobile devices and services - MobileHCI ’09 vol 43.. ACM, NY, USA.Google Scholar
- Eyrolle, H, Cellier JM (2000) The effects of interruptions in work activity: Field and laboratory results. Appl Ergon 31(5): 537–543.View ArticleGoogle Scholar
- Frank, E, Witten IH (1998) Generating accurate rule sets without global optimization In: Machine Learning: proceedings of the fifteenth international conference, 144–151.. Morgan Kaufmann Publishers Inc, CA, USA.Google Scholar
- Gellersen, HW, Schmidt A, Beigl M (2002) Multi-sensor context-awareness in mobile devices and smart artifacts. Mobile Netw Appl 7(5): 341–351.View ArticleGoogle Scholar
- Hao, T, Xing G, Zhou G (2013) iSleep: Unobtrusive sleep quality monitoring using smartphones In: Proceedings of the 11th ACM conference on embedded networked sensor systems, 4–1414.. ACM, Rome, Italy.Google Scholar
- Hori, T, Aizawa K (2003) Context-based video retrieval system for the life-log applications In: Proceedings of the 5th ACM SIGMM international workshop on multimedia information retrieval - MIR ’03, 31–38.. ACM, CA, USA.View ArticleGoogle Scholar
- Ho, J, Intille SS (2005) Using context-aware computing to reduce the perceived burden of interruptions from mobile devices In: Proceedings of the SIGCHI conference on human factors in computing systems CHI 05 Portland, 909–918.. ACM, CA, USA.View ArticleGoogle Scholar
- Kang, JH, Welbourne W, Stewart B, Borriello G (2004) Extracting places from traces of locations In: 2nd ACM International Workshop on Wireless Mobile Applications and Services on WLAN Hotspots vol. 9, 110–118.. ACM, CA, USA.View ArticleGoogle Scholar
- Khalil, A, Connelly K (2005) Context-aware configuration: a study on improving cell phone awareness In: Modeling and using context vol. 3554, 197–209.. Springer Berlin Heidelberg, Berlin, Germany.View ArticleGoogle Scholar
- Kobayashi, A, Muramatsu S, Kamisaka D, Watanabe T, Minamikawa A, Iwamoto T, Yokoyama H (2011) Shaka: User movement estimation considering reliability, power saving, and latency using mobile phone. IEICE Trans Inf SystE94-D(6): 1153–1163.View ArticleGoogle Scholar
- Kwapisz, JR, Weiss GM, Moore SA (2010) Activity recognition using cell phone accelerometers. ACM SIGKDD Explorations Newsletter 12(2): 74–82.View ArticleGoogle Scholar
- Lewis, DD (1998) Naive (Bayes) at forty: The independence assumption in information retrieval. Mach Learn: ECML-981398: 4–15.Google Scholar
- Lee, YS, Cho SB (2011) Activity recognition using hierarchical hidden markov models on a smartphone with 3D accelerometer. Hybrid Artif Intell Syst6678: 460–467.Google Scholar
- Metsis, V, Androutsopoulos I, Paliouras G (2006) Spam filtering with naive Bayes? which naive Bayes? In: Third conference on email and anti-spam (CEAS), California, USA.Google Scholar
- Ouchi, K, Doi M (2012) Indoor-outdoor activity recognition by a smartphone In: Proceedings of the 2012 ACM conference on ubiquitous computing - UbiComp ’12, 600–601.. ACM, CA, USA.View ArticleGoogle Scholar
- Sellen, A, Fogg A, Aitken M, Hodges S, Rother C, Wood K (2007) Do life-logging technologies support memory for the past? an experimental study using sensecam In: Proceedings of the SIGCHI conference on human factors in computing systems, 81–90.. ACM, CA, USA.View ArticleGoogle Scholar
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.