Dataset

Dataset

Activity Prediction

Last Updated: Dec. 2, 2012

This dataset contains data collected through controlled, laboratory conditions. If you are interested in "real world" data, please consider our Actitracker Dataset.

The data in this file corresponds with the data used in the following paper:

Jennifer R. Kwapisz, Gary M. Weiss and Samuel A. Moore (2010). Activity Recognition using Cell Phone Accelerometers, Proceedings of the Fourth International Workshop on Knowledge Discovery from Sensor Data (at KDD-10), Washington DC. [PDF]

When using this dataset, we request that you cite this paper.
You may also want to cite our other relevant articles, which can be found here.

When sharing or redistributing this dataset, we request that the readme.txt file is always included.

Statistics

  • Raw Time Series Data
    • Number of examples: 1,098,207
    • Number of attributes: 6
    • Missing attribute values: None
    • Class Distribution
      • Walking: 424,400 (38.6%)
      • Jogging: 342,177 (31.2%)
      • Upstairs: 122,869 (11.2%)
      • Downstairs: 100,427 (9.1%)
      • Sitting: 59,939 (5.5%)
      • Standing: 48,395 (4.4%)
  • Transformed Examples
    • Number of transformed examples: 5,424
    • Number of transformed attributes: 46
    • Missing attribute values: None
    • Class Distribution
      • Walking: 2,082 (38.4%)
      • Jogging: 1,626 (30.0%)
      • Upstairs: 633 (11.7%)
      • Downstairs: 529 (9.8%)
      • Sitting: 307 (5.7%)
      • Standing: 247 (4.6%)

Download Latest Version
  • Changelog:
    • (v1.1)
      • about files updated with summary information
      • file naming convention updated to include version numbers
      • readme.txt updated to include relevant papers
      • WISDM_ar_v1.1_trans_about.txt updated with citation to paper describing the attributes.
    • (v1.0)
      • user names masked with ID numbers 1-36
      • dataset initialized

  • Files:
    • readme.txt
    • WISDM_ar_v1.1_raw_about.txt
    • WISDM_ar_v1.1_trans_about.txt
    • WISDM_ar_v1.1_raw.txt
    • WISDM_ar_v1.1_transformed.arff

Actitracker

Last Updated: Oct. 22, 2013

This dataset contains "real world" data. If you are interested in controlled testing data, please consider our Actitivty Prediction Dataset.

This data has been released by the Wireless Sensor Data Mining (WISDM) Lab. The data in this set were collected with our Actitracker system, which is available online for free at and in the Google Play store. The system is described in the following paper:

Jeffrey W. Lockhart, Gary M. Weiss, Jack C. Xue, Shaun T. Gallagher, Andrew B. Grosner, and Tony T. Pulickal (2011). "Design Considerations for the WISDM Smart Phone-Based Sensor Mining Architecture," Proceedings of the Fifth International Workshop on Knowledge Discovery from Sensor Data (at KDD-11), San Diego, CA. [PDF]

When using this dataset, we request that you cite this paper.
You may also want to cite our other relevant articles, which can be found here, specifically:

Gary M. Weiss and Jeffrey W. Lockhart (2012). "The Impact of Personalization on Smartphone-Based Activity Recognition," Proceedings of the AAAI-12 Workshop on Activity Context Representation: Techniques and Languages, Toronto, CA.

Jennifer R. Kwapisz, Gary M. Weiss and Samuel A. Moore (2010). "Activity Recognition using Cell Phone Accelerometers," Proceedings of the Fourth International Workshop on Knowledge Discovery from Sensor Data (at KDD-10), Washington DC.

When sharing or redistributing this dataset, we request that the readme.txt file is always included.

Statistics

  • Demographics
    • Number of examples: 563
    • Number of attributes: 6
    • Missing attribute values: No
  • Raw Data
    • Number of examples: 2,980,765
    • Number of attributes: 6
    • Missing attribute values: No
    • Class Distribution:
      • Walking: 1,255,923 (42.1%)
      • Jogging: 438,871 (14.7%)
      • Stairs: 57,425 (1.9%)
      • Sitting: 663,706 (22.3%)
      • Standing: 288,873 (9.7%)
      • Lying Down: 275,967 (9.3%)
  • Raw Data (Unlabeled)
    • Number of examples: 38,209,772
    • Number of attributes: 6
    • Missing attribute values: No
  • Transformed Data
    • Number of examples: 5435
    • Number of attributes: 46
    • Missing attribute values: No
    • Class Distribution:
      • Walking: 2,185 (40.2%)
      • Jogging: 130 (2.4%)
      • Stairs: 251 (4.6%)
      • Sitting: 1,410 (25.9%)
      • Standing: 840 (15.5%)
      • Lying Down: 619 (11.4%)
  • Transformed Data (Unlabeled)
    • Number of examples: 1,369,349
    • Number of attributes: 46
    • Missing attribute values: No
    • Class Distribution:
      • Walking: 281,169 (20.5%)
      • Jogging: 2,130 (0.2%)
      • Stairs: 31,268 (2.3%)
      • Sitting: 655,362 (47.9%)
      • Standing: 158,457 (11.6%)
      • Lying Down: 240,963 (17.6%)
Download Latest Version
  • Changelog:
    • (v2.0)
      • activity label predictions added to unlabeled_transformed

  • Files:
    • readme.txt
    • WISDM_at_v2.0_raw_about.txt
      • WISDM_at_v2.0_transformed_about.arff
      • WISDM_at_v2.0_unlabeled_raw_about.txt
      • WISDM_at_v2.0_unlabeled_transformed_about.arff
    • WISDM_at_v2.0_demographics_about.txt
    • WISDM_at_v2.0_raw.txt
    • WISDM_at_v2.0_transformed.arff
    • WISDM_at_v2.0_unlabeled_raw.txt
    • WISDM_at_v2.0_unlabeled_transformed.arff
    • WISDM_at_v2.0_demographics.txt


  • Both labeled and unlabeled data are contained in this dataset. Labeled data is from when the user trained Actitracker with "Training Mode" The user physically specifies which activity is being performed. In both the raw and transformed files for labeled data, the activity label is determined by the user's input. Unlabeled data is from when the user was running Actitracker for regular use. The user does not specify which activity is being performed. In the unlabeled raw data file, the activity label is "NoLabel" In the unlabeled transformed file, the activity label is the activity that our system predicted the user to be performing.

Dataset Transformation Process

Last Updated: Jul. 14, 2014

The data transformation process in this file corresponds with the one used in the following paper:

Jeffrey W. Lockhart, Gary M. Weiss, Jack C. Xue, Shaun T. Gallagher, Andrew B. Grosner, and Tony T. Pulickal (2011). "Design Considerations for the WISDM Smart Phone-Based Sensor Mining Architecture," Proceedings of the Fifth International Workshop on Knowledge Discovery from Sensor Data (at KDD-11), San Diego, CA. [PDF]

When using this dataset, we request that you cite this paper.
You may also want to cite our other relevant articles, which can be found here.

Gary M. Weiss and Jeffrey W. Lockhart (2012). "The Impact of Personalization on Smartphone-Based Activity Recognition," Proceedings of the AAAI-12 Workshop on Activity Context Representation: Techniques and Languages, Toronto, CA.

Jennifer R. Kwapisz, Gary M. Weiss and Samuel A. Moore (2010). "Activity Recognition using Cell Phone Accelerometers," Proceedings of the Fourth International Workshop on Knowledge Discovery from Sensor Data (at KDD-10), Washington DC.


These files enact the data transfromation process where files of raw accelerometer data are converted to Attribute-Relation File Format (ARFF files) for use with WEKA machine learning software.

standalone_public_v1.0.jar is called with two arguments, a filepath to the input file (i.e. raw data file to read) and a filepath to the output file (i.e. arff file to be written to)

The source code for standalone_public_v1.0.jar is also provided with:
StandAloneFeat.java
TupFeat.java
FeatureLib.java

Descriptions of the features produced by this process can be found in the literature mentioned above as well as the about files for the transformed data of our published datasets.

For our transformation process, we take 10 seconds worth of accelerometer samples (200 records/lines in the raw file) and transform them into a single example/tuple of 46 values. Most of the features we generate are simple statistical measures.

Things to note:
An error concerning the number of tuples saved was recently found and corrected in the source code, so this particular version of the JAR file is not the same one used to create the transformed data from the raw data that is currently published on our site.

During the transformation process, only the first character of the activity label from the raw data files are used when creating the arff files. Because some of our activities begin with the same letter (i.e. Stairs, Standing, Sitting) if these labels are present in the raw files and the JAR file is called, one cannot distinguish between the activites in the arff files because theu activity label will be the same for multiple activites. WISDM uses a single-character labeling system to represent the activities we recognize, and simple perl scipts are called when it is necessary to translate between the full activity label and our single character system.

Walking - A
Jogging - B
Stairs - C
Sitting - D
Standing - E
LyingDown - F
NoLabel - G

Download Latest Version
  • Files:
    • readme.txt
    • FeatureLib.java
    • StandAloneFeat.java
    • TupFeat.java
    • standalone_public_v1.0.jar