This repository provides a PIN input keystroke dataset collected through controlled experiments for research on behavioral biometrics and typing dynamics. Participants repeatedly entered six-digit PINs on both the numeric row and numeric keypad under a consistent environment. Each dataset file contains key press/release timestamps and input sequence logs, enabling detailed analysis of typing patterns and individual behavior.
The dataset was newly collected by recruiting voluntary participants. A custom logging application simulated a login screen, where participants repeatedly entered six-digit numeric PINs. Each input event was recorded as a log containing the key press/release timestamps and input count number.
PINs were randomly generated using the Excel RANDBETWEEN function.
All participants entered the same six types of PINs, each 80 times, with a 5-second pause before the last five trials.
Incorrect entries involving missing Enter key presses or use of Backspace were excluded, and only valid entries were retained.
- Location: Fixed desk in a university lab environment
- Laptop: Dynabook SJ73/KU
- Input devices:
- Numeric row input: Built-in keyboard (Dynabook)
- Numeric keypad input: FUJITSU KU-0325
- Each CSV file contains preprocessed (cleaned) data.
- Preprocessing includes:
- Sorting by press/release order
- Removing over-inputs caused by missing
Enterkey - Removing header rows
Original CSV header structure: [countnum, Name, State, time]
| Column | Description |
|---|---|
countnum |
Input count number |
Name |
Pressed key |
State |
Press or release |
time |
Timestamp (elapsed time since 2001/01/01) |
- Do not use the Backspace key when a mistake occurs.
- If an input mistake is made, press Enter and continue.
- After the 75th input, a warning message appears; from the 76th input onward, entries are made every 5 seconds.
| PIN | Excluded Participants | Remarks |
|---|---|---|
| 669270 | 11 | Data from participant #11 was excluded |
| 631215 | 11, 24 | Data from participants #11 and #24 were excluded |
| 960148 | 11 | Data from participant #11 was excluded |
| 797362 (NumPad) | 11 | Data from participant #11 was excluded |
| 495471 (NumPad) | 11 | Data from participant #11 was excluded |
| 659114 (NumPad) | None | No issues |
This dataset was collected in accordance with the research ethics regulations of the author’s institution and the Cybersecurity Research Ethics Checklist. Participants were fully informed of the research purpose and procedure before participation, and provided consent for data collection and public release. All data were anonymized to protect participant privacy — no personal or identifiable information was collected. Only keystroke logs of PIN inputs are included in this dataset.
@inproceedings{yamaguchi_css2025,
author = "Shuji Yamaguchi and Win Myat Mon Khin and Hidehito Gomi and Kotaro Kominami and Tetsutaro Uehara",
title = "Toward a Risk-based Multimodal FIDO Authenticator: Proposal and Empirical Evaluation of Keystroke Dynamics-Based Authentication",
booktitle = "Proc. of Computer Security Symposium",
year = "2025",
}