| Title: | Data Sets for Craig Starbuck's Book, "The Fundamentals of People Analytics: With Applications in R" |
|---|---|
| Description: | Data sets associated with modeling examples in Craig Starbuck's book, "The Fundamentals of People Analytics: With Applications in R". |
| Authors: | Craig Starbuck [aut, cre]
|
| Maintainer: | Craig Starbuck <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-22 08:01:00 UTC |
| Source: | https://github.com/cran/peopleanalytics |
Fictitious benefits data for employees in a mid-size company
data("benefits")data("benefits")
A data frame with 1471 observations on the following 3 variables.
employee_idUnique identifier for each employee
stock_opt_lvlJob level, where 1 = 'Junior' and 5 = 'Senior'
trainingsNumber of trainings completed within the past year
data(benefits)data(benefits)
Fictitious demographics data for employees in a mid-size company
data("demographics")data("demographics")
A data frame with 1470 observations on the following 7 variables.
employee_idUnique identifier for each employee
ageEmployee age in years
commute_distCommute distance in miles
ed_lvlEducation level, where 1 = 'High School', 2 = 'Associate Degree', 3 = 'Bachelor's Degree', 4 = 'Master's Degree', and 5 = 'Doctoral Degree'
ed_fieldEducation field associated with most recent degree
genderGender self-identification
marital_stsMarital status
data(demographics)data(demographics)
Fictitious data on employees in a mid-size company
data("employees")data("employees")
A data frame with 1470 observations on the following 36 variables.
employee_idUnique identifier for each employee
activeFlag set to 'Yes' for active employees and 'No' for inactive employees
stock_opt_lvlStock option level
trainingsNumber of trainings completed within the past year
ageEmployee age in years
commute_distCommute distance in miles
ed_lvlEducation level, where 1 = 'High School', 2 = 'Associate Degree', 3 = 'Bachelor's Degree', 4 = 'Master's Degree', and 5 = 'Doctoral Degree'
ed_fieldEducation field associated with most recent degree
genderGender self-identification
marital_stsMarital status
deptDepartment of which an employee is a member
engagementEmployee engagement score measured on a 4-point Likert scale, where 1 = 'Highly Disengaged' and 4 = 'Highly Engaged'
job_lvlJob level, where 1 = 'Junior' and 5 = 'Senior'
job_titleJob title
overtimeFlag set to 'Yes' if the employee is nonexempt and works overtime and 'No' if the employee does not work overtime
business_travelBusiness travel frequency
hourly_rateHourly rate calculated irrespective of hourly/salaried employees
daily_compHourly rate * 8
monthly_compHourly rate * 2080 / 12
annual_compHourly rate * 2080
ytd_leadsYear-to-date (YTD) number of leads generated for employees in Sales Executive and Sales Representative positions
ytd_salesYear-to-date (YTD) sales measured in USD for employees in Sales Executive and Sales Representative positions
standard_hrsExpected working hours over a two-week payroll cycle
salary_hike_pctThe percent increase in salary for the employee's most recent compensation adjustment (whether due to a standard merit increase, off-cycle adjustment, or promotion)
perf_ratingMost recent performance rating, where 1 = 'Needs Improvement', 2 = 'Core Contributor', 3 = 'Noteworthy', and 4 = 'Exceptional'
prior_emplr_cntNumber of prior employers
env_satEnvironment satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
job_satJob satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
rel_satCollegue relationship satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
wl_balanceWork-life balance score measured on a 4-point Likert scale, where 1 = 'Poor Balance' and 4 = 'Excellent Balance'
work_expTotal years of work experience
org_tenureYears at current company
job_tenureYears in current job
last_promoYears since last promotion
mgr_tenureYears under current manager
interview_ratingAverage rating across the interview loop for the onsite stage of the employee's recruiting process, where 1 = 'Definitely Not' and 5 = 'Definitely Yes'
data(employees)data(employees)
Fictitious job data for employees in a mid-size company
data("job")data("job")
A data frame with 1470 observations on the following 6 variables.
employee_idUnique identifier for each employee
deptDepartment of which an employee is a member
job_lvlJob level, where 1 = 'Junior' and 5 = 'Senior'
job_titleJob title
overtimeFlag set to 'Yes' if the employee is nonexempt and works overtime and 'No' if the employee does not work overtime
business_travelBusiness travel frequency
data(job)data(job)
Fictitious payroll data for employees in a mid-size company
data("payroll")data("payroll")
A data frame with 1470 observations on the following 6 variables.
employee_idUnique identifier for each employee
hourly_rateHourly rate calculated irrespective of hourly/salaried employees
daily_compHourly rate * 8
monthly_compHourly rate * 2080 / 12
annual_compHourly rate * 2080
standard_hrsExpected working hours over a two-week payroll cycle
data(payroll)data(payroll)
Fictitious performance data for employees in a mid-size company
data("performance")data("performance")
A data frame with 1470 observations on the following 3 variables.
employee_idUnique identifier for each employee
salary_hike_pctThe percent increase in salary for the employee's most recent compensation adjustment (whether due to a standard merit increase, off-cycle adjustment, or promotion)
perf_ratingMost recent performance rating, where 1 = 'Needs Improvement', 2 = 'Core Contributor', 3 = 'Noteworthy', and 4 = 'Exceptional'
data(performance)data(performance)
Fictitious prior employment data for employees in a mid-size company
data("prior_employment")data("prior_employment")
A data frame with 1470 observations on the following 2 variables.
employee_idUnique identifier for each employee
prior_emplr_cntNumber of prior employers
data(prior_employment)data(prior_employment)
Fictitious sentiment data for employees in a mid-size company
data("sentiment")data("sentiment")
A data frame with 1470 observations on the following 6 variables.
employee_idUnique identifier for each employee
env_satEnvironment satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
engagementEmployee engagement score measured on a 4-point Likert scale, where 1 = 'Highly Disengaged' and 4 = 'Highly Engaged'
job_satJob satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
rel_satColleague relationship satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
wl_balanceWork-life balance score measured on a 4-point Likert scale, where 1 = 'Poor Balance' and 4 = 'Excellent Balance'
data(sentiment)data(sentiment)
Fictitious data on the active status of employees in a mid-size company
data("status")data("status")
A data frame with 1470 observations on the following 2 variables.
employee_idUnique identifier for each employee
activeFlag set to 'Yes' for active employees and 'No' for inactive employees
data(status)data(status)
Fictitious survey responses for anonymized employees in a mid-size company
data("survey_responses")data("survey_responses")
A data frame with 400 observations on the following 12 variables.
belongBelonging score measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
effortDiscretionary Effort score measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
inclInclusion score measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
eng_1Engagement score on item 1 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Disengaged' and 5 = 'Highly Engaged'
eng_2Engagement score on item 2 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Disengaged' and 5 = 'Highly Engaged'
eng_3Engagement score on item 3 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Disengaged' and 5 = 'Highly Engaged'
happHappiness score measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
psafetyPsychological Safety score measured on a 7-point Likert scale, where 1 = 'Highly Unfavorable' and 7 = 'Highly Favorable'
ret_1Retention score on item 1 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
ret_2Retention score on item 2 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
ret_3Retention score on item 3 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
ldrshpSenior Leadership score measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
data(survey_responses)data(survey_responses)
Fictitious tenure data for employees in a mid-size company
data("tenure")data("tenure")
A data frame with 1470 observations on the following 6 variables.
employee_idUnique identifier for each employee
work_expFlag set to 'Yes' for active employees and 'No' for inactive employees
org_tenureYears at current company
job_tenureYears in current job
last_promoYears since last promotion
mgr_tenureYears under current manager
data(tenure)data(tenure)
Fictitious monthly employee turnover rates by several dimensions
data("turnover_trends")data("turnover_trends")
A data frame with 3000 observations on the following 6 variables.
yearInteger representing the year, which ranges from 1 (earliest) to 5 (most recent)
monthInteger representing the month, which ranges from 1 (January) to 12 (December)
jobJob title
levelJob level, where 1 = 'Junior' and 5 = 'Senior'
remoteFlag set to 'Yes' for a remote worker and 'No' for a non-remote worker
turnover_rateMonthly turnover rate, calculated by dividing the termination count into the average headcount (beginning headcount + ending headcount / 2) for the respective month
data(turnover_trends)data(turnover_trends)