Business Intelligence
Analytics Horsepower
For this paper, the database designed from the previous study using MySQL Workbench would be used to describe and reflect upon the work in a paper that describes the process for developing insights from the data.
This approach will be the essence of creating analytics horsepower from data. Analytical horsepower requires a researcher to have the expertise to make the right decision on what data to use, how to use that data and draw insights from the analysis for guided decision-making within an organization (Chen, 2021). Analytical horsepower also requires the researcher to know how to work with unstructured data and understand the distinction between having the right balance of human lead and system lead. It also includes being able to determine the best answer to address a business problem, for instance, recognizing whether descriptive analytics or prescriptive analytics needs to be used or both.
Problem Statement: Employee Retention
Regardless of the industry an organization is in, they would have experienced some attrition of employees which tends to impact the retention level of skilled workers. Attrition describes the number of employees who leave an organization within a specified time frame either voluntarily or involuntarily. Employee retention refers to an organization’s capacity to retain its staff. A review on employee retention by Das et al (2013) states that although most organizations are technology driven in recent times, human resources are still required to run these technologies.
This makes the employees the backbone of these organizations. It is apparent that a company’s ability to secure and retain these skilled workers plays a significant role within the corporation. And for them to stay competitive economically, these businesses need to understand that employees’ knowledge and skills are critical (Das, 2013). Maintaining employees in the workforce has become a challenge for many organizational leaders. Aside from keeping employees happy, companies must find ways to retain their talent to ensure they get to keep the best staff as long as possible.
Although employee turnover is part of any business, an organization’s ability to reduce attrition will cut down on the hassle, boost morale, and maintain a good customer experience while saving them time and money (Guerranti & Dimitri, 2023). This is because when employees leave a company, employers not only lose valuable talent but spend more money to hire, train, and replace the lost workers. Higher-performing workers usually know the processes of the department and can achieve results with the best utilization of available resources.
When these higher-performing team members leave the company, only non-performing employees end up being retained. This affects the quality of work since the work culture becomes different after the expertise of the employee who left had acquired over the years is lost. There are several reasons why employee retention is important. The first benefit is the increase in employee loyalty to the goals of the organization. Having workforces that have been kept around longer advances loyalty thus increasing productivity and ethical work.
The second benefit to employee retention is saving the company money (Das, 2013). It costs a lot of money (Guerranti & Dimitri, 2023) for these employers to find new team members through recruiters, train them and offer them sign-on bonuses as well. Thirdly, skilled employees continue to build skill sets and experience levels within the organization. This offers management the benefit of promoting dexterous personnel into new roles rather than the corporation starting over with a newly recruited staff.
There are numerous ways attrition occurs in companies that impact the retention of workers (Guerranti & Dimitri, 2023). Sometimes it is involuntary where an employee does not get to decide to leave but is let go due to position elimination, termination, or layoffs. Other times, voluntary attrition occurs due to a new job offered to the employee by a competitor with a better work environment, or due to family reasons. Also when a worker reaches their retirement stage, attrition via retirement happens. In addition to the above, some other factors could influence the retention rate of employees.
These include the culture within the organization, ongoing training, workplace environment, mission and values alignment, and flexibility in schedule for working families. The higher an organization’s retention rate, the lower its employee attrition rate would be. The research problem to be addressed in this study is the rise of employee attrition in the workplace. According to a study carried out by Madsen et al (2023), the rate of attrition from male-dominant workplaces seems higher among women than among men (Madsen et al., 2023).
Most workers including teachers are believed to be subjected to high amounts of stress due to issues in their work environment and sometimes also experience burnout (Farmer, 2020). Although teacher attrition was observed to be lower than in other occupations, it has been postulated that their retention rate might fall if dramatic improvement in the organizations, management, and funding of public schools is not observed (Boe et al., 2008). In the corporate workplace, the most common job experience levels are entry-level, intermediate/senior staff, first-level management, middle management, and executive/senior management which may have different levels of attrition rates.
The following is a list of research questions and hypotheses that will be used to probe the dataset within the designed database.
-
Which category within the job experience levels has a higher attrition rate?
Null Hypothesis (Ho): All job experience levels have the same attrition rate
Alternative Hypothesis (Ha): Entry-level and senior staff have a higher attrition rate among the 5 job levels
-
Does the working environment impact the attrition rate of employees
Null Hypothesis (Ho): There is no correlation between the working environment and the attrition rate
Alternative Hypothesis (Ha): There is a correlation between the working environment and the attrition rate
-
What is the impact of employee tenure on their monthly salary
Null Hypothesis (Ho): The number of years of work makes no difference in an employee’s monthly income
Alternative Hypothesis (Ha): The number of years of work makes a difference in an employee’s monthly income
Statement of Significance
The study about employee retention is very relevant because a better understanding of how to improve it can save an organization from losing productivity (Das, 2013). When companies experience higher retention rates and lower attrition, they can get more work done. This is because more engaged workers are employed, who in turn get more work done to improve their customer relationships. When more employees leave an organization, it can dampen employee morale, and result in lost connections, including heavier workloads and responsibilities being passed onto the remaining employees. These experiences can impact a department negatively due to the contagious nature of the turnover.
From this study, senior leaders will appreciate the need to strengthen their employee retention programs. This will go a long way to lift the morale of employees and enhance connectedness and engagement as contagious positive emotions in the workplace are evoked. Corporate culture is usually formed from the conduct of the people, their perceptions, and preferences (Wang et al., 2016). This culture greatly influences recruiting and retaining the right people within the organization. From this study, senior leaders and departmental heads would be reminded that when a person leaves a team, others begin to wonder what the reason was and begin to examine their own loyalty towards the department.
But in instances where engaged employees are aligned with the company culture continue to stay, this experience goes a long way to strengthen the organizational philosophy and environment. Ultimately, a positive employee experience increases the level of connection and dedication they have towards their role and organization. There are several factors that impact experiences workers have that may be outside human resources control (Delen, 2020). Through this research, organizational leaders will be directed to focus on what the employees need to be most efficient at work.
And as they start to implement these and keep a lot of their best talents within the team, it improves the employee experience. This will control the retention rate immensely within the corporation. Furthermore, revenue gains come from reduced hiring costs, and increased productivity, in addition to improved customer service experience. From this study, it will be emphasized that when revenue increases are traced from retention policies, it could be a very significant human resource metric that could demonstrate the return on investment on some of these proposed initiatives.
Database Design Process
Designing a database systematically supports an effective development process which is very important to thriving database functionality and performance. This process entails developing and refining a database structure based on the requirements of the business (Cross et al., 2009). The three stages of database design are Conceptual database design, Logical database design, and Physical database design. In the Conceptual design, the data requirements for the business are defined by answering questions like the following. What is the type of information the business currently uses, needs, or wants from the system? It also addresses the kind of report that needs to be generated for insights that would be drawn from the data.
The Logical design also helps further define and assess the business information requirement and the relationships amongst them. As part of the steps in the Logical design, the tables will be defined, relationships between the tables will be determined, and columns will be populated before normalizing the tables (Hingorani et al., 2017). The primary keys will be determined as well as the values for each column. The final stage which is the Physical design phase is a refinement of the logical design. It is supposed to map the logical design to a relational database management system. This step examines how users will access the database to determine this information – which data would be used commonly, what columns require indexes for data access, and whether denormalization of the database will improve performance.
Creating a database in MySQL
An attrition schema was created for this research database design in MySQL database. This will enable the capture of qualitative and quantitative data on employees. The tables that will be created will be the education table, employee history table, job table, salary table, about employee table, records table, and another information table.
SQL Code Used in MySQL Workbench: Creating Education Table
DROP TABLE IF EXISTS education_table
;
CREATE TABLE education_table
(
employee_no
int NOT NULL,
education
int DEFAULT NULL,
education_field
varchar(45) DEFAULT NULL,
envronment_satisfaction
int DEFAULT NULL,
PRIMARY KEY (employee_no
)
)
INSERT INTO education_table
VALUES (1,2,‘Life Sciences’,2),(2,1,‘Life Sciences’,3),(4,2,‘Other’,4),(5,4,‘Life Sciences’,4),(7,1,‘Medical’,1),(8,2,‘Life Sciences’,4),(10,3,‘Medical’,3),(11,1,‘Life Sciences’,4),(12,3,‘Life Sciences’,4),(13,3,‘Medical’,3),(14,3,‘Medical’,1),(15,2,‘Life Sciences’,4),(16,1,‘Life Sciences’,1),(18,2,‘Medical’,2),(19,3,‘Life Sciences’,3),(20,4,‘Life Sciences’,2),(21,2,‘Life Sciences’,1),(22,2,‘Medical’,4),(23,4,‘Life Sciences’,1),(24,3,‘Life Sciences’,4),(26,2,‘Other’,1),(27,4,‘Life Sciences’,3),(28,4,‘Life Sciences’,1),(30,2,‘Life Sciences’,3),(31,1,‘Medical’,2),(32,3,‘Other’,3),(33,1,‘Life Sciences’,2),(35,4,‘Marketing’,3),(36,4,‘Medical’,1),(38,4,‘Marketing’,2),(39,3,‘Medical’,3),(40,4,‘Other’,4),(41,2,‘Medical’,4),(42,3,‘Technical Degree’,4),(45,3,‘Medical’,2),(46,2,‘Medical’,4),(47,2,‘Marketing’,1),(49,3,‘Marketing’,4),(51,4,‘Life Sciences’,2),(52,3,‘Life Sciences’,3),(53,2,‘Other’,3),(54,4,‘Life Sciences’,4),(55,3,‘Life Sciences’,1),(56,3,‘Life Sciences’,4),(57,2,‘Medical’,3),(58,3,‘Technical Degree’,2),(60,4,‘Marketing’,2),(61,2,‘Life Sciences’,2),(62,4,‘Marketing’,1),(63,1,‘Life Sciences’,4),(64,2,‘Life Sciences’,1),(65,4,‘Technical Degree’,3),(68,5,‘Marketing’,2),(70,2,‘Medical’,3),(72,3,‘Marketing’,3),(73,2,‘Life Sciences’,1),(74,5,‘Life Sciences’,2),(75,4,‘Medical’,3),(76,4,‘Life Sciences’,4),(77,4,‘Life Sciences’,1),(78,3,‘Medical’,1),(79,5,‘Life Sciences’,4),(80,2,‘Medical’,2),(81,3,‘Life Sciences’,1),(83,3,‘Technical Degree’,3),(84,3,‘Medical’,4),(85,3,‘Life Sciences’,2),(86,3,‘Life Sciences’,2),(88,3,‘Medical’,2),(90,3,‘Medical’,4),(91,1,‘Life Sciences’,1),(94,3,‘Life Sciences’,3),(95,4,‘Medical’,3),(96,3,‘Life Sciences’,2),(97,3,‘Life Sciences’,2),(98,4,‘Life Sciences’,3),(100,4,‘Marketing’,3),(101,4,‘Other’,4),(102,4,‘Medical’,1),(103,2,‘Medical’,2),(104,1,‘Life Sciences’,4),(105,3,‘Medical’,2),(106,2,‘Life Sciences’,1),(107,3,‘Medical’,2),(110,2,‘Medical’,1),(112,3,‘Life Sciences’,4),(113,1,‘Technical Degree’,3),(116,4,‘Life Sciences’,4),(117,3,‘Life Sciences’,3),(118,2,‘Medical’,3),(119,4,‘Life Sciences’,3),(120,4,‘Marketing’,3),(121,2,‘Medical’,3),(124,3,‘Medical’,3),(125,4,‘Medical’,2),(126,4,‘Technical Degree’,1),(128,2,‘Other’,1),(129,3,‘Medical’,2),(131,4,‘Medical’,4),(132,3,‘Medical’,2),(133,4,‘Human Resources’,3);
With the attributes in the attrition database, the attrition rate of the organization could be measured and the factors that contributed to it could be analyzed to draw insight for management decisions. The attrition database will be created in MySQL Workbench which is a visual database tool for designing, developing, and administering MySQL databases. The process will start by launching MySQL Workbench and opening the Setup New Connection wizard.
The name for the connection and username will be entered to connect to the localhost and root. The required connection in SQL will be clicked on in the Workbench to start the page. When the MySQL Workbench window is opened, a new schema will be created in the connected server. To populate the schema with tables, the Table entry will be right-clicked to select Create Table. This opens a dashboard that allowed the name of the entity to be entered along with the column names.
The datatypes for each column which range from integer, varchar, timestamp, and date will be selected before saving. This step will be repeated for all the tables that will be created. The next step after creating the tables will be to add data. This step requires the table to be populated to be right-clicked in the SCHEMA pane to allow the modification of the table. Selected rows will have data entered into them to complete the database creation.
Insights Planned to Be Achieved By The Design
The design would be used to aid in carrying out future analysis to gather insights into determining which categories within the job levels of an organization tend to experience a higher attrition rate. With a bar chart distribution by job level, this estimation of the attrition rate based on job level could be achieved. A correlation matrix plot will bring to light how the working environments are impacting the attrition rate of employees. With a clearer understanding of employees who are satisfied with their working environment score and their retention level within the company, senior leaders will be able to explore better ways to create the most conducive environment that could boost the retention rate of their employees.
From this design, a comparison between the monthly salaries of employees and the number of years spent working within the company could be established. Based on the trend achieved from the analysis that would be carried out, the morale of employees could be boosted once they are made aware of the progression in a salary increase for staff members who remain with the company for a longer duration. In instances where the trend was not as impressive as most employees might have hoped, senior leaders will be able to use the insights to make modifications to the incentives awarded to employees. These modifications might result in an increase in the retention rate within the organization.
Another analysis that could be carried out with the design is estimating the percentage of employees by age within the organization and using that to determine the attrition rate of the various age categories. From this insight, management and human resources would have a better segmentation of the employee demography and get to target the age group of employees who are most likely to leave the organization. Thereby designing better incentive packages that would be most attractive to ensure the retention of this group of workforce. In addition, this design would also help in discovering whether the marital status of employees impacts their attrition rate at all.
Factors for Referential Integrity
Data integrity ensures the precision, completeness, reliability, and authenticity of a company’s dataset (Sivathanu et al., 2005). This is even more necessary for organizations as volumes of data increase exponentially. There are different types of data integrity which are Physical integrity, and Logical integrity which is further subdivided into Entity integrity, Referential integrity, Domain integrity, and User-defined integrity. Physical integrity ensures the precision, correctness, and wholeness of the data are guarded when it is collected and retrieved. This prevents compromises that might arise from issues such as natural disasters, power outages, or storage erosion.
With Logical integrity, the data is protected from any changes that might occur while being used in different ways through relational databases. Entity integrity is the first format of Logical integrity. It structures relation systems that store data within a table to be used and linked in various ways. The primary keys together with unique values that are created help in the identification of data. This step avoids multiple entries of data and a table without a null. The second format of Logical integrity is Referential integrity which is a series of processes that ensure data remains stored and utilized in a uniform way.
This format allows only appropriate data deletion, changes, or amendments to prevent duplication and ensure data accuracy. Domain integrity is the third format of Logical integrity. It is a series of steps that ensure data accuracy within a domain that is classified by a set of values. These values of the table columns are permitted to contain constraints and measures that limit the amount, format, and type of data that can be entered. Finally, the User-defined integrity being the last Logical integrity requires rules and constraints to be created around data by using alignment with their specific requirements.
Future Research
Granting new and old staff training and development programs is a great investment any organization could offer (Das, 2013). Although most new hires receive onboarding training, old employees also need to expand their knowledge base to stay well-informed. If an employee already working in a specific department is offered training programs, it enhances motivation and increases the productivity of the workforce. The new skills that they acquire from the training make them much more efficient and make the organization competitive. Whilst some companies design and develop this training in-house using Subject Mater Experts (SMEs), others outsource the training and development deployment to external vendors.
For future research, when an additional table on Learning and Development records of employees is included in the existing schema, the database could be used to research how the training of employees impacts their retention rate. The new additional table name is “L&D records table” with column names such as employee number, training modality, training duration, and employee role. With such data, most corporate organizations will be able to track the kind of training that is best embraced by staff. According to Stewart (2014), learning in a professional community is considered to be more effective than traditional professional development methods (Stewart, 2014).
Typically whether it is Upskilling, Reskilling, or Certification programs in the Learning and Development program, management and departmental heads will be able to use this study to analyze the most impactful approaches for their team members. Thereby offering them a chance to close the skill gap that exists due to technological advancements. Data on the delivery format like self-paced training records, live webinars, or in-person training could be used to predict which modality of the delivery arrangement results in higher completion rates of the courses employees are enrolled in.
Recently, many organizations are moving away from live in-person training to online training due to the high-cost implication of live in-person training per employee. But with an online version, more employees are able to enroll in the program at a minimal cost. To ensure interactivity, some online training programs have been re-designed to be self-paced with a combination of occasional instructor touchpoints to address any concerns learners might have. With this database, further study into the consistency and quality nature of these training modalities could be carried out.
Assumptions and limitations in the database
It was assumed that the database records will help in resolving all of the impacts in the database design. From this study, it was also assumed the entity relationship diagram of the attrition database shows a one-to-one relationship (1:1), one-to-many relationship (1:M), and many-to-many relationship (N:M). In a one-to-one relationship (1:1) within a database design, there will be two tables and the first table could be associated with zero or one instance of the second table, and the second table is also associated with zero, or one instance of the first table. It was assumed the employee and the employee history table in the Entity relationship diagram in Figure 1 shows a 1: 1 relationship between them.
In a one-to-many relationship (1:M) as captured in Figure 1, it was again assumed the job_table is associated with zero, one, or many instances of the employee _table. And the employee_table on the other hand is associated with zero, or one instance of job_table. For the many-to-many relationship in Figure 1, both the about_employee table and education table are associated with zero, one, or many instances of each other. This type of relationship, although very common in the business world, is not permitted in a relational database and is usually converted into one-to-many relationships. Hence the need to insert a common entity which in this diagram was the employee table in between the about_employee table and the education table.
Author: Adwoa Osei-Yeboah
References
Boe, E. E., Cook, L. H., & Sunderland, R. J. (2008). Teacher Turnover: Examining Exit Attrition, Teaching Area Transfer, and School Migration. Council for Exceptional Children, 75(1) https://journals.sagepub.com/doi/epdf/10.1177/001440290807500101
Chen, H. (2021). HorsePower: An Array-based Optimization Framework for Query Processing and Data Analytics (Ph.D. ed.). McGill University (Canada).
Cross, S. S., Palmer, I. R., & Stephenson, T. J. (2009). How to design and use a research database. Diagnostic Histopathology (Oxford, England : 2008), 15(10), 490-495. 10.1016/j.mpdhp.2009.07.003
Das, B. L. (2013). Employee Retention: A Review of Literature. IOSR Journal of Business and Management, 14(2), 8-16. 10.9790/487X-1420816
Delen, D. (2020). Predictive Analytics: Data Mining, Machine Learning and Data Science for Practitioners (2nd ed. ed.). Pearson FT Press.
Farmer, D. (2020). Teacher Attrition: The Impacts of Stress. The Delta Kappa Gamma Bulletin: International Journal for Professional Educators, 87(1), 41-50. https://go.openathens.net/redirector/ncu.edu?url=https://www.proquest.com/scholarly-journals/teacher-attrition-impacts-stress/docview/2457214546/se-2
Guerranti, F., & Dimitri, G. M. (2023). A Comparison of Machine Learning Approaches for Predicting Employee Attrition. Applied Sciences, 13(1), 267. 10.3390/app13010267
Hingorani, K., Gittens, D., & Edwards, N. (2017). REINFORCING DATABASE CONCEPTS BY USING ENTITY RELATIONSHIPS DIAGRAMS (ERD) AND NORMALIZATION TOGETHER FOR DESIGNING ROBUST DATABASES. Issues in Information Systems, 18(1), 148. 10.48009/1_iis_2017_148-155
Madsen, A. Å, Brekke, I., & Fekjær, S. B. (2023). Women’s Attrition from Male-Dominated Workplaces in Norway: The Importance of Numerical Minority Status, Motherhood and Class. Work, Employment and Society, 37(2), 333-351. 10.1177/09500170211004247
Sivathanu, G., Wright, C. P., & Zadok, E. (2005). Ensuring Data Integrity in Storage: Techniques and Applications. Paper presented at the 2005 ACM Workshop on Storage Security and Survivability, pp. 26-36.
Stewart, C. (2014). Transforming Professional Development to Professional Learning. Journal of Adult Education, 43(Number 1), 142-165. 10.4324/9781315235400-8
Wang, G., Gunasekaran, A., Ngai, E. W. T., & Papadopoulos, T. (2016). Big data analytics in logistics and supply chain management: Certain investigations for research and applications. International Journal of Production Economics, 176, 98-110. 10.1016/j.ijpe.2016.03.014