Student Identification Based on Patterns of Association For Student Misbehaviour Using Frequent Pattern Growth Algorithms

ABSTRACT


Introduction
Artificial intelligence (AI) is a field of computer technology that aims to create systems capable of mimicking and surpassing human cognitive abilities. Using advanced algorithms and computers, artificial intelligence can learn, analyse and make decisions based on the data provided. The purpose of developing artificial intelligence is to increase the efficiency, productivity and adaptability of systems in various fields that can assist humans in decision-making [1] [2][3] [4]. A process that uses statistical, mathematical, artificial intelligence (AI), and machine learning techniques to extract and identify useful information from large databases is called data mining [4][5] [6] [7].
Juvenile delinquency and crime cannot be separated from the context of socio-cultural conditions of the era of globalization, because each period is distinctive and presents a kind of challenge, especially for the younger generation, so young people respond with a distinctive reaction to the existing social and cultural stimuli [8] [9].
Adolescence is a time of psychological shock, a time of transition or a wavering bridge that connects a dependent childhood with a mature, independent adulthood. When the teenager already feels responsible for himself. He is able to account for every action and accept the philosophy of life of the society in which he lives. According to the changes of feelings he has gone through. It is undeniable that these teenagers are potentially already religious [10] .
Student delinquency is an act that often causes community, school, and family unrest. Thus, student delinquency actually refers to a behavior in the form of deviation or violation of established norms, and in terms of the law, delinquency is a violation of the law that cannot be subjected to criminal law in relation to its age [3] [11].
Islamic counseling not only helps in overcoming students' problems related to learning, but also touches the religious aspects of students, as religion plays a very important role in life, from bad deeds to useful lives, both in worldly life and in the hereafter [11] [12].
In order to address the privacy concerns in data mining, the subfield of data mining called privacy preserving data mining (PPDM) has developed greatly in recent years. The purpose of PPDM is to protect sensitive information from unwanted or unauthorized disclosure and to preserve the utility of the data. The considerations for PPDM are twofold. First, sensitive raw data, such as individual ID card numbers and mobile phone numbers, should not be used directly for mining. Second, sensitive mining products, the disclosure of which would lead to an invasion of privacy, should be excluded [13][14] [5].
Kernel-based data mining (DKBDM), such as distributed support vector engines. Among the few known data destruction problems, those related to insider attacks have increased significantly, making them one of the fastest growing types of security breaches. Once considered a minor issue, insider attacks have risen to become one of the top three data breaches. Research related to insiders includes only limited kernelbased data mining distributions, leading to significant vulnerabilities in developing protections for cooperating organisations. Previous work often fails due to multifactor models that are rather limited in scope and implementation [15] [16][17] [18].
Motion capture is an important technique with a wide range of applications in fields such as computer vision, computer animation, film production and medical rehabilitation. Even with a professional motion capture system. The raw data obtained largely contains the inevitable voices and outliers. Many methods have been developed to differentiate the data, although this problem remains a challenge due to the high complexity of human movements and the diversity of real-world situations. Data driven Robust Human is an approach to motion denoising that identifies spatio-temporal patterns and structural sparsity in motion data [19].
Empirical data analysis methods based on random matrix theory (RMT) and time series analysis are proposed for energy systems. In the context of Big Data in energy systems, there is a great need for new mathematical tools to describe and analyse Big Data. The results showed that empirical data from power systems modelled by RMT in time series have high sensitivity to dynamically characterised system states as well as efficiency observations in system analysis compared to conventional equation-based methods [20][21] [22].
The frequent extraction of route patterns from personal trajectory data is the basis for location awareness and location services. However, most approaches are only capable of finding short, incomplete route patterns. A new approach is proposed to find route patterns, often based on abstractions of trajectories. First, path partitioning, location extraction, data simplification and general segment detection are used for abstract path data, converting these trajectories into Common Temporal Segment (STS) sequences and generating a 1-frequency frequency item set. Then a pattern search algorithm was proposed based on the temporal-spatial proximity relationship of STAR [23][24] [25] [26].
In this study, it will discuss one of the algorithms in Data Mining, namely the Frequent Pattern-Growth (FP-Growth) algorithm. This algorithm is part of the association technique in data mining. Fp-Growth is used as an alternative to determine data that occur frequently in a data set (Frequent Itemset). The feature of Fp-Growth is a data structure used in a tree called fp-Tree. Using Fp-Tree can extract Frequent Itemset from Fp-Tree [24]. The purpose of using this Fp-Growth algorithm in this case is to determine the behavior patterns of SMA Negeri 1 Lembang Jaya students who are often conduct the violation.

Research Methods
The research methodology is a step that must be carried out in order to make the preparation of the research easier and also to serve as a guide for the researcher in carrying out the research. The methodology used in the preparation of this study is presented in a framework. The form of the framework description in this study starts with the identification of the problems, problem analysis, setting the objectives, studying the literature, data collection, analysis and processing of the data, implementation, testing the results, and decision making. The phase of the framework are designed to ensure that the research is purposeful and that the objectives set in the study are achieved.
The objective of this study is to identify students based on association patterns of student violations using frequent pattern growth algorithms that can serve as a reference for teachers when making decisions about student violations in school.
This framework presents the steps taken to solve the problems under discussion. The framework of this research can be illustrated in the picture below :

Introduction
The system implementation phase is one of the phase in the system development life cycle. This phase is the phase of putting the information system so that it is ready for use. In order for a system to run properly, it must first be determined where it will be implemented.

System Implementation
In implementation, a computer is needed, to operate the computer itself also requires three supporting components as follows: 1.
Hardware Hardware used to design or run application programs that have been created. It is a computer unit that has memory as a storage medium, and a printer unit as a report printer.
2. Software To run the designed application program must use some supporting software.

Figure 5. File Retrieval
In this phase is a way to retrieve violation data files that have been changed through Notepad with Arff format from folders stored on the saved partition.
e. Inputted file display on WEKA as shown in figure 6. The Explorer menu is used to test student violation data to get the results of the Rule.
g. Then click Open File and find the Arff format data savings, and click Open, as shown in figure 8.  On the figure above, it is used to select the Association and make settings that will be used in testing student violation data. i. Then select the Association and setting. Click FP-Growth to make settings, then click Ok as shown in figure 10.

Figure 10. Display for Association Selection and Settings
j. An then click Start to get the Output as figure 11.

Figure 11. Display of the output result
As shown above is the result of student violation rules produced according to the Frequent Growth pattern algorithm used which is useful for making the right decision.

Conclusion
Based on the results of the analyzes and tests performed, the following can be concluded: 1. The greater the minimum support, the fewer rules are generated.. 2. The Frequent Pattern Growth data mining method can be used to identify the violations frequently committed by students at SMA 1 Lembang Jaya, Solok.. 3. The patterns obtained can be used to help guidance counselors make decisions about violations that occur.