2019 Introduction of Balinese Script Handwriting Using Zoning and Multilayer

th @cs.unud.ac.id Abstract Handwriting identification is one out of the many research ever conducted. In its development, the handwriting can be written in real time by the user by using the mouse (online character recognition). Various studies on the traditional character handwriting recognition continue to be developed. One of them is the recognition of the Balinese characters. Balinese characters have their own unique characters compared with the other regions. The difference between the shapes of the characters with the other characters are quite similar, or there are some characters that can only be distinguished by a small sketch or doodle. This study uses Artificial Neural Network with Backpropagation algorithm to perform the Balinese characters recognition and zoning as a method of feature extraction. In a variation of the extraction method, the characteristics used are Image Centroid and Zone (ICZ), Zone Centroid and Zone (ZCZ) and normalization of features. Of the three methods, it will be determined the best method used in the Balinese characters recognition. From the test results of the extraction method, the combined characteristics of the ICZ, ZCZ and normalization of features were the most effective to be used for the recognition of the Balinese characters. The level of accuracy obtained from the results of the online testing was 71,28% and 72,31% for offline testing, with parameters of Backpropagation, which used the value of learning rate of 0,03, a momentum value of 0,5 and the number of neurons in the hidden layer of


INTRODUCTION
Current technological developments provide a great influence on science and the development of technology will provide convenience for its users. Pattern recognition is one of the fields nowadays which is widely developed and used in various introductions such as fingerprint recognition, speech recognition and writing recognition. Pattern recognition itself is a scientific discipline that classifies objects based on predetermined parameters into several classes or categories. So that computers are required to be able to recognize a pattern like human abilities. Among its applications, the introduction of writing has its own uniqueness and level of difficulty to recognize.
Given that Indonesia has cultural diversity, Indonesia also has a diversity of regional writings. For this reason, the object of writing is Balinese script. The Balinese script writing has its own uniqueness with other regional writings. The difference in form between one character and another character is almost the same and there are characters that are only distinguished by small strokes or strokes. From these problems, an application needs to be able to recognize Balinese script well. In addition, it is expected to contribute to the preservation of culture. In the last few years, many techniques have been used in handwriting recognition, such as the Modified Discrimination Function (MDF) Classifier and Learning Vector Quantization [1], Vector Machine Support and Nearest Neighbor [2]. The zoning method is one of the feature extraction methods in popular and simple imagery that is used to extract features from an image. By using the image zoning method, it will be divided into several zones of the same size and then the respective zones will be taken. From the method, in this study the Atrificial Neural Network (ANN) method, the Multilayer Perceptron with Backpropagation and zoning algorithms, is used as a feature extraction method to be able to recognize Balinese script well. Variations in feature extraction methods used are Image Centroid and Zone (ICZ), Zone Centroid and Zone (ZCZ) [3] and normalization of features. In this research the system will be developed to be able to do an online introduction where Balinese script writing data can be written using the mouse in real time by the user.

Zoning Feature Extraction
Retrieving the characteristics of the image using the zoning method. One of them is the distance feature metric extraction method of centroid and zone (ICZ) and Zone Centroid Zone (ZCZ) [3]. Using this method, the first centroid will be calculated from the input image. Coordinates can be expressed by points (x c ,y c ).
With f(x,y) pixel value of the image in a certain position. Then, the image will be divided into n zones. Dimensional input image size mxn will be divided into n zones so that the size of each zone is m/n x n/n. The results of this stage will produce n features of each Balinese character. Using a combination will produce 2.n features.
Following is the Zoning Image Centroid and Zone (ICZ) Algorithm: a) Calculate the centroid of the input imageBagi citra menjadi n zona b) Calculate the distance of the centroid of the image with each pixel in the zone c) Repeat steps (c) for all pixels in the zone d) Calculate the average distance between these points e) Repeat steps (c) to (e) for all zones in sequence f) Save n features for the next stage  (ANN) is a concept of knowledge engineering in the field of artificial intelligence that is designed by adopting the human nervous system, whose processing is mainly in the brain [4]. The ANN design is generally shown in Figure 1. The image, the input vector consists of a number of values given as input values in ANN. The input vector has three values (x1, x2, xi) as features in the data that will be processed in ANN. Each input value passes a w-weighted relationship, then all values are combined. The combined value is then processed by the activation function to produce the y signal as output.
The activation function uses a (threshold) value to limit the output value so that it is always within the specified value limit.

Figure 1 ANN Architecture
Judging from the number of layers, ANN can be divided into two types, namely ANN single layer and plural ANN layers. A single ANN has one layer of processing neurons. One layer can contain many neurons. Figure 1 is an example of a single layer ANN with the number of one neuron. Examples of single layer ANN algorithms are Perceptron, Delta, and so on. While the multi-layer ANN has a number of intermediate neurons that connect input vectors with the exit layer. The intermediate layer is called hidden layer (hidden layer). Examples of multiple layer ANN algorithms are Constructive Backpropagation, Recurrent Neural Network, Backpropagation and so on.

Multilayer Perceptron with Backpropagation
Multiple layer perceptron (Multilayer Perceptron / MLP) is a derivative ANN from Perceptron, in the form of feedback ANN (feedforward) with one or more hidden layers (hidden layer). Usually, the network consists of one input layer, at least one layer of computing neurons in the middle (hidden), and a computational neuron layer output. The input signal is propagated with forward direction at layer-by-layer [4]. The MLP architecture is shown in Figure 2. Many training algorithms are available, but the most popular is Backpropagation. This method was first proposed in 1969 (Bryson and Ho, 1969), but was later ignored because of heavy computing. In the mid 1980s this algorithm was again discussed. The training method carried out by the Backpropagation algorithm is the same as the Perceptron. A number of training data as input patterns are given to the network. The network calculates the output pattern. If there is an error (the difference between the desired output target and the output value), the weight in the network will be updated to reduce the error.
Dalam In MLP Backpropagation, the training algorithm has two phases. The first phase, the vector / input pattern is given in the input layer. The network then propagates the input pattern from the input layer to the first hidden layer, then passes it to the next hidden layer until the output value is generated by the output layer. The second phase, if the output value / pattern is different from the desired output value, the error will be calculated, then reversed from the output layer until it returns to the input layer. Weight is modified during the reverse propagation process [5].
The backpropagation training algorithm with binary sigmoid activation functions is as follows: 0) 0) Initialize weights with small random numbers. 1) 1) If the condition is not reached, do steps 2 -9. 2) 2) For each pair of training data, do steps 3 -8. Calculate weight correction (used to renew w jk ) with the acceleration rate α.
Calculate error information δ by multiplying with the activation function derivative.
Calculate changes in weight v ji (which will be used to change the weight v ji ).
3) Calculate all weight changes to the output units and hidden units

4)
The condition test stops using RMSE or the maximum iteration.

Research Design
This study took the title "Introduction to Balinese Script Online Using Artificial Neural Network". This research belongs to experimental research research [6]. The object of research is (8) Balinese characters and the method used is Multilayer Percepton with the Backpropagation algorithm. The research object will be treated or treated. The treatment in question is in the process of recognition, the object will be carried out an introduction process with different artificial neural network architectures. The treatment given is the difference in the number of neurons in the hidden layer, the size of the learning rate and the momentum value. Then from the results of treatment or treatment that has been done, it will be seen the accuracy of the introduction of each MLP network architecture in the character recognition of Balinese characters. In this research, the independent variable (the manipulated factor) is the architecture of the Multilayer network Perceptron changes the number of neurons in the hidden layer, the rate of the learning rate and mementum. The dependent variable is the accuracy of the introduction of Balinese script. By using the number of neurons in the hidden layer, different rates of learning and momentum will give different recognition accuracy.

Data Collection
In this study the type of data used is primary data. Data will be taken by scanning handwritten data from each respondent. The data was collected by respondents themselves by writing each character so that for each respondent got 65. The number of respondents used in this study was 13 people with the image format in the form of .bmp. In addition, the data will be collected from the fonts of the dwijendra bali simbar. In this study an online introduction will be carried out, so data is also collected from handwriting using a mouse.

Preprocessing
To get information from the image, the previous image will be processed to get the information, called the initial data processing stage (preprocesing). Figure 3 is the initial stage flowchart carried out in the system.

Methods Used
In this study, the method used for feature extraction is the zoning method and for Balinese script pattern recognition using the Multilayer Perceptron using the Backpropagation learning method. The research method scheme can be seen in Figure 4.

Testing and Evaluation
Character samples have their own characteristics for each different writing and different respondents. In data collection there were 10 respondents consisting of men and women with an average age of 22 years and 3 respondents from Balinese language teachers. From the results of collecting datasets, 1170 data will be used for training and 195 data used for testing.
In the testing process the accuracy or level of character recognition with the program will be tested. In this study testing was carried out by making several changes to the number of neurons in the hidden layer, learning rate and momentum values. Planning the number of neurons in the hidden layer to be used is 80 to 130. The learning rate and momentum to be used are several values in the range from 0.01 to close to 1.
In the test some changes were made, for each test performed the accuracy value was calculated (system accuracy). The value of accuracy can be calculated by the following equation

Feature Extraction Method
The testing of this feature extraction method was carried out to determine the level of accuracy produced by each extraction method. The method that has the highest accuracy will be used to analyze backpropagation networks. The training data used are the results of the collection of handwritten scans, handwriting with the mouse and fonts of the dwijendra cymbals. 18 sample samples were used for each character, so there were 1170 training data samples. The number of samples for testing there are 3 samples per script, so there are 195 sample testing data. Data for testing is different from data for training. The method used in this study is image centroid and zone (ICZ), a combination of image centroid and zone (ICZ) and zone centroid zone (ZCZ) and a combination of ICZ, ZCZ and normalization. The characteristics of backpropagation used to perform this test are the learning rate value 0.01, the number of hidden layer 130 neurons, the momentum value of 0.5 and the target error of 0.01. The maximum epoch limit is 10000.
From the results in Table 1, the use of extraction methods influences the level of accuracy produced because the features produced from an image will represent the identity of the image itself.

Backpropagation Parameter Analysis
In this analysis several tests were carried out using research variables, namely the learning rate, the number of neurons in the hidden layer and momentum in the training process using the backpropagation algorithm. The feature extraction method used is a combination of ICZ, ZCZ and normalization. In each test used training data 1170 and testing data used is 195 data. The error target used is 0.0189 and the iteration limit is 10000.
The first is learning rate testing. This test was conducted to determine the effect of using learning rate values on system accuracy. The initial momentum value and the number of hidden layer neurons are 0.5 and 130.

Data Testing Testing
This test is done after getting the best parameters from the learning rate, the number of neurons in the hidden layer and the momentum value. The best value obtained in this study is the learning rate of 0.03, the number of neurons in the hidden layer 130 and momentum 0.5. The feature extraction method used is a combination of ICZ, ZCZ and normalization of features. Testing will be done offline and online. Offline testing will be carried out on 195 scan testing data from handwriting. The results of testing data testing obtained an accuracy rate of 72.31%. Furthermore, online testing will be carried out because in this study an online introduction was developed. Users will write Balinese characters in real time on the system using the mouse on the frame provided and the introduction process will be carried out. Each script will be tested 3 times so that the total online testing data is 195. The level of accuracy obtained is 71.28%.%.

Analysis of Other Factors
Other causes of accuracy were still low, the data used in the study. The level of variance of the data used will affect the level of accuracy. The image used in the study is handwriting so it is varied than the script image using fonts. Using handwriting is lower because everyone has a different writing style. In addition to the character of Balinese scripts, some characters have similarities with other characters. One example of the test results of the letters A-building (2 true prediction data), predicted data (1 data is predicted as a-kara).

CONCLUSION
Conclusions can be drawn from the research that has been done that the Artificial Neural Network method with Backpropagation algorithm and feature zoning extraction method can be used to do Balinese Script handwriting recognition with 71.28% online testing accuracy and 72.31% offline testing using feature extraction methods a combination of ICZ, ZCZ and normalization of features.
The combined feature extraction method of ICZ, ZCZ and normalization of features produced the highest accuracy in this study, so this method was most effectively used in the introduction of Balinese script.
The results of parameter linkage analysis of the backpropagation network indicate an influence on the level of accuracy. The choice of parameters must be optimal to get optimal results. In this study the best parameter is the learning rate value of 0.03, the momentum value is 0.5 and the number of neurons in the hidden layer used is 130.
The low value of accuracy obtained in this study is caused by the use of zoning methods which sometimes give similar values to different characters.