1 Introduction
2 Algorithm
2.1 Input extract
2.2 Convolution neural network transformation
2.3 Ouput load
The aim of this project is the implementation of the convolution neural network operation in MIPS assembly language with the satisfaction of the provided input and output requirements. For the former, It is file named input_matrix.txt which contain three row of number where the numbers is seprate by space. The first row include four number N, M, p, s, where N is the dimension’s length of the input matrix, M is the dimension’s length of the kernel matrix, p is the number of padding that is applied to the input matrix which value is 0, and s is the stride, which is will be the same for both dimension. The second and third line is the sequence of number which is the entry value for matrix input and kernel respectively. Additionally, all the matrix content are floating-point number rounded to one decimal point while the stride and padding are integer numbers. However, all of the numbers are represented as floats with 1 digit after the decimal point. The latter is the file named output_matrix.txt which contain the convolution matrix result which is a sequence of floating-point numbers separate by space.
The source code of the program is organized into three part, namely, input extract, convolution neural network transformation, and output load. The organization was based on the ETL (extract - transform - load) process that commonly see in data warehouse, data lakehouse, etc. The functionality of each part are, for the “input extract”, the extraction of the input values from the input file; convolution neural network transformation, the CNN operation; the “output load”, the loading of the convolution matrix result into output file.
The “input extract”, due to the different in operation and logic for handling each line of the input file, is further divided into three part, each of them is in charge for each line respectively. However, despite the difference, there is a common mechanism that they all ultilize which is the reading of the reading of number values.
For the algorithm, we first chose five register, let call them a1, a2, a3, a4, a5, a6, a7. the a1, a2 and a3 hold the integer, decimal part and the decimal place of the floating-point number. Where the a4, a5 serve as the flag for the reading and converting of the number from the string of ASCII character to floating-point format (a4 is sign flag where 0 for positive and 1 for negative, a5 is State flag, it help determine wether the current readed digit belong to the integer or decimal part).
The register a6 is used to hold the current read character and a7 is used to store the address of the input buffer. The program will read the file byte by byte.
The algorithm will be follow:
Step 1: read a character, if it is the end line character(‘\n’), space character (‘ ‘) or the null terminator (‘\0’), then go to step 3. If it is the negative sign character, then set a4 to one. If it is the decimal point character(‘.’), then set a5 to one -> move to step 2.
Step 2: substract the ASCII value of the character that is stored in a6 to ‘0’ -> multiply the accumulator by 10 -> add it to accumulator (a1 or a2 depend on the state flag) -> if , in case of state flag is set to one, then we add one to a3. Return to step 1.
Step 3: convert the value in a1 and a2 to floating-point format -> check the sign flag a4, if it is set to one, then substract both integer part and decimal part by zero to obtain negative value -> calculate the divisor (10^a3) for decimal part -> divide the decimal part for the divisor -> add the integer and decimal part. move to step 4.
Step 4: store the read number, finnish.
Let call the Number reading mechanism as ReadNum. For the algorithm, we will define a register a0 which is used as a flag for the parsing of value into N, M, p, s variable( 0 for N, 1 for M, 2 for p and 3 for s).
The algorithm would be follow:
Step 1: use ReadNum to read the value in input file -> convert it to the integer.
Step 2: check the flag if 0 then store to N, if 1 then store to M, if 2 then store to p,if 3 then store to s. Else, go to step 3.
Step3: finish the reading of the first line.
The algorithm would be follow: Step 1: calculate the length of the input array (which is equal to (N+2p)^2 ) -> store the length -> allocated an dynamic array base on the calculated length -> store the address of the array.
Step 2: iterrate over the array for p(N + 2p) time and load padding value (in this case is 0.0).
Step 3: load the padding value p time -> read and load N number from the input file -> load the padding value p time. If finish reading the line, then step 4. Else, repeat step 3.
Step 4: finnish.
The algorithm would be follow: Step 1: calculate the length of the input array (which is equal to M^2 ) -> store the length -> allocated an dynamic array base on the calculated length -> store the address of the array -> step 2.
Step 2: read number -> load number -> if the end of line, then step3. Else move to next number and repeat step 2.
Step 3: finnish.
The algorithm would be follow: Step 1: Loads the input image dimensions (N), kernel size (M), padding (P), and stride (S) into registers - > Calculates the output dimensions based on the input and kernel parameters - > Dynamically allocates memory for the output matrix -> step 2.
Step 2: For each output position: Initializes an accumulator to store the result of the convolution.
Iterates over the kernel elements.
For each kernel element: Calculates the corresponding input value index.
Loads the input value and kernel weight.
Multiplies the input value and kernel weight Adds the product to the accumulator. Stores the accumulated result in the output matrix. finish.
The algorithm would be follow: Step 1: Checks if the input integer is negative, If negative, stores a '-' character in the output string and negates the integer.
Step 2: Repeatedly divides the integer by 10, pushing the remainder onto the stack. This process continues until the integer becomes zero, then move to step 3.
Step 3: Pops digits from the stack - > Converts each digit to its ASCII character equivalent - > Stores the character in the output string - > step 4.
Step 4: Add null terminator ('\0') to the end of the string to mark its end - > step 5 Step 5: finish.
The algorithm would be follow: Step 1: Checks if the float is negative - > Sets a flag to indicate the sign (0 for positive, 1 for negative) - > step 2.
Step 2: Round the floating-point number to one decimal place -> Convert the float to an integer (obtain intege part) -> Convert the integer back to a float - > Subtract the integer part from the original float - > multiply the substraction result by 10 (obtain the decimal part) -> step 3.
Step 3: convert integer part and decimal part using convertion of integer to string mechanism and store them -> finish.
The algorithm would be follow: Step 1: Opens the specified output file in write mode.
Step 2: For each float in the array: Using convertion of float to string mechanism to obtain the sign flag, integer part, decimal part.
If the sign flag is 1, then write the negative sign
Write the integer part
Write the dot character
Write the decimal part
If the last element is not reached, then Write the space character
Close file Finnish.