File: imld/v3.0.0/AAREADME.txt Tool: ISIP Machine Learning Demo Version: v3.0.0 #IMLD v3.0.0 ------------------------------------------------------------------------------- Change Log: (20240124) update README for v3.0.0 (20220407) updated README for v1.8.0 ------------------------------------------------------------------------------- This directory contains all the Python code needed to run our learning demo tool. This tool is used to aid in learning machine learning topics. A. WHAT'S NEW Version 3.0.0 includes these enhancements: - Retooled backend computations to be done by NEDC ML Tools library - Included algorithms from the ML Tools library - Allows the user to create custom algorithms in Python - Ability to save created models and model parameters as a file - Created "Train" and "Evaluate" options to streamline analysis - Option to view confusion matrices - Small formatting changes - Bug fixes and overall performance improvement B. INSTALLATION REQUIREMENTS Python code unfortunately often depends on a large number of add-ons, making it very challenging to port code into new environments. This tool has been tested extensively on Windows, Linux, and Mac machines running Python v3.9.x. However, due to the large dependencies required for this software, there are options to download pre-combiled binaries that make installation simple. If the user opts to download the pre-compiled distribution, NO ADDITIONAL SOFTWARE TOOLS ARE REQUIRED. We offer pre-compiled distributions for the following systems: o Windows (x64) o MacOS (ARM) o Linux (x64) If a pre-compiled distribution is not offered for your system, please download the base distribution and follow your system's guide to install the required dependencies. If you choose to download the base software, you must install the required Python interpreter and its dependencies: o Python 3.9.x (we recommend installing Anaconda) o PyQt5: https://www.riverbankcomputing.com/software/pyqt/download5 o Numpy/SciPy: http://www.numpy.org/ o PyQtGraph: http://www.pyqtgraph.org/ o Matplotlib: https://matplotlib.org/ o SciKit Learn: https://scikit-learn.org/stable/ o PyInstaller: https://pyinstaller.org/en/stable/ (Only if you wish to compile the base download) There is a requirements.txt included in the release that helps you automate the process of updating your environment. C. USER'S GUIDE C.1. WINDOWS USERS As mentioned, Windows users have the option to download either the base or pre-compiled distributions. Below are detailed guides for getting started for both distributions. C.1.1 WINDOWS EXECUTABLE DISTRIBUTION The Windows executable distribution should work for all 64-bit (x64) Windows system. For the executable distribution, no installations will be required. The program can be started by simply running the file: nedc_imld.exe Once ran, a blank terminal will open as a window and the IMLD GUI will appear not long after that. Now IMLD can be ran on the users system. IMLD can be downloaded or moved to anywhere on the users system. However, the IMLD executable must be in the same directory as the "_interal\" folder, as the executable depends on this folder. Users may also create shortcuts to the "nedc_imld.exe" executable for ease of use. C.1.2 WINDOWS BASE DISTRIBUTION The base Windows distribution includes the Python source code. In order for the Python code to work, the proper dependencies must be installed. The easiest way to install dependencies is by using the Python "venv" tool, which creates a virtual environment where the proper dependencies can be installed. First, make sure Python is installed on your device. Python can be installed from the Windows Store, or directly from the Python website. Python 3.9.x is recommended for this installation. All Python versions 3.3 and higher include the "venv" library, which is needed for this installation. If your installation does not have "venv", you can install it by using the command: $ pip install venv To create a virtual environment, right-click on the extracted distribution. Then select the "Open in Terminal" open. THIS GUIDE WILL FOCUS ON USING WINDOWS POWERSHELL, so make sure to use Powershell rather than Command Prompt. Inside of the terminal a virtual environment can be created using the command: $ python -m venv This command will generate a folder titled the same as the name you specified in the command. The virtual environment can be activated using the command: $ .\\Scripts\Activate.ps1 After this command, the users command prompt should change to display the virtual environment name before the users PATH, example: () PS C:\Users\NEDC\Desktop\imld_base> With the enviroment activated, the dependencies can be installed. The following command will automatically install the required dependencies from the "requirements.txt" file: $ pip install -r .\requirements.txt If any errors occur, it is recommended to install each packed listed in the "requirements.txt" by itself using the command: $ pip install With the required dependencies installed, IMLD can now be run. To run IMLD, run the follow command in the terminal: $ python .\lib\nedc_imld.py Make sure the virtual environment is activated whenever this Python script is ran. IF THE VIRTUAL ENVIRONMENT IS NOT ACTIVE, IMLD WILL NOT WORK. The script will also not work if ran through Windows file explorer because the virtual environment will not be active in this case. To deactivate the virtual environment, the following command can be ran: $ deactivate If the user would like to generate an executable on their machine for ease of use, the following command can be ran in the terminal: $ pyinstaller.exe --add-data ".\lib\config\*;.\config" .\lib\nedc_imld.py This command will generate a ".\dist\" folder in the working folder which contains the executable and required ".\_internal\" folder. This installation can be treated as if it was downloaded as one of the compiled distributions. C.2 MAC/LINUX USERS Since Macs and Linux machines are Unix based, they share same commands in their respective shells (BASH/ZShell). This guide will serve as instructions for both Mac and Linux systems. C.2.1 MAC/LINUX EXECUTABLE DISTRIBUTION The Linux executable distribution should work for all 64-bit (x64) Linux systems. For Mac systems, the distribution should work for all ARM based Macs. ARM based Macs have the M-Series (M1, M2, ...) series processor. The executable will unfortunately not work for Intel (x64) based Macs. For the executable distribution, no installations will be required. The program can be started by simply running the file: ./bin/nedc_imld IMLD can be downloaded or moved to anywhere on the users system. However, the IMLD executable must be in the same directory as the "_interal\" folder, as the executable depends on this folder. Users may also create shortcuts to the "nedc_imld" executable for ease of use. C.2.2 MAC/LINUX BASE DISTRIBUTION The base MAC/LINUX distribution includes the Python source code. In order for the Python code to work, the proper dependencies must be installed. The easiest way to install dependencies is by using the Python "venv" tool, which creates a virtual environment where the proper dependencies can be installed. First, make sure Python is installed on your device. Python is often already installed on MacOS and most Linux distributions. Python 3.9.x is recommended for this installation. All Python versions 3.3 and higher include the "venv" library, which is needed for this installation. If your installation does not have "venv", you can install it by using the command: $ pip install venv To create a virtual environment, open the extracted directory in a terminal. Inside of the terminal a virtual environment can be created using the command: $ python3 -m venv This command will generate a folder titled the same as the name you specified in the command. The virtual environment can be activated using the command: $ source ./venv/bin/activate After this command, the users command prompt should change to display the virtual environment name before the users PATH, example: () NEDC :> With the enviroment activated, the dependencies can be installed. The following command will automatically install the required dependencies from the "requirements.txt" file: $ pip install -r ./requirements.txt If any errors occur, it is recommended to install each packed listed in the "requirements.txt" by itself using the command: $ pip install With the required dependencies installed, IMLD can now be run. To run IMLD, run the follow command in the terminal: $ python ./lib/nedc_imld.py Make sure the virtual environment is activated whenever this Python script is ran. IF THE VIRTUAL ENVIRONMENT IS NOT ACTIVE, IMLD WILL NOT WORK. The script will also not work if ran through a GUI file explorer because the virtual environment will not be active in this case. To deactivate the virtual environment, the following command can be ran: $ deactivate If the user would like to generate an executable on their machine for ease of use, the following command can be ran in the terminal: $ pyinstaller --add-data "./lib/config/*:./config" ./lib/nedc_imld.py This command will generate a "./dist/" folder in the working folder which contains the executable and required "./_internal/" folder. This installation can be treated as if it was downloaded as one of the compiled distributions. C.3 CUSTOM ALGORITHMS New to IMLD version 3.0.0, users can now import their own custom algorithms into IMLD. These algorithms are created using Python and allow users to include any additional Python libraries they like. To create a custom algorithm, refer to the "custom_algorithm.py" script, which can be found in these locations: Executable Distributions: .//_internal/config/custom_algorithm.py Base Distributions: .//lib/config/custom_algorithm.py This Python script includes the necessary methods and libraries that make your algorithm work in IMLD. Here are the steps to creating a custon algorithm: 1. Create A Class: The name of your class is very important as it must be identical to the name that is added to the parameters file (more on this later). 2. Add the Required Methods to your Class: Each custom algorithm must include 3 methods that have strict requirements for the algorithm to work in IMLD. The TEMPLATE class in the custom algorithms script acts as a template of the required functions for each class, including the arguments and returns for each method. The three methods are 1. __init__() -> None 2. train(data,write_train_labels,fname_train_labels) -> self.model_d, score 3. predict(data,model) -> p_labels, posteriors The arguments and returns of the methods are described in the custom_algorithms.py script in more detail. 3. Test the Custom algorithm Make sure to test the custom algorithm before deploying it into IMLD. It is much easier to test the code as a Python script rather than in IMLD. 4. Add the Custom Algorithm to the Parameters File Before the custom algorithm appears in IMLD, it must be added to the IMLD - algorithms parameters file, which can be found here: Executable Distributions: .//_internal/config/imld_alg_params.txt Base Distributions: .//lib/config/imld_alg_params.txt The parameter file contains a highly-detailed guide at the beginning of the file. Include a parameter block for the custom algorithm that adheres to the requirments of the parameter file. Make sure the key to the paramets block is EXACTLY THE SAME AS THE NAME OF THE PYTHON CLASS. With a parameter block created, make sure to add the custom algorithm to the list of algorithms at the top of the parameters file. Again, make sure to add the name EXACTLY THE SAME AS THE NAME OF THE PYTHON CLASS AND PARAMETER BLOCK. With all of these steps followed, the custom algorithm should now appear in the "Algorithms" drop-down menu. More than one custom algorithm can be added to IMLD by following the steps above by adding onto the aformentioned custom algorithms script and parameters file. D. CLOSING REMARKS IMLD is designed as an educational software to help students and learners understand the fundamentals of machine learning. IMLD is not supposed to be a professional data analysis tool. If you have any issues, bugs, or requests, feel free to reach out to help@nedcdata.org Thank you for installing IMLD and supporting open source software! Best regards, Joe Picone