EasyML: Making Machine Learning More Accessible - Developer X

About the project

Machine learning is an increasingly important tool in society, but most ML models are prohibitively difficult for people to use. Even machine learning researchers often struggle to use each other's code! I created a website platform to make it easy to use different ML algorithms. I chose two of my favorite applications, a protein classifier and a pet classifier, to demonstrate the potential of this platform.

This was my final project for Harvard's CS50 course, which I took in Fall 2021. A video that walks through the website is here.

Project execution

There were many different parts to this project that had to all be integrated together in a cohesive website. First, there were the machine learning algorithms, which were mostly written by ML researchers in Python, but had to be edited for the website. Next, there was the web application, which used Python and allowed the algorithms to be run on the website. Lastly, there was the actual website, which was written in HTML and CSS.

Machine Learning Algorithm Design

First, I had to find and connect relevant machine learning algorithms for this website. I used an algorithm that takes in an amino acid sequence and classifies it into one of many different protein types. Connecting this code to the website took the large majority of my time in this project. The machine learning algorithm was extremely convoluted and it was designed for batch file inputs, not single amino acid sequences. I had to code a completely new input program which took in a single amino acid sequence as a string and treated it like a list, transformed it into a float, and then run it through the model. Changing the number of features of the model finally got it to work with this system.

During the frustration with this protein algorithm, I planned to change the model to a simple algorithm to classify images to dogs or cats. This implementation was much easier to connect to the website, as it was already created with a web application in mind, so I had to connect to my web application, but the body of the code was mostly unchanged. The code for both of these algorithms were found on Kaggle.

After I got the protein algorithm to work, I thought it would be best to include both of these algorithms on the website. I think that having both algorithms actually significantly improved the project, as it can be used as a platform to add more ML programs in the future.

Web Application Design

After finishing the code for the machine learning algorithms, I had to find a way to present those algorithms in a user-friendly way on the website. I had originally planned to run the Python code on the website itself, but I learned that there were web applications that would streamline that process to improve the UI/UX of the code.

I chose to use the web application called Anvil, as it was Python-based and was free. On Anvil, I created the user interface for both of the web apps, which connected directly to my machine learning algorithms (which were running on my Deepnote notebook). The Anvil code was written in Python and took in the input string/file, sent it to the Deepnote notebook, and returned the algorithms' output.

I embedded Anvil to my website through an HTML code block.

Website Design

Initially, I designed the HTML and CSS for my website completely by myself, using parts of code from my homework and other projects. The website had a homepage, which connected to pages for the other two programs, Protein Classifier and Pet Classifier. I was recommended to use a platform called nicepage.com to help me design the website, which helped a lot. I adapted my HTML and CSS code to their HTML template and CSS files, which helped with the aesthetics significantly.

Project results

As a website built in the CS50 IDE, the code should be compiled there and the server should be created in VS Code. To run the machine learning algorithms, run the attached .ipynb files (titled "petclassifier.ipynb" and "proteinclass.ipynb") on a notebook before using the website. You can also run it on my Deepnote notebook here: https://deepnote.com/project/EasyML-0IwyrG_BREGxLcx3wtQ0dA/%2Fproteinclass.ipynb. The Protein Classifier takes about 15 minutes to train at first, but once it has finished, the web application should run almost instantaneously.

The first block presents the "Protein Classification" tool. Clicking the hyperlink takes you to the Protein Classification page, which has more details about the model as well as a web app connected to the program. By typing (or more likely, pasting) an amino acid sequence into the text box, the model will tell you what type of protein that sequence codes for. The second block presents the "Pet Classification" tool. Clicking on that hyperlink takes you to the Pet Classification page, which has information about the model being used and the web app for that classification program. You can upload a photo with a dog or cat in it into the web app, which will be run through the algorithm to classify what animal your pet is. The third block gives my email to users to receive feedback or questions about the platform, as well as information about the website's future.

As this was a course project, the site was live for the semester but has been taken down as website hosting is expensive. All my code is available on GitHub if you want to try downloading and running it on your own!

Code on GitHub here: github.com/changbenjamin/EasyML

EasyML: Making Machine Learning More Accessible