Complete Getting Started
From creating project to model training.
Last updated
From creating project to model training.
Last updated
In this guide, we will use a sample problem to go over all steps necessary for federated learning with FELT. This process consists of:
Initial setup
Preparing datasets
Starting local training
Aggregating results from local training
Using final model
For this guide you need an ethereum address. We recommened using MetaMask wallet:
For this guide we will be using Polygon Mumbai test network.
In case you don't have Polygon Mumbai
network in your MetaMask, you can add it by following this guide.
Right now, the app is deployed on the Polygon Mumbai testnet. First, you will need some MATIC tokens to pay for the transaction fees. You can obtain these using a Polygon faucet. Just visit the following link and paste your wallet address:
You will also need OCEAN tokens to pay for datasets and algorithms. You can collect them through OCEAN faucet by submitting your wallet address here:
For the demonstration of federated learning, let’s imagine two towns collaborating on analyzing housing data. The data might contain sensitive information. Therefore, they can’t fully disclose the data. Each town publishes its dataset on Ocean, allowing only computation over data without direct access. We will try to predict a house price based on house parameters (size in square feet, number of bedrooms, bathrooms, material, etc.). Below you can see a demonstration of our data (original data file).
We already have the data published on Ocean (using the Mumbai chain) as the following assets, which we will use in this guide:
In this guide we will be using FELT algorithms. For that we need to have data in the correct data format. Right now, we support only CSV format. With the following rules:
CSV contains only numerical data
CSV doesn't contain the header row
All datasets used during training must have the same number of columns
You can check this file house-prices-part1.csv
which is used in this article. For more details about publishing your datasets on the Ocean marketplace, please read:
If you are using your data, don’t forget to allow the “Local Training — FELT” algorithm or just all published algorithms.
Now that we have our data ready. It’s time to start the training! Head to the app.feltoken.ai. Before you begin, you need to sign in to FELT.
Then you will select between training on single dataset or on multiple datasets. For our case we will use the multiple datasets option. In the first step you will fill in the name of training (you can pick an arbitrary one) and search for our datasets with following dids:
Then you proceed to the next step, where you select the algorithm you want to run. Pick Local Training - FELT (DEV)
In the final step you customize parameters of the algoritm. FELT algorithm lets you pick from different models and customize their parameters. Right now, you can pick from scikit-learn models or analytics (mean, variance...). For our case we can pick any regression model, for example Ridge regression.
One of the most important options is to pick target column index. This is the index representing column which we want to predict. Setting value to -1 will use the last column. You can click on submit once you select you hyperparameters.
Once you hit Submit button, you will see the summary and then 2 options how to start the traing. Each option is further described here:
For this guide, we will be using browser flow without auto-sign. You will have to approve a few transactions using MetaMask to start the training. Make sure that you connected your Metamask and that you are connected to Polygon Mumbai
test network. Here is the list of all transactions you will have to confirm:
Approve OCEAN token spend to purchase the dataset
Purchase the dataset (now approve and purchase are separate transactions)
Approve OCEAN token spend to purchase FELT algorithm for training
Purchase the algorithm
Sign request to start the compute job (training)
We are starting the training on two datasets; therefore, you will have to approve those transactions twice.
Once you start the local training, you can go to launched jobs page (you can use Launched jobs button). Here you can monitor the progress.
Once both jobs finish, you can start the aggregation. On right side of each local training you have check box which you can use to select which local trainings should be aggregated (you need to select at least 2). After selecting jobs to aggregate, you can click Aggregate button and start the aggregation.
After starting the aggregation, the progress bar will pop up. You will have to approve the following transactions:
Sign URLs to access local models
Approve OCEAN token to pay for provider fees
Order dataset for the compute job
Approve OCEAN token spend to purchase FELT algorithm for aggregation
Purchase the algorithm
Sign request to start the compute job (aggregation)
You can watch aggregation progress. Once it finishes, you will see the Download final model button. You will sign the request and download the final model (in our case final-model-House Prices.json
). The file is not a standard machine learning file format. You will have to use the FELT library to import it.
First, you have to install the FELT python library using pip (it requires Python 3.9 or newer):
Then you can load the model using feltlabs.model.load_model(model_path)
function. This function will take the path of the model file as an argument and return the model object.
When using the federated learning option and importing the model using load_model(...)
function. The function returns the model, which can be used as a standard scikit-learn model object. The model can then be used for prediction using the function model.predict(data)
. You can check the following code for sample usage:
That’s it. You just trained your first model on a distributed dataset! Now it’s up to your imagination to find projects where you can use this technology.
These should be the main parts for getting started with the FELT. In the following guides, you might find more detailed instructions for specific tasks.