.. _quickstart: Quick Start =========== .. note:: Make sure that you have LineaPy installed. You can find relevant instructions in the :ref:`Installation ` section. Once LineaPy is installed, we are ready to start using the package. We start with a simple example using the `Iris dataset `_ to demonstrate how to use LineaPy to 1) store a variable's history, 2) get its cleaned-up code, and 3) build an executable pipeline for the variable. .. code:: python import lineapy import pandas as pd from sklearn.linear_model import LinearRegression, ElasticNet # Load data df = pd.read_csv("https://raw.githubusercontent.com/LineaLabs/lineapy/main/examples/tutorials/data/iris.csv") # Some very basic feature engineering color_map = {"Setosa": 0, "Versicolor": 1, "Virginica": 2} df["variety_color"] = df["variety"].map(color_map) df2 = df.copy() df2["d_versicolor"] = df["variety"].apply(lambda x: 1 if x == "Versicolor" else 0) df2["d_virginica"] = df["variety"].apply(lambda x: 1 if x == "Virginica" else 0) # Initialize two models model1 = LinearRegression() model2 = ElasticNet() # Fit both models model1.fit( X=df2[["petal.width", "d_versicolor", "d_virginica"]], y=df2["sepal.width"], ) model2.fit( X = df[["petal.width", "variety_color"]], y = df["sepal.width"], ) Now, we reach the end of our development session and decide to save the ElasticNet model. We can store the model as a LineaPy :ref:`artifact ` as follows: .. code:: python # Store the model as an artifact lineapy.save(model2, "iris_elasticnet_model") A LineaPy artifact encapsulates both the value *and* code, so we can easily retrieve the model's code, like so: .. code:: python # Retrieve the model artifact artifact = lineapy.get("iris_elasticnet_model") # Check code for the model artifact print(artifact.get_code()) which will print: .. code:: none import pandas as pd from sklearn.linear_model import ElasticNet df = pd.read_csv( "https://raw.githubusercontent.com/LineaLabs/lineapy/main/examples/tutorials/data/iris.csv" ) color_map = {"Setosa": 0, "Versicolor": 1, "Virginica": 2} df["variety_color"] = df["variety"].map(color_map) model2 = ElasticNet() model2.fit( X=df[["petal.width", "variety_color"]], y=df["sepal.width"], ) Note that these are the minimal essential steps to produce the model. That is, LineaPy has automatically cleaned up the original code by removing extraneous operations that do not affect the model. Say we are now asked to retrain the model on a regular basis to account for any updates in the source data. We need to set up a pipeline to train the model, and LineaPy make it as simple as a single line of code: .. code:: python lineapy.to_pipeline( artifacts=[artifact.name], pipeline_name="iris_model_pipeline", output_dir="output/", framework="AIRFLOW", ) which generates several files that can be used to execute the pipeline from the UI or CLI. In sum, LineaPy automates time-consuming, manual steps in a data science workflow, helping us move our work into production more quickly. .. note:: To learn more about LineaPy's API, check out `this `_ tutorial.