NAV
python
  • ↼ Back to Product Docs
  • Introduction

    Fiddler Platform Image

    # This section contains code examples
    

    Welcome to the Fiddler API! You can use our API to access Fiddler platform API endpoints.

    Currently we support language bindings in Python. You can view code examples in the dark area to the right, and you can switch the programming language of the examples with the tabs in the top right.

    Installation

    Fiddler's primary SDK is the Python package fiddler-client. To install fiddler-client, run the the commands to the side in your shell:

    # install python3
    brew install python3
    
    # run the `curl` command to download `get-pip.py`
    curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
    
    # install pip3
    python3 get-pip.py
    
    # install `fiddler-client` from the command line
    # (or go to this [link](https://pypi.org/project/fiddler-client/) to download):
    
    pip3 install fiddler-client
    

    Authentication

    import fiddler as fdl
    
    # NOTE: typically the API url for your running instance of Fiddler will be
    # "https://xxxxx.fiddler.ai" (or "http://localhost:4100" for onebox). However, use
    # "http://host.docker.internal:4100" as our URL if Jupyter is running in a docker
    # VM on the same macOS machine as onebox
    url = 'http://host.docker.internal:4100'
    
    # see <Fiddler URL>/settings/credentials to find, create, or change this token
    auth_token = '<FIDDLER_API_TOKEN>'
    
    # see <Fiddler URL>/settings/general to find this id (listed as "Organization ID")
    org_id = '<FIDDLER_ORG_ID>'
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    

    The Fiddler Client uses API keys to allow access to your Fiddler instance.

    Projects

    Fiddler Platform Image

    Projects are used to organize your models and datasets. A project represents a machine learning task, e.g. predicting house prices, assessing creditworthiness, or detecting fraud.

    A project can contain one or more models for the ML task, e.g. LinearRegression-HousePredict, RandomForest-HousePredict.

    More information can be found here

    Get All Projects

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    client.list_projects()
    

    The command above returns a structure like this:

    [
        'project_a',
        'project_b',
        'project_c'
    ]
    

    This endpoint retrieves the ids of all projects accessible by the user.

    Returns

    List[str], List of strings containing the ids of each project

    Create a Project

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    my_project = 'example_project'
    client.create_project(project_id=my_project)
    

    The command above returns a structure like this:

    {
        'project_name': 'example_project'
    }
    

    This endpoint creates a project under the ID project_id.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.

    Returns

    Server response for action.

    Delete a Project

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    my_project = 'example_project'
    client.delete_project(project_id=my_project)
    

    The command above returns a structure like this:

    true
    

    This endpoint deletes a specific project using the passed project_id.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.

    Returns

    Server response for action.

    Datasets

    Fiddler Platform Image

    A dataset in Fiddler is a data table containing features and targets for machine learning models. They can optionally have metadata and “decisions” columns, which can be used to segment the dataset for analyses, track business decisions, or as protected attributes in bias-related workflows. Users typically upload a representative sample of their model’s training data. Often a holdout test set is also included.

    The sample should be unbiased, faithfully capturing moments of the parent distribution. Further, values appearing in dataset columns should be representative of their entire ranges or of all possible values for categorical variables.

    More information can be found here

    Get All Datasets

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    my_project="example_project"
    client.list_datasets(project_id=my_project)
    

    The command above returns a structure like this:

    [
        'dataset_a',
        'dataset_b',
        'dataset_c'
    ]
    

    List the names of all datasets in the project.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.

    Returns

    List[str], List of strings containing the ids of each dataset.

    Upload a Dataset

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id = 'example_project'
    dataset_id = 'example_dataset'
    
    df_schema = fdl.DatasetInfo.from_dataframe(pandas_df, max_inferred_cardinality=1000)
    
    fdl.upload_dataset(
                  project_id=project_id,
                  dataset={'train': df_train,
                            'test': df_test},
                  dataset_id=dataset_id,
                  info=df_schema)
    
    

    The command above returns a structure like this:

    {
      'row_count': 9134,
      'col_count': 24,
      'log': [
          'Importing dataset example_dataset',
          'Creating table for example_dataset',
          'Importing data file: train.csv',
          'Importing data file: test.csv'
      ]
    }
    

    Uploads a dataset into Fiddler.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    dataset Dict[str, pd.DataFrame] The ID to be used for the model.
    dataset_id str The unique identifier of the dataset within the specified project.
    info Optional[fdl.DatasetInfo] A fdl.DatasetInfo object specifying all the details of the dataset. If not provided, a fdl.DatasetInfo will be inferred from the dataset and a warning raised.
    size_check_enabled bool True Flag to enable the dataframe size check. Default behavior is to raise a warning and present an interactive dialogue if the size of the dataframes exceeds the default limit. Set this flag to False to disable the checks.

    Returns

    Server response for action.

    Delete a Dataset

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    my_project = 'example_project'
    my_dataset = 'example_dataset'
    
    client.delete_dataset(project_id=my_project, model_id=my_model)
    

    The command above returns a structure like this:

    'Dataset deleted example_dataset'
    

    Deletes a dataset within a project.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    dataset_id str The unique identifier of the dataset within the specified project.

    Returns

    Server response for action.

    Get DatasetInfo

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id = 'example_project'
    dataset_id = 'example_dataset'
    
    df_schema = client.get_dataset_info(project_id=project_id, dataset_id=dataset_id)
    

    Gets a fdl.DatasetInfo object for the specified dataset. This object is used in creating the fdl.ModelInfo that includes the feature column names and data types.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.

    Returns

    fdl.DatasetInfo, a fdl.DatasetInfo object for the specified dataset.

    Get Dataset Slice

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id = 'example_project'
    query_str = r"""select * from "bank_churn"."bank_churn" limit 10"""
    
    df_schema = client.get_slice(project_id=project_id, query_str=query_str)
    

    Uses the passed Query string to generate a table containing the sliced data.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    sql_query str None A special SQL query that begins with the keyword "SLICE".
    columns_override Optional[List[str]] None A list of columns to return even if they are not specified in the slice.

    Returns

    pd.DataFrame, A table containing the sliced data (as a Pandas DataFrame)

    Models

    Fiddler Platform Image

    A model in Fiddler represents a machine learning model. A project will have one or more models for the ML task (e.g. a project to predict house prices might contain LinearRegression-HousePredict and RandomForest-HousePredict).

    At its most basic level, a model in Fiddler is simply a directory that contains three key components:

    1. The model file (e.g. *.pkl)
    2. model.yaml: The YAML file containing all the metadata needed to describe the model, what goes into the model, and what should come out of it. This model metadata is used in Fiddler’s explanations, analytics, and UI.
    3. package.py: The package.py contains all of the code needed to standardize the execution of the model. Please see the package.py section below for details.

    More information can be found here

    Get All Models for a Project

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    my_project="example_project"
    client.list_models(project_id=my_project)
    

    The command above returns a structure like this:

    [
        'model_a',
        'model_b',
        'model_c'
    ]
    

    List the names of all models in the specified project.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.

    Returns

    List[str], List of strings containing the ids of each model.

    Register a Model

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id = 'example_project'
    model_id = 'example_model'
    target = 'e_target'
    continuous_features = ['con_a', 'con_b']
    categorical_features = ['cat_a', 'cat_b']
    
    feature_columns = list(continuous_features + categorical_features)
    metadata_cols = ['meta']
    decision_cols = ['high_value']
    outputs = ['predicted_output']
    
    model_info = fdl.ModelInfo.from_dataset_info(
                dataset_info=client.get_dataset_info(dataset_id),
                target=target,
                features=feature_columns,
                metadata_cols=metadata_cols,
                decision_cols=decision_cols,
                outputs=outputs,
                input_type=fdl.ModelInputType.TABULAR,
                model_task=fdl.ModelTask.REGRESSION,
                display_name='My Model',
                description='this is a model for the example',
    )
    
    client.register_model(project_id=project_id, model_id=model_id, dataset_id=dataset_id, model_info=model_info)
    

    The command above returns a structure like this:

    'Model successfully registered on Fiddler. \n Visit http://your-org.fiddler.ai/projects/example_project'
    

    Register a model in Fiddler. This will generate a surrogate model (supports tabular for now), which can be replaced later with original model using update_model() api below.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.
    dataset_id str The unique identifier of the dataset within the specified project.
    model_info fdl.ModelInfo fdl.ModelInfo is a schema describing the model. Refer to fdl.ModelInputType, fdl.ModelTask helper classes for details

    Returns

    Server response for action.

    Upload Model Package

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id = 'example_project'
    model_id = 'example_model'
    artifact_path = Path('/tmp/foo')
    
    client.upload_model_package(artifact_path=artifact_path,  # expects a model.yaml, package.py, and model.pkl
                                project_id='credit_underwriting',
                                model_id='lending')
    

    Creates and uploads a custom model using a model.yaml file along with custom glue-code for running the model. Expects a model.yaml, package.py, and model.pkl within the specified artifact_path. This API supports tabular and nlp models as well as models packaged in a container.

    To ingest and register a model container into Fiddler, upload_model_package when given the parameters like image, deployment_type etc, creates a set of kubernetes objects like a deployment, service, ingress etc in the background.

    Parameters

    Parameter Type Default Description
    artifact_path Path None A path to a directory containing all of the model artifacts needed to run the model. This includes a package.py file with the glue code needed to run the model.
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.
    deployment_type Optional[str] 'predictor' One of {'predictor', 'executor'}.
    - 'predictor'': where the model just exposes a /predict endpoint typically simple sklearn like models.
    - 'executor': where Fiddler needs the model internals typically deep models like tensorflow and pytorch etc.
    image_uri Optional[str] None A URI of the form '/:' which if specified will be used to create a new runtime and then serve the model.
    namespace Optional[str] 'default' The kubernetes namespace to use for the newly created runtime.
    port Optional[int] 5100 The port to use for the newly created runtime.
    replicas Optional[int] 1 The number of replicas running the model.
    cpus Optional[int] 0.25 The number of CPU cores reserved per replica.
    memory Optional[str] '128m' The amount of memory reservation per replica.
    gpus Optional[int] 0 The number of GPU cores reserved per replica.
    await_deployment Optional[bool] True Whether to block until deployment completes.

    Returns

    Server response for action.

    Update Model

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id = 'example_project'
    model_id = 'example_model'
    model_dir = Path('/tmp/foo')
    
    client.update_model(project_id, model_id, model_dir)
    

    Update the specified model, with model binary and package.py from the specified model_dir. No changes to model schema is allowed. This is just to replace the model with another model with same input features, target and output

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.
    model_dir Path A path to a directory containing all of the model artifacts needed to run the model. This includes a package.py file with the glue code needed to run the model.
    force_pre_compute bool True If true refresh the pre-computed values. This can also be done manually by calling trigger_pre_computation

    Returns

    Server response for action.

    Delete a Model

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    my_project = 'example_project'
    my_model = 'example_model'
    
    client.delete_model(project_id=my_project, model_id=my_model)
    

    Deletes a model within a project.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.
    delete_prod bool False Boolean value to delete the production table. By default this table is not dropped.
    delete_pred bool True Boolean value to delete the prediction table. By default this table is dropped.

    Returns

    Server response for action.

    Get ModelInfo

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    my_project = 'example_project'
    my_model = 'example_model'
    
    client.get_model_info(project_id=my_project, model_id=my_model)
    

    Get fdl.ModelInfo object describing the specified project's model.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.

    Returns

    fdl.ModelInfo, object describing the specified project's model.

    Explainability

    Fiddler Explainability

    API related to explainability

    Run Model

    import fiddler as fdl
    import pandas as pd
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id='example_project'
    dataset_id='example_dataset'
    model_id='example_model'
    
    event_sample_df = client.get_dataset(dataset_id, splits=['test'], max_rows=10000)
    event_sample_df = event_sample_df.sample(n=500).reset_index(drop=True)
    
    # get prediction result
    result = client.run_model(project_id, model_id, event_sample_df, log_events=False)
    result_df = pd.concat([event_sample_df, result], axis=1)
    result_df.head()
    

    Executes a model in the Fiddler platform on the specified pd.DataFrame, returning the outputs of the model in a pd.DataFrame object.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.
    df pd.DataFrame A dataframe containing model inputs as rows.
    log_events bool False Variable determining if the the predictions generated should be logged as production traffic
    casting_type bool False Indicating if Fiddler should try to cast the data in the event with the type referenced in model info.

    Returns

    Server response for action.

    Run Explanation

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id='example_project'
    dataset_id='example_dataset'
    model_id='example_model'
    
    event_sample_df = client.get_dataset(dataset_id, splits=['test'], max_rows=10000)
    event_sample_df = event_sample_df.sample(n=500).reset_index(drop=True)
    
    # get prediction result
    result = client.run_model(project_id, model_id, event_sample_df, log_events=False)
    result_df = pd.concat([event_sample_df, result], axis=1)
    result_df.head()
    
    

    Explains a model's prediction on a single instance.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.
    df pd.DataFrame A dataframe containing model inputs as rows.
    explanations Union[str, Iterable[str]] 'shap' A single string or list of strings specifying which explanation algorithms to run.
    Possible algo values: 'ig_flex', 'fiddler_shapley_values', 'ig', 'shap', 'mean_reset', 'zero_reset', 'permute'
    dataset_id str The unique identifier of the dataset within the specified project.
    casting_type bool False Indicating if Fiddler should try to cast the data in the event with the type referenced in model info.
    return_raw_response bool False Indicating if Fiddler should return the response as an explanation object, by default to False. Or, set to True if raw explanation is needed.

    Returns

    Union[AttributionExplanation, MulticlassAttributionExplanation, List[AttributionExplanation], List[MulticlassAttributionExplanation]] A single fdl.AttributionExplanation if explanations was a single string, or a list of AttributionExplanation objects if a list of explanations was requested.

    Run Feature Importance

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    # calling feature importance with kwargs parameters
    result = client.run_feature_importance('lending',
                                        'logreg-all',
                                        'p2p_loans',
                                        dataset_splits=('test.csv',),
                                        n_iterations=100,
                                        n_references=100,
                                        impact_not_importance=True,
                                        ci_confidence_level=0.99
                                        )
    
    # using slice query
    query_str = r"""select * from "p2p_loans"."logreg-all" where sub_grade = 'A3' limit 100"""
    
    result = client.run_feature_importance('lending',
                                        'logreg-all',
                                        'p2p_loans',
                                        slice_query=query_str)
    

    Explains a model's prediction on a single instance.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.
    dataset_id str The unique identifier of the dataset within the specified project.
    dataset_splits Optional[List[str]] None If specified, importance will only be computed over these splits. Otherwise, all splits will be used. Only a single split is currently supported.
    slice_query Optional[str] None A special SQL query.
    kwargs ** Additional parameters to be passed to the importance algorithm. For example:
    - n_inputs, n_iterations, n_references, ci_confidence_level, and impact_not_importance.

    Returns

    Dict[str, Any], A named tuple with the explanation results.

    Trigger Pre-Computation

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id = 'example_project'
    model_id = 'example_model'
    dataset_id = 'example_dataset'
    
    client.trigger_pre_computation(project_id=project_id, model_id=model_id, dataset_id=dataset_id)
    

    Calls the Fiddler service to Triggers various precomputation steps within the Fiddler service based on input parameters. Use the various computation flags to best fit for needs, or leave as default True to run all pre-computation steps.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.
    dataset_id str The unique identifier of the dataset within the specified project.
    overwrite_cache Optional[bool] True Boolean indicating whether to overwrite previously cached information.
    batch_size Optional[int] 20 Batch size of global PDP calculation.
    calculate_predictions Optional[bool] True Boolean indicating whether to pre-calculate and store model predictions.
    cache_global_pdps Optional[bool] True Boolean indicating whether to pre-calculate and cache global partial dependence plots.
    cache_global_impact_importance Optional[bool] True Boolean indicating whether to pre-calculate and global feature impact and global feature importance.

    Returns

    Server response for action.

    Get Mutual Information

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id = 'example_project'
    dataset_id = 'example_dataset'
    features = ['f_a', 'f_b', 'f_c', 'f_d']
    
    result = client.get_mutual_information(project_id=project_id, dataset_id=dataset_id, features=features)
    

    The Mutual information measures the dependency between two random variables. It's a non-negative value. If two random variables are independent MI is equal to zero. Higher MI values means higher dependency

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    dataset_id str The unique identifier of the dataset within the specified project.
    features List[str] List of features to compute mutual information.

    Returns

    Dict[str, Any], a dictionary of mutual information w.r.t the given features.

    Run Fairness

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id = 'census_income'
    model_id = 'model_census_income'
    dataset_id = 'census_income_data'
    
    client.run_fairness(project_id=project_id,
                    model_id=model_id,
                    dataset_id=dataset_id,
                    protected_features=['Sex'],
                    positive_outcome=' >50K',
                    slice_query=None,
                    score_threshold=0.5)
    
    # using slice query
    slice_query = r"""SELECT * FROM census_income_data.model_census_income WHERE __source_file='test.csv'"""
    client.run_fairness(project_id=project_id,
                    model_id=model_id,
                    dataset_id=dataset_id,
                    protected_features=['Race', 'Sex'],
                    positive_outcome=' >50K',
                    slice_query=slice_query,
                    score_threshold=0.5)
    

    Get fairness metrics for a model over a dataset.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.
    dataset_id str The unique identifier of the dataset within the specified project.
    protected_features List[str] List of protected features.
    positive_outcome Union[str, int] Name or value of the positive outcome.
    slice_query Optional[str] None If specified, slice the data.
    score_threshold Optional[float] 0.5 Positive score threshold applied to get outcomes.

    Returns

    Dict[str, Any], A dictionary with the fairness metrics, technical_metrics, labels distribution and model outcomes distribution.

    Monitoring

    Fiddler Monitoring

    API related to monitoring

    Add Monitoring Config

    import fiddler as fdl
    import pandas as pd
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id='example_project'
    model_id='example_model'
    
    # sample config info
    {
        'min_bin_value': 3600, # possible values 300, 3600, 7200, 43200, 86400, 604800 secs
        'time_ranges': ['Day', 'Week', 'Month', 'Quarter', 'Year'],
        'default_time_range': 7200,
        'tag': 'anything you want'
    }
    
    client.add_monitoring_config(config_info=config, project_id=project_id, model_id=model_id)
    

    Adds a config for either an entire org, or project or a model.

    Parameters

    Parameter Type Default Description
    config_info Dict Monitoring config info for an entire org or a project or a model.
    project_id Optional[str] The unique identifier for the project in the Fiddler platform.
    model_id Optional[str] The unique identifier of the model within the specified project.

    Returns

    Server response for action.

    Publish Event

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    # publish_event to log prediction
    client.publish_event(project_id='bank_churn',
                        model_id='bank_churn',
                        event = {
                           "CreditScore": 650,      # data type: int
                           "Geography": "France",   # data type: category
                           "Gender": "Female",
                           "Age": 45,
                           "Tenure": 2,
                           "Balance": 10000.0,      # data type: float
                           "NumOfProducts": 1,
                           "HasCrCard": "Yes",
                           "isActiveMember": "Yes",
                           "EstimatedSalary": 120000,
                           "probability_churned": 0.105
                           },
                        event_id='some_unique_id',   #optional
                        update_event=False,          #optional
                        event_time_stamp=1609462800000, #optional
                        dry_run=False
                        )
    
    # publish_event to log label
    client.publish_event(project_id='bank_churn',
                        model_id='bank_churn',
                        event = {
                            "Churn Result": 'Churned',  # data type: category
                            "Discount": 'True'',  # data type: category
                            },
                        event_id='some_unique_id', #optional
                        update_event=True, #optional
                        event_time_stamp=1609462800000      #optional
                        )
    

    Publishes a monitoring event to the Fiddler platform.

    Take a look at the ingestion code samples and notebook example to learn more.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.
    event Dict Dictionary of event details, such as features and predictions
    event_id Optional[str] None Unique id that refers to this event
    update_event Optional[bool] None Indicates whether to update an existing event with a log label/decision
    event_timestamp Optional[int] None The UTC timestamp of the event in epoch milliseconds (e.g. 1609462800000). Defaults to current time.
    casting_type Optional[bool] False Indicating if Fiddler should try to cast the data in the event with the type referenced in model info. Default to False
    dry_run Optional[bool] False If true, the event isnt published and instead the user gets a report which shows IF the event along with the model would face any problems with respect to monitoring.
    return_raw_response bool False Indicating if Fiddler should return the response as an explanation object, by default to False. Or, set to True if raw explanation is needed.

    Returns

    Server response for action.

    Publish Events Batch

    import fiddler as fdl
    import pandas as pd
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id = 'example_project'
    model_id = 'example_model'
    
    event_log = pd.read_csv('./event_log.csv')
    event_df = event_log.sample(n=100)
    
    aws_par_file = 's3://bucket/events.parquet'
    gcp_avro_file = 'gs://bucket/events.avro'
    local_csv_file = 'event_log.csv'
    
    batch_sources = [
    event_df,        # Dataframe
    aws_par_file,    # Parquet file hosted on S3
    aws_csv_file,    # Avro file hosted on GCP
    local_csv_file,  # CSV file on local disk
    ]
    
    for batch_source in batch_sources:
        client.publish_events_batch(
                        project_id=project_id,
                        model_id=model_id,
                        batch_source=batch_source,
                        timestamp_field='timestamp')
    

    Publishes a batch events object to Fiddler.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    model_id str The unique identifier of the model within the specified project.
    batch_source Union[pd.DataFrame, str] Batch object to be published. Can be one of:
    - Pandas DataFrame (pd.DataFrame)
    - CSV file (*.csv),
    - AVRO file (*.avro),
    - PKL Pandas DataFrame (*.pkl),
    - gzipped CSV file (*.csv.gz),
    - Parquet file (*.pq)
    id_field Optional[str] None Column to extract id value from.
    update_event Optional[bool] False Bool indicating if the events are updates to previously published rows.
    timestamp_field Optional[str] None Column to extract timestamp value from. Timestamp must match the specified format in timestamp_format.
    timestamp_format Optional[FiddlerTimestamp] FiddlerTimestamp.INFER Format of timestamp within batch object. Can be one of:
    - FiddlerTimestamp.INFER
    - FiddlerTimestamp.EPOCH_MILLISECONDS
    - FiddlerTimestamp.EPOCH_SECONDS
    - FiddlerTimestamp.ISO_8601
    data_source Optional[BatchPublishType] None Source of batch object. In case of failed inference, can be one of:
    - BatchPublishType.DATAFRAME
    - BatchPublishType.LOCAL_DISK
    - BatchPublishType.AWS_S3
    - BatchPublishType.GCP_STORAGE
    casting_type Optional[bool] False Indicating if Fiddler should try to cast the data in the event with the type referenced in model info. Default to False.
    credentials Optional[dict] None Dictionary containing authorization for AWS or GCP.
    For AWS S3, list of expected keys are
    ['aws_access_key_id', 'aws_secret_access_key', 'aws_session_token']
    with 'aws_session_token' being applicable to the AWS account being used.

    For GCP, list of expected keys are
    ['gcs_access_key_id', 'gcs_secret_access_key', 'gcs_session_token']
    with 'gcs_session_token' being applicable to the GCP account being used.

    Returns

    Server response for action.

    Model Package

    Fiddler package.py

    To upload your model to Fiddler, you need to create a model package.

    A model package is essentially a directory that contains - a set of instructions for how the model will operate in production (called package.py) - some metadata about the model (called model.yaml) - your model - additional assets such as preprocessing pipelines

    package.py

    At the heart of a model package is the package.py script.

    package.py is a Python module that the user provides which:

    1. Facilitates model loading
    2. Implements interfaces necessary for the Fiddler platform to interact with models.

    This module provides the flexibility that enables Fiddler to support a wide variety of complex models. For certain common highly-standardized frameworks, the Fiddler client provides helper upload methods to auto-generate this module (e.g. scikit-learn).

    package.py will be invoked within the model’s specific assets directory and must implement a get_model() function. This function returns an instance of a model class that implements the following methods:

    Additionally, there’s the business of loading the model and any serialized preprocessing objects. This is most commonly performed in the model object’s initializer init().

    To the side is what the package.py should look like in the most general sense:

    %%writefile package.py
    
    import pickle
    from pathlib import Path
    import pandas as pd
    
    PACKAGE_PATH = Path(__file__).parent
    
    """
    Here, we create a ModelPackage object that contains all the necessary
        components for the model to run smoothly.
    """
    
    class ModelPackage:
    
        def __init__(self):
            """
            Here we can load in the model and any other necessary
                serialized objects from the PACKAGE_PATH.
            """
    
        def transform_input(self, input_df):
            """
            The transform_input() function let's us apply any necessary
                preprocessing to our input before feeding it into our model.
            It should return the transformed version of input_df.
            """
    
        def predict(self, input_df):
            """
            The predict() function should return a DataFrame of predictions
                whose columns correspond to the outputs of your model.
    
            For regression models, this DataFrame typically has a single column
                that stores the continuous output of your model.
    
            For binary classification models, this DataFrame typically has a
                single column that stores the probability prediction for the
                positive class.
    
            For multiclass classification models, this DataFrame typically has
                the same number of columns as it does classes (one for each
                class probability prediction).
            """
    
    
    def get_model():
        return ModelPackage()
    

    Some additional notes about package.py: - Every package.py should have a predict() function and a transform_input() function. These are the only functions that will be invoked by Fiddler directly. - You can write your own functions and add them to package.py. The complexity of package.py will depend on the framework you are using and the task you are performing. - You can incorporate other .py scripts into your package.py with relative imports. Just add them to the model directory along with package.py. - Generally, you should call transform_input() from within the predict() function before making predictions. - If you don't need to transform your input in any way, you can just return the original input from the transform_input() function as shown to the side and leave it out of the predict() function.

    # Example `transform_input` with no changes made to the `input_dt`
    def transform_input(self, input_df):
        return input_df
    

    model.yaml

    Fiddler also requires that you save some model metadata in YAML form and include it in the model package.

    You can harvest this metadata by creating a fdl.ModelInfo object. This can be generated from an existing Fiddler dataset. An example is to the side.

    # Using existing Fiddler dataset to generate a fdl.ModelInfo object
    import fiddler as fdl
    
    model_info = fdl.ModelInfo.from_dataset_info(
        dataset_id=# your Fiddler dataset ID,
        dataset_info=# your Fiddler DatasetInfo object,
        target=# the name of your target column,
        outputs=# the names of your model output columns
    )
    

    Once you have your ModelInfo object, you can call its to_dict() function to store it as a YAML file as shown to the side.

    # Storing ModelInfo as a YAML
    import yaml
    
    with open('model.yaml', 'w') as yaml_file:
        yaml.dump({'model': model_info.to_dict()}, yaml_file)
    

    Model Assets

    The final step in creating a model package is to include any assets you need in order to make predictions.

    This includes your model artifact and any serialized preprocessing objects that you may need.

    These serialized objects should be loaded by the __init__() function you wrote inside package.py.

    The Complete Model Package

    At this point, your model package (directory) should contain - package.py - model.yaml - Any model assets

    Sklearn Model

    Here is another examples of package.py for a simple SKLearn model

    from pathlib import Path
    # with the `sklearn_wrapper.py` file dropped into the top-level org directory
    from ...sklearn_wrapper import SimpleSklearnModel
    
    PACKAGE_PATH = Path(__file__).parent
    MODEL_FILE_NAME = 'model.pkl'
    PRED_COLUMN_NAMES = ['setosa', 'versicolor', 'virginica']
    
    def get_model():
        return SimpleSklearnModel(PACKAGE_PATH / MODEL_FILE_NAME,
                                  PRED_COLUMN_NAMES, is_classifier=True,
                                  is_multiclass=True)
    

    Tensorflow Text Classifer

    To the side is a nontrivial example of a package.py for a TensorFlow text classifier. The model’s raw input is a text string which is tokenized by a pretrained tokenizer and padded to provide fixed-length sequences of token ids.

    import pathlib
    import pickle as pk
    import pandas as pd
    import tensorflow as tf
    from tensorflow.keras.models import load_model
    from tensorflow.keras.preprocessing.sequence import pad_sequences
    
    MODEL_DIR = pathlib.Path(__file__).parent
    
    SAVED_MODEL_PATH = MODEL_DIR / 'spam_keras.h5'
    TOKENIZER_PATH = MODEL_DIR / 'tokenizer.pkl'
    
    TEXT_FIELD = 'text'
    OUTPUT_COLUMNS = ['spam_probability']
    
    MAX_SEQUENCE_LENGTH = 50
    
    class MyModel:
      def __init__(self):
          # load the tokenizer from a pickle file
          with open(TOKENIZER_PATH, 'rb') as handle:
           self.tokenizer = pk.load(handle)
    
          # create persistent TensorFlow session
          self.sess = tf.Session()
    
          # load model into that session
          with self.sess.as_default():
              self.model = load_model(SAVED_MODEL_PATH)
    
      def transform_input(self, input_df):
          # tokenize the raw string
          input_tokens = [self.tokenizer.texts_to_sequences([x])[0]
            for x in input_df[TEXT_FIELD].values]
    
          # pad the token list to fixed length
          input_tokens = pad_sequences(input_tokens, MAX_SEQUENCE_LENGTH)
    
          return pd.DataFrame(input_tokens.tolist())
    
      def predict(self, input_df):
          # transform the raw input
          transformed_input_df = self.transform_input(input_df)
          # apply model to transformed input
          with self.sess.as_default():
              predictions = self.model.predict(transformed_input_df)
    
          return pd.DataFrame(data=predictions, columns=OUTPUT_COLUMNS)
    
    def get_model():
      return MyModel()
    

    This framework, which is quite simple in cases where the dataset schema closely resembles the model input, provides a great deal of flexibility to compose pipelines involving user code, saved processing modules, and library functions where necessary.

    Pytorch Model

    One final example of package.py for a Pytorch model

    import pathlib
    import pickle
    import fastai.text
    import pandas as pd
    import torch
    import torch.nn.functional as F
    PACKAGE_PATH = pathlib.Path(__file__).parent
    PREPROCESSOR_PATH = PACKAGE_PATH / 'preprocessor.pkl'
    MODEL_WEIGHTS_PATH = PACKAGE_PATH / 'model_weights.pth'
    OUTPUT_NAME = 'sentiment'
    class MyModel:
        def load_model(self):
            # set up preprocessing functions
            with PREPROCESSOR_PATH.open('rb') as f:
                self.preprocessor = pickle.load(f)
    
            # load the fast.ai pytorch model and set to eval mode
            model = fastai.text.learner.get_text_classifier(
                arch=fastai.text.models.awd_lstm.AWD_LSTM,
                vocab_sz=,
                n_class=2)
            loaded_state_dict = torch.load(MODEL_WEIGHTS_PATH, map_location='cpu')
            state_dict = dict(zip(model.state_dict().keys(),
                                  loaded_state_dict['model'].values()))
            model.load_state_dict(state_dict)
            model.eval()
            self.model = model
        def transform_input(self, input_df):
            return self.preprocessor(input_df)
        def predict(self, input_tensor):
            with torch.no_grad():
                pre_softmax_pred, _, _ = self.model.forward(input_tensor)
                pred = F.softmax(pre_softmax_pred, dim=1)[:, 1].numpy()
            return pd.DataFrame({OUTPUT_NAME: pred})
    
    def get_model():
        return MyModel()
    

    Additional Examples

    Additional examples can be found in the models folder in Fiddler Samples and embedded in Tutorials examples.

    Access (RBAC)

    This section lists all the Role Based Access Control (RBAC) APIs that administrators and non-administrators can create, list and control the user access to the projects.

    Share Project

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    # administrators can share any project with someone else.. even though they do not OWN it
    # project OWNER can unshare their project with someone lese.
    res = client.share_project(project_name='bank_churn', role='READ', user_name='user1@fiddler.ai')
    

    Share a project with other users and/or teams.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    role str one of the ['READ', 'WRITE', 'OWNER'] roles.
    user_name Optional['str'] Username, typically an email address.
    team_name Optional['str'] Name of the team.

    Returns

    Server response for action.

    Unshare Project

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    # administrators can unshare any project with someone else, even though they do not OWN it
    # project OWNER can unshare their project with someone lese.
    res = client.unshare_project(project_name='quickstart', role='WRITE', user_name='user2@fiddler.ai')
    

    Remove share access of a project from other users and/or teams.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.
    role str one of the ['READ', 'WRITE', 'OWNER'] roles.
    user_name Optional['str'] Username, typically an email address.
    team_name Optional['str'] Name of the team.

    Returns

    Server response for action.

    List Org Roles

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    org_roles = client.list_org_roles()
    print(json.dumps(org_roles, indent=2))
    

    List the users and their roles in the organization.

    Returns

    List[str], List of users and their roles in the organization.

    List Project Roles

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_roles = client.list_project_roles('bank_churn')
    print('users and teams and their roles on project: bank_churn', json.dumps(project_roles, indent=2))
    

    List the users and teams with access to a given project.

    Parameters

    Parameter Type Default Description
    project_id str The unique identifier for the project in the Fiddler platform.

    Returns

    List[str], List of users and teams with access to a given project.

    List Teams

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    teams = client.list_teams()
    print(json.dumps(teams, indent=2))
    

    List the teams and the members in each team.

    Returns

    Dict, Dict with teams as keys and list of members as values.

    DatasetInfo

    class fdl.DatasetInfo()
    

    Information about a dataset. Defines the schema.

    Parameter Type Default Description
    display_name str A name for user-facing display (different from an id).
    columns List[fdl.Column] List of fdl.Column objects comprising of the columns of the dataset.
    files Optional[List[str]] None List of files
    dataset_id str None The unique identifier of the dataset within the specified project.

    DatasetInfo from_dataframe()

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    df_schema = fdl.DatasetInfo.from_dataframe(df=pandas_df, max_inferred_cardinality=1000)
    

    Creates a fdl.DatasetInfo object from a Pandas pd.DataFrame.

    Parameter Type Default Description
    df Union[pd.DataFrame, Iterable[pd.DataFrame]] Either a single DataFrame or an iterable of DataFrame objects. If an iterable is given, all dataframes must have the same columns.
    display_name str '' A name for user-facing display (different from an id).
    max_inferred_cardinality Optional[int] None If not None, any string-typed column with fewer than max_inferred_cardinality unique values will be inferred as a category (useful for cases where use of the built-in CategoricalDtype functionality of Pandas is not desired).
    zero_one_as_int bool False If True, skip inferring integer valued column of only {0, 1} values as type bool.

    Returns

    fdl.DatasetInfo, A fdl.DatasetInfo object.

    DatasetInfo to_dict()

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    dataset_info_obj = fdl.DatasetInfo.from_dataframe(df=pandas_df, max_inferred_cardinality=1000)
    
    dataset_info_dict = dataset_info_obj.to_dict()
    

    Converts a fdl.DatasetInfo object into a Python dictionary format.

    Returns

    Dict, A Dict object representation of a fdl.DatasetInfo.

    DatasetInfo from_dict()

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    dataset_info_obj = fdl.DatasetInfo.from_dict(desrialized_json=dataset_info_dict)
    

    Converts a Python dictionary format into a fdl.DatasetInfo object.

    Returns

    fdl.DatasetInfo, A fdl.DatasetInfo object.

    ModelInfo

    class fdl.ModelInfo()
    

    Information about a model.

    Parameter Type Default Description
    display_name str A name for user-facing display (different from an id).
    input_type ModelInputType Specifies whether the model is in the tabular or text paradigm.
    model_task ModelTask Specifies the task the model is designed to address.
    inputs List[Column] A list of Column objects corresponding to the dataset columns that are fed as inputs into the model.
    outputs List[Column] A list of Column objects corresponding to the table output by the model when running predictions.
    metadata Optional[List[Column]] None A list of Column objects corresponding to metadata information that does not contribute to model predictions.
    decisions Optional[List[Column]] None A list of Column objects corresponding to decisions based off the model.
    targets Optional[List[Column]] None A list of Column objects corresponding to the dataset columns used as targets/labels for the model. If not provided, some functionality (like scoring) will not be available.
    framework Optional[str] None A string providing information about the software library and version used to train and run this model.
    description Optional[str] None A user-facing description of the model.
    datasets Optional[List[str]] None A list of the dataset ids used by the model.
    mlflow_params Optional[MLFlowParams] None MLFlow parameters.
    model_deployment_params Optional[ModelDeploymentParams] None Model Deployment parameters.
    artifact_status Optional[ArtifactStatus] None Status of the model artifact
    preferred_explanation_method Optional[ExplanationMethod] None Specifies a preference for the default explanation algorithm. Front-end will choose explanaton method if unspecified (typically Fiddler Shapley). Providing ExplanationMethod.CUSTOM will cause the first of the custom_explanation_names to be the default (which must be defined in that case).
    custom_explanation_names Optional[Sequence[str]] [] List of names that can be passed to the explanation_name argument of the optional user-defined explain_custom method of the model object defined in package.py.

    ModelInfo from_dataset_info()

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    project_id = 'example_project'
    model_id = 'example_model'
    target = 'e_target'
    continuous_features = ['con_a', 'con_b']
    categorical_features = ['cat_a', 'cat_b']
    
    feature_columns = list(continuous_features + categorical_features)
    metadata_cols = ['meta']
    decision_cols = ['high_value']
    outputs = ['predicted_output']
    
    model_info = fdl.ModelInfo.from_dataset_info(
                dataset_info=client.get_dataset_info(dataset_id),
                target=target,
                features=feature_columns,
                metadata_cols=metadata_cols,
                decision_cols=decision_cols,
                outputs=outputs,
                input_type=fdl.ModelInputType.TABULAR,
                model_task=fdl.ModelTask.REGRESSION,
                display_name='My Model',
                description='this is a model for the example',
    )
    

    Creates a fdl.ModelInfo object from a fdl.DatasetInfo object and additional parameters.

    Parameter Type Default Description
    dataset_info DatasetInfo A DatasetInfo object describing the training dataset.
    target Optional[str] The column name of the target the model predicts.
    features Optional[Sequence[str]] None A list of column names for columns used as features.
    metadata_cols Optional[Sequence[str]] None A list of column names for columns used as metadata.
    decision_cols Optional[Sequence[str]] None A list of column names for columns used as decisions.
    display_name Optional[str] None A model name for user-facing display (different from an id).
    description Optional[str] None A user-facing description of the model.
    input_type ModelInputType ModelInputType.TABULAR Specifies the paradigm (tabular or text) of the model.
    model_task Optional[ModelTask] None Specifies the prediction task addressed by the model. If not explicitly provided, this will be inferred from the data type of the target variable.
    outputs Optional[Sequence[str]] None model output parameters
    model_deployment_params Optional[ModelDeploymentParams] None Model Deployment parameters.
    preferred_explanation_method Optional[ExplanationMethod] None Specifies a preference for the default explanation algorithm. Front-end will choose explanaton method if unspecified (typically Fiddler Shapley). Providing ExplanationMethod.CUSTOM will cause the first of the custom_explanation_names to be the default (which must be defined in that case).
    custom_explanation_names Optional[Sequence[str]] [] List of names that can be passed to the explanation_name argument of the optional user-defined explain_custom method of the model object defined in package.py.

    Returns

    fdl.ModelInfo, A fdl.ModelInfo object.

    ModelInfo to_dict()

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    model_info_obj = fdl.ModelInfo.from_dataframe(df=pandas_df, max_inferred_cardinality=1000)
    
    model_info_dict = model_info_obj.to_dict()
    

    Converts a fdl.ModelInfo object into a Python dictionary format.

    Returns

    Dict, A Dict object representation of a fdl.ModelInfo.

    ModelInfo from_dict()

    import fiddler as fdl
    
    client = fdl.FiddlerApi(url=url, org_id=org_id, auth_token=auth_token)
    
    model_info_obj = fdl.ModelInfo.from_dict(desrialized_json=model_info_dict)
    

    Converts a Python dictionary format into a fdl.ModelInfo object.

    Returns

    fdl.ModelInfo, A fdl.ModelInfo object.

    ModelInputType

    @enum.unique
    class fdl.ModelInputType(enum.Enum)
    

    Supported model paradigms for the Fiddler platform.

    ModelInputType Enum Values

    import fiddler as fdl
    
    model_input_tab = fdl.ModelInputType.TABULAR
    model_input_text = fdl.ModelInputType.TEXT
    
    Enum Value Description
    fdl.ModelInputType.TABULAR Tabular model
    fdl.ModelInputType.TEXT Text-based model

    ModelTask

    @enum.unique
    class fdl.ModelTask(enum.Enum)
    

    Supported model tasks for the Fiddler platform.

    ModelTask Enum Values

    import fiddler as fdl
    
    model_task_binary = fdl.ModelTask.BINARY_CLASSIFICATION
    model_task_multi = fdl.ModelTask.MULTICLASS_CLASSIFICATION
    model_task_regression = fdl.ModelTask.REGRESSION
    
    Enum Value Description
    fdl.ModelTask.BINARY_CLASSIFICATION Binary classification model
    fdl.ModelTask.MULTICLASS_CLASSIFICATION Multiclass classification model
    fdl.ModelTask.REGRESSION Multiclass classification model