NAV
Python
  • API Reference

  • Introduction

    API Reference

    The Fiddler API contains many useful tools for sending and receiving data to and from the Fiddler platform.

    Currently, we have a Python SDK that allows you to connect to Fiddler directly from a Jupyter notebook or automated pipeline.

    For each API, a description can be found on the left of the page, along with usage information. Additionally, code examples can be found on the right of the page.

    Installation

    Installation

    # Install python3
    brew install python3
    
    # Run the curl command to download get-pip.py
    curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
    
    # Install pip3
    python3 get-pip.py
    
    # Install `fiddler-client` from the command line
    pip3 install fiddler-client
    

    The Fiddler Python SDK can be downloaded from the fiddler-client PyPI package.
    To install the package, run the the commands on the right in your shell.

    Client Setup

    fdl.FiddlerApi

    The API client object used to communicate with Fiddler.

    In order to use the client, you'll need to provide authentication details as shown below.

    Usage

    import fiddler as fdl
    
    URL = 'https://app.fiddler.ai'
    ORG_ID = 'my_org'
    AUTH_TOKEN = 'p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58'
    
    client = fdl.FiddlerApi(
        url=URL,
        org_id=ORG_ID,
        auth_token=AUTH_TOKEN
    )
    

    Proxy URLs

    proxies = {
        'http' : 'http://proxy.example.com:1234',
        'https': 'https://proxy.example.com:5678'
    }
    
    Parameters
    Parameter Type Default Description
    url str None The URL used to connect to Fiddler.
    org_id str None The organization ID for a Fiddler instance. Can be found on the General tab of the Settings page.
    auth_token str None The authorization token used to authenticate with Fiddler. Can be found on the Credentials tab of the Settings page.
    proxies Optional[dict] None A dictionary containing proxy URLs.
    verbose Optional[bool] False If True, API calls will be logged verbosely.

    Writing fiddler.ini

    %%writefile fiddler.ini
    
    [FIDDLER]
    url = https://app.fiddler.ai
    org_id = my_org
    auth_token = p9uqlkKz1zAA3KAU8kiB6zJkXiQoqFgkUgEa1sv4u58
    

    If you want to authenticate with Fiddler without passing this information directly into the function call, you can store it in a file named fiddler.ini, which should be stored in the same directory as your notebook or script.

    Usage

    client = fdl.FiddlerApi()
    

    Projects

    Projects are used to organize your models and datasets. Each project can represent a machine learning task (e.g. predicting house prices, assessing creditworthiness, or detecting fraud).

    A project can contain one or more models (e.g. lin_reg_house_predict, random_forest_house_predict).

    For more information on projects, click here.

    client.list_projects

    Usage

    client.list_projects()
    

    Response

    [
        'project_a',
        'project_b',
        'project_c'
    ]
    

    Retrieves the project IDs of all projects accessible by the user.

    Returns
    Type Description
    list A list containing the project ID string for each project.

    client.create_project

    Usage

    PROJECT_ID = 'example_project'
    
    client.create_project(
        project_id=PROJECT_ID
    )
    

    Response

    {
        'project_name': 'example_project'
    }
    

    Creates a project using the specified ID.

    Parameters
    Parameter Type Default Description
    project_id str A unique identifier for the project. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.
    Returns
    Type Description
    dict A dictionary mapping project_name to the project ID string specified.

    client.delete_project

    Deletes a project.

    Usage

    PROJECT_ID = 'example_project'
    
    client.delete_project(
        project_id=PROJECT_ID
    )
    

    Response

    True
    

    Deletes a specified project.

    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    Returns
    Type Description
    bool A boolean denoting whether deletion was successful.

    Datasets

    Datasets (or baseline datasets) are used for making comparisons with production data.

    A baseline dataset should be sampled from your model's training set, so it can serve as a representation of what the model expects to see in production.

    More information can be found here.

    For guidance on how to design a baseline dataset, click here.

    client.list_datasets

    Retrieves the dataset IDs of all datasets accessible within a project.

    Usage

    PROJECT_ID = "example_project"
    
    client.list_datasets(
        project_id=PROJECT_ID
    )
    

    Response

    [
        'dataset_a',
        'dataset_b',
        'dataset_c'
    ]
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    Returns
    Type Description
    list A list containing the string ID of each dataset.

    client.upload_dataset

    Uploads a dataset from a pandas DataFrame.

    Usage

    import pandas as pd
    
    PROJECT_ID = 'example_project'
    DATASET_ID = 'example_dataset'
    
    df = pd.read_csv('example_dataset.csv')
    
    dataset_info = fdl.DatasetInfo.from_dataframe(
        df=df
    )
    
    client.upload_dataset(
        project_id=PROJECT_ID,
        dataset_id=DATASET_ID,
        dataset={
            'baseline': df
        },
        info=dataset_info
    )
    

    Response

    {
      'row_count': 10000,
      'col_count': 20,
      'log': [
        'Importing dataset example_dataset',
        'Creating table for example_dataset',
        'Importing data file: baseline.csv'
      ]
    }
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    dataset dict A dictionary mapping dataset slice names to pandas DataFrames.
    dataset_id str A unique identifier for the dataset. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.
    info Optional[fdl.DatasetInfo] None The Fiddler DatasetInfo object used to describe the dataset. Click here for more information.
    size_check_enabled Optional[bool] True If True, will issue a warning when a dataset has a large number of rows.
    Returns
    Type Description
    dict A dictionary containing information about the uploaded dataset.

    client.delete_dataset

    Deletes a dataset from a project.

    Usage

    PROJECT_ID = 'example_project'
    DATASET_ID = 'example_dataset'
    
    client.delete_dataset(
        project_id=PROJECT_ID,
        dataset_id=DATASET_ID
    )
    

    Response

    'Dataset deleted example_dataset'
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    dataset_id str The unique identifier for the dataset.
    Returns
    Type Description
    str A message confirming that the dataset was deleted.

    client.get_dataset_info

    Retrieves the DatasetInfo object associated with a dataset.

    Usage

    PROJECT_ID = 'example_project'
    DATASET_ID = 'example_dataset'
    
    dataset_info = client.get_dataset_info(
        project_id=PROJECT_ID,
        dataset_id=DATASET_ID
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    dataset_id str The unique identifier for the dataset.
    Returns
    Type Description
    fdl.DatasetInfo The DatasetInfo object associated with the specified dataset.

    Models

    A model is a representation of your machine learning model. Each model must have an associated dataset to be used as a baseline for monitoring, explainability, and fairness capabilities.

    You do not need to upload your model artifact in order to register your model, but doing so will significantly improve the quality of explanations generated by Fiddler.

    More information can be found here.

    client.list_models

    Retrieves the model IDs of all models accessible within a project.

    Usage

    PROJECT_ID = 'example_project'
    
    client.list_models(
        project_id=PROJECT_ID
    )
    

    Response

    [
        'model_a',
        'model_b',
        'model_c'
    ]
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    Returns
    Type Description
    list A list containing the string ID of each model.

    client.register_model

    Registers a model without uploading an artifact. Requires a fdl.ModelInfo object containing information about the model.

    Usage

    PROJECT_ID = 'example_project'
    DATASET_ID = 'example_dataset'
    MODEL_ID = 'example_model'
    
    dataset_info = client.get_dataset_info(
        project_id=PROJECT_ID,
        dataset_id=DATASET_ID
    )
    
    model_task = fdl.ModelTask.REGRESSION
    model_target = 'target_column'
    model_output = 'output_column'
    model_features = [
        'feature_1',
        'feature_2',
        'feature_3'
    ]
    
    model_info = fdl.ModelInfo.from_dataset_info(
        dataset_info=dataset_info,
        target=model_target,
        outputs=[model_output],
        model_task=model_task
    )
    
    client.register_model(
        project_id=PROJECT_ID,
        dataset_id=DATASET_ID,
        model_id=MODEL_ID,
        model_info=model_info
    )
    

    Response

    'Model successfully registered on Fiddler. \n Visit https://app.fiddler.ai/projects/example_project'
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    model_id str A unique identifier for the model. Must be a lowercase string between 2-30 characters containing only alphanumeric characters and underscores. Additionally, it must not start with a numeric character.
    dataset_id str The unique identifier for the dataset.
    model_info fdl.ModelInfo A ModelInfo object containing information about the model.
    deployment Optional[fdl.core_objects.DeploymentOptions] None A DeploymentOptions object containing information about the model deployment.
    cache_global_impact_importance Optional[bool] True If True, global feature impact and global feature importance will be precomputed and cached when the model is registered.
    cache_global_pdps Optional[bool] False If True, global partial dependence plots will be precomputed and cached when the model is registered.
    cache_dataset Optional[bool] True If True, histogram information for the baseline dataset will be precomputed and cached when the model is registered.
    Returns
    Type Description
    str A message confirming that the model was registered.

    client.upload_model_package

    Registers a model with Fiddler and uploads a model artifact to be used for explainability and fairness capabilities.

    Usage

    import pathlib
    
    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    artifact_path = pathlib.Path('model_dir')
    
    client.upload_model_package(
        artifact_path=artifact_path,
        project_id=PROJECT_ID,
        model_id=MODEL_ID
    )
    
    Parameters
    Parameter Type Default Description
    artifact_path pathlib.Path None A path to the directory containing all of the model files needed to run the model.
    project_id str The unique identifier for the project.
    model_id str The unique identifier for the model.
    deployment_type Optional[str] 'predictor' The type of deployment for the model. Can be one of
    • 'predictor' — Just a predict endpoint is exposed.
    • 'executor' — The model's internals are exposed.
    image_uri Optional[str] None A URI of the form '/:'. If specified, the image will be used to create a new runtime to serve the model.
    namespace Optional[str] 'default' The Kubernetes namespace to use for the newly created runtime. image_uri must be specified.
    port Optional[int] 5100 The port to use for the newly created runtime. image_uri must be specified.
    replicas Optional[int] 1 The number of replicas running the model. image_uri must be specified.
    cpus Optional[int] 0.25 The number of CPU cores reserved per replica. image_uri must be specified.
    memory Optional[str] '128m' The amount of memory reserved per replica. image_uri must be specified.
    gpus Optional[int] 0 The number of GPU cores reserved per replica. image_uri must be specified.
    await_deployment Optional[bool] True If True, will block until deployment completes.

    client.update_model

    Replaces the model artifact for a model.

    Usage

    import pathlib
    
    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    model_dir = pathlib.Path('model_dir')
    
    client.update_model(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        model_dir=model_dir
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    model_id str The unique identifier for the model.
    model_dir pathlib.Path A path to the directory containing all of the model files needed to run the model.
    force_pre_compute bool True If True, re-run precomputation steps for the model. This can also be done manually by calling client.trigger_pre_computation.
    Returns
    Type Description
    bool A boolean denoting whether the update was successful.

    client.delete_model

    Deletes a model from a project.

    Without deleting production data

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    client.delete_model(
        project_id=PROJECT_ID,
        model_id=MODEL_ID
    )
    

    Deleting production data

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    client.delete_model(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        delete_prod=True
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    model_id str The unique identifier for the model.
    delete_prod Optional[bool] False If True, production data will also be deleted.
    delete_pred Optional[bool] True If True, prediction data will also be deleted.

    client.trigger_pre_computation

    Runs a variety of precomputation steps for a model.

    Usage

    PROJECT_ID = 'example_project'
    DATASET_ID = 'example_dataset'
    MODEL_ID = 'example_model'
    
    client.trigger_pre_computation(
        project_id=PROJECT_ID,
        dataset_id=DATASET_ID,
        model_id=MODEL_ID
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    model_id str The unique identifier for the model.
    dataset_id str The unique identifier for the dataset.
    overwrite_cache Optional[bool] True If True, will overwrite existing cached information.
    batch_size Optional[int] 10 The batch size used for global PDP calculations.
    calculate_predictions Optional[bool] True If True, will precompute and store model predictions.
    cache_global_pdps Optional[bool] True If True, will precompute and cache partial dependence plot information.
    cache_global_impact_importance Optional[bool] True If True, will precompute and cache global feature impact and global feature importance metrics.
    cache_dataset Optional[bool] False If True, will precompute and cache histogram information for the baseline dataset.

    client.get_model_info

    Retrieves the ModelInfo object associated with a model.

    Usage

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    model_info = client.get_model_info(
        project_id=PROJECT_ID,
        model_id=MODEL_ID
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    model_id str The unique identifier for the model.
    Returns
    Type Description
    fdl.ModelInfo The ModelInfo object associated with the specified model.

    Monitoring

    client.publish_event

    Publishes a single production event to Fiddler asynchronously.

    Usage

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    example_event = {
        'feature_1': 20.7,
        'feature_2': 45000,
        'feature_3': True,
        'output_column': 0.79,
        'target_column': 1
    }
    
    client.publish_event(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        event=example_event,
        event_id='event_001',
        event_timestamp=1637344470000
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    model_id str The unique identifier for the model.
    event dict A dictionary mapping field names to field values. Any fields found that are not present in the model's ModelInfo object will be dropped from the event.
    event_id Optional[str] None A unique identifier for the event. If not specified, Fiddler will generate its own ID, which can be retrived using the get_slice API.
    update_event Optional[bool] None If True, will only modify an existing event, referenced by event_id. If no event is found, no change will take place.
    event_timestamp Optional[int] None A timestamp for when the event took place. The format of this timestamp is given by timestamp_format. If no timestamp is provided, the current time will be used.
    timestamp_format Optional[fdl.FiddlerTimestamp] fdl.FiddlerTimestamp.INFER The format of the timestamp passed in event_timestamp. Can be one of
    • fdl.FiddlerTimestamp.INFER
    • fdl.FiddlerTimestamp.EPOCH_MILLISECONDS
    • fdl.FiddlerTimestamp.EPOCH_SECONDS
    • fdl.FiddlerTimestamp.ISO_8601
    casting_type Optional[bool] False If True, will try to cast the data in event to be in line with the data types defined in the model's ModelInfo object.
    dry_run Optional[bool] False If True, the event will not be published, and instead a report will be generated with information about any problems with the event. Useful for debugging issues with event publishing.

    client.publish_events_batch

    Publishes a batch of events to Fiddler synchronously.

    Usage

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    path_to_batch = 'events_batch.csv'
    
    client.publish_events_batch(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        batch_source=path_to_batch
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    model_id str The unique identifier for the model.
    batch_source Union[pd.DataFrame, str] Either a pandas DataFrame containing a batch of events, or the path to a file containing a batch of events. Supported file types are
    • CSV (.csv)
    • Parquet (.pq)
    • Pickled DataFrame (.pkl)
    id_field Optional[str] None The field containing event IDs for events in the batch. If not specified, Fiddler will generate its own IDs, which can be retrived using the get_slice API.
    update_event Optional[bool] False If True, will only modify existing events, referenced by IDs from id_field. If an ID is provided for which there is no event, no change will take place for that row.
    timestamp_field Optional[str] None The field containing timestamps for events in the batch. The format of these timestamps is given by timestamp_format. If no timestamp is provided for a given row, the current time will be used.
    timestamp_format Optional[fdl.FiddlerTimestamp] fdl.FiddlerTimestamp.INFER The format of the timestamps passed in timestamp_field. Can be one of
    • fdl.FiddlerTimestamp.INFER
    • fdl.FiddlerTimestamp.EPOCH_MILLISECONDS
    • fdl.FiddlerTimestamp.EPOCH_SECONDS
    • fdl.FiddlerTimestamp.ISO_8601
    data_source Optional[fdl.BatchPublishType] None The location of the data source provided. By default, Fiddler will try to infer the value. Can be one of
    • fdl.BatchPublishType.DATAFRAME
    • fdl.BatchPublishType.LOCAL_DISK
    • fdl.BatchPublishType.AWS_S3
    • fdl.BatchPublishType.GCP_STORAGE
    casting_type Optional[bool] False If True, will try to cast the data in the events to be in line with the data types defined in the model's ModelInfo object.
    credentials Optional[dict] None A dictionary containing authorization information for AWS or GCP.

    For AWS, the expected keys are
    • 'aws_access_key_id'
    • 'aws_secret_access_key'
    • 'aws_session_token'

    For GCP, the expected keys are
    • 'gcs_access_key_id'
    • 'gcs_secret_access_key'
    • 'gcs_session_token'
    group_by Optional[str] None The field used to group events together when computing performance metrics (for ranking models only).

    client.publish_events_batch_schema

    Publishes a batch of events to Fiddler synchronously using a schema for locating fields within complex data structures.

    Usage

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    path_to_batch = 'events_batch.avro'
    
    schema = {
        '__static': {
            '__project': PROJECT_ID,
            '__model': MODEL_ID
        },
        '__dynamic': {
            'feature_1': 'features/feature_1',
            'feature_2': 'features/feature_2',
            'feature_3': 'features/feature_3',
            'output_column': 'outputs/output_column',
            'target_column': 'targets/target_column'
        }
    }
    
    client.publish_events_batch_schema(
        batch_source=path_to_batch,
        publish_schema=schema
    )
    
    Parameters
    Parameter Type Default Description
    batch_source Union[pd.DataFrame, str] Either a pandas DataFrame containing a batch of events, or the path to a file containing a batch of events. Supported file types are
    • CSV (.csv)
    • Avro (.avro)
    publish_schema dict A dictionary used for locating fields within complex or nested data structures.
    data_source Optional[fdl.BatchPublishType] None The location of the data source provided. By default, Fiddler will try to infer the value. Can be one of
    • fdl.BatchPublishType.DATAFRAME
    • fdl.BatchPublishType.LOCAL_DISK
    • fdl.BatchPublishType.AWS_S3
    • fdl.BatchPublishType.GCP_STORAGE
    credentials Optional[dict] None A dictionary containing authorization information for AWS or GCP.

    For AWS, the expected keys are
    • 'aws_access_key_id'
    • 'aws_secret_access_key'
    • 'aws_session_token'

    For GCP, the expected keys are
    • 'gcs_access_key_id'
    • 'gcs_secret_access_key'
    • 'gcs_session_token'
    group_by Optional[str] None The field used to group events together when computing performance metrics (for ranking models only).

    client.add_monitoring_config

    Adds a custom configuration for monitoring.

    Usage

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    monitoring_config = {
        'min_bin_value': 3600,
        'time_ranges': ['Day', 'Week', 'Month', 'Quarter', 'Year'],
        'default_time_range': 7200
    }
    
    client.add_monitoring_config(
        config_info=monitoring_config,
        project_id=PROJECT_ID,
        model_id=MODEL_ID
    )
    
    Parameters
    Parameter Type Default Description
    config_info dict Monitoring config info for an entire org or a project or a model.
    project_id Optional[str] None The unique identifier for the project.
    model_id Optional[str] None The unique identifier for the model.

    Explainability

    client.run_model

    Runs a model on a pandas DataFrame and returns the predictions.

    Usage

    import pandas as pd
    
    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    df = pd.read_csv('example_data.csv')
    
    predictions = client.run_model(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        df=df
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    model_id str The unique identifier for the model.
    df pd.DataFrame A pandas DataFrame containing model input vectors as rows.
    log_events bool False If True, the rows of df along with the model predictions will be logged as production events.
    casting_type bool False If True, will try to cast the data in event to be in line with the data types defined in the model's ModelInfo object.
    Returns
    Type Description
    pd.DataFrame A pandas DataFrame containing model predictions for the given input vectors.

    client.run_explanation

    Runs a point explanation for a given input vector.

    Usage

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    df = pd.read_csv('example_data.csv')
    
    explanation = client.run_explanation(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        df=df
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    model_id str The unique identifier for the model.
    df pd.DataFrame A pandas DataFrame containing a model input vector as a row. If more than one row is included, the first row will be used.
    explanations Union[str, list] 'shap' A string or list of strings specifying which explanation algorithms to run.
    Can be one or more of
    • 'fiddler_shapley_values'
    • 'shap'
    • 'ig_flex'
    • 'ig'
    • 'mean_reset'
    • 'zero_reset'
    • 'permute'
    dataset_id Optional[str] None The unique identifier for the dataset.
    casting_type Optional[bool] False If True, will try to cast the data in event to be in line with the data types defined in the model's ModelInfo object.
    return_raw_response Optional[bool] False If True, a raw output will be returned instead of explanation objects.
    Returns
    Type Description
    Union[fdl.AttributionExplanation, fdl.MulticlassAttributionExplanation, list] A fdl.AttributionExplanation object, fdl.MulticlassAttributionExplanation object, or list of such objects for each explanation method specified in explanations.

    client.run_feature_importance

    Calculates global feature importance for a model over a specified dataset.

    Usage

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    DATASET_ID = 'example_dataset'
    
    feature_importance = client.run_feature_importance(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        dataset_id=DATASET_ID
    )
    

    With a SQL query

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    DATASET_ID = 'example_dataset'
    
    slice_query = f""" SELECT * FROM "{DATASET_ID}.{MODEL_ID}" WHERE feature_1 < 20.0 LIMIT 100 """
    
    feature_importance = client.run_feature_importance(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        dataset_id=DATASET_ID,
        slice_query=slice_query
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    model_id str The unique identifier for the model.
    dataset_id str The unique identifier for the dataset.
    dataset_splits Optional[list] None A list of dataset splits taken from the dataset argument of upload_dataset. If specified, feature importance will only be calculated over the provided splits. Otherwise, all splits will be used.
    slice_query Optional[str] None A SQL query. If specified, feature importance will only be calculated over the dataset slice specified by the query.
    **kwargs Additional arguments to be passed.
    Can be one or more of
    • n_inputs
    • n_iterations
    • n_references
    • ci_confidence_level
    • impact_not_importance
    Returns
    Type Description
    dict A dictionary containing feature importance results.

    client.get_mutual_information

    Calculates the mutual information (MI) between variables over a specified dataset.

    Usage

    PROJECT_ID = 'example_project'
    DATASET_ID = 'example_dataset'
    
    mutual_information_features = [
        'feature_1',
        'feature_2',
        'feature_3'
    ]
    
    mutual_information = client.get_mutual_information(
        project_id=PROJECT_ID,
        dataset_id=DATASET_ID,
        features=mutual_information_features
    )
    

    With a SQL query

    PROJECT_ID = 'example_project'
    DATASET_ID = 'example_dataset'
    
    mutual_information_features = [
        'feature_1',
        'feature_2',
        'feature_3'
    ]
    
    slice_query = f""" SELECT * FROM "{DATASET_ID}.{MODEL_ID}" WHERE feature_1 < 20.0 LIMIT 100 """
    
    mutual_information = client.get_mutual_information(
        project_id=PROJECT_ID,
        dataset_id=DATASET_ID,
        features=mutual_information_features,
        slice_query=slice_query
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    dataset_id str The unique identifier for the dataset.
    features list A list of features for which to compute mutual information.
    normalized Optional[bool] False If True, will compute normalized mutual information (NMI) instead.
    slice_query Optional[str] None A SQL query. If specified, mutual information will only be calculated over the dataset slice specified by the query.
    sample_size Optional[int] None If specified, only sample_size samples will be used in the mutual information calculation.
    seed Optional[float] 0.25 The random seed used to sample when sample_size is specified.
    Returns
    Type Description
    dict A dictionary containing mutual information results.

    Analytics

    client.get_slice

    Retrieve a slice of data as a pandas DataFrame.

    Querying a dataset

    import pandas as pd
    
    PROJECT_ID = 'example_project'
    DATASET_ID = 'example_dataset'
    MODEL_ID = 'example_model'
    
    query = f""" SELECT * FROM "{DATASET_ID}.{MODEL_ID}" """
    
    slice_df = client.get_slice(
        sql_query=query,
        project_id=PROJECT_ID
    )
    

    Querying production data

    import pandas as pd
    
    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    
    query = f""" SELECT * FROM "production.{MODEL_ID}" """
    
    slice_df = client.get_slice(
        sql_query=query,
        project_id=PROJECT_ID
    )
    
    Parameters
    Parameter Type Default Description
    sql_query str The SQL query used to identify the slice.
    project_id str The unique identifier for the project.
    columns_override Optional[list] None A list of columns to include in the slice, even if they aren't specified in the query.
    Returns
    Type Description
    pd.DataFrame A pandas DataFrame containing the slice returned by the specified query.

    Fairness

    client.run_fairness

    Calculates fairness metrics for a model over a specified dataset.

    Usage

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    DATASET_ID = 'example_dataset'
    
    protected_features = [
        'feature_1',
        'feature_2'
    ]
    
    positive_outcome = 1
    
    fairness_metrics = client.run_fairness(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        dataset_id=DATASET_ID,
        protected_features=protected_features,
        positive_outcome=positive_outcome
    )
    

    With a SQL query

    PROJECT_ID = 'example_project'
    MODEL_ID = 'example_model'
    DATASET_ID = 'example_dataset'
    
    protected_features = [
        'feature_1',
        'feature_2'
    ]
    
    positive_outcome = 1
    
    slice_query = f""" SELECT * FROM "{DATASET_ID}.{MODEL_ID}" WHERE feature_1 < 20.0 LIMIT 100 """
    
    fairness_metrics = client.run_fairness(
        project_id=PROJECT_ID,
        model_id=MODEL_ID,
        dataset_id=DATASET_ID,
        protected_features=protected_features,
        positive_outcome=positive_outcome,
        slice_query=slice_query
    )
    

    Get fairness metrics for a model over a dataset.

    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    model_id str The unique identifier for the model.
    dataset_id str The unique identifier for the dataset.
    protected_features list A list of protected features.
    positive_outcome Union[str, int] The name or value of the positive outcome for the model.
    slice_query Optional[str] None A SQL query. If specified, fairness metrics will only be calculated over the dataset slice specified by the query.
    score_threshold Optional[float] 0.5 The score threshold used to calculate model outcomes.
    Returns
    Type Description
    dict A dictionary containing fairness metric results.

    Access Control

    client.share_project

    Shares a project with a user or team.

    Usage

    PROJECT_ID = 'example_project'
    
    client.share_project(
        project_name=PROJECT_ID,
        role='READ',
        user_name='user@example.com'
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    role str The permissions role being shared. Can be one of
    • 'READ'
    • 'WRITE'
    • 'OWNER'
    user_name Optional[str] A username with which the project will be shared. Typically an email address.
    team_name Optional[str] A team with which the project will be shared.

    client.unshare_project

    Shares a project with a user or team.

    Usage

    PROJECT_ID = 'example_project'
    
    client.unshare_project(
        project_name=PROJECT_ID,
        role='READ',
        user_name='user@example.com'
    )
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    role str The permissions role being revoked. Can be one of
    • 'READ'
    • 'WRITE'
    • 'OWNER'
    user_name Optional[str] A username with which the project permissions will be revoked. Typically an email address.
    team_name Optional[str] A team with which the project permissions will be revoked.

    client.list_org_roles

    Retrieves the names of all users and their permissions roles.

    Usage

    client.list_org_roles()
    

    Response

    {
        'members': [
            {
                'id': 1,
                'user': 'admin@example.com',
                'email': 'admin@example.com',
                'isLoggedIn': True,
                'firstName': 'Example',
                'lastName': 'Administrator',
                'imageUrl': None,
                'settings': {'notifyNews': True,
                    'notifyAccount': True,
                    'sliceTutorialCompleted': True},
                'role': 'ADMINISTRATOR'
            },
            {
                'id': 2,
                'user': 'user@example.com',
                'email': 'user@example.com',
                'isLoggedIn': True,
                'firstName': 'Example',
                'lastName': 'User',
                'imageUrl': None,
                'settings': {'notifyNews': True,
                    'notifyAccount': True,
                    'sliceTutorialCompleted': True},
                'role': 'MEMBER'
            }
        ],
        'invitations': [
            {
                'id': 3,
                'user': 'newuser@example.com',
                'role': 'MEMBER',
                'invited': True,
                'link': 'http://app.fiddler.ai/signup/vSQWZkt3FP--pgzmuYe_-3-NNVuR58OLZalZOlvR0GY'
            }
        ]
    }
    
    Returns
    Type Description
    dict A dictionary of users and their roles in the organization.

    client.list_project_roles

    Retrieves the names of users and their permissions roles for a given project.

    Usage

    PROJECT_ID = 'example_project'
    
    client.list_project_roles(
        project_name=PROJECT_ID
    )
    

    Response

    {
        'roles': [
            {
                'user': {
                    'email': 'admin@example.com'
                },
                'team': None,
                'role': {
                    'name': 'OWNER'
                }
            },
            {
                'user': {
                    'email': 'user@example.com'
                },
                'team': None,
                'role': {
                    'name': 'READ'
                }
            }
        ]
    }
    
    Parameters
    Parameter Type Default Description
    project_id str The unique identifier for the project.
    Returns
    Type Description
    dict A dictionary of users and their roles for the specified project.

    client.list_teams

    Retrieves the names of all teams and the users and roles within each team.

    Usage

    client.list_teams()
    

    Response

    {
        'example_team': {
            'members': [
                {
                    'user': 'admin@example.com',
                    'role': 'MEMBER'
                },
                {
                    'user': 'user@example.com',
                    'role': 'MEMBER'
                }
            ]
        }
    }
    
    Returns
    Type Description
    dict A dictionary containing information about teams and users.

    Objects

    fdl.DatasetInfo

    Stores information about a dataset.

    Usage

    columns = [
        fdl.Column(
            name='feature_1',
            data_type=fdl.DataType.FLOAT
        ),
        fdl.Column(
            name='feature_2',
            data_type=fdl.DataType.INTEGER
        ),
        fdl.Column(
            name='feature_3',
            data_type=fdl.DataType.BOOLEAN
        ),
        fdl.Column(
            name='output_column',
            data_type=fdl.DataType.FLOAT
        ),
        fdl.Column(
            name='target_column',
            data_type=fdl.DataType.INTEGER
        )
    ]
    
    dataset_info = fdl.DatasetInfo(
        display_name='Example Dataset',
        columns=columns
    )
    
    Parameters
    Parameter Type Default Description
    display_name str A display name for the dataset.
    columns list A list of fdl.Column objects containing information about the columns.
    files Optional[list] None A list of strings pointing to CSV files to use.
    dataset_id str None The unique identifier for the dataset.
    **kwargs Additional arguments to be passed.

    fdl.DatasetInfo.from_dataframe

    Constructs a DatasetInfo object from a pandas DataFrame.

    Usage

    import pandas as pd
    
    df = pd.read_csv('example_dataset.csv')
    
    dataset_info = fdl.DatasetInfo.from_dataframe(
        df=df
    )
    
    Parameters
    Parameter Type Default Description
    df Union[pd.DataFrame, list] Either a single pandas DataFrame or a list of DataFrames. If a list is given, all dataframes must have the same columns.
    display_name str '' A display name for the dataset.
    max_inferred_cardinality Optional[int] None If specified, any string column containing fewer than max_inferred_cardinality unique values will be converted to a categorical data type.
    dataset_id Optional[str] None The unique identifier for the dataset.
    Returns
    Type Description
    fdl.DatasetInfo A DatasetInfo object constructed from the pandas DataFrame provided.

    fdl.DatasetInfo.to_dict

    Converts a DatasetInfo object to a dictionary.

    Usage

    import pandas as pd
    
    df = pd.read_csv('example_dataset.csv')
    
    dataset_info = fdl.DatasetInfo.from_dataframe(
        df=df
    )
    
    dataset_info_dict = dataset_info.to_dict()
    

    Response

    {
        'name': 'Example Dataset',
        'columns': [
            {
                'column-name': 'feature_1',
                'data-type': 'float'
            },
            {
                'column-name': 'feature_2',
                'data-type': 'int'
            },
            {
                'column-name': 'feature_3',
                'data-type': 'bool'
            },
            {
                'column-name': 'output_column',
                'data-type': 'float'
            },
            {
                'column-name': 'target_column',
                'data-type': 'int'
            }
        ],
        'files': []
    }
    
    Returns
    Type Description
    dict A dictionary containing information from the DatasetInfo object.

    fdl.DatasetInfo.from_dict

    Converts a dictionary to a DatasetInfo object.

    Usage

    import pandas as pd
    
    df = pd.read_csv('example_dataset.csv')
    
    dataset_info = fdl.DatasetInfo.from_dataframe(
        df=df
    )
    
    dataset_info_dict = dataset_info.to_dict()
    
    new_dataset_info = fdl.DatasetInfo.from_dict(
        deserialized_json={
            'dataset': dataset_info_dict
        }
    )
    
    Parameters
    Parameter Type Default Description
    deserialized_json dict The dictionary to be converted.
    Returns
    Type Description
    fdl.DatasetInfo A DatasetInfo object constructed from the dictionary.

    fdl.ModelInfo

    Stores information about a model.

    Usage

    inputs = [
        fdl.Column(
            name='feature_1',
            data_type=fdl.DataType.FLOAT
        ),
        fdl.Column(
            name='feature_2',
            data_type=fdl.DataType.INTEGER
        ),
        fdl.Column(
            name='feature_3',
            data_type=fdl.DataType.BOOLEAN
        )
    ]
    
    outputs = [
        fdl.Column(
            name='output_column',
            data_type=fdl.DataType.FLOAT
        )
    ]
    
    targets = [
        fdl.Column(
            name='target_column',
            data_type=fdl.DataType.INTEGER
        )
    ]
    
    model_info = fdl.ModelInfo(
        display_name='Example Model',
        input_type=fdl.ModelInputType.TABULAR,
        model_task=fdl.ModelTask.BINARY_CLASSIFICATION,
        inputs=inputs,
        outputs=outputs,
        targets=targets
    )
    
    Parameters
    Parameter Type Default Description
    display_name str A display name for the model.
    input_type fdl.ModelInputType A ModelInputType object containing the input type of the model.
    model_task ModelTask A ModelTask object containing the model task.
    inputs list A list of Column objects corresponding to the inputs (features) of the model.
    outputs list A list of Column objects corresponding to the outputs (predictions) of the model.
    target_class_order Optional[list] None A list denoting the order of classes in the target.
    metadata Optional[list] None A list of Column objects corresponding to any metadata fields.
    decisions Optional[list] None A list of Column objects corresponding to any decision fields (post-prediction business decisions).
    targets Optional[list] None A list of Column objects corresponding to the targets (ground truth) of the model.
    framework Optional[str] None A string providing information about the software library and version used to train and run this model.
    description Optional[str] None A description of the model.
    datasets Optional[list] None A list of the dataset IDs used by the model.
    mlflow_params Optional[fdl.MLFlowParams] None A MLFlowParams object containing information about MLFlow parameters.
    model_deployment_params Optional[fdl.ModelDeploymentParams] None A ModelDeploymentParams object containing information about model deployment.
    artifact_status Optional[fdl.ArtifactStatus] None An ArtifactStatus object containing information about the model artifact.
    preferred_explanation_method Optional[fdl.ExplanationMethod] None An ExplanationMethod object that specifies the default explanation algorithm to use for the model.
    custom_explanation_names Optional[list] [] A list of names that can be passed to the explanation_name argument of the optional user-defined explain_custom method of the model object defined in package.py.
    binary_classification_threshold Optional[float] None The threshold used for classifying examples for binary classifiers.
    ranking_top_k Optional[int] None Used only for ranking models. Sets the top k results to take into consideration when computing performance metrics like MAP and NDCG.
    group_by Optional[str] None The column by which to group events for certain performance metrics like MAP and NDCG.
    **kwargs Additional arguments to be passed.

    fdl.ModelInfo.from_dataset_info

    Constructs a ModelInfo object from a DatasetInfo object.

    Usage

    import pandas as pd
    
    df = pd.read_csv('example_dataset.csv')
    
    dataset_info = fdl.DatasetInfo.from_dataframe(
        df=df
    )
    
    model_info = fdl.ModelInfo.from_dataset_info(
        dataset_info=dataset_info,
        features=[
            'feature_1',
            'feature_2',
            'feature_3'
        ],
        outputs=[
            'output_column'
        ],
        target='target_column',
        input_type=fdl.ModelInputType.TABULAR,
        model_task=fdl.ModelTask.BINARY_CLASSIFICATION
    )
    
    Parameters
    Parameter Type Default Description
    dataset_info fdl.DatasetInfo The DatasetInfo object from which to construct the ModelInfo object.
    target str The column to be used as the target (ground truth).
    dataset_id Optional[str] None The unique identifier for the dataset.
    features Optional[list] None A list of columns to be used as features.
    metadata_cols Optional[list] None A list of columns to be used as metadata fields.
    decision_cols Optional[list] None A list of columns to be used as decision fields.
    display_name Optional[str] None A display name for the model.
    description Optional[str] None A description of the model.
    input_type fdl.ModelInputType fdl.ModelInputType.TABULAR A ModelInputType object containing the input type for the model.
    model_task Optional[fdl.ModelTask] None A ModelTask object containing the model task.
    outputs Optional[list] None A list of columns containing model outputs (predictions).
    categorical_target_class_details Optional[list] None A list denoting the order of classes in the target.
    model_deployment_params Optional[fdl.ModelDeploymentParams] None A ModelDeploymentParams object containing information about model deployment.
    preferred_explanation_method Optional[fdl.ExplanationMethod] None An ExplanationMethod object that specifies the default explanation algorithm to use for the model.
    custom_explanation_names Optional[list] [] A list of names that can be passed to the explanation_name argument of the optional user-defined explain_custom method of the model object defined in package.py.
    binary_classification_threshold Optional[float] None The threshold used for classifying examples for binary classifiers.
    ranking_top_k Optional[int] None Used only for ranking models. Sets the top k results to take into consideration when computing performance metrics like MAP and NDCG.
    group_by Optional[str] None The column by which to group events for certain performance metrics like MAP and NDCG.
    Returns
    Type Description
    fdl.ModelInfo A ModelInfo object constructed from the DatasetInfo object provided.

    fdl.ModelInfo.to_dict

    Converts a ModelInfo object to a dictionary.

    Usage

    import pandas as pd
    
    df = pd.read_csv('example_dataset.csv')
    
    dataset_info = fdl.DatasetInfo.from_dataframe(
        df=df
    )
    
    model_info = fdl.ModelInfo.from_dataset_info(
        dataset_info=dataset_info,
        features=[
            'feature_1',
            'feature_2',
            'feature_3'
        ],
        outputs=[
            'output_column'
        ],
        target='target_column',
        input_type=fdl.ModelInputType.TABULAR,
        model_task=fdl.ModelTask.BINARY_CLASSIFICATION
    )
    
    model_info_dict = model_info.to_dict()
    

    Response

    {
        'name': 'Example Model',
        'input-type': 'structured',
        'model-task': 'binary_classification',
        'inputs': [
            {
                'column-name': 'feature_1',
                'data-type': 'float'
            },
            {
                'column-name': 'feature_2',
                'data-type': 'int'
            },
            {
                'column-name': 'feature_3',
                'data-type': 'bool'
            },
            {
                'column-name': 'target_column',
                'data-type': 'int'
            }
        ],
        'outputs': [
            {
                'column-name': 'output_column',
                'data-type': 'float'
            }
        ],
        'datasets': [],
        'targets': [
            {
                'column-name': 'target_column',
                'data-type': 'int'
            }
        ],
        'custom-explanation-names': []
    }
    
    Returns
    Type Description
    dict A dictionary containing information from the DatasetInfo object.

    fdl.ModelInfo.from_dict

    Converts a dictionary to a ModelInfo object.

    Usage

    import pandas as pd
    
    df = pd.read_csv('example_dataset.csv')
    
    dataset_info = fdl.DatasetInfo.from_dataframe(
        df=df
    )
    
    model_info = fdl.ModelInfo.from_dataset_info(
        dataset_info=dataset_info,
        features=[
            'feature_1',
            'feature_2',
            'feature_3'
        ],
        outputs=[
            'output_column'
        ],
        target='target_column',
        input_type=fdl.ModelInputType.TABULAR,
        model_task=fdl.ModelTask.BINARY_CLASSIFICATION
    )
    
    model_info_dict = model_info.to_dict()
    
    new_model_info = fdl.ModelInfo.from_dict(
        deserialized_json={
            'model': model_info_dict
        }
    )
    
    Parameters
    Parameter Type Default Description
    deserialized_json dict The dictionary to be converted.
    Returns
    Type Description
    fdl.ModelInfo A ModelInfo object constructed from the dictionary.

    fdl.ModelInputType

    Represents supported model input types.

    Usage

    model_input_type = fdl.ModelInputType.TABULAR
    
    Enum Values
    Enum Value Description
    fdl.ModelInputType.TABULAR For tabular models.
    fdl.ModelInputType.TEXT For text models.

    fdl.ModelTask

    Represents supported model tasks.

    Usage

    model_task = fdl.ModelTask.BINARY_CLASSIFICATION
    
    Enum Values
    Enum Value Description
    fdl.ModelTask.REGRESSION For regression models.
    fdl.ModelTask.BINARY_CLASSIFICATION For binary classification models.
    fdl.ModelTask.MULTICLASS_CLASSIFICATION For multiclass classification models.
    fdl.ModelTask.RANKING For ranking models.

    fdl.DataType

    Represents supported data types.

    Usage

    data_type = fdl.DataType.FLOAT
    
    Enum Values
    Enum Value Description
    fdl.DataType.FLOAT For floats.
    fdl.DataType.INTEGER For integers.
    fdl.DataType.BOOLEAN For booleans.
    fdl.DataType.STRING For strings.
    fdl.DataType.CATEGORY For categorical types.

    fdl.Column

    Represents a column of a dataset.

    Usage

    column = fdl.Column(
        name='feature_1',
        data_type=fdl.DataType.FLOAT,
        value_range_min=0.0,
        value_range_max=80.0
    )
    
    Parameters
    Parameter Type Default Description
    name str The name of the column.
    data_type fdl.DataType The DataType object corresponding to the data type of the column.
    possible_values Optional[list] None A list of unique values used for categorical columns.
    is_nullable Optional[bool] None If True, will expect missing values in the column.
    value_range_min Optional[float] None The minimum value used for numeric columns.
    value_range_max Optional[float] None The maximum value used for numeric columns.