Share via


PublicData Class

Defines the base class of public data.

Public data class contains common properties and methods for each open datasets.

Initialize with columns.

Constructor

PublicData(cols: List[str] | None, enable_telemetry: bool = True)

Parameters

Name Description
cols
Required

A list of column names to enrich.

enable_telemetry

Indicates whether to send telemetry.

Default value: True
cols
Required

column name list which the user wants to enrich

enable_telemetry
Required

whether to send telemetry

Methods

get_enricher

Get enricher.

to_pandas_dataframe

To pandas dataframe.

to_spark_dataframe

To spark dataframe.

get_enricher

Get enricher.

get_enricher()

to_pandas_dataframe

To pandas dataframe.

to_pandas_dataframe()

to_spark_dataframe

To spark dataframe.

to_spark_dataframe()

Attributes

cols

Get the column name list to retrieve.

env

Return the runtime environment.

id

Get the ___location ID of the open data.

registry_id

Get the registry ID of this public dataset registered at the backend.

Azure uses this registry ID to get latest metadata like storage ___location. You should expect all public data sub classes to assign _registry_id.

Returns

Type Description
str

The registry ID.

logger

logger = <Logger azureml.opendatasets (DEBUG)>

mandatory_columns

mandatory_columns = []