NycTlcYellow Class
Represents the NYC Taxi & Limousine Commission yellow taxi trip public dataset.
The yellow taxi trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts. For more information about this dataset, including column descriptions, different ways to access the dataset, and examples, see NYC Taxi & Limousine Commission - yellow taxi trip records in the Microsoft Azure Open Datasets catalog.
Initialize filtering fields.
- Inheritance
-
azureml.opendatasets._nyc_taxi_base.NycTaxiBaseNycTlcYellow
Constructor
NycTlcYellow(start_date: datetime = datetime.datetime(2015, 1, 1, 0, 0), end_date: datetime = datetime.datetime(2024, 10, 18, 0, 0), cols: List[str] | None = None, limit: int | None = -1, enable_telemetry: bool = True)
Parameters
Name | Description |
---|---|
start_date
|
The date at which to start loading data, inclusive. If None, the Default value: 2015-01-01 00:00:00
|
end_date
|
The date at which to end loading data, inclusive. If None, the Default value: 2024-10-18 00:00:00
|
cols
|
A list of columns names to load from the dataset. If None, all columns are loaded. For information on the available columns in this dataset, see NYC Taxi & Limousine Commission - yellow taxi trip records. Default value: None
|
limit
|
A value indicating the number of days of data to load with Default value: -1
|
enable_telemetry
|
Whether to enable telemetry on this dataset. Default value: True
|
start_date
Required
|
The start date you'd like to query inclusively. |
end_date
Required
|
The end date you'd like to query inclusively. |
cols
Required
|
A list of column names you'd like to retrieve. None will get all columns. |
limit
Required
|
to_pandas_dataframe() will load only "limit" months of data. -1 means no limit. |
enable_telemetry
Required
|
Indicates whether to send telemetry. |
Remarks
The example below shows how to access the dataset.
from azureml.opendatasets import NycTlcYellow
from dateutil import parser
end_date = parser.parse('2018-06-06')
start_date = parser.parse('2018-05-01')
nyc_tlc = NycTlcYellow(start_date=start_date, end_date=end_date)
nyc_tlc_df = nyc_tlc.to_pandas_dataframe()