dask_geopandas.GeoDataFrame#
- class dask_geopandas.GeoDataFrame(dsk, name, meta, divisions, spatial_partitions=None)#
Parallel GeoPandas GeoDataFrame
Do not use this class directly. Instead use functions like
dask_geopandas.read_parquet()
,ordask_geopandas.from_geopandas()
.- __init__(dsk, name, meta, divisions, spatial_partitions=None)#
Methods
__init__
(dsk, name, meta, divisions[, ...])abs
()Return a Series/DataFrame with absolute numeric value of each element.
add
(other[, axis, level, fill_value])Get Addition of dataframe and other, element-wise (binary operator add).
add_prefix
(prefix)Prefix labels with string prefix.
add_suffix
(suffix)Suffix labels with string suffix.
affine_transform
(matrix)Return a
GeoSeries
with translated geometries.align
(other[, join, axis, fill_value])Align two objects on their axes with the specified join method.
all
([axis, skipna, split_every, out])Return whether all elements are True, potentially over an axis.
any
([axis, skipna, split_every, out])Return whether any element is True, potentially over an axis.
append
(other[, interleave_partitions])Append rows of other to the end of caller, returning a new object.
apply
(func[, axis, broadcast, raw, reduce, ...])Parallel version of pandas.DataFrame.apply
applymap
(func[, meta])Apply a function to a Dataframe elementwise.
assign
(**kwargs)Assign new columns to a DataFrame.
astype
(dtype)Cast a pandas object to a specified dtype
dtype
.bfill
([axis, limit])Synonym for
DataFrame.fillna()
withmethod='bfill'
.buffer
(distance[, resolution])Returns a
GeoSeries
of geometries representing all points within a givendistance
of each geometric object.calculate_spatial_partitions
()Calculate spatial partitions
categorize
([columns, index, split_every])Convert columns of the DataFrame to category dtype.
clear_divisions
()Forget division information
clip
(mask[, keep_geom_type])Clip points, lines, or polygon geometries to the mask extent.
clip_lower
(threshold)clip_upper
(threshold)combine
(other, func[, fill_value, overwrite])Perform column-wise combine with another DataFrame.
combine_first
(other)Update null elements with value in the same location in other.
compute
(**kwargs)Compute this dask collection
contains
(other, *args, **kwargs)Returns a
Series
ofdtype('bool')
with valueTrue
for each aligned geometry that contains other.copy
()Make a copy of the dataframe
corr
([method, min_periods, split_every])Compute pairwise correlation of columns, excluding NA/null values.
count
([axis, split_every, numeric_only])Count non-NA cells for each column or row.
cov
([min_periods, split_every])Compute pairwise covariance of columns, excluding NA/null values.
covered_by
(other, *args, **kwargs)Returns a
Series
ofdtype('bool')
with valueTrue
for each aligned geometry that is entirely covered by other.covers
(other, *args, **kwargs)Returns a
Series
ofdtype('bool')
with valueTrue
for each aligned geometry that is entirely covering other.crosses
(other, *args, **kwargs)Returns a
Series
ofdtype('bool')
with valueTrue
for each aligned geometry that cross other.cummax
([axis, skipna, out])Return cumulative maximum over a DataFrame or Series axis.
cummin
([axis, skipna, out])Return cumulative minimum over a DataFrame or Series axis.
cumprod
([axis, skipna, dtype, out])Return cumulative product over a DataFrame or Series axis.
cumsum
([axis, skipna, dtype, out])Return cumulative sum over a DataFrame or Series axis.
describe
([split_every, percentiles, ...])Generate descriptive statistics.
diff
([periods, axis])First discrete difference of element.
difference
(other, *args, **kwargs)Returns a
GeoSeries
of the points in each aligned geometry that are not in other.disjoint
(other, *args, **kwargs)Returns a
Series
ofdtype('bool')
with valueTrue
for each aligned geometry disjoint to other.dissolve
([by, aggfunc, split_out])Dissolve geometries within
groupby
into a single geometry.distance
(other, *args, **kwargs)Returns a
Series
containing the distance to aligned other.div
(other[, axis, level, fill_value])Get Floating division of dataframe and other, element-wise (binary operator truediv).
divide
(other[, axis, level, fill_value])Get Floating division of dataframe and other, element-wise (binary operator truediv).
dot
(other[, meta])Compute the dot product between the Series and the columns of other.
drop
([labels, axis, columns, errors])Drop specified labels from rows or columns.
drop_duplicates
([subset, split_every, ...])Return DataFrame with duplicate rows removed.
dropna
([how, subset, thresh])Remove missing values.
eq
(other[, axis, level])Get Equal to of dataframe and other, element-wise (binary operator eq).
eval
(expr[, inplace])Evaluate a string describing operations on DataFrame columns.
explode
()Explode muti-part geometries into multiple single geometries.
ffill
([axis, limit])Synonym for
DataFrame.fillna()
withmethod='ffill'
.fillna
([value, method, limit, axis])Fill NA/NaN values using the specified method.
first
(offset)Select initial periods of time series data based on a date offset.
floordiv
(other[, axis, level, fill_value])Get Integer division of dataframe and other, element-wise (binary operator floordiv).
ge
(other[, axis, level])Get Greater than or equal to of dataframe and other, element-wise (binary operator ge).
geohash
([as_string, precision])Calculate geohash based on the middle points of the geometry bounds for a given precision.
geom_almost_equals
(other, *args, **kwargs)Returns a
Series
ofdtype('bool')
with valueTrue
if each aligned geometry is approximately equal to other.geom_equals
(other, *args, **kwargs)Returns a
Series
ofdtype('bool')
with valueTrue
for each aligned geometry equal to other.geom_equals_exact
(other, tolerance)Return True for all geometries that equal aligned other to a given tolerance, else False.
get_dtype_counts
()get_ftype_counts
()get_partition
(n)Get a dask DataFrame/Series representing the nth partition.
groupby
([by, group_keys, sort, observed, dropna])Group DataFrame using a mapper or by a Series of columns.
gt
(other[, axis, level])Get Greater than of dataframe and other, element-wise (binary operator gt).
head
([n, npartitions, compute])First n rows of the dataset
hilbert_distance
([total_bounds, level])Calculate the distance along a Hilbert curve.
idxmax
([axis, skipna, split_every])Return index of first occurrence of maximum over requested axis.
idxmin
([axis, skipna, split_every])Return index of first occurrence of minimum over requested axis.
info
([buf, verbose, memory_usage])Concise summary of a Dask DataFrame.
interpolate
(distance[, normalized])Return a point at the specified distance along each geometry
intersection
(other, *args, **kwargs)Returns a
GeoSeries
of the intersection of points in each aligned geometry with other.intersects
(other, *args, **kwargs)Returns a
Series
ofdtype('bool')
with valueTrue
for each aligned geometry that intersects other.isin
(values)Whether each element in the DataFrame is contained in values.
isna
()Detect missing values.
isnull
()Detect missing values.
items
()Iterate over (column name, Series) pairs.
iterrows
()Iterate over DataFrame rows as (index, Series) pairs.
itertuples
([index, name])Iterate over DataFrame rows as namedtuples.
join
(other[, on, how, lsuffix, rsuffix, ...])Join columns of another DataFrame.
kurtosis
([axis, fisher, bias, nan_policy, ...])Return unbiased kurtosis over requested axis.
last
(offset)Select final periods of time series data based on a date offset.
le
(other[, axis, level])Get Less than or equal to of dataframe and other, element-wise (binary operator le).
lt
(other[, axis, level])Get Less than of dataframe and other, element-wise (binary operator lt).
map_overlap
(func, before, after, *args, **kwargs)Apply a function to each partition, sharing rows with adjacent partitions.
map_partitions
(func, *args, **kwargs)Apply Python function on each DataFrame partition.
mask
(cond[, other])Replace values where the condition is True.
max
([axis, skipna, split_every, out, ...])Return the maximum of the values over the requested axis.
mean
([axis, skipna, split_every, dtype, ...])Return the mean of the values over the requested axis.
melt
([id_vars, value_vars, var_name, ...])Unpivots a DataFrame from wide format to long format, optionally leaving identifier variables set.
memory_usage
([index, deep])Return the memory usage of each column in bytes.
memory_usage_per_partition
([index, deep])Return the memory usage of each partition
merge
(right[, how, on, left_on, right_on, ...])Merge the DataFrame with another DataFrame
min
([axis, skipna, split_every, out, ...])Return the minimum of the values over the requested axis.
mod
(other[, axis, level, fill_value])Get Modulo of dataframe and other, element-wise (binary operator mod).
mode
([dropna, split_every])Get the mode(s) of each element along the selected axis.
morton_distance
([total_bounds, level])Calculate the distance of geometries along the Morton curve
mul
(other[, axis, level, fill_value])Get Multiplication of dataframe and other, element-wise (binary operator mul).
ne
(other[, axis, level])Get Not equal to of dataframe and other, element-wise (binary operator ne).
nlargest
([n, columns, split_every])Return the first n rows ordered by columns in descending order.
notnull
()Detect existing (non-missing) values.
nsmallest
([n, columns, split_every])Return the first n rows ordered by columns in ascending order.
nunique
([split_every, dropna, axis])Count number of distinct elements in specified axis.
nunique_approx
([split_every])Approximate number of unique rows.
overlaps
(other, *args, **kwargs)Returns True for all aligned geometries that overlap other, else False.
persist
(**kwargs)Persist this dask collection into memory
pipe
(func, *args, **kwargs)Apply func(self, *args, **kwargs).
pivot_table
([index, columns, values, aggfunc])Create a spreadsheet-style pivot table as a DataFrame.
pop
(item)Return item and drop from frame.
pow
(other[, axis, level, fill_value])Get Exponential power of dataframe and other, element-wise (binary operator pow).
prod
([axis, skipna, split_every, dtype, ...])Return the product of the values over the requested axis.
product
([axis, skipna, split_every, dtype, ...])Return the product of the values over the requested axis.
project
(other, *args, **kwargs)Return the distance along each geometry nearest to other
quantile
([q, axis, method])Approximate row-wise and precise column-wise quantiles of DataFrame
query
(expr, **kwargs)Filter dataframe with complex expression
radd
(other[, axis, level, fill_value])Get Addition of dataframe and other, element-wise (binary operator radd).
random_split
(frac[, random_state, shuffle])Pseudorandomly split dataframe into different pieces row-wise
rdiv
(other[, axis, level, fill_value])Get Floating division of dataframe and other, element-wise (binary operator rtruediv).
reduction
(chunk[, aggregate, combine, meta, ...])Generic row-wise reductions.
relate
(other, *args, **kwargs)Returns the DE-9IM intersection matrices for the geometries
rename
([index, columns])Alter axes labels.
rename_geometry
(col)Renames the GeoDataFrame geometry column to the specified name.
repartition
([divisions, npartitions, ...])Repartition dataframe along new divisions
replace
([to_replace, value, regex])Replace values given in to_replace with value.
representative_point
()Returns a
GeoSeries
of (cheaply computed) points that are guaranteed to be within each geometry.resample
(rule[, closed, label])Resample time-series data.
reset_index
([drop])Reset the index to the default index.
rfloordiv
(other[, axis, level, fill_value])Get Integer division of dataframe and other, element-wise (binary operator rfloordiv).
rmod
(other[, axis, level, fill_value])Get Modulo of dataframe and other, element-wise (binary operator rmod).
rmul
(other[, axis, level, fill_value])Get Multiplication of dataframe and other, element-wise (binary operator rmul).
rolling
(window[, min_periods, center, ...])Provides rolling transformations.
rotate
(angle[, origin, use_radians])Returns a
GeoSeries
with rotated geometries.round
([decimals])Round a DataFrame to a variable number of decimal places.
rpow
(other[, axis, level, fill_value])Get Exponential power of dataframe and other, element-wise (binary operator rpow).
rsub
(other[, axis, level, fill_value])Get Subtraction of dataframe and other, element-wise (binary operator rsub).
rtruediv
(other[, axis, level, fill_value])Get Floating division of dataframe and other, element-wise (binary operator rtruediv).
sample
([n, frac, replace, random_state])Random sample of items
scale
([xfact, yfact, zfact, origin])Returns a
GeoSeries
with scaled geometries.select_dtypes
([include, exclude])Return a subset of the DataFrame's columns based on the column dtypes.
sem
([axis, skipna, ddof, split_every, ...])Return unbiased standard error of the mean over requested axis.
set_crs
(value[, allow_override])Set the Coordinate Reference System (CRS) of a
GeoSeries
.set_geometry
(col)Set the GeoDataFrame geometry using either an existing column or the specified input.
set_index
(*args, **kwargs)Set the DataFrame index (row labels) using an existing column.
shift
([periods, freq, axis])Shift index by desired number of periods with an optional time freq.
shuffle
(on[, npartitions, max_branch, ...])Rearrange DataFrame into new partitions
simplify
(*args, **kwargs)Returns a
GeoSeries
containing a simplified representation of each geometry.sjoin
(df[, how, predicate])Spatial join of two GeoDataFrames.
skew
([xs, ys, origin, use_radians])Returns a
GeoSeries
with skewed geometries.sort_values
(by[, npartitions, ascending, ...])Sort the dataset by a single column.
spatial_shuffle
([by, level, ...])Shuffle the data into spatially consistent partitions.
squeeze
([axis])Squeeze 1 dimensional axis objects into scalars.
std
([axis, skipna, ddof, split_every, ...])Return sample standard deviation over requested axis.
sub
(other[, axis, level, fill_value])Get Subtraction of dataframe and other, element-wise (binary operator sub).
sum
([axis, skipna, split_every, dtype, out, ...])Return the sum of the values over the requested axis.
symmetric_difference
(other, *args, **kwargs)Returns a
GeoSeries
of the symmetric difference of points in each aligned geometry with other.tail
([n, compute])Last n rows of the dataset
to_bag
([index, format])Create Dask Bag from a Dask DataFrame
to_crs
([crs, epsg])Returns a
GeoSeries
with all geometries transformed to a new coordinate reference system.to_csv
(filename, **kwargs)Store Dask DataFrame to CSV files
to_dask_array
([lengths, meta])Convert a dask DataFrame to a dask array.
Create a dask.dataframe object from a dask_geopandas object
to_delayed
([optimize_graph])Convert into a list of
dask.delayed
objects, one per partition.to_feather
(path, *args, **kwargs)See dask_geopadandas.to_feather docstring for more information
to_hdf
(path_or_buf, key[, mode, append])Store Dask Dataframe to Hierarchical Data Format (HDF) files
to_html
([max_rows])Render a DataFrame as an HTML table.
to_json
(filename, *args, **kwargs)See dd.to_json docstring for more information
to_orc
(path, *args, **kwargs)See dd.to_orc docstring for more information
to_parquet
(path, *args, **kwargs)Store Dask.dataframe to Parquet files
to_records
([index, lengths])Create Dask Array from a Dask Dataframe
to_sql
(name, uri[, schema, if_exists, ...])See dd.to_sql docstring for more information
to_string
([max_rows])Render a DataFrame to a console-friendly tabular output.
to_timestamp
([freq, how, axis])Cast to DatetimeIndex of timestamps, at beginning of period.
to_wkb
([hex])Encode all geometry columns in the GeoDataFrame to WKB.
to_wkt
(**kwargs)Encode all geometry columns in the GeoDataFrame to WKT.
touches
(other, *args, **kwargs)Returns a
Series
ofdtype('bool')
with valueTrue
for each aligned geometry that touches other.translate
([xoff, yoff, zoff])Returns a
GeoSeries
with translated geometries.truediv
(other[, axis, level, fill_value])Get Floating division of dataframe and other, element-wise (binary operator truediv).
union
(other, *args, **kwargs)Returns a
GeoSeries
of the union of points in each aligned geometry with other.var
([axis, skipna, ddof, split_every, ...])Return unbiased variance over requested axis.
visualize
([filename, format, optimize_graph])Render the computation of this object's task graph using graphviz.
where
(cond[, other])Replace values where the condition is False.
within
(other, *args, **kwargs)Returns a
Series
ofdtype('bool')
with valueTrue
for each aligned geometry that is within other.Attributes
area
Returns a
Series
containing the area of each geometry in theGeoSeries
expressed in the units of the CRS.attrs
Dictionary of global attributes of this dataset.
axes
boundary
Returns a
GeoSeries
of lower dimensional objects representing each geometries's set-theoretic boundary.bounds
Returns a
DataFrame
with columnsminx
,miny
,maxx
,maxy
values containing the bounds for each geometry.centroid
Returns a
GeoSeries
of points representing the centroid of each geometry.columns
convex_hull
Returns a
GeoSeries
of geometries representing the convex hull of each geometry.The Coordinate Reference System (CRS) represented as a
pyproj.CRS
object.Coordinate based indexer to select by intersection with bounding box.
divisions
Tuple of
npartitions + 1
values, in ascending order, marking the lower/upper bounds of each partition's index.dtypes
Return data types
empty
envelope
Returns a
GeoSeries
of geometries representing the envelope of each geometry.exterior
Returns a
GeoSeries
of LinearRings representing the outer boundary of each polygon in the GeoSeries.geom_type
Returns a
Series
of strings specifying the Geometry Type of each object.geometry
has_z
Returns a
Series
ofdtype('bool')
with valueTrue
for features that have a z-component.iloc
Purely integer-location based indexing for selection by position.
index
Return dask Index instance
interiors
Returns a
Series
of List representing the inner rings of each polygon in the GeoSeries.is_empty
Returns a
Series
ofdtype('bool')
with valueTrue
for empty geometries.is_ring
Returns a
Series
ofdtype('bool')
with valueTrue
for features that are closed.is_simple
Returns a
Series
ofdtype('bool')
with valueTrue
for geometries that do not cross themselves.is_valid
Returns a
Series
ofdtype('bool')
with valueTrue
for geometries that are valid.known_divisions
Whether divisions are already known
length
Returns a
Series
containing the length of each geometry expressed in the units of the CRS.loc
Purely label-location based indexer for selection by label.
ndim
Return dimensionality
npartitions
Return number of partitions
partitions
Slice dataframe by partitions
shape
Return a tuple representing the dimensionality of the DataFrame.
sindex
Need to figure out how to concatenate spatial indexes
size
Size of the Series or DataFrame as a Delayed object.
spatial_partitions
The spatial extent of each of the partitions of the dask GeoDataFrame.
total_bounds
Returns a tuple containing
minx
,miny
,maxx
,maxy
values for the bounds of the series as a whole.type
Return the geometry type of each geometry in the GeoSeries
unary_union
Returns a geometry containing the union of all geometries in the
GeoSeries
.values
Return a dask.array of the values of this dataframe