Reduce ImageCollection#
THe Earth Engine API provides 2 ways to reduce images: reduceRegion and reduceRegions. geetools is making these methods also available for ee.ImageCollection objects.
Set up environment#
Install all the required libs if necessary and perform the import statements upstream.
# uncomment if installation of libs is necessary
# !pip install earthengine-api geetools
import ee
import geetools #noqa: F401
import geopandas as gpd
from matplotlib import pyplot as plt
import pandas as pd
# uncomment if initialization is required
# ee.Initialize()
Example data#
The following examples rely on a ee.FeatureCollection composed of three ecoregion features that define regions by which to reduce image data. The ImageCollection data loads the modis vegetation indicies and subset the 2010 2020 decade of images.
## Import the example feature collection and drop the data property.
ecoregions = (
ee.FeatureCollection("projects/google/charts_feature_example")
.select(["label", "value", "warm"])
)
## Load MODIS vegetation indices data and subset of 4 images.
vegIndices = (
ee.ImageCollection("MODIS/061/MOD13A1")
.filter(ee.Filter.date("2010-01-01", "2010-02-28"))
.select(["NDVI", "EVI"])
)
Reduce over single region#
Using reduceRegion you can reduce an ee.ImageCollection over a single region.
The function will return a ee.Dictionary with the reduced values of each band grouped under each image Id as key.
It will return a ee.Dictionary with the following shape:
{
"image1": {"band1": value1, "band2": value2, ...},
"image2": {"band1": value1, "band2": value2, ...},
}
where image*is the id of the image as per specified property (casted to string) and band* is the name of the band.
result = vegIndices.geetools.reduceRegion(
reducer = ee.Reducer.mean(),
idProperty = "system:time_start",
idType = ee.Date,
geometry = ecoregions.filter(ee.Filter.eq("label", "Forest")).geometry(),
scale = 500
)
result.getInfo()
{'2010-01-01T00-00-00': {'EVI': 1912.5637702562262, 'NDVI': 3273.672377532786},
'2010-01-17T00-00-00': {'EVI': 3276.7642398350026, 'NDVI': 7331.223758333469},
'2010-02-02T00-00-00': {'EVI': 2963.2602251579947, 'NDVI': 7845.514550793475},
'2010-02-18T00-00-00': {'EVI': 3276.4948281435295, 'NDVI': 7951.898544727663}}
Then a user can easily transform this data into a dataframe and use any tools from the Python ecosystem:
df = pd.DataFrame(result.getInfo()).transpose()
df.head(15)
| EVI | NDVI | |
|---|---|---|
| 2010-01-01T00-00-00 | 1912.563770 | 3273.672378 |
| 2010-01-17T00-00-00 | 3276.764240 | 7331.223758 |
| 2010-02-02T00-00-00 | 2963.260225 | 7845.514551 |
| 2010-02-18T00-00-00 | 3276.494828 | 7951.898545 |
Reduce over muliple regions#
Using reduceRegions you can reduce an ee.ImageCollection over multiple regions.
The result will be shaped as a ee.FeatureCollection with 2 primary keys.
The
idPropertyas key for images stored in final feature asimage_idThe id of the feature stored in the final features as
feature_id.
Each feature will have the same properties as the original feature collection + the reduced value of the corresponding image over the feature geometry. The user can specify all the parameter of the reduction and specify which image property will be used as the id of the image.
result = vegIndices.geetools.reduceRegions(
reducer = ee.Reducer.mean(),
idProperty = "system:time_start",
idType = ee.Date,
collection = ecoregions,
scale = 500
)
# we can display the result as a table using geopandas
gdf = gpd.GeoDataFrame.from_features(result.getInfo()["features"])
gdf.head(15)
| geometry | EVI | NDVI | feature_id | image_id | label | value | warm | |
|---|---|---|---|---|---|---|---|---|
| 0 | POLYGON ((-109.21 31.42, -108.3 31.42, -108.3 ... | 1014.455318 | 1797.930408 | 00000000000000000000 | 2010-01-01T00-00-00 | Desert | 0 | 1 |
| 1 | POLYGON ((-109.21 31.42, -108.3 31.42, -108.3 ... | 971.166098 | 1900.816413 | 00000000000000000000 | 2010-01-17T00-00-00 | Desert | 0 | 1 |
| 2 | POLYGON ((-109.21 31.42, -108.3 31.42, -108.3 ... | 973.384019 | 1825.576545 | 00000000000000000000 | 2010-02-02T00-00-00 | Desert | 0 | 1 |
| 3 | POLYGON ((-109.21 31.42, -108.3 31.42, -108.3 ... | 986.248517 | 1790.578384 | 00000000000000000000 | 2010-02-18T00-00-00 | Desert | 0 | 1 |
| 4 | POLYGON ((-122.73 43.45, -122.28 43.45, -122.2... | 1912.563770 | 3273.672378 | 00000000000000000001 | 2010-01-01T00-00-00 | Forest | 1 | 1 |
| 5 | POLYGON ((-122.73 43.45, -122.28 43.45, -122.2... | 3276.764240 | 7331.223758 | 00000000000000000001 | 2010-01-17T00-00-00 | Forest | 1 | 1 |
| 6 | POLYGON ((-122.73 43.45, -122.28 43.45, -122.2... | 2963.260225 | 7845.514551 | 00000000000000000001 | 2010-02-02T00-00-00 | Forest | 1 | 1 |
| 7 | POLYGON ((-122.73 43.45, -122.28 43.45, -122.2... | 3276.494828 | 7951.898545 | 00000000000000000001 | 2010-02-18T00-00-00 | Forest | 1 | 1 |
| 8 | POLYGON ((-101.81 41.7, -100.53 41.7, -100.53 ... | 704.321312 | 1057.387456 | 00000000000000000002 | 2010-01-01T00-00-00 | Grassland | 2 | 0 |
| 9 | POLYGON ((-101.81 41.7, -100.53 41.7, -100.53 ... | 1231.157133 | 2044.010150 | 00000000000000000002 | 2010-01-17T00-00-00 | Grassland | 2 | 0 |
| 10 | POLYGON ((-101.81 41.7, -100.53 41.7, -100.53 ... | 1111.532233 | 1770.138801 | 00000000000000000002 | 2010-02-02T00-00-00 | Grassland | 2 | 0 |
| 11 | POLYGON ((-101.81 41.7, -100.53 41.7, -100.53 ... | 1055.246577 | 1797.979795 | 00000000000000000002 | 2010-02-18T00-00-00 | Grassland | 2 | 0 |
From this you can easily create chrono mapping of the regions or more custom figures that are not covered by the plot_* methods:
# Create a figure with 2 rows and 3 columns
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(10, 6)) # Adjust figsize as needed
# Flatten the 2D array of axes for easier access, if needed
axes_flat = axes.flatten()
# get a list of all the available dates
dates = vegIndices.aggregate_array("system:time_start").distinct()
# Plot the data
for i in range(3):
ax = axes_flat[i]
image_id = ee.Date(dates.get(i)).format("YYYY-MM-dd'T'HH-mm-ss")
fc = result.filter(ee.Filter.eq("image_id", image_id))
fc.geetools.plot(ax=ax, cmap="viridis", property="NDVI")
ax.set_title(image_id.getInfo())
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
Cell In[9], line 15
13 image_id = ee.Date(dates.get(i)).format("YYYY-MM-dd'T'HH-mm-ss")
14 fc = result.filter(ee.Filter.eq("image_id", image_id))
---> 15 fc.geetools.plot(ax=ax, cmap="viridis", property="NDVI")
16 ax.set_title(image_id.getInfo())
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/geetools/ee_feature_collection.py:702, in FeatureCollectionAccessor.plot(self, ax, property, crs, cmap, boundaries, color)
700 names = nonSystemNames.cat(systemNames)
701 property = property if property != "" else names.get(0).getInfo()
--> 702 data = self._obj.select([property]).getInfo()
704 # transform the data to a geodataframe and reproject it to the destination crs
705 gdf = gpd.GeoDataFrame.from_features(data["features"]).set_crs(4326).to_crs(crs)
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/ee/collection.py:579, in Collection.getInfo(self)
566 def getInfo(self) -> Optional[Any]:
567 """Returns all the known information about this collection.
568
569 This function makes a REST call to to retrieve all the known information
(...)
577 properties.
578 """
--> 579 return super().getInfo()
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/ee/computedobject.py:107, in ComputedObject.getInfo(self)
101 def getInfo(self) -> Optional[Any]:
102 """Fetch and return information about this object.
103
104 Returns:
105 The object can evaluate to anything.
106 """
--> 107 return data.computeValue(self)
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/ee/data.py:1128, in computeValue(obj)
1125 body = {'expression': serializer.encode(obj, for_cloud_api=True)}
1126 _maybe_populate_workload_tag(body)
-> 1128 return _execute_cloud_call(
1129 _get_cloud_projects()
1130 .value()
1131 .compute(body=body, project=_get_projects_path(), prettyPrint=False)
1132 )['result']
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/ee/data.py:408, in _execute_cloud_call(call, num_retries)
406 num_retries = _max_retries if num_retries is None else num_retries
407 try:
--> 408 return call.execute(num_retries=num_retries)
409 except googleapiclient.errors.HttpError as e:
410 raise _translate_cloud_exception(e)
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/googleapiclient/_helpers.py:130, in positional.<locals>.positional_decorator.<locals>.positional_wrapper(*args, **kwargs)
128 elif positional_parameters_enforcement == POSITIONAL_WARNING:
129 logger.warning(message)
--> 130 return wrapped(*args, **kwargs)
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/googleapiclient/http.py:923, in HttpRequest.execute(self, http, num_retries)
920 self.headers["content-length"] = str(len(self.body))
922 # Handle retries for server-side errors.
--> 923 resp, content = _retry_request(
924 http,
925 num_retries,
926 "request",
927 self._sleep,
928 self._rand,
929 str(self.uri),
930 method=str(self.method),
931 body=self.body,
932 headers=self.headers,
933 )
935 for callback in self.response_callbacks:
936 callback(resp)
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/googleapiclient/http.py:191, in _retry_request(http, num_retries, req_type, sleep, rand, uri, method, *args, **kwargs)
189 try:
190 exception = None
--> 191 resp, content = http.request(uri, method, *args, **kwargs)
192 # Retry on SSL errors and socket timeout errors.
193 except _ssl_SSLError as ssl_error:
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/google_auth_httplib2.py:218, in AuthorizedHttp.request(self, uri, method, body, headers, redirections, connection_type, **kwargs)
215 body_stream_position = body.tell()
217 # Make the request.
--> 218 response, content = self.http.request(
219 uri,
220 method,
221 body=body,
222 headers=request_headers,
223 redirections=redirections,
224 connection_type=connection_type,
225 **kwargs
226 )
228 # If the response indicated that the credentials needed to be
229 # refreshed, then refresh the credentials and re-attempt the
230 # request.
231 # A stored token may expire between the time it is retrieved and
232 # the time the request is made, so we may need to try twice.
233 if (
234 response.status in self._refresh_status_codes
235 and _credential_refresh_attempt < self._max_refresh_attempts
236 ):
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/httplib2/__init__.py:1724, in Http.request(self, uri, method, body, headers, redirections, connection_type)
1722 content = b""
1723 else:
-> 1724 (response, content) = self._request(
1725 conn, authority, uri, request_uri, method, body, headers, redirections, cachekey,
1726 )
1727 except Exception as e:
1728 is_timeout = isinstance(e, socket.timeout)
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/httplib2/__init__.py:1444, in Http._request(self, conn, host, absolute_uri, request_uri, method, body, headers, redirections, cachekey)
1441 if auth:
1442 auth.request(method, request_uri, headers, body)
-> 1444 (response, content) = self._conn_request(conn, request_uri, method, body, headers)
1446 if auth:
1447 if auth.response(response, body):
File ~/checkouts/readthedocs.org/user_builds/geetools/envs/v1.17.2/lib/python3.10/site-packages/httplib2/__init__.py:1396, in Http._conn_request(self, conn, request_uri, method, body, headers)
1394 pass
1395 try:
-> 1396 response = conn.getresponse()
1397 except (http.client.BadStatusLine, http.client.ResponseNotReady):
1398 # If we get a BadStatusLine on the first try then that means
1399 # the connection just went stale, so retry regardless of the
1400 # number of RETRIES set.
1401 if not seen_bad_status_line and i == 1:
File ~/.asdf/installs/python/3.10.17/lib/python3.10/http/client.py:1375, in HTTPConnection.getresponse(self)
1373 try:
1374 try:
-> 1375 response.begin()
1376 except ConnectionError:
1377 self.close()
File ~/.asdf/installs/python/3.10.17/lib/python3.10/http/client.py:318, in HTTPResponse.begin(self)
316 # read until we get a non-100 response
317 while True:
--> 318 version, status, reason = self._read_status()
319 if status != CONTINUE:
320 break
File ~/.asdf/installs/python/3.10.17/lib/python3.10/http/client.py:279, in HTTPResponse._read_status(self)
278 def _read_status(self):
--> 279 line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
280 if len(line) > _MAXLINE:
281 raise LineTooLong("status line")
File ~/.asdf/installs/python/3.10.17/lib/python3.10/socket.py:717, in SocketIO.readinto(self, b)
715 while True:
716 try:
--> 717 return self._sock.recv_into(b)
718 except timeout:
719 self._timeout_occurred = True
File ~/.asdf/installs/python/3.10.17/lib/python3.10/ssl.py:1307, in SSLSocket.recv_into(self, buffer, nbytes, flags)
1303 if flags != 0:
1304 raise ValueError(
1305 "non-zero flags not allowed in calls to recv_into() on %s" %
1306 self.__class__)
-> 1307 return self.read(nbytes, buffer)
1308 else:
1309 return super().recv_into(buffer, nbytes, flags)
File ~/.asdf/installs/python/3.10.17/lib/python3.10/ssl.py:1163, in SSLSocket.read(self, len, buffer)
1161 try:
1162 if buffer is not None:
-> 1163 return self._sslobj.read(len, buffer)
1164 else:
1165 return self._sslobj.read(len)
KeyboardInterrupt: