Abstract
Brazil is the world's largest producer of sugarcane (Saccharum officinarum), accounting for approximately 40% of global production, with the state of São Paulo responsible for more than half of the national output due to its high level of mechanization. Despite its economic importance, publicly available datasets integrating information on sugarcane yield and production environment remain scarce. This is the first freely available dataset comprising crop yield, meteorological, and production environment data with a large number of observations derived from multiple plots, harvest cycles, and time steps, and that identifies the exact locations of 12 commercial fields in the northeast of São Paulo State, Brazil. It is combined with images downloaded from the Sentinel-2 satellite, based on plot shapefiles, and with other meteorological data at the exact locations and during the same periods of sugarcane cultivation. Crop yield and production environment data were shared by a sugar and alcohol plant operating in the region, collected at farms in the northeast of São Paulo State, Brazil, with measurements taken at the plot level across two plots per farm, across six farms. The data correspond to different numbers of harvests per plot. Between the plant and harvest dates, complementary data were generated by downloading Sentinel-2 RGB bands as single-band images and combining them into a single image. The exact process is applied using a meteorological dataset, selecting the closest meteorological station to obtain data for the same days between the plant and harvest dates. Given the unavailability of integrated sugarcane datasets, this resource provides a valuable foundation for studies on crop yield prediction, analysis of production environments, and the development and evaluation of data-driven models in precision agriculture.