Abstract
Modern proteins are remarkable polymers built from a 20-amino-acid alphabet, shaped by billions of years of evolution. Yet in Earth's prebiotic era, several amino acids - particularly the canonical basic residues lysine, arginine, and histidine - were likely scarce, unlike the more readily available acidic amino acids. Moreover, protein-length polymers were inaccessible before ribosomal synthesis emerged, and peptides were probably short, statistical, and non-templated. How the earliest proteins and enzymes emerged under these constraints remains a central question in origins-of-life research. Here, we synthesize random peptide libraries that span a broad electrostatic spectrum and systematically interrogate their properties. The data indicate that a prebiotically plausible acidic alphabet stands out in its propensity for secondary structure and higher-order soluble assembly via formation of β-sheets. These assemblies arise from highly heterogeneous sequences, plausibly reflecting the statistical diversity of early Earth peptides, and differ from amyloid structures in both solubility and morphology. Our results further show that the acidic random peptides have inherent capacity to bind certain metal ions, implying their potential to contribute to prebiotic catalysis. Using a large language model for structural prediction, we further show that peptides composed of this acidic alphabet exhibit a strong propensity for compact conformations. Altogether, this study showcases that unevolved sequences of prebiotically-abundant amino acids can readily produce foldable self-assembling polymers, potentially providing a steppingstone toward the first proteins, prior to the onset of purifying selection.